MeshTaichi: A Compiler for Efficient Mesh-based Operations

Chang Yu UESTC Taichi Graphics

Yi Xu Tsinghua University Taichi Graphics

Ye Kuang Taichi Graphics

Yuanming Hu Taichi Graphics

Tiantian Liu Taichi Graphics

Top:We propose MeshTaichi: a high-level programming model to handle mesh-based operations efficiently. By exploiting data locality and memory hierarchy, our compiler generates high-performance CPU/GPU computational kernels. As a result, we achieve much better performance than the state-of-the-art compilers and data structures. Bottom: We implement the MLS-MPM algorithm with FEM-based Lagrangian-force using our programming language. The simulation consists of 17,010 armadillos with 222,048,540 vertices and 713,739,600 tetrahedrons in total. Each frame consists of 300 substeps and takes 2.4 minutes on average on an NVIDIA A100 Tensor Core GPU.

Abstract

Meshes are an indispensable representation in many graphics applications because they provide conformal spatial discretizations. However, mesh-based operations are often slow due to unstructured memory access patterns. We propose MeshTaichi, a novel mesh compiler that provides an intuitive programming model for efficient mesh-based operations. Our programming model hides the complex indexing system from users and allows users to write mesh-based operations using reference-style neighborhood queries. Our compiler achieves its high performance by exploiting data locality. We partition input meshes and prepare the wanted relations by inspecting users' code during compile time. During run time, we further utilize on-chip memory (shared memory on GPU and L1 cache on CPU) to access the wanted attributes of mesh elements efficiently. Our compiler decouples low-level optimization options with computations, so that users can explore different localized data attributes and different memory orderings without changing their computation code. As a result, users can write concise code using our programming model to generate efficient mesh-based computations on both CPU and GPU backends. We test MeshTaichi on a variety of physically-based simulation and geometry processing applications with both triangle and tetrahedron meshes. MeshTaichi achieves a consistent speedup ranging from 1.4 times to 6 times, compared to state-of-the-art mesh data structures and compilers.

Recorded Full Talk

Publication

Chang Yu(*), Yi Xu(*), Ye Kuang, Yuanming Hu, Tiantian Liu. MeshTaichi: A Compiler for Efficient Mesh-based Operations. ACM Transactions on Graphics 42(6) [Proceedings of SIGGRAPH Asia], 2022.

Links and Downloads

Paper

BibTeX

Project Page

Acknowledgements

We thank Mingrui Zhang and Chuqiao Zhou for early-stage brainstorming, Xiuqi Yang for performance profiling, Jihua Liu and Shumu Xu for providing the OpenVDB-reconstructed mesh, Yun (Raymond) Fei and Ming Gao for discussions on atomic operations, Ahmed H. Mahmoud for answering our questions about RXMesh, and Haidong Lan and Bo Qiao for proofreading. We also thank the anonymous reviewers for their constructive feedback.