This is an automated email from the ASF dual-hosted git repository.
tqchen pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm-site.git
The following commit(s) were added to refs/heads/main by this push:
new a17b5d99c7 update
a17b5d99c7 is described below
commit a17b5d99c7e540bf553a686bc6544c00f21d9d51
Author: tqchen <[email protected]>
AuthorDate: Wed Oct 22 08:13:48 2025 -0700
update
---
_posts/2025-10-21-tvm-ffi.md | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/_posts/2025-10-21-tvm-ffi.md b/_posts/2025-10-21-tvm-ffi.md
index c6e6252f81..98a0d824bc 100644
--- a/_posts/2025-10-21-tvm-ffi.md
+++ b/_posts/2025-10-21-tvm-ffi.md
@@ -7,17 +7,17 @@
-We are currently living in an exciting era for AI, where machine learning
systems and infrastructures are crucial for training and deploying efficient AI
models. The modern machine learning systems landscape comes rich with diverse
components, including popular ML frameworks and array libraries like JAX,
PyTorch, and CuPy. It also includes specialized libraries such as
FlashAttention, FlashInfer and cuDNN. Furthermore, there's a growing trend of
ML compilers and domain-specific languages [...]
+We are currently living in an exciting era for AI, where machine learning
systems and infrastructures are crucial for training and deploying efficient AI
models. The modern machine learning systems landscape comes rich with diverse
components, including popular ML frameworks and array libraries like JAX,
PyTorch, and CuPy. It also includes specialized libraries such as
FlashAttention, FlashInfer and cuDNN. Furthermore, there's a growing trend of
ML compilers and domain-specific languages [...]
-The exciting growth of the ecosystem is the reason for the fast pace of
innovation in AI today. However, it also presents a significant challenge:
**interoperability**. Many of those components need to integrate with each
other. For example, libraries such as FlashInfer, cuDNN needs to be integrated
into PyTorch, JAX, TensorRT’s runtime system, each may come with different
interface requirements. ML compilers and DSLs also usually expose Python JIT
binding support, while also need to bri [...]
+The exciting growth of the ecosystem is the reason for today's fast pace of
innovation in AI. However, it also presents a significant challenge:
**interoperability**. Many of those components need to integrate with each
other. For example, libraries such as FlashInfer and cuDNN need to be
integrated into PyTorch, JAX, and TensorRT's runtime system, each of which may
come with different interface requirements. ML compilers and DSLs also usually
expose Python JIT binding support, while als [...]
{: style="width: 70%; margin:
auto; display: block;" }
-The the core of these interoperability challenges are the **Application Binary
Interface (ABI)** and the **Foreign Function Interface (FFI)**. **ABI** defines
how data structures are stored in memory and precisely what occurs when a
function is called. For instance, the way torch stores Tensors may be different
from say cupy/numpy, so we cannot directly pass a torch.Tensor pointer and its
treatment as a cupy.NDArray. The very nature of machine learning applications
usually mandates cross [...]
+At the core of these interoperability challenges are the **Application Binary
Interface (ABI)** and the **Foreign Function Interface (FFI)**. **ABI** defines
how data structures are stored in memory and precisely what occurs when a
function is called. For instance, the way PyTorch stores Tensors may be
different from CuPy/NumPy, so we cannot directly pass a torch.Tensor pointer
and treat it as a cupy.NDArray. The very nature of machine learning
applications usually mandates cross-languag [...]
-All of the above observations call for a **need for ABI and FFI for the ML
systems** use-cases. Looking at the state today, luckily, we do have something
to start with – the C ABI, which every programming language speaks and remains
stable over time. Unfortunately, C only focuses on low-level data types such as
int, float and raw pointers. On the other end of the spectrum, we know that
python is something that must gain first-class support, but also there is still
a need for different-la [...]
+All of the above observations call for a **need for ABI and FFI for ML
systems** use cases. Looking at the current state, luckily, we do have
something to start with – the C ABI, which every programming language speaks
and remains stable over time. Unfortunately, C only focuses on low-level data
types such as int, float and raw pointers. On the other end of the spectrum, we
know that Python is something that must gain first-class support, but there is
still a need for different-language [...]
-This post introduces TVM FFI, an **open ABI and FFI for machine learning
systems**. The project evolved from multiple years of ABI calling conventions
design iterations in the Apache TVM project. We find that the design can be
made generic, independent from the choice of compiler/language and should
benefit the ML systems community. As a result, we brought into a minimal
library built from the ground up with a clear intention to become an open,
standalone library that can be shared and e [...]
+This post introduces TVM FFI, an **open ABI and FFI for machine learning
systems**. The project evolved from multiple years of ABI calling conventions
design iterations in the Apache TVM project. We find that the design can be
made generic, independent of the choice of compiler/language and should benefit
the ML systems community. As a result, we built a minimal library from the
ground up with a clear intention to become an open, standalone library that can
be shared and evolved together [...]
- **Stable, minimal C ABI** designed for kernels, DSLs, and runtime
extensibility.
- **Zero-copy interop** across PyTorch, JAX, and CuPy using [DLPack
protocol](https://data-apis.org/array-api/2024.12/design_topics/data_interchange.html).
@@ -31,13 +31,13 @@ Importantly, the goal of the project is not to create
another framework or langu
## **Technical Design**
-To start with, we need a mechanism to store the values that are passing across
machine learning frameworks. It achieves this using a core data structure
called TVMFFIAny. It is a 16 bytes C structure that follows the design
principle of tagged-union
+To start with, we need a mechanism to store the values that are passed across
machine learning frameworks. It achieves this using a core data structure
called TVMFFIAny. It is a 16-byte C structure that follows the design principle
of tagged union
{: style="width: 50%; margin: auto;
display: block;" }
-The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps to manage
type information and deletion. This design allows us to use the same type_index
mechanism that allows for the future growth and recognition of new kinds of
objects within the FFI, ensuring extensibility. The standalone deleter ensures
objects can be safely allocated by one source or language and deleted in
another place.
+The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps manage type
information and deletion. This design allows us to use the same type_index
mechanism that allows for future growth and recognition of new kinds of objects
within the FFI, ensuring extensibility. The standalone deleter ensures objects
can be safely allocated by one source or language and deleted in another place.
{: style="width: 50%; margin: auto;
display: block;" }
@@ -97,8 +97,18 @@ Once DSL integrates with the ABI, we can leverage the same
flow to load back and
{: style="width: 40%; margin: auto;
display: block;" }
+## Core Design Principle and Applications
-As we can see, the common open ABI foundation offers numerous opportunities
for ML systems to interoperate. We anticipate that this solution can
significantly benefit various aspects of ML systems and AI infrastructure:
+
+Coming back to the high level, the core design principle of the TVM FFI ABI is
to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for
ML systems to interoperate. We anticipate that this solution can significantly
benefit various aspects of ML systems and AI infrastructure:
* **Kernel libraries**: Ship a single package to support multiple frameworks,
Python versions, and different languages.
* **Kernel DSLs**: a reusable ABI for JIT and AOT kernel exposure frameworks
and runtimes.