This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new e8f6b23997 Build at Wed Oct 22 15:14:45 UTC 2025
e8f6b23997 is described below
commit e8f6b23997db8c0f1a3a6c2905c616dc9e82fd88
Author: tvm-bot <[email protected]>
AuthorDate: Wed Oct 22 15:14:45 2025 +0000
Build at Wed Oct 22 15:14:45 UTC 2025
---
2025/10/21/tvm-ffi.html | 26 ++++++++++++++++++--------
atom.xml | 28 +++++++++++++++++++---------
feed.xml | 28 +++++++++++++++++++---------
rss.xml | 30 ++++++++++++++++++++----------
4 files changed, 76 insertions(+), 36 deletions(-)
diff --git a/2025/10/21/tvm-ffi.html b/2025/10/21/tvm-ffi.html
index 0fb4b19c97..fe9ae696f7 100644
--- a/2025/10/21/tvm-ffi.html
+++ b/2025/10/21/tvm-ffi.html
@@ -146,17 +146,17 @@
</p>
</br>
<div class="post-content">
- <p>We are currently living in an exciting era for AI, where machine
learning systems and infrastructures are crucial for training and deploying
efficient AI models. The modern machine learning systems landscape comes rich
with diverse components, including popular ML frameworks and array libraries
like JAX, PyTorch, and CuPy. It also includes specialized libraries such as
FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a growing trend of
ML compilers and domain-specific [...]
+ <p>We are currently living in an exciting era for AI, where machine
learning systems and infrastructures are crucial for training and deploying
efficient AI models. The modern machine learning systems landscape comes rich
with diverse components, including popular ML frameworks and array libraries
like JAX, PyTorch, and CuPy. It also includes specialized libraries such as
FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a growing trend of
ML compilers and domain-specific [...]
-<p>The exciting growth of the ecosystem is the reason for the fast pace of
innovation in AI today. However, it also presents a significant challenge:
<strong>interoperability</strong>. Many of those components need to integrate
with each other. For example, libraries such as FlashInfer, cuDNN needs to be
integrated into PyTorch, JAX, TensorRT’s runtime system, each may come with
different interface requirements. ML compilers and DSLs also usually expose
Python JIT binding support, while [...]
+<p>The exciting growth of the ecosystem is the reason for today’s fast pace of
innovation in AI. However, it also presents a significant challenge:
<strong>interoperability</strong>. Many of those components need to integrate
with each other. For example, libraries such as FlashInfer and cuDNN need to be
integrated into PyTorch, JAX, and TensorRT’s runtime system, each of which may
come with different interface requirements. ML compilers and DSLs also usually
expose Python JIT binding su [...]
<p><img src="/images/tvm-ffi/interop-challenge.png" alt="image" style="width:
70%; margin: auto; display: block;" /></p>
-<p>The the core of these interoperability challenges are the
<strong>Application Binary Interface (ABI)</strong> and the <strong>Foreign
Function Interface (FFI)</strong>. <strong>ABI</strong> defines how data
structures are stored in memory and precisely what occurs when a function is
called. For instance, the way torch stores Tensors may be different from say
cupy/numpy, so we cannot directly pass a torch.Tensor pointer and its treatment
as a cupy.NDArray. The very nature of machine le [...]
+<p>At the core of these interoperability challenges are the
<strong>Application Binary Interface (ABI)</strong> and the <strong>Foreign
Function Interface (FFI)</strong>. <strong>ABI</strong> defines how data
structures are stored in memory and precisely what occurs when a function is
called. For instance, the way PyTorch stores Tensors may be different from
CuPy/NumPy, so we cannot directly pass a torch.Tensor pointer and treat it as a
cupy.NDArray. The very nature of machine learning a [...]
-<p>All of the above observations call for a <strong>need for ABI and FFI for
the ML systems</strong> use-cases. Looking at the state today, luckily, we do
have something to start with – the C ABI, which every programming language
speaks and remains stable over time. Unfortunately, C only focuses on low-level
data types such as int, float and raw pointers. On the other end of the
spectrum, we know that python is something that must gain first-class support,
but also there is still a need [...]
+<p>All of the above observations call for a <strong>need for ABI and FFI for
ML systems</strong> use cases. Looking at the current state, luckily, we do
have something to start with – the C ABI, which every programming language
speaks and remains stable over time. Unfortunately, C only focuses on low-level
data types such as int, float and raw pointers. On the other end of the
spectrum, we know that Python is something that must gain first-class support,
but there is still a need for dif [...]
-<p>This post introduces TVM FFI, an <strong>open ABI and FFI for machine
learning systems</strong>. The project evolved from multiple years of ABI
calling conventions design iterations in the Apache TVM project. We find that
the design can be made generic, independent from the choice of
compiler/language and should benefit the ML systems community. As a result, we
brought into a minimal library built from the ground up with a clear intention
to become an open, standalone library that can [...]
+<p>This post introduces TVM FFI, an <strong>open ABI and FFI for machine
learning systems</strong>. The project evolved from multiple years of ABI
calling conventions design iterations in the Apache TVM project. We find that
the design can be made generic, independent of the choice of compiler/language
and should benefit the ML systems community. As a result, we built a minimal
library from the ground up with a clear intention to become an open, standalone
library that can be shared and [...]
<ul>
<li><strong>Stable, minimal C ABI</strong> designed for kernels, DSLs, and
runtime extensibility.</li>
@@ -171,11 +171,11 @@
<h2 id="technical-design"><strong>Technical Design</strong></h2>
-<p>To start with, we need a mechanism to store the values that are passing
across machine learning frameworks. It achieves this using a core data
structure called TVMFFIAny. It is a 16 bytes C structure that follows the
design principle of tagged-union</p>
+<p>To start with, we need a mechanism to store the values that are passed
across machine learning frameworks. It achieves this using a core data
structure called TVMFFIAny. It is a 16-byte C structure that follows the design
principle of tagged union</p>
<p><img src="/images/tvm-ffi/tvmffiany.png" alt="image" style="width: 50%;
margin: auto; display: block;" /></p>
-<p>The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps to manage
type information and deletion. This design allows us to use the same type_index
mechanism that allows for the future growth and recognition of new kinds of
objects within the FFI, ensuring extensibility. The standalone deleter ensures
objects can be safely allocated by one source or language and deleted in
another place.</p>
+<p>The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps manage type
information and deletion. This design allows us to use the same type_index
mechanism that allows for future growth and recognition of new kinds of objects
within the FFI, ensuring extensibility. The standalone deleter ensures objects
can be safely allocated by one source or language and deleted in another
place.</p>
<p><img src="/images/tvm-ffi/tvmffiobject.png" alt="image" style="width: 50%;
margin: auto; display: block;" /></p>
@@ -231,7 +231,17 @@ Once DSL integrates with the ABI, we can leverage the same
flow to load back and
<p><img src="/images/tvm-ffi/mydsl.png" alt="image" style="width: 40%; margin:
auto; display: block;" /></p>
-<p>As we can see, the common open ABI foundation offers numerous opportunities
for ML systems to interoperate. We anticipate that this solution can
significantly benefit various aspects of ML systems and AI infrastructure:</p>
+<h2 id="core-design-principle-and-applications">Core Design Principle and
Applications</h2>
+
+<p>Coming back to the high level, the core design principle of the TVM FFI ABI
is to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for
ML systems to interoperate. We anticipate that this solution can significantly
benefit various aspects of ML systems and AI infrastructure:</p>
<ul>
<li><strong>Kernel libraries</strong>: Ship a single package to support
multiple frameworks, Python versions, and different languages.</li>
diff --git a/atom.xml b/atom.xml
index 0cc512185b..1b28904c60 100644
--- a/atom.xml
+++ b/atom.xml
@@ -4,7 +4,7 @@
<title>TVM</title>
<link href="https://tvm.apache.org" rel="self"/>
<link href="https://tvm.apache.org"/>
- <updated>2025-10-21T20:51:57+00:00</updated>
+ <updated>2025-10-22T15:14:12+00:00</updated>
<id>https://tvm.apache.org</id>
<author>
<name></name>
@@ -17,17 +17,17 @@
<link href="https://tvm.apache.org/2025/10/21/tvm-ffi"/>
<updated>2025-10-21T00:00:00+00:00</updated>
<id>https://tvm.apache.org/2025/10/21/tvm-ffi</id>
- <content type="html"><p>We are currently living in an exciting era
for AI, where machine learning systems and infrastructures are crucial for
training and deploying efficient AI models. The modern machine learning systems
landscape comes rich with diverse components, including popular ML frameworks
and array libraries like JAX, PyTorch, and CuPy. It also includes specialized
libraries such as FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a
growing trend of ML compil [...]
+ <content type="html"><p>We are currently living in an exciting era
for AI, where machine learning systems and infrastructures are crucial for
training and deploying efficient AI models. The modern machine learning systems
landscape comes rich with diverse components, including popular ML frameworks
and array libraries like JAX, PyTorch, and CuPy. It also includes specialized
libraries such as FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a
growing trend of ML compil [...]
-<p>The exciting growth of the ecosystem is the reason for the fast pace
of innovation in AI today. However, it also presents a significant challenge:
<strong>interoperability</strong>. Many of those components need to
integrate with each other. For example, libraries such as FlashInfer, cuDNN
needs to be integrated into PyTorch, JAX, TensorRT’s runtime system, each may
come with different interface requirements. ML compilers and DSLs also usually
expose Python JIT bindi [...]
+<p>The exciting growth of the ecosystem is the reason for today’s fast
pace of innovation in AI. However, it also presents a significant challenge:
<strong>interoperability</strong>. Many of those components need to
integrate with each other. For example, libraries such as FlashInfer and cuDNN
need to be integrated into PyTorch, JAX, and TensorRT’s runtime system, each of
which may come with different interface requirements. ML compilers and DSLs
also usually expose Pyt [...]
<p><img src="/images/tvm-ffi/interop-challenge.png"
alt="image" style="width: 70%; margin: auto; display:
block;" /></p>
-<p>The the core of these interoperability challenges are the
<strong>Application Binary Interface (ABI)</strong> and the
<strong>Foreign Function Interface (FFI)</strong>.
<strong>ABI</strong> defines how data structures are stored in
memory and precisely what occurs when a function is called. For instance, the
way torch stores Tensors may be different from say cupy/numpy, so we cannot
directly pass a torch.Tensor pointer and its treatment as a c [...]
+<p>At the core of these interoperability challenges are the
<strong>Application Binary Interface (ABI)</strong> and the
<strong>Foreign Function Interface (FFI)</strong>.
<strong>ABI</strong> defines how data structures are stored in
memory and precisely what occurs when a function is called. For instance, the
way PyTorch stores Tensors may be different from CuPy/NumPy, so we cannot
directly pass a torch.Tensor pointer and treat it as a cupy.NDAr [...]
-<p>All of the above observations call for a <strong>need for ABI
and FFI for the ML systems</strong> use-cases. Looking at the state
today, luckily, we do have something to start with – the C ABI, which every
programming language speaks and remains stable over time. Unfortunately, C only
focuses on low-level data types such as int, float and raw pointers. On the
other end of the spectrum, we know that python is something that must gain
first-class support, but also ther [...]
+<p>All of the above observations call for a <strong>need for ABI
and FFI for ML systems</strong> use cases. Looking at the current state,
luckily, we do have something to start with – the C ABI, which every
programming language speaks and remains stable over time. Unfortunately, C only
focuses on low-level data types such as int, float and raw pointers. On the
other end of the spectrum, we know that Python is something that must gain
first-class support, but there is st [...]
-<p>This post introduces TVM FFI, an <strong>open ABI and FFI for
machine learning systems</strong>. The project evolved from multiple
years of ABI calling conventions design iterations in the Apache TVM project.
We find that the design can be made generic, independent from the choice of
compiler/language and should benefit the ML systems community. As a result, we
brought into a minimal library built from the ground up with a clear intention
to become an open, standalon [...]
+<p>This post introduces TVM FFI, an <strong>open ABI and FFI for
machine learning systems</strong>. The project evolved from multiple
years of ABI calling conventions design iterations in the Apache TVM project.
We find that the design can be made generic, independent of the choice of
compiler/language and should benefit the ML systems community. As a result, we
built a minimal library from the ground up with a clear intention to become an
open, standalone library that [...]
<ul>
<li><strong>Stable, minimal C ABI</strong> designed for
kernels, DSLs, and runtime extensibility.</li>
@@ -42,11 +42,11 @@
<h2 id="technical-design"><strong>Technical
Design</strong></h2>
-<p>To start with, we need a mechanism to store the values that are
passing across machine learning frameworks. It achieves this using a core data
structure called TVMFFIAny. It is a 16 bytes C structure that follows the
design principle of tagged-union</p>
+<p>To start with, we need a mechanism to store the values that are
passed across machine learning frameworks. It achieves this using a core data
structure called TVMFFIAny. It is a 16-byte C structure that follows the design
principle of tagged union</p>
<p><img src="/images/tvm-ffi/tvmffiany.png"
alt="image" style="width: 50%; margin: auto; display:
block;" /></p>
-<p>The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps to manage
type information and deletion. This design allows us to use the same type_index
mechanism that allows for the future growth and recognition of new kinds of
objects within the FFI, ensuring extensibility. The standalone deleter ensures
objects can be safely allocated by one source or language and deleted in
another place.</p>
+<p>The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps manage type
information and deletion. This design allows us to use the same type_index
mechanism that allows for future growth and recognition of new kinds of objects
within the FFI, ensuring extensibility. The standalone deleter ensures objects
can be safely allocated by one source or language and deleted in another
place.</p>
<p><img src="/images/tvm-ffi/tvmffiobject.png"
alt="image" style="width: 50%; margin: auto; display:
block;" /></p>
@@ -102,7 +102,17 @@ Once DSL integrates with the ABI, we can leverage the same
flow to load back and
<p><img src="/images/tvm-ffi/mydsl.png"
alt="image" style="width: 40%; margin: auto; display:
block;" /></p>
-<p>As we can see, the common open ABI foundation offers numerous
opportunities for ML systems to interoperate. We anticipate that this solution
can significantly benefit various aspects of ML systems and AI
infrastructure:</p>
+<h2 id="core-design-principle-and-applications">Core Design
Principle and Applications</h2>
+
+<p>Coming back to the high level, the core design principle of the TVM
FFI ABI is to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for
ML systems to interoperate. We anticipate that this solution can significantly
benefit various aspects of ML systems and AI infrastructure:</p>
<ul>
<li><strong>Kernel libraries</strong>: Ship a single
package to support multiple frameworks, Python versions, and different
languages.</li>
diff --git a/feed.xml b/feed.xml
index 77489431c4..cbed874716 100644
--- a/feed.xml
+++ b/feed.xml
@@ -1,14 +1,14 @@
-<?xml version="1.0" encoding="utf-8"?><feed
xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/"
version="4.4.1">Jekyll</generator><link href="/feed.xml" rel="self"
type="application/atom+xml" /><link href="/" rel="alternate" type="text/html"
/><updated>2025-10-21T20:51:57+00:00</updated><id>/feed.xml</id><title
type="html">TVM</title><author><name>{"name" =>
nil}</name></author><entry><title type="html">Building an Open ABI and FFI for
ML Systems</tit [...]
+<?xml version="1.0" encoding="utf-8"?><feed
xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/"
version="4.4.1">Jekyll</generator><link href="/feed.xml" rel="self"
type="application/atom+xml" /><link href="/" rel="alternate" type="text/html"
/><updated>2025-10-22T15:14:12+00:00</updated><id>/feed.xml</id><title
type="html">TVM</title><author><name>{"name" =>
nil}</name></author><entry><title type="html">Building an Open ABI and FFI for
ML Systems</tit [...]
-<p>The exciting growth of the ecosystem is the reason for the fast pace of
innovation in AI today. However, it also presents a significant challenge:
<strong>interoperability</strong>. Many of those components need to integrate
with each other. For example, libraries such as FlashInfer, cuDNN needs to be
integrated into PyTorch, JAX, TensorRT’s runtime system, each may come with
different interface requirements. ML compilers and DSLs also usually expose
Python JIT binding support, while [...]
+<p>The exciting growth of the ecosystem is the reason for today’s fast pace of
innovation in AI. However, it also presents a significant challenge:
<strong>interoperability</strong>. Many of those components need to integrate
with each other. For example, libraries such as FlashInfer and cuDNN need to be
integrated into PyTorch, JAX, and TensorRT’s runtime system, each of which may
come with different interface requirements. ML compilers and DSLs also usually
expose Python JIT binding su [...]
<p><img src="/images/tvm-ffi/interop-challenge.png" alt="image" style="width:
70%; margin: auto; display: block;" /></p>
-<p>The the core of these interoperability challenges are the
<strong>Application Binary Interface (ABI)</strong> and the <strong>Foreign
Function Interface (FFI)</strong>. <strong>ABI</strong> defines how data
structures are stored in memory and precisely what occurs when a function is
called. For instance, the way torch stores Tensors may be different from say
cupy/numpy, so we cannot directly pass a torch.Tensor pointer and its treatment
as a cupy.NDArray. The very nature of machine le [...]
+<p>At the core of these interoperability challenges are the
<strong>Application Binary Interface (ABI)</strong> and the <strong>Foreign
Function Interface (FFI)</strong>. <strong>ABI</strong> defines how data
structures are stored in memory and precisely what occurs when a function is
called. For instance, the way PyTorch stores Tensors may be different from
CuPy/NumPy, so we cannot directly pass a torch.Tensor pointer and treat it as a
cupy.NDArray. The very nature of machine learning a [...]
-<p>All of the above observations call for a <strong>need for ABI and FFI for
the ML systems</strong> use-cases. Looking at the state today, luckily, we do
have something to start with – the C ABI, which every programming language
speaks and remains stable over time. Unfortunately, C only focuses on low-level
data types such as int, float and raw pointers. On the other end of the
spectrum, we know that python is something that must gain first-class support,
but also there is still a need [...]
+<p>All of the above observations call for a <strong>need for ABI and FFI for
ML systems</strong> use cases. Looking at the current state, luckily, we do
have something to start with – the C ABI, which every programming language
speaks and remains stable over time. Unfortunately, C only focuses on low-level
data types such as int, float and raw pointers. On the other end of the
spectrum, we know that Python is something that must gain first-class support,
but there is still a need for dif [...]
-<p>This post introduces TVM FFI, an <strong>open ABI and FFI for machine
learning systems</strong>. The project evolved from multiple years of ABI
calling conventions design iterations in the Apache TVM project. We find that
the design can be made generic, independent from the choice of
compiler/language and should benefit the ML systems community. As a result, we
brought into a minimal library built from the ground up with a clear intention
to become an open, standalone library that can [...]
+<p>This post introduces TVM FFI, an <strong>open ABI and FFI for machine
learning systems</strong>. The project evolved from multiple years of ABI
calling conventions design iterations in the Apache TVM project. We find that
the design can be made generic, independent of the choice of compiler/language
and should benefit the ML systems community. As a result, we built a minimal
library from the ground up with a clear intention to become an open, standalone
library that can be shared and [...]
<ul>
<li><strong>Stable, minimal C ABI</strong> designed for kernels, DSLs, and
runtime extensibility.</li>
@@ -23,11 +23,11 @@
<h2 id="technical-design"><strong>Technical Design</strong></h2>
-<p>To start with, we need a mechanism to store the values that are passing
across machine learning frameworks. It achieves this using a core data
structure called TVMFFIAny. It is a 16 bytes C structure that follows the
design principle of tagged-union</p>
+<p>To start with, we need a mechanism to store the values that are passed
across machine learning frameworks. It achieves this using a core data
structure called TVMFFIAny. It is a 16-byte C structure that follows the design
principle of tagged union</p>
<p><img src="/images/tvm-ffi/tvmffiany.png" alt="image" style="width: 50%;
margin: auto; display: block;" /></p>
-<p>The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps to manage
type information and deletion. This design allows us to use the same type_index
mechanism that allows for the future growth and recognition of new kinds of
objects within the FFI, ensuring extensibility. The standalone deleter ensures
objects can be safely allocated by one source or language and deleted in
another place.</p>
+<p>The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps manage type
information and deletion. This design allows us to use the same type_index
mechanism that allows for future growth and recognition of new kinds of objects
within the FFI, ensuring extensibility. The standalone deleter ensures objects
can be safely allocated by one source or language and deleted in another
place.</p>
<p><img src="/images/tvm-ffi/tvmffiobject.png" alt="image" style="width: 50%;
margin: auto; display: block;" /></p>
@@ -83,7 +83,17 @@ Once DSL integrates with the ABI, we can leverage the same
flow to load back and
<p><img src="/images/tvm-ffi/mydsl.png" alt="image" style="width: 40%; margin:
auto; display: block;" /></p>
-<p>As we can see, the common open ABI foundation offers numerous opportunities
for ML systems to interoperate. We anticipate that this solution can
significantly benefit various aspects of ML systems and AI infrastructure:</p>
+<h2 id="core-design-principle-and-applications">Core Design Principle and
Applications</h2>
+
+<p>Coming back to the high level, the core design principle of the TVM FFI ABI
is to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for
ML systems to interoperate. We anticipate that this solution can significantly
benefit various aspects of ML systems and AI infrastructure:</p>
<ul>
<li><strong>Kernel libraries</strong>: Ship a single package to support
multiple frameworks, Python versions, and different languages.</li>
@@ -113,7 +123,7 @@ Please checkout the following resources:</p>
<p>The project draws collective wisdoms of the Machine Learning System
community and python open source ecosystem, including past development insights
of many developers from numpy, PyTorch, JAX, Caffe, mxnet, XGBoost, cuPy,
pybind11, nanobind and more.</p>
-<p>We would specifically like to thank the PyTorch team, JAX team, CUDA python
team, cuteDSL team, cuTile team, Apache TVM community, XGBoost team,
TileLang team, Triton distributed team, FlashInfer team, SGLang community,
TensorRT-LLM community, the vLLM community, for their their insightful
feedbacks.</p>]]></content><author><name>Apache TVM FFI
Community</name></author><summary type="html"><![CDATA[We are currently living
in an exciting era for AI, where machine learning systems [...]
+<p>We would specifically like to thank the PyTorch team, JAX team, CUDA python
team, cuteDSL team, cuTile team, Apache TVM community, XGBoost team,
TileLang team, Triton distributed team, FlashInfer team, SGLang community,
TensorRT-LLM community, the vLLM community, for their their insightful
feedbacks.</p>]]></content><author><name>Apache TVM FFI
Community</name></author><summary type="html"><![CDATA[We are currently living
in an exciting era for AI, where machine learning systems [...]
<h2 id="boundaries-in-the-modern-ml-system-stack">Boundaries in the Modern ML
System Stack</h2>
diff --git a/rss.xml b/rss.xml
index de2e7e52de..077169b0b3 100644
--- a/rss.xml
+++ b/rss.xml
@@ -5,24 +5,24 @@
<description>TVM - </description>
<link>https://tvm.apache.org</link>
<atom:link href="https://tvm.apache.org" rel="self"
type="application/rss+xml" />
- <lastBuildDate>Tue, 21 Oct 2025 20:51:57 +0000</lastBuildDate>
- <pubDate>Tue, 21 Oct 2025 20:51:57 +0000</pubDate>
+ <lastBuildDate>Wed, 22 Oct 2025 15:14:12 +0000</lastBuildDate>
+ <pubDate>Wed, 22 Oct 2025 15:14:12 +0000</pubDate>
<ttl>60</ttl>
<item>
<title>Building an Open ABI and FFI for ML Systems</title>
- <description><p>We are currently living in an exciting
era for AI, where machine learning systems and infrastructures are crucial for
training and deploying efficient AI models. The modern machine learning systems
landscape comes rich with diverse components, including popular ML frameworks
and array libraries like JAX, PyTorch, and CuPy. It also includes specialized
libraries such as FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a
growing trend of ML c [...]
+ <description><p>We are currently living in an exciting
era for AI, where machine learning systems and infrastructures are crucial for
training and deploying efficient AI models. The modern machine learning systems
landscape comes rich with diverse components, including popular ML frameworks
and array libraries like JAX, PyTorch, and CuPy. It also includes specialized
libraries such as FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a
growing trend of ML c [...]
-<p>The exciting growth of the ecosystem is the reason for the fast pace
of innovation in AI today. However, it also presents a significant challenge:
<strong>interoperability</strong>. Many of those components need to
integrate with each other. For example, libraries such as FlashInfer, cuDNN
needs to be integrated into PyTorch, JAX, TensorRT’s runtime system, each may
come with different interface requirements. ML compilers and DSLs also usually
expose Python JIT bindi [...]
+<p>The exciting growth of the ecosystem is the reason for today’s fast
pace of innovation in AI. However, it also presents a significant challenge:
<strong>interoperability</strong>. Many of those components need to
integrate with each other. For example, libraries such as FlashInfer and cuDNN
need to be integrated into PyTorch, JAX, and TensorRT’s runtime system, each of
which may come with different interface requirements. ML compilers and DSLs
also usually expose Pyt [...]
<p><img src="/images/tvm-ffi/interop-challenge.png"
alt="image" style="width: 70%; margin: auto; display:
block;" /></p>
-<p>The the core of these interoperability challenges are the
<strong>Application Binary Interface (ABI)</strong> and the
<strong>Foreign Function Interface (FFI)</strong>.
<strong>ABI</strong> defines how data structures are stored in
memory and precisely what occurs when a function is called. For instance, the
way torch stores Tensors may be different from say cupy/numpy, so we cannot
directly pass a torch.Tensor pointer and its treatment as a c [...]
+<p>At the core of these interoperability challenges are the
<strong>Application Binary Interface (ABI)</strong> and the
<strong>Foreign Function Interface (FFI)</strong>.
<strong>ABI</strong> defines how data structures are stored in
memory and precisely what occurs when a function is called. For instance, the
way PyTorch stores Tensors may be different from CuPy/NumPy, so we cannot
directly pass a torch.Tensor pointer and treat it as a cupy.NDAr [...]
-<p>All of the above observations call for a <strong>need for ABI
and FFI for the ML systems</strong> use-cases. Looking at the state
today, luckily, we do have something to start with – the C ABI, which every
programming language speaks and remains stable over time. Unfortunately, C only
focuses on low-level data types such as int, float and raw pointers. On the
other end of the spectrum, we know that python is something that must gain
first-class support, but also ther [...]
+<p>All of the above observations call for a <strong>need for ABI
and FFI for ML systems</strong> use cases. Looking at the current state,
luckily, we do have something to start with – the C ABI, which every
programming language speaks and remains stable over time. Unfortunately, C only
focuses on low-level data types such as int, float and raw pointers. On the
other end of the spectrum, we know that Python is something that must gain
first-class support, but there is st [...]
-<p>This post introduces TVM FFI, an <strong>open ABI and FFI for
machine learning systems</strong>. The project evolved from multiple
years of ABI calling conventions design iterations in the Apache TVM project.
We find that the design can be made generic, independent from the choice of
compiler/language and should benefit the ML systems community. As a result, we
brought into a minimal library built from the ground up with a clear intention
to become an open, standalon [...]
+<p>This post introduces TVM FFI, an <strong>open ABI and FFI for
machine learning systems</strong>. The project evolved from multiple
years of ABI calling conventions design iterations in the Apache TVM project.
We find that the design can be made generic, independent of the choice of
compiler/language and should benefit the ML systems community. As a result, we
built a minimal library from the ground up with a clear intention to become an
open, standalone library that [...]
<ul>
<li><strong>Stable, minimal C ABI</strong> designed for
kernels, DSLs, and runtime extensibility.</li>
@@ -37,11 +37,11 @@
<h2 id="technical-design"><strong>Technical
Design</strong></h2>
-<p>To start with, we need a mechanism to store the values that are
passing across machine learning frameworks. It achieves this using a core data
structure called TVMFFIAny. It is a 16 bytes C structure that follows the
design principle of tagged-union</p>
+<p>To start with, we need a mechanism to store the values that are
passed across machine learning frameworks. It achieves this using a core data
structure called TVMFFIAny. It is a 16-byte C structure that follows the design
principle of tagged union</p>
<p><img src="/images/tvm-ffi/tvmffiany.png"
alt="image" style="width: 50%; margin: auto; display:
block;" /></p>
-<p>The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps to manage
type information and deletion. This design allows us to use the same type_index
mechanism that allows for the future growth and recognition of new kinds of
objects within the FFI, ensuring extensibility. The standalone deleter ensures
objects can be safely allocated by one source or language and deleted in
another place.</p>
+<p>The objects in TVMFFIObject are managed as intrusive pointers, where
TVMFFIObject itself contains the header of the pointer that helps manage type
information and deletion. This design allows us to use the same type_index
mechanism that allows for future growth and recognition of new kinds of objects
within the FFI, ensuring extensibility. The standalone deleter ensures objects
can be safely allocated by one source or language and deleted in another
place.</p>
<p><img src="/images/tvm-ffi/tvmffiobject.png"
alt="image" style="width: 50%; margin: auto; display:
block;" /></p>
@@ -97,7 +97,17 @@ Once DSL integrates with the ABI, we can leverage the same
flow to load back and
<p><img src="/images/tvm-ffi/mydsl.png"
alt="image" style="width: 40%; margin: auto; display:
block;" /></p>
-<p>As we can see, the common open ABI foundation offers numerous
opportunities for ML systems to interoperate. We anticipate that this solution
can significantly benefit various aspects of ML systems and AI
infrastructure:</p>
+<h2 id="core-design-principle-and-applications">Core Design
Principle and Applications</h2>
+
+<p>Coming back to the high level, the core design principle of the TVM
FFI ABI is to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for
ML systems to interoperate. We anticipate that this solution can significantly
benefit various aspects of ML systems and AI infrastructure:</p>
<ul>
<li><strong>Kernel libraries</strong>: Ship a single
package to support multiple frameworks, Python versions, and different
languages.</li>