This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new e8f6b23997 Build at Wed Oct 22 15:14:45 UTC 2025
e8f6b23997 is described below

commit e8f6b23997db8c0f1a3a6c2905c616dc9e82fd88
Author: tvm-bot <[email protected]>
AuthorDate: Wed Oct 22 15:14:45 2025 +0000

    Build at Wed Oct 22 15:14:45 UTC 2025
---
 2025/10/21/tvm-ffi.html | 26 ++++++++++++++++++--------
 atom.xml                | 28 +++++++++++++++++++---------
 feed.xml                | 28 +++++++++++++++++++---------
 rss.xml                 | 30 ++++++++++++++++++++----------
 4 files changed, 76 insertions(+), 36 deletions(-)

diff --git a/2025/10/21/tvm-ffi.html b/2025/10/21/tvm-ffi.html
index 0fb4b19c97..fe9ae696f7 100644
--- a/2025/10/21/tvm-ffi.html
+++ b/2025/10/21/tvm-ffi.html
@@ -146,17 +146,17 @@
         </p>
     </br>
     <div class="post-content">
-      <p>We are currently living in an exciting era for AI, where machine 
learning systems and infrastructures are crucial for training and deploying 
efficient AI models. The modern machine learning systems landscape comes rich 
with diverse components, including popular ML frameworks and array libraries 
like JAX, PyTorch, and CuPy. It also includes specialized libraries such as 
FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a growing trend of 
ML compilers and domain-specific  [...]
+      <p>We are currently living in an exciting era for AI, where machine 
learning systems and infrastructures are crucial for training and deploying 
efficient AI models. The modern machine learning systems landscape comes rich 
with diverse components, including popular ML frameworks and array libraries 
like JAX, PyTorch, and CuPy. It also includes specialized libraries such as 
FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a growing trend of 
ML compilers and domain-specific  [...]
 
-<p>The exciting growth of the ecosystem is the reason for the fast pace of 
innovation in AI today. However, it also presents a significant challenge: 
<strong>interoperability</strong>. Many of those components need to integrate 
with each other. For example, libraries such as FlashInfer, cuDNN needs to be 
integrated into PyTorch, JAX, TensorRT’s runtime system, each may come with 
different interface requirements. ML compilers and DSLs also usually expose 
Python JIT binding support, while  [...]
+<p>The exciting growth of the ecosystem is the reason for today’s fast pace of 
innovation in AI. However, it also presents a significant challenge: 
<strong>interoperability</strong>. Many of those components need to integrate 
with each other. For example, libraries such as FlashInfer and cuDNN need to be 
integrated into PyTorch, JAX, and TensorRT’s runtime system, each of which may 
come with different interface requirements. ML compilers and DSLs also usually 
expose Python JIT binding su [...]
 
 <p><img src="/images/tvm-ffi/interop-challenge.png" alt="image" style="width: 
70%; margin: auto; display: block;" /></p>
 
-<p>The the core of these interoperability challenges are the 
<strong>Application Binary Interface (ABI)</strong> and the <strong>Foreign 
Function Interface (FFI)</strong>. <strong>ABI</strong> defines how data 
structures are stored in memory and precisely what occurs when a function is 
called. For instance, the way torch stores Tensors may be different from say 
cupy/numpy, so we cannot directly pass a torch.Tensor pointer and its treatment 
as a cupy.NDArray. The very nature of machine le [...]
+<p>At the core of these interoperability challenges are the 
<strong>Application Binary Interface (ABI)</strong> and the <strong>Foreign 
Function Interface (FFI)</strong>. <strong>ABI</strong> defines how data 
structures are stored in memory and precisely what occurs when a function is 
called. For instance, the way PyTorch stores Tensors may be different from 
CuPy/NumPy, so we cannot directly pass a torch.Tensor pointer and treat it as a 
cupy.NDArray. The very nature of machine learning a [...]
 
-<p>All of the above observations call for a <strong>need for ABI and FFI for 
the ML systems</strong> use-cases. Looking at the state today, luckily, we do 
have something to start with – the C ABI, which every programming language 
speaks and remains stable over time. Unfortunately, C only focuses on low-level 
data types such as int, float and raw pointers. On the other end of the 
spectrum, we know that python is something that must gain first-class support, 
but also there is still a need  [...]
+<p>All of the above observations call for a <strong>need for ABI and FFI for 
ML systems</strong> use cases. Looking at the current state, luckily, we do 
have something to start with – the C ABI, which every programming language 
speaks and remains stable over time. Unfortunately, C only focuses on low-level 
data types such as int, float and raw pointers. On the other end of the 
spectrum, we know that Python is something that must gain first-class support, 
but there is still a need for dif [...]
 
-<p>This post introduces TVM FFI, an <strong>open ABI and FFI for machine 
learning systems</strong>. The project evolved from multiple years of ABI 
calling conventions design iterations in the Apache TVM project. We find that 
the design can be made generic, independent from the choice of 
compiler/language and should benefit the ML systems community. As a result, we 
brought into a minimal library built from the ground up with a clear intention 
to become an open, standalone library that can [...]
+<p>This post introduces TVM FFI, an <strong>open ABI and FFI for machine 
learning systems</strong>. The project evolved from multiple years of ABI 
calling conventions design iterations in the Apache TVM project. We find that 
the design can be made generic, independent of the choice of compiler/language 
and should benefit the ML systems community. As a result, we built a minimal 
library from the ground up with a clear intention to become an open, standalone 
library that can be shared and  [...]
 
 <ul>
   <li><strong>Stable, minimal C ABI</strong> designed for kernels, DSLs, and 
runtime extensibility.</li>
@@ -171,11 +171,11 @@
 
 <h2 id="technical-design"><strong>Technical Design</strong></h2>
 
-<p>To start with, we need a mechanism to store the values that are passing 
across machine learning frameworks. It achieves this using a core data 
structure called TVMFFIAny. It is a 16 bytes C structure that follows the 
design principle of tagged-union</p>
+<p>To start with, we need a mechanism to store the values that are passed 
across machine learning frameworks. It achieves this using a core data 
structure called TVMFFIAny. It is a 16-byte C structure that follows the design 
principle of tagged union</p>
 
 <p><img src="/images/tvm-ffi/tvmffiany.png" alt="image" style="width: 50%; 
margin: auto; display: block;" /></p>
 
-<p>The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps to manage 
type information and deletion. This design allows us to use the same type_index 
mechanism that allows for the future growth and recognition of new kinds of 
objects within the FFI, ensuring extensibility. The standalone deleter ensures 
objects can be safely allocated by one source or language and deleted in 
another place.</p>
+<p>The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps manage type 
information and deletion. This design allows us to use the same type_index 
mechanism that allows for future growth and recognition of new kinds of objects 
within the FFI, ensuring extensibility. The standalone deleter ensures objects 
can be safely allocated by one source or language and deleted in another 
place.</p>
 
 <p><img src="/images/tvm-ffi/tvmffiobject.png" alt="image" style="width: 50%; 
margin: auto; display: block;" /></p>
 
@@ -231,7 +231,17 @@ Once DSL integrates with the ABI, we can leverage the same 
flow to load back and
 
 <p><img src="/images/tvm-ffi/mydsl.png" alt="image" style="width: 40%; margin: 
auto; display: block;" /></p>
 
-<p>As we can see, the common open ABI foundation offers numerous opportunities 
for ML systems to interoperate. We anticipate that this solution can 
significantly benefit various aspects of ML systems and AI infrastructure:</p>
+<h2 id="core-design-principle-and-applications">Core Design Principle and 
Applications</h2>
+
+<p>Coming back to the high level, the core design principle of the TVM FFI ABI 
is to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between 
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop 
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another 
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism 
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM 
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI 
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the 
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for 
ML systems to interoperate. We anticipate that this solution can significantly 
benefit various aspects of ML systems and AI infrastructure:</p>
 
 <ul>
   <li><strong>Kernel libraries</strong>: Ship a single package to support 
multiple frameworks, Python versions, and different languages.</li>
diff --git a/atom.xml b/atom.xml
index 0cc512185b..1b28904c60 100644
--- a/atom.xml
+++ b/atom.xml
@@ -4,7 +4,7 @@
  <title>TVM</title>
  <link href="https://tvm.apache.org"; rel="self"/>
  <link href="https://tvm.apache.org"/>
- <updated>2025-10-21T20:51:57+00:00</updated>
+ <updated>2025-10-22T15:14:12+00:00</updated>
  <id>https://tvm.apache.org</id>
  <author>
    <name></name>
@@ -17,17 +17,17 @@
    <link href="https://tvm.apache.org/2025/10/21/tvm-ffi"/>
    <updated>2025-10-21T00:00:00+00:00</updated>
    <id>https://tvm.apache.org/2025/10/21/tvm-ffi</id>
-   <content type="html">&lt;p&gt;We are currently living in an exciting era 
for AI, where machine learning systems and infrastructures are crucial for 
training and deploying efficient AI models. The modern machine learning systems 
landscape comes rich with diverse components, including popular ML frameworks 
and array libraries like JAX, PyTorch, and CuPy. It also includes specialized 
libraries such as FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a 
growing trend of ML compil [...]
+   <content type="html">&lt;p&gt;We are currently living in an exciting era 
for AI, where machine learning systems and infrastructures are crucial for 
training and deploying efficient AI models. The modern machine learning systems 
landscape comes rich with diverse components, including popular ML frameworks 
and array libraries like JAX, PyTorch, and CuPy. It also includes specialized 
libraries such as FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a 
growing trend of ML compil [...]
 
-&lt;p&gt;The exciting growth of the ecosystem is the reason for the fast pace 
of innovation in AI today. However, it also presents a significant challenge: 
&lt;strong&gt;interoperability&lt;/strong&gt;. Many of those components need to 
integrate with each other. For example, libraries such as FlashInfer, cuDNN 
needs to be integrated into PyTorch, JAX, TensorRT’s runtime system, each may 
come with different interface requirements. ML compilers and DSLs also usually 
expose Python JIT bindi [...]
+&lt;p&gt;The exciting growth of the ecosystem is the reason for today’s fast 
pace of innovation in AI. However, it also presents a significant challenge: 
&lt;strong&gt;interoperability&lt;/strong&gt;. Many of those components need to 
integrate with each other. For example, libraries such as FlashInfer and cuDNN 
need to be integrated into PyTorch, JAX, and TensorRT’s runtime system, each of 
which may come with different interface requirements. ML compilers and DSLs 
also usually expose Pyt [...]
 
 &lt;p&gt;&lt;img src=&quot;/images/tvm-ffi/interop-challenge.png&quot; 
alt=&quot;image&quot; style=&quot;width: 70%; margin: auto; display: 
block;&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;The the core of these interoperability challenges are the 
&lt;strong&gt;Application Binary Interface (ABI)&lt;/strong&gt; and the 
&lt;strong&gt;Foreign Function Interface (FFI)&lt;/strong&gt;. 
&lt;strong&gt;ABI&lt;/strong&gt; defines how data structures are stored in 
memory and precisely what occurs when a function is called. For instance, the 
way torch stores Tensors may be different from say cupy/numpy, so we cannot 
directly pass a torch.Tensor pointer and its treatment as a c [...]
+&lt;p&gt;At the core of these interoperability challenges are the 
&lt;strong&gt;Application Binary Interface (ABI)&lt;/strong&gt; and the 
&lt;strong&gt;Foreign Function Interface (FFI)&lt;/strong&gt;. 
&lt;strong&gt;ABI&lt;/strong&gt; defines how data structures are stored in 
memory and precisely what occurs when a function is called. For instance, the 
way PyTorch stores Tensors may be different from CuPy/NumPy, so we cannot 
directly pass a torch.Tensor pointer and treat it as a cupy.NDAr [...]
 
-&lt;p&gt;All of the above observations call for a &lt;strong&gt;need for ABI 
and FFI for the ML systems&lt;/strong&gt; use-cases. Looking at the state 
today, luckily, we do have something to start with – the C ABI, which every 
programming language speaks and remains stable over time. Unfortunately, C only 
focuses on low-level data types such as int, float and raw pointers. On the 
other end of the spectrum, we know that python is something that must gain 
first-class support, but also ther [...]
+&lt;p&gt;All of the above observations call for a &lt;strong&gt;need for ABI 
and FFI for ML systems&lt;/strong&gt; use cases. Looking at the current state, 
luckily, we do have something to start with – the C ABI, which every 
programming language speaks and remains stable over time. Unfortunately, C only 
focuses on low-level data types such as int, float and raw pointers. On the 
other end of the spectrum, we know that Python is something that must gain 
first-class support, but there is st [...]
 
-&lt;p&gt;This post introduces TVM FFI, an &lt;strong&gt;open ABI and FFI for 
machine learning systems&lt;/strong&gt;. The project evolved from multiple 
years of ABI calling conventions design iterations in the Apache TVM project. 
We find that the design can be made generic, independent from the choice of 
compiler/language and should benefit the ML systems community. As a result, we 
brought into a minimal library built from the ground up with a clear intention 
to become an open, standalon [...]
+&lt;p&gt;This post introduces TVM FFI, an &lt;strong&gt;open ABI and FFI for 
machine learning systems&lt;/strong&gt;. The project evolved from multiple 
years of ABI calling conventions design iterations in the Apache TVM project. 
We find that the design can be made generic, independent of the choice of 
compiler/language and should benefit the ML systems community. As a result, we 
built a minimal library from the ground up with a clear intention to become an 
open, standalone library that  [...]
 
 &lt;ul&gt;
   &lt;li&gt;&lt;strong&gt;Stable, minimal C ABI&lt;/strong&gt; designed for 
kernels, DSLs, and runtime extensibility.&lt;/li&gt;
@@ -42,11 +42,11 @@
 
 &lt;h2 id=&quot;technical-design&quot;&gt;&lt;strong&gt;Technical 
Design&lt;/strong&gt;&lt;/h2&gt;
 
-&lt;p&gt;To start with, we need a mechanism to store the values that are 
passing across machine learning frameworks. It achieves this using a core data 
structure called TVMFFIAny. It is a 16 bytes C structure that follows the 
design principle of tagged-union&lt;/p&gt;
+&lt;p&gt;To start with, we need a mechanism to store the values that are 
passed across machine learning frameworks. It achieves this using a core data 
structure called TVMFFIAny. It is a 16-byte C structure that follows the design 
principle of tagged union&lt;/p&gt;
 
 &lt;p&gt;&lt;img src=&quot;/images/tvm-ffi/tvmffiany.png&quot; 
alt=&quot;image&quot; style=&quot;width: 50%; margin: auto; display: 
block;&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps to manage 
type information and deletion. This design allows us to use the same type_index 
mechanism that allows for the future growth and recognition of new kinds of 
objects within the FFI, ensuring extensibility. The standalone deleter ensures 
objects can be safely allocated by one source or language and deleted in 
another place.&lt;/p&gt;
+&lt;p&gt;The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps manage type 
information and deletion. This design allows us to use the same type_index 
mechanism that allows for future growth and recognition of new kinds of objects 
within the FFI, ensuring extensibility. The standalone deleter ensures objects 
can be safely allocated by one source or language and deleted in another 
place.&lt;/p&gt;
 
 &lt;p&gt;&lt;img src=&quot;/images/tvm-ffi/tvmffiobject.png&quot; 
alt=&quot;image&quot; style=&quot;width: 50%; margin: auto; display: 
block;&quot; /&gt;&lt;/p&gt;
 
@@ -102,7 +102,17 @@ Once DSL integrates with the ABI, we can leverage the same 
flow to load back and
 
 &lt;p&gt;&lt;img src=&quot;/images/tvm-ffi/mydsl.png&quot; 
alt=&quot;image&quot; style=&quot;width: 40%; margin: auto; display: 
block;&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;As we can see, the common open ABI foundation offers numerous 
opportunities for ML systems to interoperate. We anticipate that this solution 
can significantly benefit various aspects of ML systems and AI 
infrastructure:&lt;/p&gt;
+&lt;h2 id=&quot;core-design-principle-and-applications&quot;&gt;Core Design 
Principle and Applications&lt;/h2&gt;
+
+&lt;p&gt;Coming back to the high level, the core design principle of the TVM 
FFI ABI is to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between 
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop 
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another 
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism 
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM 
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI 
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the 
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for 
ML systems to interoperate. We anticipate that this solution can significantly 
benefit various aspects of ML systems and AI infrastructure:&lt;/p&gt;
 
 &lt;ul&gt;
   &lt;li&gt;&lt;strong&gt;Kernel libraries&lt;/strong&gt;: Ship a single 
package to support multiple frameworks, Python versions, and different 
languages.&lt;/li&gt;
diff --git a/feed.xml b/feed.xml
index 77489431c4..cbed874716 100644
--- a/feed.xml
+++ b/feed.xml
@@ -1,14 +1,14 @@
-<?xml version="1.0" encoding="utf-8"?><feed 
xmlns="http://www.w3.org/2005/Atom"; ><generator uri="https://jekyllrb.com/"; 
version="4.4.1">Jekyll</generator><link href="/feed.xml" rel="self" 
type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" 
/><updated>2025-10-21T20:51:57+00:00</updated><id>/feed.xml</id><title 
type="html">TVM</title><author><name>{&quot;name&quot; =&gt; 
nil}</name></author><entry><title type="html">Building an Open ABI and FFI for 
ML Systems</tit [...]
+<?xml version="1.0" encoding="utf-8"?><feed 
xmlns="http://www.w3.org/2005/Atom"; ><generator uri="https://jekyllrb.com/"; 
version="4.4.1">Jekyll</generator><link href="/feed.xml" rel="self" 
type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" 
/><updated>2025-10-22T15:14:12+00:00</updated><id>/feed.xml</id><title 
type="html">TVM</title><author><name>{&quot;name&quot; =&gt; 
nil}</name></author><entry><title type="html">Building an Open ABI and FFI for 
ML Systems</tit [...]
 
-<p>The exciting growth of the ecosystem is the reason for the fast pace of 
innovation in AI today. However, it also presents a significant challenge: 
<strong>interoperability</strong>. Many of those components need to integrate 
with each other. For example, libraries such as FlashInfer, cuDNN needs to be 
integrated into PyTorch, JAX, TensorRT’s runtime system, each may come with 
different interface requirements. ML compilers and DSLs also usually expose 
Python JIT binding support, while  [...]
+<p>The exciting growth of the ecosystem is the reason for today’s fast pace of 
innovation in AI. However, it also presents a significant challenge: 
<strong>interoperability</strong>. Many of those components need to integrate 
with each other. For example, libraries such as FlashInfer and cuDNN need to be 
integrated into PyTorch, JAX, and TensorRT’s runtime system, each of which may 
come with different interface requirements. ML compilers and DSLs also usually 
expose Python JIT binding su [...]
 
 <p><img src="/images/tvm-ffi/interop-challenge.png" alt="image" style="width: 
70%; margin: auto; display: block;" /></p>
 
-<p>The the core of these interoperability challenges are the 
<strong>Application Binary Interface (ABI)</strong> and the <strong>Foreign 
Function Interface (FFI)</strong>. <strong>ABI</strong> defines how data 
structures are stored in memory and precisely what occurs when a function is 
called. For instance, the way torch stores Tensors may be different from say 
cupy/numpy, so we cannot directly pass a torch.Tensor pointer and its treatment 
as a cupy.NDArray. The very nature of machine le [...]
+<p>At the core of these interoperability challenges are the 
<strong>Application Binary Interface (ABI)</strong> and the <strong>Foreign 
Function Interface (FFI)</strong>. <strong>ABI</strong> defines how data 
structures are stored in memory and precisely what occurs when a function is 
called. For instance, the way PyTorch stores Tensors may be different from 
CuPy/NumPy, so we cannot directly pass a torch.Tensor pointer and treat it as a 
cupy.NDArray. The very nature of machine learning a [...]
 
-<p>All of the above observations call for a <strong>need for ABI and FFI for 
the ML systems</strong> use-cases. Looking at the state today, luckily, we do 
have something to start with – the C ABI, which every programming language 
speaks and remains stable over time. Unfortunately, C only focuses on low-level 
data types such as int, float and raw pointers. On the other end of the 
spectrum, we know that python is something that must gain first-class support, 
but also there is still a need  [...]
+<p>All of the above observations call for a <strong>need for ABI and FFI for 
ML systems</strong> use cases. Looking at the current state, luckily, we do 
have something to start with – the C ABI, which every programming language 
speaks and remains stable over time. Unfortunately, C only focuses on low-level 
data types such as int, float and raw pointers. On the other end of the 
spectrum, we know that Python is something that must gain first-class support, 
but there is still a need for dif [...]
 
-<p>This post introduces TVM FFI, an <strong>open ABI and FFI for machine 
learning systems</strong>. The project evolved from multiple years of ABI 
calling conventions design iterations in the Apache TVM project. We find that 
the design can be made generic, independent from the choice of 
compiler/language and should benefit the ML systems community. As a result, we 
brought into a minimal library built from the ground up with a clear intention 
to become an open, standalone library that can [...]
+<p>This post introduces TVM FFI, an <strong>open ABI and FFI for machine 
learning systems</strong>. The project evolved from multiple years of ABI 
calling conventions design iterations in the Apache TVM project. We find that 
the design can be made generic, independent of the choice of compiler/language 
and should benefit the ML systems community. As a result, we built a minimal 
library from the ground up with a clear intention to become an open, standalone 
library that can be shared and  [...]
 
 <ul>
   <li><strong>Stable, minimal C ABI</strong> designed for kernels, DSLs, and 
runtime extensibility.</li>
@@ -23,11 +23,11 @@
 
 <h2 id="technical-design"><strong>Technical Design</strong></h2>
 
-<p>To start with, we need a mechanism to store the values that are passing 
across machine learning frameworks. It achieves this using a core data 
structure called TVMFFIAny. It is a 16 bytes C structure that follows the 
design principle of tagged-union</p>
+<p>To start with, we need a mechanism to store the values that are passed 
across machine learning frameworks. It achieves this using a core data 
structure called TVMFFIAny. It is a 16-byte C structure that follows the design 
principle of tagged union</p>
 
 <p><img src="/images/tvm-ffi/tvmffiany.png" alt="image" style="width: 50%; 
margin: auto; display: block;" /></p>
 
-<p>The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps to manage 
type information and deletion. This design allows us to use the same type_index 
mechanism that allows for the future growth and recognition of new kinds of 
objects within the FFI, ensuring extensibility. The standalone deleter ensures 
objects can be safely allocated by one source or language and deleted in 
another place.</p>
+<p>The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps manage type 
information and deletion. This design allows us to use the same type_index 
mechanism that allows for future growth and recognition of new kinds of objects 
within the FFI, ensuring extensibility. The standalone deleter ensures objects 
can be safely allocated by one source or language and deleted in another 
place.</p>
 
 <p><img src="/images/tvm-ffi/tvmffiobject.png" alt="image" style="width: 50%; 
margin: auto; display: block;" /></p>
 
@@ -83,7 +83,17 @@ Once DSL integrates with the ABI, we can leverage the same 
flow to load back and
 
 <p><img src="/images/tvm-ffi/mydsl.png" alt="image" style="width: 40%; margin: 
auto; display: block;" /></p>
 
-<p>As we can see, the common open ABI foundation offers numerous opportunities 
for ML systems to interoperate. We anticipate that this solution can 
significantly benefit various aspects of ML systems and AI infrastructure:</p>
+<h2 id="core-design-principle-and-applications">Core Design Principle and 
Applications</h2>
+
+<p>Coming back to the high level, the core design principle of the TVM FFI ABI 
is to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between 
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop 
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another 
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism 
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM 
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI 
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the 
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for 
ML systems to interoperate. We anticipate that this solution can significantly 
benefit various aspects of ML systems and AI infrastructure:</p>
 
 <ul>
   <li><strong>Kernel libraries</strong>: Ship a single package to support 
multiple frameworks, Python versions, and different languages.</li>
@@ -113,7 +123,7 @@ Please checkout the following resources:</p>
 
 <p>The project draws collective wisdoms of the Machine Learning System 
community and python open source ecosystem, including past development insights 
of many developers from numpy, PyTorch, JAX, Caffe, mxnet, XGBoost, cuPy, 
pybind11, nanobind and more.</p>
 
-<p>We would specifically like to thank the PyTorch team, JAX team, CUDA python 
team,  cuteDSL team,  cuTile team,  Apache TVM community,  XGBoost team, 
TileLang team, Triton distributed team, FlashInfer team,  SGLang community,  
TensorRT-LLM community, the vLLM community, for their their insightful 
feedbacks.</p>]]></content><author><name>Apache TVM FFI 
Community</name></author><summary type="html"><![CDATA[We are currently living 
in an exciting era for AI, where machine learning systems [...]
+<p>We would specifically like to thank the PyTorch team, JAX team, CUDA python 
team,  cuteDSL team,  cuTile team,  Apache TVM community,  XGBoost team, 
TileLang team, Triton distributed team, FlashInfer team,  SGLang community,  
TensorRT-LLM community, the vLLM community, for their their insightful 
feedbacks.</p>]]></content><author><name>Apache TVM FFI 
Community</name></author><summary type="html"><![CDATA[We are currently living 
in an exciting era for AI, where machine learning systems [...]
 
 <h2 id="boundaries-in-the-modern-ml-system-stack">Boundaries in the Modern ML 
System Stack</h2>
 
diff --git a/rss.xml b/rss.xml
index de2e7e52de..077169b0b3 100644
--- a/rss.xml
+++ b/rss.xml
@@ -5,24 +5,24 @@
         <description>TVM - </description>
         <link>https://tvm.apache.org</link>
         <atom:link href="https://tvm.apache.org"; rel="self" 
type="application/rss+xml" />
-        <lastBuildDate>Tue, 21 Oct 2025 20:51:57 +0000</lastBuildDate>
-        <pubDate>Tue, 21 Oct 2025 20:51:57 +0000</pubDate>
+        <lastBuildDate>Wed, 22 Oct 2025 15:14:12 +0000</lastBuildDate>
+        <pubDate>Wed, 22 Oct 2025 15:14:12 +0000</pubDate>
         <ttl>60</ttl>
 
 
         <item>
                 <title>Building an Open ABI and FFI for ML Systems</title>
-                <description>&lt;p&gt;We are currently living in an exciting 
era for AI, where machine learning systems and infrastructures are crucial for 
training and deploying efficient AI models. The modern machine learning systems 
landscape comes rich with diverse components, including popular ML frameworks 
and array libraries like JAX, PyTorch, and CuPy. It also includes specialized 
libraries such as FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a 
growing trend of ML c [...]
+                <description>&lt;p&gt;We are currently living in an exciting 
era for AI, where machine learning systems and infrastructures are crucial for 
training and deploying efficient AI models. The modern machine learning systems 
landscape comes rich with diverse components, including popular ML frameworks 
and array libraries like JAX, PyTorch, and CuPy. It also includes specialized 
libraries such as FlashAttention, FlashInfer and cuDNN. Furthermore, there’s a 
growing trend of ML c [...]
 
-&lt;p&gt;The exciting growth of the ecosystem is the reason for the fast pace 
of innovation in AI today. However, it also presents a significant challenge: 
&lt;strong&gt;interoperability&lt;/strong&gt;. Many of those components need to 
integrate with each other. For example, libraries such as FlashInfer, cuDNN 
needs to be integrated into PyTorch, JAX, TensorRT’s runtime system, each may 
come with different interface requirements. ML compilers and DSLs also usually 
expose Python JIT bindi [...]
+&lt;p&gt;The exciting growth of the ecosystem is the reason for today’s fast 
pace of innovation in AI. However, it also presents a significant challenge: 
&lt;strong&gt;interoperability&lt;/strong&gt;. Many of those components need to 
integrate with each other. For example, libraries such as FlashInfer and cuDNN 
need to be integrated into PyTorch, JAX, and TensorRT’s runtime system, each of 
which may come with different interface requirements. ML compilers and DSLs 
also usually expose Pyt [...]
 
 &lt;p&gt;&lt;img src=&quot;/images/tvm-ffi/interop-challenge.png&quot; 
alt=&quot;image&quot; style=&quot;width: 70%; margin: auto; display: 
block;&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;The the core of these interoperability challenges are the 
&lt;strong&gt;Application Binary Interface (ABI)&lt;/strong&gt; and the 
&lt;strong&gt;Foreign Function Interface (FFI)&lt;/strong&gt;. 
&lt;strong&gt;ABI&lt;/strong&gt; defines how data structures are stored in 
memory and precisely what occurs when a function is called. For instance, the 
way torch stores Tensors may be different from say cupy/numpy, so we cannot 
directly pass a torch.Tensor pointer and its treatment as a c [...]
+&lt;p&gt;At the core of these interoperability challenges are the 
&lt;strong&gt;Application Binary Interface (ABI)&lt;/strong&gt; and the 
&lt;strong&gt;Foreign Function Interface (FFI)&lt;/strong&gt;. 
&lt;strong&gt;ABI&lt;/strong&gt; defines how data structures are stored in 
memory and precisely what occurs when a function is called. For instance, the 
way PyTorch stores Tensors may be different from CuPy/NumPy, so we cannot 
directly pass a torch.Tensor pointer and treat it as a cupy.NDAr [...]
 
-&lt;p&gt;All of the above observations call for a &lt;strong&gt;need for ABI 
and FFI for the ML systems&lt;/strong&gt; use-cases. Looking at the state 
today, luckily, we do have something to start with – the C ABI, which every 
programming language speaks and remains stable over time. Unfortunately, C only 
focuses on low-level data types such as int, float and raw pointers. On the 
other end of the spectrum, we know that python is something that must gain 
first-class support, but also ther [...]
+&lt;p&gt;All of the above observations call for a &lt;strong&gt;need for ABI 
and FFI for ML systems&lt;/strong&gt; use cases. Looking at the current state, 
luckily, we do have something to start with – the C ABI, which every 
programming language speaks and remains stable over time. Unfortunately, C only 
focuses on low-level data types such as int, float and raw pointers. On the 
other end of the spectrum, we know that Python is something that must gain 
first-class support, but there is st [...]
 
-&lt;p&gt;This post introduces TVM FFI, an &lt;strong&gt;open ABI and FFI for 
machine learning systems&lt;/strong&gt;. The project evolved from multiple 
years of ABI calling conventions design iterations in the Apache TVM project. 
We find that the design can be made generic, independent from the choice of 
compiler/language and should benefit the ML systems community. As a result, we 
brought into a minimal library built from the ground up with a clear intention 
to become an open, standalon [...]
+&lt;p&gt;This post introduces TVM FFI, an &lt;strong&gt;open ABI and FFI for 
machine learning systems&lt;/strong&gt;. The project evolved from multiple 
years of ABI calling conventions design iterations in the Apache TVM project. 
We find that the design can be made generic, independent of the choice of 
compiler/language and should benefit the ML systems community. As a result, we 
built a minimal library from the ground up with a clear intention to become an 
open, standalone library that  [...]
 
 &lt;ul&gt;
   &lt;li&gt;&lt;strong&gt;Stable, minimal C ABI&lt;/strong&gt; designed for 
kernels, DSLs, and runtime extensibility.&lt;/li&gt;
@@ -37,11 +37,11 @@
 
 &lt;h2 id=&quot;technical-design&quot;&gt;&lt;strong&gt;Technical 
Design&lt;/strong&gt;&lt;/h2&gt;
 
-&lt;p&gt;To start with, we need a mechanism to store the values that are 
passing across machine learning frameworks. It achieves this using a core data 
structure called TVMFFIAny. It is a 16 bytes C structure that follows the 
design principle of tagged-union&lt;/p&gt;
+&lt;p&gt;To start with, we need a mechanism to store the values that are 
passed across machine learning frameworks. It achieves this using a core data 
structure called TVMFFIAny. It is a 16-byte C structure that follows the design 
principle of tagged union&lt;/p&gt;
 
 &lt;p&gt;&lt;img src=&quot;/images/tvm-ffi/tvmffiany.png&quot; 
alt=&quot;image&quot; style=&quot;width: 50%; margin: auto; display: 
block;&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps to manage 
type information and deletion. This design allows us to use the same type_index 
mechanism that allows for the future growth and recognition of new kinds of 
objects within the FFI, ensuring extensibility. The standalone deleter ensures 
objects can be safely allocated by one source or language and deleted in 
another place.&lt;/p&gt;
+&lt;p&gt;The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps manage type 
information and deletion. This design allows us to use the same type_index 
mechanism that allows for future growth and recognition of new kinds of objects 
within the FFI, ensuring extensibility. The standalone deleter ensures objects 
can be safely allocated by one source or language and deleted in another 
place.&lt;/p&gt;
 
 &lt;p&gt;&lt;img src=&quot;/images/tvm-ffi/tvmffiobject.png&quot; 
alt=&quot;image&quot; style=&quot;width: 50%; margin: auto; display: 
block;&quot; /&gt;&lt;/p&gt;
 
@@ -97,7 +97,17 @@ Once DSL integrates with the ABI, we can leverage the same 
flow to load back and
 
 &lt;p&gt;&lt;img src=&quot;/images/tvm-ffi/mydsl.png&quot; 
alt=&quot;image&quot; style=&quot;width: 40%; margin: auto; display: 
block;&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;As we can see, the common open ABI foundation offers numerous 
opportunities for ML systems to interoperate. We anticipate that this solution 
can significantly benefit various aspects of ML systems and AI 
infrastructure:&lt;/p&gt;
+&lt;h2 id=&quot;core-design-principle-and-applications&quot;&gt;Core Design 
Principle and Applications&lt;/h2&gt;
+
+&lt;p&gt;Coming back to the high level, the core design principle of the TVM 
FFI ABI is to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between 
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop 
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another 
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism 
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM 
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI 
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the 
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for 
ML systems to interoperate. We anticipate that this solution can significantly 
benefit various aspects of ML systems and AI infrastructure:&lt;/p&gt;
 
 &lt;ul&gt;
   &lt;li&gt;&lt;strong&gt;Kernel libraries&lt;/strong&gt;: Ship a single 
package to support multiple frameworks, Python versions, and different 
languages.&lt;/li&gt;


Reply via email to