junrushao commented on code in PR #402: URL: https://github.com/apache/tvm-ffi/pull/402#discussion_r2684015221
########## docs/concepts/abi_overview.rst: ########## @@ -0,0 +1,490 @@ +.. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +ABI Overview in C +================= + +.. hint:: + + Authoritative ABI specifications are defined in + + - C header `tvm/ffi/c_api.h <https://github.com/apache/tvm-ffi/blob/main/include/tvm/ffi/c_api.h>`_, which contains the core ABI, and + - C header `tvm/ffi/extra/c_env_api.h <https://github.com/apache/tvm-ffi/blob/main/include/tvm/ffi/extra/c_env_api.h>`_, which contains extra support features. + +The TVM-FFI ABI is designed around the following key principles: + +- **Minimal and efficient.** Keep things simple and deliver close-to-metal performance. +- **Stability guarantee.** The ABI remains stable across compiler versions and is independent of host languages or frameworks. +- **Expressive for machine learning.** Native support for tensors, shapes, and data types commonly used in ML workloads. +- **Extensible.** The ABI supports user-defined types and features through a dynamic type registration system. + +This tutorial covers common concepts and usage patterns of the TVM-FFI ABI, with low-level C code examples for precise reference. + +Any and AnyView +--------------- + +.. seealso:: + + :doc:`any` for :cpp:class:`~tvm::ffi::Any` and :cpp:class:`~tvm::ffi::AnyView` usage patterns. + +At the core of TVM-FFI is :cpp:class:`TVMFFIAny`, a 16-byte tagged union that can hold any value +recognized by the FFI system. It enables type-erased value passing across language boundaries. + +.. dropdown:: C ABI Reference: :cpp:class:`TVMFFIAny` + :icon: code + + .. literalinclude:: ../../include/tvm/ffi/c_api.h + :language: c + :start-after: [TVMFFIAny.begin] + :end-before: [TVMFFIAny.end] + :caption: tvm/ffi/c_api.h + +**Ownership.** :cpp:class:`TVMFFIAny` has two variants with identical layout but different :ref:`ownership semantics <any-ownership>`: + +- **Owning:** :cpp:class:`tvm::ffi::Any` - reference-counted, manages object lifetime +- **Borrowing:** :cpp:class:`tvm::ffi::AnyView` - non-owning view, caller must ensure validity + +.. note:: + To convert a borrowing :cpp:class:`~tvm::ffi::AnyView` to an owning :cpp:class:`~tvm::ffi::Any`, use :cpp:func:`TVMFFIAnyViewToOwnedAny`. + +**Runtime Type Index.** The ``type_index`` field identifies what kind of value is stored: + +- :ref:`Atomic POD types <any-atomic-types>` (``type_index`` < :cpp:enumerator:`kTVMFFIStaticObjectBegin <TVMFFITypeIndex::kTVMFFIStaticObjectBegin>`): + Stored inline in the payload union without heap allocation or reference counting. +- :ref:`Object types <any-heap-allocated-objects>` (``type_index`` >= :cpp:enumerator:`kTVMFFIStaticObjectBegin <TVMFFITypeIndex::kTVMFFIStaticObjectBegin>`): + Stored as pointers to heap-allocated, reference-counted TVM-FFI objects. + +.. important:: + The TVM-FFI type index system does not rely on C++ RTTI. + + +Construct Any +~~~~~~~~~~~~~ + +**From atomic POD types.** The following C code constructs a :cpp:class:`TVMFFIAny` from an integer: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Any_AnyView.FromInt_Float.begin] + :end-before: [Any_AnyView.FromInt_Float.end] + +Set the ``type_index`` from :cpp:enum:`TVMFFITypeIndex` and assign the corresponding payload field. + +.. important:: + + Always zero the ``zero_padding`` field and any unused bytes in the value union. + This invariant enables direct byte comparison and hashing of :cpp:class:`TVMFFIAny` values. + +**From object types.** The following C code constructs a :cpp:class:`TVMFFIAny` from a heap-allocated object: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Any_AnyView.FromObjectPtr.begin] + :end-before: [Any_AnyView.FromObjectPtr.end] + +When ``IS_OWNING_ANY`` is ``true`` (owning :cpp:class:`~tvm::ffi::Any`), this increments the object's reference count. + +.. _abi-destruct-any: + +Destruct Any +~~~~~~~~~~~~ + +The following C code destroys a :cpp:class:`TVMFFIAny`: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Any_AnyView.Destroy.begin] + :end-before: [Any_AnyView.Destroy.end] + +When ``IS_OWNING_ANY`` is ``true`` (owning :cpp:class:`~tvm::ffi::Any`), this decrements the object's reference count. + +Extract from Any +~~~~~~~~~~~~~~~~ + +**Extract an atomic POD.** The following C code extracts an integer or float from a :cpp:class:`TVMFFIAny`: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Any_AnyView.GetInt_Float.begin] + :end-before: [Any_AnyView.GetInt_Float.end] + +Implicit type conversion may occur. For example, when extracting a float from a :cpp:class:`TVMFFIAny` +that holds an integer, the integer is cast to a float. + +**Extract a DLTensor.** A :c:struct:`DLTensor` may originate from either a raw pointer or a heap-allocated :cpp:class:`~tvm::ffi::TensorObj`: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Any_AnyView.GetDLTensor.begin] + :end-before: [Any_AnyView.GetDLTensor.end] + +**Extract a TVM-FFI object.** TVM-FFI objects are always heap-allocated and reference-counted, +with ``type_index`` >= :cpp:enumerator:`kTVMFFIStaticObjectBegin <TVMFFITypeIndex::kTVMFFIStaticObjectBegin>`: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Any_AnyView.GetObject.begin] + :end-before: [Any_AnyView.GetObject.end] + +To take ownership of the returned value, increment the reference count via :cpp:func:`TVMFFIObjectIncRef`. +Release ownership later via :cpp:func:`TVMFFIObjectDecRef`. + +.. _abi-object: + +Object +------ + +.. seealso:: + + :doc:`object_and_class` for the object system and reflection. + +TVM-FFI Object (:cpp:class:`TVMFFIObject`) is the cornerstone of TVM-FFI's stable yet extensible type system. + +.. dropdown:: C ABI Reference: :cpp:class:`TVMFFIObject` + :icon: code + + .. literalinclude:: ../../include/tvm/ffi/c_api.h + :language: c + :start-after: [TVMFFIObject.begin] + :end-before: [TVMFFIObject.end] + :caption: tvm/ffi/c_api.h + +All TVM-FFI objects share these characteristics: + +- Heap-allocated and reference-counted +- Layout-stable 24-byte header containing reference counts, type index, and deleter callback +- Type index >= :cpp:enumerator:`kTVMFFIStaticObjectBegin <TVMFFITypeIndex::kTVMFFIStaticObjectBegin>` + +**Dynamic Type System.** Classes can be registered at runtime via :cpp:func:`TVMFFITypeGetOrAllocIndex`, +with support for single inheritance. See :ref:`type-checking-and-casting` for usage details. + +.. _abi-object-ownership: + +Ownership Management +~~~~~~~~~~~~~~~~~~~~ + +Ownership is managed via reference counting. See :ref:`object-reference-counting` for details. + +Two C APIs manage object ownership: + +- :cpp:func:`TVMFFIObjectIncRef`: Acquire ownership by incrementing the reference count +- :cpp:func:`TVMFFIObjectDecRef`: Release ownership by decrementing the reference count; the deleter callback (:cpp:member:`TVMFFIObject::deleter`) executes when the count reaches zero + +**Move ownership from Any/AnyView.** The following C code transfers ownership from an owning :cpp:class:`~tvm::ffi::Any` to an object pointer: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Object.MoveFromAny.begin] + :end-before: [Object.MoveFromAny.end] + +Since :cpp:class:`~tvm::ffi::AnyView` is non-owning (``IS_OWNING_ANY`` is ``false``), +acquiring ownership requires explicitly incrementing the reference count. + +**Release ownership.** The following C code releases ownership of a TVM-FFI object: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :name: ABI.Object.Destroy + :start-after: [Object.Destroy.begin] + :end-before: [Object.Destroy.end] + +Inheritance Checking +~~~~~~~~~~~~~~~~~~~~ + +TVM-FFI models single inheritance as a tree where each node points to its parent. +Each type has a unique type index, and the system tracks ancestors, inheritance depth, and other metadata. +This information is available via :cpp:func:`TVMFFIGetTypeInfo`. + +The following C code checks whether a type is a subclass of another: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Object.IsInstance.begin] + :end-before: [Object.IsInstance.end] + +.. _abi-tensor: + +Tensor +------ + +.. seealso:: + + :doc:`tensor` for details about TVM-FFI tensors and DLPack interoperability. + +TVM-FFI provides :cpp:class:`tvm::ffi::TensorObj`, a DLPack-native tensor class that is also a standard TVM-FFI object. +This means tensors can be managed using the same reference counting mechanisms as other objects. + +.. dropdown:: C ABI Reference: :cpp:class:`tvm::ffi::TensorObj` + :icon: code + + .. code-block:: cpp + :caption: tvm/ffi/container/tensor.h + + class TensorObj : public Object, public DLTensor { + // no other members besides those from Object and DLTensor + }; + + +Access Tensor Metadata +~~~~~~~~~~~~~~~~~~~~~~ + +The following C code obtains a :c:struct:`DLTensor` pointer from a :cpp:class:`~tvm::ffi::TensorObj`: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Tensor.AccessDLTensor.begin] + :end-before: [Tensor.AccessDLTensor.end] + +The :c:struct:`DLTensor` pointer provides access to shape, dtype, device, data pointer, and other tensor metadata. + +Construct Tensor +~~~~~~~~~~~~~~~~ + +The following C code constructs a :cpp:class:`~tvm::ffi::TensorObj` from a :c:struct:`DLManagedTensorVersioned`: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Tensor_FromDLPack.begin] + :end-before: [Tensor_FromDLPack.end] + +.. hint:: + TVM-FFI's Python API automatically wraps framework tensors (e.g., :py:class:`torch.Tensor`) as :cpp:class:`~tvm::ffi::TensorObj`, + so manual conversion is typically unnecessary. + +Destruct Tensor +~~~~~~~~~~~~~~~ + +As a standard TVM-FFI object, :cpp:class:`~tvm::ffi::TensorObj` follows the :ref:`standard destruction pattern <ABI.Object.Destroy>`. +When the reference count reaches zero, the deleter callback (:cpp:member:`TVMFFIObject::deleter`) executes. + +Export Tensor to DLPack +~~~~~~~~~~~~~~~~~~~~~~~ + +To share a :cpp:class:`~tvm::ffi::TensorObj` with other frameworks, export it as a :c:struct:`DLManagedTensorVersioned`: + +.. literalinclude:: ../../examples/abi_overview/example_code.c + :language: c + :start-after: [Tensor_ToDLPackVersioned.begin] + :end-before: [Tensor_ToDLPackVersioned.end] + +.. _abi-function: + +Function +-------- + +.. seealso:: + + :ref:`sec:function` for a detailed description of TVM-FFI functions. + +All functions in TVM-FFI follow a unified C calling convention that enables ABI-stable, +type-erased, and cross-language function calls, defined by :cpp:type:`TVMFFISafeCallType`. + +**Calling convention.** The signature includes: + +- ``handle`` (``void*``): Optional resource handle passed to the callee; typically ``NULL`` for exported symbols +- ``args`` (``TVMFFIAny*``) and ``num_args`` (``int``): Array of non-owning :cpp:class:`~tvm::ffi::AnyView` input arguments +- ``result`` (``TVMFFIAny*``): Owning :cpp:class:`~tvm::ffi::Any` output value +- Return value: ``0`` for success; ``-1`` or ``-2`` for errors (see :ref:`sec:exception`) Review Comment: ```rst This design is called a **packed function**, because it "packs" all arguments into a single array of type-erased :cpp:type:`tvm::ffi::AnyView`, and further unifies calling convention across all languages without resorting to JIT compilation. More specifically, this mechanism enables the following scenarios: - **Dynamic languages**. Well-optimized bindings are provided for, e.g. Python, to translate arguments into packed function format, and translate return value back to the host language. - **Static languages**. Metaprogramming techniques, such as C++ templates, are usually available to directly instantiate packed format on stack, saving the need for dynamic examination. - **Cross-language callbacks**. Language-agnostic :cpp:class:`tvm::ffi::Function` makes it easy to call between languages without depending on languaga-specific features such as GIL. **Performance Implications**. This approach is in practice highly efficient in machine learning workloads. - In Python/C++ calls, we can get to microsecond level overhead, which is generally similar to overhead for eager mode; - When both sides of calls are static languages, the overhead will go down to tens of nanoseconds. .. note:: Although we found it less necessary in practice, further link time optimization (LTO) is still theoretically possible in scenarios where both sides are static languages with a known symbol and linked into a single binary. In this case, the callee can be inlined into caller side and the stack argument memory can be passed into register passing. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
