barry-jin opened a new issue #19672:
URL: https://github.com/apache/incubator-mxnet/issues/19672


   ## Problem statement
   Currently MXNet is using PackedFunc based FFI[1] to improve performance of 
imperative execution. C APIs will be registered as PackedFunc in the runtime 
system and we can call any of these PackedFunc via only one exposed API. But, 
PackedFunc based FFI has only ADT(Algebatic data type), Integer and Float these 
contaienrs registered in runtime system, and also it is limited to only few 
data types and structures as input, like Integer, Float, NDArray, list of 
Integers and list of Floats, tuple. For some complex data structures, like 
python dictionary, list of strings, we need to parse them or unzip them in 
front end first and then feed into packed function, which may reduce code 
readability and introduce other overheads. To adopt PackedFunc based FFI on 
some existed imperative calls or some other API calls in the future, new 
containers are needed in runtime system, like map. 
   
   ## Current Containers
   In current MXNet FFI, we pack python arguments into C arguments that FFI 
accept through 
[`_make_mxnet_args`](https://github.com/apache/incubator-mxnet/blob/3746babc8fdb211584a9a661207061cb646b01a8/python/mxnet/_ffi/_ctypes/function.py#L39-L76)
 and convert `list` or `tuple` types into arguments that FFI accept through 
[`convert_to_node`](https://github.com/apache/incubator-mxnet/blob/3746babc8fdb211584a9a661207061cb646b01a8/python/mxnet/_ffi/node_generic.py#L40-L60)
 function. 
   
   Current container in use in PackedFunc based FFI is 
[ADT](https://github.com/apache/incubator-mxnet/blob/ed4aa23dba78def23abfe08c7e74eb4dbab1bbbb/src/api/_api_internal/_api_internal.cc#L56-L68),
 which is Algebatic data type. Since ADT container only accepts [Object 
types](https://github.com/apache/incubator-mxnet/blob/ed4aa23dba78def23abfe08c7e74eb4dbab1bbbb/python/mxnet/_ffi/_ctypes/object.py#L62-L92)
 as fields, [Integer and 
Float](https://github.com/apache/incubator-mxnet/blob/ed4aa23dba78def23abfe08c7e74eb4dbab1bbbb/src/api/_api_internal/_api_internal.cc#L36-L54)
 types are also registered as Objects so that we can feed list of Integers and 
Floats into PackedFunc based FFI. However, here are the limitations that only 
list of Integers and list of Floats are accepted by 
[`convert_to_node`](https://github.com/apache/incubator-mxnet/blob/3746babc8fdb211584a9a661207061cb646b01a8/python/mxnet/_ffi/node_generic.py#L40-L60)
 function, other structures like list of strings, dictionary wil
 l raise ValueError. 
   
   ## Proposed solutions
   Our PackedFunc based FFI is mainly adapted from TVM project. So, we can also 
adapt Map object and String object from TVM.  
   
   ## Examples
   ### A simple example of using Map container
   We can register a map related C function, such as sum of all the values in 
the Map. 
   ```C++
   MXNET_REGISTER_GLOBAL("_api.MapVSum")
   .set_body([](runtime::MXNetArgs args, runtime::MXNetRetValue* ret) {
     Object* ptr = static_cast<Object*>(args[0].value().v_handle);
     auto* n = static_cast<const runtime::MapObj*>(ptr);
     int all = 0;
     for (const auto& kv : *n) {
       runtime::Integer value = Downcast<runtime::Integer, 
ObjectRef>(kv.second);
       all = all + value->value;
     }
     *ret = all;
   });
   ```
   After compile and build the library, we can call `MapVSum` on Python side to 
sum all the diction values. 
   ```Python
   from mxnet._ffi import _init_api
   
   def map_sum():
       amap = mxnet._ffi.convert_to_node({"a": 2, "b": 3})
       # MapVSum: sum all the integer values
       return MapVSum(amap)
   
   _init_api("_api")
   ```
   
   ### Example for adopting new FFI on 
[CreateCachedOp](https://github.com/apache/incubator-mxnet/blob/ed4aa23dba78def23abfe08c7e74eb4dbab1bbbb/python/mxnet/_ctypes/ndarray.py#L106-L123)
 with map container
   ```Python
   # part of file incubator-mxnet/python/mxnet/_ctypes/ndarray.py
   # create CachedOp on python side
   from ..base import NDArrayHandle, CachedOpHandle, SymbolHandle
   from . import _api_internal
   
   class CachedOp(object):
       """Cached operator handle."""
       __slots__ = ["handle", "is_np_sym", "_monitor_callback"]
   
       def __init__(self, sym, flags=(), thread_safe=False):
           self._monitor_callback = None
   
           from ..symbol.numpy._symbol import _Symbol
           self.is_np_sym = bool(isinstance(sym, _Symbol))
           
           flags = dict([(key, str(value)) for key, value in flags])
           self.handle = CachedOpHandle(_api_internal._cached_op_create(
               sym.handle,
               len(flags),
               flags,
               thread_safe
           ))
   ```
   
   ```C++
   // register create cached_op as C function
   #include <mxnet/api_registry.h>
   #include <mxnet/runtime/packed_func.h>
   #include <mxnet/runtime/container_ext.h>
   #include "../imperative/cached_op.h"
   #include "../imperative/cached_op_threadsafe.h"
   
   namespace mxnet {
   
   MXNET_REGISTER_GLOBAL("_api._cached_op_create")
   .set_body([](runtime::MXNetArgs args, runtime::MXNetRetValue* ret) {
     nnvm::Symbol* sym = 
static_cast<nnvm::Symbol*>(static_cast<void*>(args[0]));
     int num_flags = args[1];
     Object* flags_ptr = static_cast<Object*>(args[2].value().v_handle);
     auto* n = static_cast<const runtime::MapObj*>(flags_ptr);
     bool thread_safe = args[3];
     std::vector<std::pair<std::string, std::string> > flags;
     flags.reserve(num_flags);
     for (const auto& kv : *n) {
       
flags.emplace_back(std::string(runtime::Downcast<runtime::String>(kv.first)),
                          
std::string(runtime::Downcast<runtime::String>(kv.second)));
     }
     mxnet::CachedOpPtr *out;
     if (!thread_safe) {
       out = new CachedOpPtr(new CachedOp(*sym, flags));
     } else {
       out = new CachedOpPtr(new CachedOpThreadSafe(*sym, flags));
     }
     *ret = static_cast<void*>(out);
   });
   
   }
   ```
   
   ## Some more discussions
   1. In order to visit the member of the container, like get attribute value 
for each entry in map from front end, we can improve our [object 
system](https://github.com/apache/incubator-mxnet/blob/master/include/mxnet/runtime/object.h)
 by adopting VisitAttrs from TVM, which should provide us with a reflection API 
to visit each member of the object. I think it should be necessary if we want 
to fully benefit from new FFI. 
   
   2. We may need to find  a way to make our containers capable of holding 
NDArrays. FFI containers are mainly adapted from TVM and can only hold 
ObjectRefs from [object 
system](https://github.com/apache/incubator-mxnet/blob/master/include/mxnet/runtime/object.h).
 Different from TVM's NDArray, which inherits from ObjectRef and there is 
customized smart pointer called ObjectPtr to count references, NDArray in MXNet 
uses [shared 
pointer](https://github.com/apache/incubator-mxnet/blob/765297bf5d00054fdc40f8987e8238fcfa5eca60/include/mxnet/ndarray.h#L1139)
 to hold internal data. Holding NDArray in container requires 
[NDArrayHandle](https://github.com/apache/incubator-mxnet/blob/master/src/runtime/ndarray_handle.cc)
 can affect NDArray's reference counting. 
   
   ## References
   - Reference and related literature
   [1] https://github.com/apache/incubator-mxnet/pull/17510
   
   - Known implementations
   TVM Runtime SystemL https://tvm.apache.org/docs/dev/runtime.html
   DGL FFI Containers: https://docs.dgl.ai/developer/ffi.html
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@mxnet.apache.org
For additional commands, e-mail: issues-h...@mxnet.apache.org

Reply via email to