kevingurney commented on code in PR #45973: URL: https://github.com/apache/arrow/pull/45973#discussion_r2029311841
########## matlab/doc/matlab_interface_for_apache_arrow_design.md: ########## @@ -179,47 +177,65 @@ Roughly speaking, local memory sharing workflows can be divided into two categor To share a MATLAB `arrow.Array` with PyArrow efficiently, a user could use the `exportToCDataInterface` method to export the Arrow memory wrapped by an `arrow.Array` to the C Data Interface format, consisting of two C-style structs, [`ArrowArray`] and [`ArrowSchema`], which represent the Arrow data and associated metadata. -Memory addresses to the `ArrowArray` and `ArrowSchema` structs are returned by the call to `exportToCDataInterface`. These addresses can be passed to Python directly, without having to make any copies of the underlying Arrow data structures that they refer to. A user can then wrap the underlying data pointed to by the `ArrowArray` struct (which is already in the [Arrow Columnar Format]), as well as extract the necessary metadata from the `ArrowSchema` struct, to create a `pyarrow.Array` by using the static method `py.pyarrow.Array._import_from_c`. +Memory addresses to the `ArrowArray` and `ArrowSchema` structs are returned by the call to `export`. These addresses can be passed to Python directly, without having to make any copies of the underlying Arrow data structures that they refer to. A user can then wrap the underlying data pointed to by the `ArrowArray` struct (which is already in the [Arrow Columnar Format]), as well as extract the necessary metadata from the `ArrowSchema` struct, to create a `pyarrow.Array` by using the static method `pyarrow.Array._import_from_c`. + +Since we require multiple lines to import our matlab Arrow array into python, we'll use a function called `pyrunfile`, which allows us to [execute Python statements in a file supplied as an argument to the function](https://www.mathworks.com/help/matlab/ref/pyrunfile.html). ###### Example Code: + +```python +# file located in same directory as the matlab file, named import_from_c.py +import pyarrow as pa +array = pa.Array._import_from_c(arrayMemoryAddress, schemaMemoryAddress) +``` + ``` matlab % Create a MATLAB arrow.Array. >> AA = arrow.array([1, 2, 3, 4, 5]); +% Export C Data Interface C-style structs for `arrow.array.Array` values and schema +>> cArray = arrow.c.Array(); +>> cSchema = arrow.c.Schema(); + % Export the MATLAB arrow.Array to the C Data Interface format, returning the % memory addresses of the required ArrowArray and ArrowSchema C-style structs. ->> [arrayMemoryAddress, schemaMemoryAddress] = AA.exportToCDataInterface(); +>> AA.export(cArray.Address, cSchema.Address); % Import the memory addresses of the C Data Interface format structs to create a pyarrow.Array. ->> PA = py.pyarrow.Array._import_from_c(arrayMemoryAddress, schemaMemoryAddress); +>> PA = pyrunfile("import_from_c.py", "array", arrayMemoryAddress=cArray.Address, schemaMemoryAddress=cSchema.Address); ``` Conversely, a user can create an Arrow array using PyArrow and share it with MATLAB. To do this, they can call the method `_export_to_c` to export a `pyarrow.Array` to the C Data Interface format. -The memory addresses to the `ArrowArray` and `ArrowSchema` structs populated by the call to `_export_to_c` can be passed to the static method `arrow.Array.importFromCDataInterface` to construct a MATLAB `arrow.Array` with zero copies. +**NOTE:** Since the python calls to `_export_to_c` and `_import_from_c` have underscores at the beginning of their names, they cannot be called directly in MATLAB. MATLAB member functions or variables are [not allowed to start with an underscore](https://www.mathworks.com/help/matlab/matlab_prog/variable-names.html). -The example code below is adapted from the [`test_cffi.py` test cases for PyArrow]. +To initialize a Python `pyarrow` array, the MATLAB `pyrunfile` command can be used, which supports the execution of Python code containing variables and functions with names that start with an underscore. Review Comment: ```suggestion To initialize a Python `pyarrow` array, `pyrunfile` can (again) be used to execute a Python script containing variables and functions with names that start with an underscore. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
