Looking at the code it looks simple to add, I will look into it this week and do a PR if I get something useable.
Tim Paine tim.paine.nyc 908-721-1185 > On Sep 10, 2019, at 19:35, Wes McKinney <wesmck...@gmail.com> wrote: > > Hi Tim, > > I see what you're saying now, sorry that I didn't understand sooner. > > We actually need this feature to be able to pass instances of > shared_ptr<T> (under very controlled conditions) into R using > reticulate, where T is any of > > * Array > * ChunkedArray > * DataType > * RecordBatch > * Table > * and some other classes > > > I would suggest introducing a property on pyarrow Python objects that > returns the memory address of the wrapped shared_ptr<T> (i.e. the > integer leading to shared_ptr<T>*). Then you can create your copy of > that. Would that work? The only reason this is not implemented is that > no one has needed it yet, mechanically it does not strike me as that > complex. > > See https://issues.apache.org/jira/browse/ARROW-3750. My comment in > November 2018 "Methods would need to be added to the Cython extension > types to give the memory address of the smart pointer object they > contain". I agree with my younger self. Are you up to submit a PR? > > - Wes > >> On Tue, Sep 10, 2019 at 6:31 PM Tim Paine <t.paine...@gmail.com> wrote: >> >> The end goal is to go direct from pyarrow to wasm without intermediate >> transforms. I can definitely make it work as is, we'll just have to be >> careful that the code we compile to webassembly matches exactly either our >> local copy of arrow if the user hasn't installed pyarrow, otherwise their >> installed copy. >> >> Tim Paine >> tim.paine.nyc >> 908-721-1185 >> >>> On Sep 10, 2019, at 19:12, Tim Paine <t.paine...@gmail.com> wrote: >>> >>> We're building webassembly, so we obviously don't want to introduce a >>> pyarrow dependency. I don't want to do any pyarrow manipulations in c++, >>> just get the c++ table. I was hoping pyarrow might expose a raw pointer or >>> have something castable. >>> >>> It seems to be a big limitation, there is no way of communicating a pyarrow >>> table to a c++ library that uses arrow without that library linking against >>> pyarrow. >>> >>> Tim Paine >>> tim.paine.nyc >>> 908-721-1185 >>> >>>> On Sep 10, 2019, at 17:44, Wes McKinney <wesmck...@gmail.com> wrote: >>>> >>>> The Python extension types are defined in Cython, not C or C++ so you need >>>> to load the Cython extensions in order to instantiate the classes. >>>> >>>> Why do you have 2 copies of the C++ library? That seems easy to fix. If you >>>> are using wheels from PyPI I would recommend that you switch to conda or >>>> your own wheels without the C++ libraries bundled. >>>> >>>>> On Tue, Sep 10, 2019, 4:23 PM Tim Paine <t.paine...@gmail.com> wrote: >>>>> >>>>> Is there no way to do it without PyArrow? My C++ library is building arrow >>>>> itself, which means if I use PyArow I’ll end up having 2 copies: one from >>>>> my local C++ only build, and one from PyArrow. >>>>> >>>>>> On Sep 10, 2019, at 5:18 PM, Wes McKinney <wesmck...@gmail.com> wrote: >>>>>> >>>>>> hi Tim, >>>>>> >>>>>> You can use the functions in >>>>>> >>>>>> >>>>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/pyarrow.h >>>>>> >>>>>> You need to call "import_pyarrow()" from C++ before these APIs can be >>>>>> used. It's similar to the NumPy C API in that regard >>>>>> >>>>>> - Wes >>>>>> >>>>>>> On Tue, Sep 10, 2019 at 4:13 PM Tim Paine <t.paine...@gmail.com> wrote: >>>>>>> >>>>>>> Hey all, following up on a question I asked on stack overflow < >>>>> https://stackoverflow.com/questions/57863751/how-to-convert-pyarrow-table-to-arrow-table-when-interfacing-between-pyarrow-in >>>>>> . >>>>>>> >>>>>>> It seems there is some code < >>>>> https://arrow.apache.org/docs/python/extending.html#_CPPv412unwrap_tableP8PyObjectPNSt10shared_ptrI5TableEE> >>>>> in PyArrow’s C++ to convert from a PyArrow table to an Arrow table. The >>>>> problem with this is that my C++ library < >>>>> https://github.com/finos/perspective> is going to build and link against >>>>> Arrow on the C++ side rather than PyArrow side (because it will also be >>>>> consumed in WebAssembly), so I want to avoid also linking against >>>>> PyArrow’s >>>>> copy of the arrow library. I also need to look for PyArrow’s header files, >>>>> which might conflict with the version in the local C++ code. >>>>>>> >>>>>>> My solution right now is to just assert that PyArrow version == Arrow >>>>> version and do some pruning (so I link against local libarrow and >>>>> PyArrow’s >>>>> libarrow_python rather than use PyArrow’s libarrow), but ideally it would >>>>> be great if there was a clean way to hand a PyArrow Table over to C++ >>>>> without requiring the C++ to have PyArrow (e.g. using only a PyObject *). >>>>> Please forgive my ignorance/google skills if its already possible! >>>>>>> >>>>>>> unwrap_table code: >>>>>>> >>>>> https://github.com/apache/arrow/blob/c39e3508f93ea41410c2ae17783054d05592dc0e/python/pyarrow/public-api.pxi#L310 >>>>> < >>>>> https://github.com/apache/arrow/blob/c39e3508f93ea41410c2ae17783054d05592dc0e/python/pyarrow/public-api.pxi#L310 >>>>>> >>>>>>> >>>>>>> library pruning: >>>>>>> >>>>> https://github.com/finos/perspective/blob/python_arrow/cmake/modules/FindPyArrow.cmake#L53 >>>>> < >>>>> https://github.com/finos/perspective/blob/python_arrow/cmake/modules/FindPyArrow.cmake#L53 >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Tim >>>>> >>>>>