Hi Tim,

I see what you're saying now, sorry that I didn't understand sooner.

We actually need this feature to be able to pass instances of
shared_ptr<T> (under very controlled conditions) into R using
reticulate, where T is any of

* Array
* ChunkedArray
* DataType
* RecordBatch
* Table
* and some other classes


I would suggest introducing a property on pyarrow Python objects that
returns the memory address of the wrapped shared_ptr<T> (i.e. the
integer leading to shared_ptr<T>*). Then you can create your copy of
that. Would that work? The only reason this is not implemented is that
no one has needed it yet, mechanically it does not strike me as that
complex.

See https://issues.apache.org/jira/browse/ARROW-3750. My comment in
November 2018 "Methods would need to be added to the Cython extension
types to give the memory address of the smart pointer object they
contain". I agree with my younger self. Are you up to submit a PR?

- Wes

On Tue, Sep 10, 2019 at 6:31 PM Tim Paine <t.paine...@gmail.com> wrote:
>
> The end goal is to go direct from pyarrow to wasm without intermediate 
> transforms. I can definitely make it work as is, we'll just have to be 
> careful that the code we compile to webassembly matches exactly either our 
> local copy of arrow if the user hasn't installed pyarrow, otherwise their 
> installed copy.
>
> Tim Paine
> tim.paine.nyc
> 908-721-1185
>
> > On Sep 10, 2019, at 19:12, Tim Paine <t.paine...@gmail.com> wrote:
> >
> > We're building webassembly, so we obviously don't want to introduce a 
> > pyarrow dependency. I don't want to do any pyarrow manipulations in c++, 
> > just get the c++ table. I was hoping pyarrow might expose a raw pointer or 
> > have something castable.
> >
> > It seems to be a big limitation, there is no way of communicating a pyarrow 
> > table to a c++ library that uses arrow without that library linking against 
> > pyarrow.
> >
> > Tim Paine
> > tim.paine.nyc
> > 908-721-1185
> >
> >> On Sep 10, 2019, at 17:44, Wes McKinney <wesmck...@gmail.com> wrote:
> >>
> >> The Python extension types are defined in Cython, not C or C++ so you need
> >> to load the Cython extensions in order to instantiate the classes.
> >>
> >> Why do you have 2 copies of the C++ library? That seems easy to fix. If you
> >> are using wheels from PyPI I would recommend that you switch to conda or
> >> your own wheels without the C++ libraries bundled.
> >>
> >>> On Tue, Sep 10, 2019, 4:23 PM Tim Paine <t.paine...@gmail.com> wrote:
> >>>
> >>> Is there no way to do it without PyArrow? My C++ library is building arrow
> >>> itself, which means if I use PyArow I’ll end up having 2 copies: one from
> >>> my local C++ only build, and one from PyArrow.
> >>>
> >>>> On Sep 10, 2019, at 5:18 PM, Wes McKinney <wesmck...@gmail.com> wrote:
> >>>>
> >>>> hi Tim,
> >>>>
> >>>> You can use the functions in
> >>>>
> >>>>
> >>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/pyarrow.h
> >>>>
> >>>> You need to call "import_pyarrow()" from C++ before these APIs can be
> >>>> used. It's similar to the NumPy C API in that regard
> >>>>
> >>>> - Wes
> >>>>
> >>>>> On Tue, Sep 10, 2019 at 4:13 PM Tim Paine <t.paine...@gmail.com> wrote:
> >>>>>
> >>>>> Hey all, following up on a question I asked on stack overflow <
> >>> https://stackoverflow.com/questions/57863751/how-to-convert-pyarrow-table-to-arrow-table-when-interfacing-between-pyarrow-in
> >>>> .
> >>>>>
> >>>>> It seems there is some code <
> >>> https://arrow.apache.org/docs/python/extending.html#_CPPv412unwrap_tableP8PyObjectPNSt10shared_ptrI5TableEE>
> >>> in PyArrow’s C++ to convert from a PyArrow table to an Arrow table. The
> >>> problem with this is that my C++ library <
> >>> https://github.com/finos/perspective> is going to build and link against
> >>> Arrow on the C++ side rather than PyArrow side (because it will also be
> >>> consumed in WebAssembly), so I want to avoid also linking against 
> >>> PyArrow’s
> >>> copy of the arrow library. I also need to look for PyArrow’s header files,
> >>> which might conflict with the version in the local C++ code.
> >>>>>
> >>>>> My solution right now is to just assert that PyArrow version == Arrow
> >>> version and do some pruning (so I link against local libarrow and 
> >>> PyArrow’s
> >>> libarrow_python rather than use PyArrow’s libarrow), but ideally it would
> >>> be great if there was a clean way to hand a PyArrow Table over to C++
> >>> without requiring the C++ to have PyArrow (e.g. using only a PyObject *).
> >>> Please forgive my ignorance/google skills if its already possible!
> >>>>>
> >>>>> unwrap_table code:
> >>>>>
> >>> https://github.com/apache/arrow/blob/c39e3508f93ea41410c2ae17783054d05592dc0e/python/pyarrow/public-api.pxi#L310
> >>> <
> >>> https://github.com/apache/arrow/blob/c39e3508f93ea41410c2ae17783054d05592dc0e/python/pyarrow/public-api.pxi#L310
> >>>>
> >>>>>
> >>>>> library pruning:
> >>>>>
> >>> https://github.com/finos/perspective/blob/python_arrow/cmake/modules/FindPyArrow.cmake#L53
> >>> <
> >>> https://github.com/finos/perspective/blob/python_arrow/cmake/modules/FindPyArrow.cmake#L53
> >>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> Tim
> >>>
> >>>

Reply via email to