I am extremely sorry for the late reply, I didn't get an email regarding your 
reply. Thanks for the links! This is exactly what I wanted. I tried doing the 
same `_import_from_c` in my code but it throws an error stating that 
`pyarrow.DataType._import_from_c` doesn't exist. I am running pyarrow 0.16.0. 
Is there a case of version mismatch here?

On 2020/03/29 20:46:32, Wes McKinney <wesmck...@gmail.com> wrote: 
> To add to this, take a look at the C interface functions in pyarrow
> 
> Reconstruct pyarrow.DataType from C ArrowSchema
> 
> https://github.com/apache/arrow/blob/b07c2626cb3cdd3498b41da9feedf7c8319baa27/python/pyarrow/types.pxi#L203
> 
> Reconstruct pyarrow.Array from C ArrowArray
> 
> https://github.com/apache/arrow/blob/b07c2626cb3cdd3498b41da9feedf7c8319baa27/python/pyarrow/array.pxi#L1176
> 
> The idea is that a single ArrowSchema may correspond to a sequence of
> ArrowArray, so the data type (equivalently schema) is represented
> separately from the array data.
> 
> You can see examples of both of these in the unit tests (which use
> cffi to create the C structs)
> 
> https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_cffi.py
> 
> If you're having trouble getting things to work, it would be helpful
> if you could show what data exactly you are putting into the C
> structures and how it is not returning the expected result when
> imported into pyarrow.
> 
> On Sun, Mar 29, 2020 at 3:41 PM Neal Richardson
> <neal.p.richard...@gmail.com> wrote:
> >
> > Hi Anish,
> > You may be interested in how the Arrow R package uses the C interface to
> > pass data to/from pyarrow. Both sides use the Arrow C++ library's
> > implementation of the C interface. See
> > https://github.com/apache/arrow/blob/master/r/src/py-to-r.cpp and
> > https://github.com/apache/arrow/blob/master/r/R/py-to-r.R. The Arrow C++
> > implementation is in
> > https://github.com/apache/arrow/tree/master/cpp/src/arrow/c.
> >
> > Neal
> >
> > On Sun, Mar 29, 2020 at 12:14 PM Anish Biswas <anishbiswas...@gmail.com>
> > wrote:
> >
> > > I have been trying to wrap my head around the[ CDataInterface.rst|
> > >
> > > https://github.com/apache/arrow/blob/master/docs/source/format/CDataInterface.rst
> > > ]
> > > document for a few days now. So what I am trying is basically to use the C
> > > interface with a minimum dependencies to produce blocks of bytes that
> > > pyarrow can reconstruct and work on as a normal pyarrow array (and
> > > vice-versa: both directions).
> > >
> > > Here's what I already tried doing.
> > >
> > >    - Created a C library that contains the two structs ArrowSchema and
> > >    ArrowArray and some functions to export an int64_t array as an Arrow
> > > Array.
> > >    This is very similar to what the document did with int32_t arrays.
> > >    - Imported the C library in Python. Created an int64_t pyarrow.array.
> > >    Serialized it to read the bytes via Numpy and populated the C struct I
> > >    created using the C library function.
> > >
> > > What I expected was that the bytes would have some resemblance to each
> > > other and that pyarrow would have some utility to pick up the ArrowArray
> > > struct and treat it as an Arrow Array. But I couldn't get it to work.
> > >
> > > I am also confused as to how do I use ArrowSchema properly. The
> > > ArrowSchema is
> > > the only structure that differentiates different ArrowArray formats.
> > > However, the fact that I am not using it anywhere with the ArrowArray
> > > struct
> > > or for that matter for any kind of initialization which tells the Arrow
> > > library that "The next structure you will encounter would be of the kind
> > > that the ArrowSchema has provided you", doesn't seem correct to me.
> > >
> > > It would really help me out, if you could tell if I actually 
> > > misinterpreted
> > > the doc, or am I doing something wrong. Thanks!
> > >
> 

Reply via email to