I need to cut another RC due to a compile error in the ORC adapter, so
I can include the patch for this issue as well.

On Wed, Apr 21, 2021 at 4:00 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
> Definitely a bug. I just opened
>
> https://issues.apache.org/jira/browse/ARROW-12495
>
> If there's a chance to get this into 4.0.0 this would be a nice one
> but I suspect the next RC is already under way (it need not block
> since this bug has been present a long time)
>
> On Wed, Apr 21, 2021 at 3:31 AM Antoine Pitrou <anto...@python.org> wrote:
> >
> >
> > It sounds like a bug if is_mutable_ is true but mutable_data_ is nullptr.
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 21/04/2021 à 03:17, Weston Pace a écrit :
> > > If it comes from pandas (and is eligible for zero-copy) then the
> > > buffer implementation will be `NumPyBuffer`.  Printing one in GDB
> > > yields...
> > >
> > > ```
> > > $12 = {_vptr.Buffer = 0x7f0b66e147f8 <vtable for
> > > arrow::py::NumPyBuffer+16>, is_mutable_ = true, is_cpu_ = true, data_
> > > = 0x55b71f901a70 "\001", mutable_data_ = 0x0, size_ = 16, capacity_ =
> > > 16,
> > >    parent_ = {<std::__shared_ptr<arrow::Buffer,
> > > (__gnu_cxx::_Lock_policy)2>> =
> > > {<std::__shared_ptr_access<arrow::Buffer, (__gnu_cxx::_Lock_policy)2,
> > > false, false>> = {<No data fields>}, _M_ptr = 0x0,
> > >        _M_refcount = {_M_pi = 0x0}}, <No data fields>},
> > >    memory_manager_ = {<std::__shared_ptr<arrow::MemoryManager,
> > > (__gnu_cxx::_Lock_policy)2>> =
> > > {<std::__shared_ptr_access<arrow::MemoryManager,
> > > (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>},
> > >        _M_ptr = 0x55b71fdca4e0, _M_refcount = {_M_pi =
> > > 0x55b71fb90640}}, <No data fields>}}
> > > ```
> > >
> > > Notice that `is_cpu_` and `is_mutable_` are both `true`.  It's maybe a
> > > bug that `is_mutable_` is true.  Although maybe not as it appears to
> > > be telling whether the underlying numpy buffer itself is mutable or
> > > not...
> > >
> > > ```
> > > if (PyArray_FLAGS(ndarray) & NPY_ARRAY_WRITEABLE) {
> > >      is_mutable_ = true;
> > > }
> > > ```
> > >
> > >
> > > On Tue, Apr 20, 2021 at 2:15 PM Niranda Perera <niranda.per...@gmail.com> 
> > > wrote:
> > >>
> > >> Hi all,
> > >>
> > >> We have been using Arrow v2.0.0 and we encountered the following issue.
> > >>
> > >> I was reading a table with numeric data using pandas.read_csv and then
> > >> converting it into pyarrow table. In our application (Cylon
> > >> <https://github.com/cylondata/cylon>), we are accessing this pyarrow 
> > >> table
> > >> from c++. We want to access the mutable data of the arrays in the pyarrow
> > >> table.
> > >>
> > >> But the following returns a nullptr.
> > >> T *mutable_data = array->data()->GetMutableValues<T>(1); // returns 
> > >> nullptr
> > >>
> > >> Interestingly,
> > >> array->data()->buffers[1]->IsMutable(); // returns true
> > >> array->data()->buffers[1]->IsCpu(); // returns true
> > >>
> > >> This only happens when I use pandas df to create a pyarrow table. It
> > >> wouldn't happen when I use pyarrow.read_csv. So, I am guessing there's 
> > >> some
> > >> issue in the buffer creation from pandas df.
> > >>
> > >> Is this an expected behavior? or has this been resolved in v2.0< 
> > >> releases?
> > >>
> > >> Best
> > >> --
> > >> Niranda Perera
> > >> https://niranda.dev/
> > >> @n1r44 <https://twitter.com/N1R44>

Reply via email to