Thanks for the pointers! The migration is going well.

We have been using Arrow 0.16.0 RecordBatchStreamWriter
<https://github.com/Paradigm4/bridge/blob/master/src/PhysicalXSave.cpp#L450>
with & without CompressedOutputStream and wrote the resulting Arrow Buffer
data to S3
<https://github.com/Paradigm4/bridge/blob/master/src/S3Driver.cpp#L168> or file
system
<https://github.com/Paradigm4/bridge/blob/master/src/FSDriver.cpp#L156>. We
have a sizable amount of data saved this way.

Once we upgrade our C++ code to use Arrow 3.0.0 or 4.0.0, will it be
possible to read the Arrow steam files written with Arrow 0.16.0?

Thank you!
Rares

On Thu, May 27, 2021 at 1:44 PM Benjamin Kietzman <bengil...@gmail.com>
wrote:

> Yes this is an adaptation of ARROW_ASSIGN_OR_RAISE for
> their bridge, which seems to throw exceptions instead of returning
> Status/Result
>
> On Thu, May 27, 2021 at 4:42 PM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
>
> > For the macro, I believe ARROW_ASSIGN_OR_RAISE already does this?
> >
> > On Thu, May 27, 2021 at 1:19 PM Benjamin Kietzman <bengil...@gmail.com>
> > wrote:
> >
> > > unique_ptr is used to designate unique ownership of the buffer
> > > just created. It's fairly compatible with shared_ptr since
> > > unique_ptr can convert implicitly to shared_ptr.
> > >
> > > One other refactoring in play here: we've been moving from
> > > Status-returning-out-argument functions to the more ergonomic
> > > Result<T>. I'd recommend you write a new macro for dealing with
> > > Result<T>s, like:
> > >
> > >     #define ASSIGN_OR_THROW_IMPL(result_name, lhs, rexpr) \
> > >         auto&& result_name = (rexpr); \
> > >         THROW_NOT_OK((result_name).status()); \
> > >         lhs = std::move(result_name).ValueUnsafe();
> > >     #define ASSIGN_OR_THROW(lhs, rexpr) \
> > >         ASSIGN_OR_THROW_IMPL(_maybe ## __COUNTER__, lhs, rexpr)
> > >
> > > Then lines such as
> > > https://github.com/Paradigm4/bridge/blob/master/src/Driver.h#L196
> > > can be rewritten as:
> > >
> > >     ASSIGN_OR_THROW(buffer, arrow::AllocateBuffer(length));
> > >
> > > Does that help?
> > >
> > > On Thu, May 27, 2021 at 3:47 PM Rares Vernica <rvern...@gmail.com>
> > wrote:
> > >
> > > > Hello,
> > > >
> > > > We are trying to migrate from Arrow 0.16.0 to a newer version,
> > hopefully
> > > up
> > > > to 4.0.0. The Arrow 0.17.0 change in AllocateBuffer from taking a
> > > > shared_ptr<Buffer> to returning a unique_ptr<Buffer> is making things
> > > very
> > > > difficult. We wonder if there is a strong reason behind the change
> from
> > > > shared_ptr to unique_ptr and if there is an easier path forward for
> us.
> > > >
> > > > In our code, we interchangeably use Buffer and ResizableBuffer. We
> pass
> > > > around these pointers across a number of classes. They are allocated
> or
> > > > resized here
> > > > https://github.com/Paradigm4/bridge/blob/master/src/Driver.h#L191
> > > > Moreover,
> > > > we cast the ResizableBuffer instance to Buffer in order to have all
> our
> > > > methods only deal with Buffer, here
> > > > https://github.com/Paradigm4/bridge/blob/master/src/Driver.h#L151
> > > >
> > > > In Arrow 0.16.0 AllocateBuffer took a shared_ptr<Buffer> and this
> works
> > > > fine. In Arrow 0.17.0 AllocateBuffer returns a unique_ptr<Buffer>.
> Our
> > > cast
> > > > from ResizableBuffer to Buffer won't work on unique_ptr and we won't
> be
> > > > able to pass the Buffer around so easily.
> > > >
> > > > I noticed that there is another AllocateBuffer in MemoryManger that
> > > returns
> > > > a shared_ptr.
> > > >
> > > >
> > >
> >
> https://arrow.apache.org/docs/cpp/api/memory.html?highlight=resizablebuffer#_CPPv4N5arrow13MemoryManager14AllocateBufferE7int64_t
> > > > Is this a better alternative to allocate a buffer? Is there a similar
> > > > method to allocate a resizable buffer?
> > > >
> > > > Thank you,
> > > > Rares
> > > >
> > >
> >
>

Reply via email to