Re: [C++] Building a ChunkedArray with allocation size control

Eric Jacobs Fri, 05 Jul 2024 09:35:25 -0700

Felipe Oliveira Carvalho wrote:

Hi,
The builders can't really know the size of the buffers when nestedtypes are involved. The general solution would be an expensivetraversal of the entire tree of builders (e.g. struct builder ofnested column types like strings) on every append.

I understand that the number and structure of the buffers used will bedifferent depending on the datatype of the arrays, and I'm okay withdoing a traversal of the builder tree to identify all of the buffers inuse. However, I'm not seeing how it would be necessary on every appendsince the topology wouldn't be changing during the build of a singlechunk (correct me if I'm wrong.) A re-traversal of the builder tree on awider granularity basis (e.g. in between chunks) would be acceptable.

:
Also make sure you allow length to be > 0 because if a single stringis bigger than X MB, you will *have to* violate this max bufferconstraint. It can only be a soft constraint in a robust solution.

If there's no way that the constraint can be maintained as per the Arrowin-memory format, it will throw an error out from my MemoryPool, and inthat case it just won't be supported here.


Thanks,
-Eric

__
Felipe

On Thu, Jul 4, 2024 at 3:12 PM Eric Jacobs <[email protected]<mailto:[email protected]>> wrote:


    Hi,
    I would like to build a ChunkedArray but I need to limit the maximum
    size of each buffer (somewhere in the low MB's). Ending the current
    chunk and starting a new one is straightforward, but I'm having some
    difficulty detecting when the current buffer(s) are close to getting
    full. If I had the Builders I could check the length() as they are
    going
    along, but I'm not sure how I can get access to those as
    ChunkedArray is
    being built via the API.

    The size control doesn't have to be precise in my case; it just
    needs to
    be conservative as a limit (i.e. the builder cannot go over X MB)

      Any advice would be appreciated.
    Thanks,
    -Eric

Re: [C++] Building a ChunkedArray with allocation size control

Reply via email to