Re: Chunk Table into RecordBatches of at most 512MB each

Jacques Nadeau Thu, 14 Mar 2024 20:18:55 -0700

Good to know. IPC in general should be better. The worse case scenario I've
seen is the rowwise population situation.


On Thu, Mar 14, 2024, 2:52 PM Greg Lowe <greg.l...@gmail.com> wrote:

> I had a quick look at the Arrow Go source code. In the IPC case, when
> using the Go allocator, it looks like it allocates to the nearest multiple
> of 64 bytes. I'm not very familiar with the details of how the Go runtime
> handles large byte array allocations. But with a quick scan of the docs, I
> believe these get rounded to the nearest page size of 4K. So I don't think
> there's a power of two issue when reading record batches via IPC.
>
>
> On Fri, 15 Mar 2024 at 13:34, Jacques Nadeau <jacq...@apache.org> wrote:
>
>> I would expect go to allocate to IPC size but the underlying allocator
>> behavior will still be present. It seems like golang uses tcmalloc so it
>> would probably round up to the next tcmalloc size. I'd assume waste
>> increases at larger allocation sizes but you'd have to review the detail to
>> better understand.
>>
>> On Thu, Mar 14, 2024, 2:15 PM Greg Lowe <greg.l...@gmail.com> wrote:
>>
>>> Note, I'm mostly concerned about constraining the memory use when
>>> reading record batches from the IPC format. I'm not so concerned about
>>> memory use by the builders while writing them.
>>>
>>> Is the power of two allocation also used when reading a record batch
>>> from an IPC record? I would have assumed that wouldn't be necessary since
>>> the required sizes would be known up front and be encoded in the IPC format.
>>>
>>> On Fri, 15 Mar 2024 at 11:33, Jacques Nadeau <jacq...@apache.org> wrote:
>>>
>>>> It depends on the implementation but some implementations use power if
>>>> two allocations or similar (not sure in golang front). So one might start
>>>> with space for 80 integers and then once you get to 81, allocation doubles
>>>> to 160 integers. I know the Java library historically operated this way
>>>> (albeit not exactly a power of two because of space related to colocated
>>>> allocations for nullability). So trying to constrain memory with record at
>>>> a time writing/reallocation will likely turn out pretty poorly. I recommend
>>>> you preallocate your batch size based on estimates initially to max memory
>>>> and then fill things in and then adjust your estimation algorithm over
>>>> time.
>>>>
>>>> On Thu, Mar 14, 2024, 12:25 PM Greg Lowe <greg.l...@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I'm aiming to reply to the following thread. Not sure if this message
>>>>> will appear in the right place.
>>>>> https://lists.apache.org/thread/93kg641xk52lm5m11vwodbyc1hzvbnf3
>>>>>
>>>>> I've implemented a workaround for a similar use case. I thought
>>>>> I'd share, as either someone could recommend a better solution using the
>>>>> existing API. Or perhaps to discuss additions to the API which could make
>>>>> this easier.
>>>>>
>>>>> In my use case the limitation is the memory available when reading a
>>>>> record batch. I'd like to keep the in-memory size of each record batch
>>>>> within a maximum number of bytes. Note, I'm not concerned about the disk
>>>>> size (which will be smaller due to LZ4 compression).
>>>>>
>>>>> So when appending values, I'd like to be able to specify a
>>>>> maximum value, say 500MB, and then once that's exceeded write the
>>>>> record batch to disk.
>>>>>
>>>>> The data types I need to support are float64, int64,
>>>>> bool, listof(float64), listof(int64), listof(bool), and strings.
>>>>>
>>>>> In my use case, I'm writing to a builder in a row-wise fashion. My
>>>>> current approach is, when I write each cell I increment a variable which
>>>>> keeps track of the approximate used memory size in bytes. Luckily, for the
>>>>> types I need to support, this is fairly simple to track approximately.
>>>>>
>>>>> i.e. a float64 is "+8", list-of float64 is "len(floats)*8+8".
>>>>>
>>>>> Is there a better way to do this using the existing API?
>>>>>
>>>>> Would it make sense for this to be supported natively by the API?
>>>>>
>>>>> I'm using the Go implementation. But I guess this applies equally to
>>>>> the C++, and maybe other implementations too.
>>>>>
>>>>> Thanks for taking the time to read this.
>>>>>
>>>>> Cheers,
>>>>> Greg
>>>>>
>>>>

Re: Chunk Table into RecordBatches of at most 512MB each

Reply via email to