Re: Writing very large rowgroups to Apache Parquet

Roman Karlstetter Thu, 09 Jul 2020 23:58:32 -0700

Hi,

I wasn't aware of the fact that jemalloc mmap automatically for larger
allocations. And I didn't yet test this.


The approach could be different in that we would know which parts of the
buffers are going to be used next (the buffers are appendonly) and which
parts won't be needed until actually flushing the rowgroup (and when
flushing, we also know the order). But I'm not sure whether that knowledge
helps a lot in a) saving memory compared to a generic allocator or b)
improving performance. In addition to that, communicating this knowledge to
the implementation will also be tricky for the general case, I guess.

Regarding setting the allocator to another memory pool: I was unsure
whether the memory pool is used for further allocations where the default
memory pool would be more appropriate. If not, then setting the memory pool
in the writer properties should actually work well.

Maybe I should just play a bit with the different memory pool options and
see how they behave. It makes more sense to discuss further ideas once I
have some performance numbers.

Thanks,
Roman


Am Fr., 10. Juli 2020 um 06:47 Uhr schrieb Micah Kornfield <
emkornfi...@gmail.com>:

> +parquet-dev as this seems more concerned with the non-arrow pieces of
> parquet
>
> Hi Roman,
> Answers inline.
>
> One way to solve that problem would be to use memory mapped files instead
> > of plain memory buffers. That way, the number of required memory can be
> > limited by the number of columns times the os-pagesize, which would be
> > independent of the rowgroup-size. Consequently, large rowgroupsizes pose
> no
> > problem with respect to RAM consumption.
>
> I was under the impression that modern allocator (i.e. jemalloc) already
> mmap for large allocations.  How would this approach be different from the
> way allocators use it?  Have you prototyped this approach to see if it
> allows for better scalability?
>
>
> > After a quick look at how the buffers are managed inside arrow (allocated
> > from a default memory pool), I have the impression that an implementation
> > of this idea could be a rather huge change. I still wanted to know
> whether
> > that is something you could see being integrated or whether that is out
> of
> > scope of arrow.
>
>
> A huge change probably isn't a great idea unless we've validated the
> approach along with alternatives.  Is there currently code that doesn't
> make use of the MemoryPool [1] provided by WriterProperties? If so we
> should probably fix it.  Otherwise, is there a reason that you can't
> substitute a customized memory pool on WriterProperties?
>
> Thanks,
> Micah
>
> [1]
>
> https://github.com/apache/arrow/blob/5602c459eb8773b6be8059b1b118175e9f16b7a3/cpp/src/parquet/properties.h#L447
>
> On Thu, Jul 9, 2020 at 8:35 AM Roman Karlstetter <
> roman.karlstet...@gmail.com> wrote:
>
> > Hi everyone,
> >
> > since some time now, parquet::ParquetFileWriter has the option to create
> > buffered rowgroups with AppendBufferedRowGroup(), which basically gives
> you
> > the possibility to write to columns in any order you like (in contrast to
> > the former only possible way of writing one column after the other). This
> > is cool since it avoids the caller from having to create an in memory
> > columnar representation of its data.
> >
> > However, when data size is huge compared to the available system memory
> > (due to wide schema or a large rowgroupsize), this is problematic, as the
> > buffers allocated internally can take up a large portion of RAM of the
> > machine the conversion is running on.
> >
> > One way to solve that problem would be to use memory mapped files instead
> > of plain memory buffers. That way, the number of required memory can be
> > limited by the number of columns times the os-pagesize, which would be
> > independent of the rowgroup-size. Consequently, large rowgroupsizes pose
> no
> > problem with respect to RAM consumption.
> >
> > I wonder what you generally think about the idea of integrating an
> > AppendFileBufferedRowGroup() (or similar name) possibility which gives
> the
> > user the option to have the internal buffers be memory mapped files.
> >
> > After a quick look at how the buffers are managed inside arrow (allocated
> > from a default memory pool), I have the impression that an implementation
> > of this idea could be a rather huge change. I still wanted to know
> whether
> > that is something you could see being integrated or whether that is out
> of
> > scope of arrow.
> >
> > Thanks in advance and kind regards,
> > Roman
> >
>

Re: Writing very large rowgroups to Apache Parquet

Reply via email to