Question about merging groupby v2 spill files

Will Lauer Mon, 09 Aug 2021 14:30:42 -0700

I recently submitted an issue about "Too many open files" in GroupBy v2 (
https://github.com/apache/druid/issues/11558) and have been investigating a
solution. It looked like the problem was happening because the code
preemptively opened all the spill files for reading, which when there are a
huge number of spill files (in our case, a single query is generating 110k
spill files), causes the "too many open files" error when the files ulimit
is set to an otherwise reasonable number. We can work around this for now
by setting "ulimit -n" to a huge value (like 1 million), but I was hoping
for a better solution.


In https://github.com/apache/druid/pull/11559, I attempted to fix this by
lazily opening files only when they were ready to be read and closing them
immediately after they had finished being read. While this looks like it
fixes the issue in some small edge cases, it isn't a general solution
because many queries end up calling CloseableIterators.mergeSorted() to
merge all the spill files together, which due to sorting necessitates
reading all the files at once, causing the "too many files" error again. It
looks like mergeSorted() is called because frequently the grouping code is
assuming the results should be sorted and is calling
ConcurrentGrouper.parallelSortAndGetGroupersIterator().

My question is, can anyone think of a way to avoid the need for sorting at
this level so as to avoid the need for opening all the spill files. Given
how sketches work in druid right now, I don't see an easy way to reduce the
number of spill files we are seeing, so I was hoping to address this on the
grouper side, but right now I can't see a solution that makes this any
better. We aren't blocked, because we can set the maximum number of files
to a much larger number, but that is an unpalatable long term solution.

Will


<http://www.verizonmedia.com>

Will Lauer

Senior Principal Architect, Audience & Advertising Reporting
Data Platforms & Systems Engineering

M 508 561 6427
1908 S. First St
Champaign, IL 61822

<http://www.facebook.com/verizonmedia>   <http://twitter.com/verizonmedia>
<https://www.linkedin.com/company/verizon-media/>
<http://www.instagram.com/verizonmedia>

Question about merging groupby v2 spill files

Reply via email to