Thanks for your opinion, I will look into reducing planner.width.max_per_node for now, and try it back up when smaller parquet files get rolled out.
On Thu, Jun 29, 2017 at 11:21 AM, Andries Engelbrecht <[email protected] > wrote: > With limited memory and what seems to be higher concurrency you may want > to reduce the minor fragments (threads) per query per node. > See if you can reduce planner.width.max_per_node on the cluster and not > have too much impact on the response times. > > Slightly smaller (512MB) parquet files may potentially also help, but that > is usually harder to restructure the data than system settings. > > --Andries > > > > On 6/29/17, 7:39 AM, "François Méthot" <[email protected]> wrote: > > Hi, > > I am investigating issue where we are started getting Out of Heap > space > error when querying parquet files in Drill 1.10. It is currently set > to 8GB > heap, and 20GB off -heap. We can't spare more. > > We usually query 0.7 to 1.2 GB parquet files. recently we have been > more on > the 1.2GB side. For same number of files. > > It now fails on simple > select bunch of fields.... where ....needle in haystack type of > params. > > > Drill is configured with the old reader: > store.parquet_use_reader=false > because of this bug DRILL-5435 (Limit cause Mem Leak) > > I have set the max number of large query to 2 instead of 10 > temporarly, > It did help so far. > > My question: > Could parquet file size be related to those new exceptions? > Would reducing max file size help to improve robustness of query in > drill > (at the expense of having more files to scan)? > > Thanks > Francois > > >
