Hi Paul, Thanks for the response. I will attach OOM logs to tickets once i get it from my OPS team.
Thanks, Nitin On Wed, Jan 8, 2020 at 12:27 AM Paul Rogers <[email protected]> wrote: > Hi Nitin, > > Thanks for letting us know about the OOM issues. These are serious and we > should focus on finding the cause and fixing them. In general, it is the > goal of the Drill project that Drill suffer no OOM errors on a cluster > configured properly for your target workload. > > Thank you for filing a JIRA ticket. The stack trace in that ticket > describes a connection shut down. Your e-mail mentioned an OOM error. Can > you attach a stack trace or log entry that led you to believe you were > getting an OOM error? How many queries are running at the time of the error? > > As you know, Drill uses two kinds of memory: heap and off-heap (AKA > "direct" or "unsafe.") Generally, you want much more off-heap than heap > memory. But, until we know which kind is being exhausted, it is hard to say > what to adjust. > > If a Drillbit fails, all queries anywhere on the cluster will fail. The > reason is simple: all queries are distributed across all nodes. This is why > we must find and fix the underlying OOM error. > > On a 64 GB machine, if you are running only Drill, you can give most of > the memory to Drill itself. Determine how much your OS and other process > need. Then, split the rest between heap and off-heap. It is very likely you > have already customized the Drill memory settings: it is the first thing > everyone does when deploying. [1] Check your settings. > > Until we know if you are running out of heap vs. off-heap, it is hard to > suggest which setting to adjust. If it is heap memory that is affected, > then you can increase the heap memory setting to see what affect that has > on Drillbit lifetime. > > Thanks, > - Paul > > [1] http://drill.apache.org/docs/configuring-drill-memory/ > > > > > > > On Tuesday, January 7, 2020, 08:45:46 AM PST, Nitin Pawar < > [email protected]> wrote: > > Hello Team > We have recently upgraded to drill-1.16 from drill-1.13 version > and we have started to notice lots of OOM issues .. its same setup with > changed binaries > till we figured out what’s the issue, we wanted to keep restarting > drillbits with cronjobs > > my question is : *If a drill is restarted .. would the queries with this > node as foreman be resubmitted automatically ?* > > Also we have a 64GB RAM machines. Can someone recommend memory setting for > this environment > > -- > Nitin Pawar -- Nitin Pawar
