I logged https://issues.apache.org/jira/browse/DRILL-4723
Specific logs can be requested. I may have to sanitize some of the logs, so if it's requested in the case itself, let me know, and I will sanitize and post. (MapR can get the full dump if they want to play through things... NDAs and such :) John On Tue, Jun 14, 2016 at 1:44 PM, Parth Chandra <pchan...@maprtech.com> wrote: > John, can you log a JIRA and attach all the logs you have to the JIRA? > > > > On Tue, Jun 14, 2016 at 11:43 AM, Parth Chandra <pchan...@maprtech.com> > wrote: > > > I can see how the GC errors will cause the world to stop spinning. The GC > > is itself not able to allocate memory which is not a great place to be > in. > > > > Sudheesh saw something similar in his branch. @Sudheesh is this possible > > we have a mem-leak in master? > > > > > > > > > > On Tue, Jun 14, 2016 at 11:37 AM, John Omernik <j...@omernik.com> wrote: > > > >> This is what I have thus far... I can provide more complete logs on a > one > >> on one basis. > >> > >> The cluster was completely mine, with fresh logs. I ran a CTAS query on > a > >> large table that over 100 fields. This query works well in other cases, > >> however I was working with the Block size, both in MapR FS and Drill > >> Parquet. I had successfully tested 512m on each, this case was > different. > >> Here are the facts in this setup: > >> > >> - No Compression in MapRFS - Using Standard Parquet Snappy Compression > >> - MapR Block Size 1024m > >> - Parquet Block size 1024m > >> - Query ends up disappearing in the profiles > >> > >> - The UI page listing bits only show 4 bits however 5 are running (node > 03 > >> process is running, but no longer in the bit) > >> > >> - Error (copied below) from rest API > >> > >> - No output in STD out or STD error on node3. Only two nodes actually > had > >> "Parquet Writing" logs. The other three on Stdout, did not have any > >> issues/errors/ > >> > >> - I have standard log files, gclogs, the profile.json (before it > >> disappeared), and the physical plan. Only some components that looked > >> possibly at issue included here > >> > >> - The Node 3 GC log shows a bunch of "Full GC Allocation Failures" that > >> take 4 seconds or more (When I've seen this in other cases, this time > can > >> balloon to 8 secs or more) > >> > >> - The node 3 output log show some issues with really long RPC issues. > >> Perhaps the GCs prevent RPC communication and create a snowball loop > >> effect? > >> > >> > >> Other logs if people are interested can be provided upon request. I just > >> didn't want to flood the whole list with all the logs. > >> > >> > >> Thanks! > >> > >> > >> John > >> > >> > >> > >> > >> > >> > >> Rest Error: > >> > >> ./load_day.py 2016-05-09 > >> > >> Drill Rest Endpoint: https://drillprod.marathonprod.zeta:20000 > >> <https://drillprod.marathonprod.zeta.ctu-bo.secureworks.net:20000/> > >> > >> Day: 2016-05-09 > >> > >> /usr/lib/python2.7/site-packages/urllib3/connectionpool.py:769: > >> InsecureRequestWarning: Unverified HTTPS request is being made. Adding > >> certificate verification is strongly advised. See: > >> https://urllib3.readthedocs.org/en/latest/security.html > >> > >> InsecureRequestWarning) > >> > >> Authentication successful > >> > >> Error encountered: 500 > >> > >> { > >> > >> "errorMessage" : "SYSTEM ERROR: ForemanException: One more more nodes > >> lost connectivity during query. Identified nodes were > >> [atl1ctuzeta03:20001].\n\n\n[Error Id: > >> d7dd0120-f7c0-44ef-ac54-29c746b26354 > >> on atl1ctuzeta01 <http://atl1ctuzeta01.ctu-bo.secureworks.net:20001/ > >> >:20001" > >> > >> } > >> > >> > >> Possible issue in Node3 Log: > >> > >> > >> 2016-06-14 17:25:27,860 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:90] > >> INFO o.a.d.e.w.f.FragmentStatusReporter - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:90: State to report: RUNNING > >> > >> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] > >> INFO o.a.d.e.w.fragment.FragmentExecutor - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State change requested > >> AWAITING_ALLOCATION --> RUNNING > >> > >> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] > >> INFO o.a.d.e.w.f.FragmentStatusReporter - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State to report: RUNNING > >> > >> 2016-06-14 17:43:41,869 [BitServer-4] WARN > >> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc > >> type > >> 1 took longer than 500ms. Actual duration was 4192ms. > >> > >> 2016-06-14 17:45:36,720 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.fragment.FragmentExecutor - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State change requested RUNNING > >> --> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:45:45,740 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.f.FragmentStatusReporter - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State to report: > >> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:46:15,318 [BitServer-3] WARN > >> o.a.d.exec.rpc.control.ControlServer - Message of mode REQUEST of rpc > type > >> 6 took longer than 500ms. Actual duration was 55328ms. > >> > >> 2016-06-14 17:46:36,057 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.fragment.FragmentExecutor - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:5: State change requested RUNNING > >> --> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:46:44,620 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.f.FragmentStatusReporter - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:5: State to report: > >> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:47:01,393 [BitServer-3] WARN > >> o.a.d.exec.rpc.control.ControlServer - Message of mode REQUEST of rpc > type > >> 6 took longer than 500ms. Actual duration was 29781ms. > >> > >> 2016-06-14 17:47:09,463 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.fragment.FragmentExecutor - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:10: State change requested > RUNNING > >> --> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:47:26,967 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.f.FragmentStatusReporter - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:10: State to report: > >> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:47:55,593 [BitServer-3] WARN > >> o.a.d.exec.rpc.control.ControlServer - Message of mode REQUEST of rpc > type > >> 6 took longer than 500ms. Actual duration was 46130ms. > >> > >> 2016-06-14 17:48:04,497 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.fragment.FragmentExecutor - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:15: State change requested > RUNNING > >> --> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:48:12,742 [BitServer-4] WARN > >> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc > >> type > >> 1 took longer than 500ms. Actual duration was 4236ms. > >> > >> 2016-06-14 17:48:42,328 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.f.FragmentStatusReporter - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:15: State to report: > >> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:49:36,351 [BitServer-4] WARN > >> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc > >> type > >> 1 took longer than 500ms. Actual duration was 4260ms. > >> > >> 2016-06-14 17:49:36,351 [BitServer-3] WARN > >> o.a.d.exec.rpc.control.ControlServer - Message of mode REQUEST of rpc > type > >> 6 took longer than 500ms. Actual duration was 91854ms. > >> > >> 2016-06-14 17:50:35,273 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.fragment.FragmentExecutor - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:20: State change requested > RUNNING > >> --> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:50:39,322 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:15] > >> INFO o.a.d.e.w.fragment.FragmentExecutor - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:15: State change requested > >> CANCELLATION_REQUESTED --> FAILED > >> > >> 2016-06-14 17:50:51,546 [CONTROL-rpc-event-queue] INFO > >> o.a.d.e.w.f.FragmentStatusReporter - > >> 289fc208-7266-6a81-73a1-709efff6c412:1:20: State to report: > >> CANCELLATION_REQUESTED > >> > >> 2016-06-14 17:51:36,905 [BitServer-4] WARN > >> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc > >> type > >> 1 took longer than 500ms. Actual duration was 4426ms. > >> > >> 2016-06-14 17:52:30,805 [BitServer-3] WARN > >> o.a.d.exec.rpc.control.ControlServer - Message of mode REQUEST of rpc > type > >> 6 took longer than 500ms. Actual duration was 98767ms. > >> > >> 2016-06-14 17:52:47,042 [BitServer-4] WARN > >> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc > >> type > >> 1 took longer than 500ms. Actual duration was 12041ms. > >> > >> Possible issues in Node3 gclog: > >> > >> > >> 1819.137: [Full GC (Allocation Failure) 23G->23G(24G), 4.0657064 secs] > >> > >> 1823.205: [GC concurrent-mark-abort] > >> > >> 1823.221: [GC pause (G1 Evacuation Pause) (young)-- 23G->23G(24G), > >> 0.0382934 secs] > >> > >> 1823.271: [GC pause (G1 Evacuation Pause) (young) (initial-mark) > >> 23G->23G(24G), 0.0250215 secs] > >> > >> 1823.296: [GC concurrent-root-region-scan-start] > >> > >> 1823.296: [GC concurrent-root-region-scan-end, 0.0000105 secs] > >> > >> 1823.296: [GC concurrent-mark-start] > >> > >> 1823.308: [Full GC (Allocation Failure) 23G->23G(24G), 4.3719713 secs] > >> > >> 1827.694: [GC pause (G1 Evacuation Pause) (young)-- 23G->23G(24G), > >> 0.0312666 secs] > >> > >> 1827.727: [GC concurrent-mark-abort] > >> > >> 1827.735: [GC pause (G1 Evacuation Pause) (young) 23G->23G(24G), > 0.0212359 > >> secs] > >> > >> 1827.766: [Full GC (Allocation Failure) 23G->23G(24G), 3.9308980 secs] > >> > >> 1831.710: [GC pause (G1 Evacuation Pause) (young)-- 23G->23G(24G), > >> 0.0298229 secs] > >> > >> 1831.750: [GC pause (G1 Evacuation Pause) (young) (initial-mark) > >> 23G->23G(24G), 0.0268410 secs] > >> > >> 1831.777: [GC concurrent-root-region-scan-start] > >> > >> 1831.777: [GC concurrent-root-region-scan-end, 0.0000212 secs] > >> > >> 1831.777: [GC concurrent-mark-start] > >> > >> 1831.789: [Full GC (Allocation Failure) 23G->23G(24G), 3.9250410 secs] > >> > >> > >> > >> On Mon, Jun 13, 2016 at 6:54 PM, John Omernik <j...@omernik.com> wrote: > >> > >> > Yep, I will create all clean logs tomorrow run the query that caused > it. > >> > Thanks. John > >> > > >> > > >> > On Monday, June 13, 2016, Parth Chandra <pchan...@maprtech.com> > wrote: > >> > > >> >> Yes, we can discuss this on the hangout. > >> >> You're right, there are two issues - > >> >> Limiting memory usage to a maximum limit should be the goal of > every > >> >> component. We are not there yet with Drill though. > >> >> Getting an Out of Memory Error and having the Drillbit become > >> >> unresponsive is something we should rarely see as either the Drill > >> >> allocator or the JVM successfully catch the condition. Can you grep > >> your > >> >> logs so we can see if that indeed is what happened? > >> >> > >> >> > >> >> > >> >> On Mon, Jun 13, 2016 at 4:27 PM, John Omernik <j...@omernik.com> > >> wrote: > >> >> > >> >> > I'd like to talk about that on the hangout. Drill should do better > >> at > >> >> > failing with a clean oom error rather then having a bit go > >> unresponsive. > >> >> > Can just that bit be restarted to return to a copacetic state? As > an > >> >> admin, > >> >> > if this is the case, how do I find this bit? > >> >> > > >> >> > Other than adding RAM, are there any query tuning settings that > could > >> >> help > >> >> > prevent the unresponsive bit? ( I see this as two issues, the > memory > >> >> > settings for the 1024m block size CTAS and the how can we prevent a > >> bit > >> >> > from going unresponsive? ) > >> >> > On Jun 13, 2016 6:19 PM, "Parth Chandra" <pchan...@maprtech.com> > >> wrote: > >> >> > > >> >> > The only time I've seen a drillbit get unresponsive is when you run > >> out > >> >> of > >> >> > Direct memory. Did you see any 'Out of Memory Error' in your logs? > If > >> >> you > >> >> > see those then you need to increase the Direct memory setting for > the > >> >> JVM. > >> >> > ( DRILL_MAX_DIRECT_MEMORY in drill-env.sh) > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > On Mon, Jun 13, 2016 at 4:10 PM, John Omernik <j...@omernik.com> > >> wrote: > >> >> > > >> >> > > The 512m block size worked. My issue with the 1024m block size > >> was on > >> >> > the > >> >> > > writing using a CTAS.... that's where my nodes got into a bad > >> >> > state....thus > >> >> > > I am wondering what setting on drill would be the right setting > to > >> >> help > >> >> > > node memory pressures on a CTAs using 1024m block size > >> >> > > On Jun 13, 2016 6:06 PM, "Parth Chandra" <pchan...@maprtech.com> > >> >> wrote: > >> >> > > > >> >> > > In general, you want to make the Parquet block size and the HDFS > >> block > >> >> > size > >> >> > > the same. A Parquet block size that is larger than the HDFS block > >> size > >> >> > can > >> >> > > split a Parquet block ( i.e. row_group ) across nodes and that > will > >> >> > > severely affect performance as data reads will no longer be > local. > >> >> 512 MB > >> >> > > is a pretty good setting. > >> >> > > > >> >> > > Note that you need to ensure the Parquet block size in the source > >> file > >> >> > > which (maybe) was produced outside of Drill. So you will need to > >> make > >> >> the > >> >> > > change in the application used to write the Parquet file. > >> >> > > > >> >> > > If you're using Drill to write the source file as well then, of > >> >> course, > >> >> > the > >> >> > > block size setting will be used by the writer. > >> >> > > > >> >> > > If you're using the new reader, then there is really no knob you > >> have > >> >> to > >> >> > > tweak. Is parquet-tools able to read the file(s)? > >> >> > > > >> >> > > > >> >> > > > >> >> > > On Mon, Jun 13, 2016 at 1:59 PM, John Omernik <j...@omernik.com> > >> >> wrote: > >> >> > > > >> >> > > > I am doing some performance testing, and per the Impala > >> >> documentation, > >> >> > I > >> >> > > am > >> >> > > > trying to use a block size of 1024m in both Drill and MapR FS. > >> >> When I > >> >> > > set > >> >> > > > the MFS block size to 512 and the Drill (default) block size I > >> saw > >> >> some > >> >> > > > performance improvements, and wanted to try the 1024 to see how > >> it > >> >> > > worked, > >> >> > > > however, my query hung and I got into that "bad state" where > the > >> >> nodes > >> >> > > are > >> >> > > > not responding right and I have to restart my whole cluster > (This > >> >> > really > >> >> > > > bothers me that a query can make the cluster be unresponsive) > >> >> > > > > >> >> > > > That said, what memory settings can I tweak to help the query > >> work. > >> >> > This > >> >> > > is > >> >> > > > quite a bit of data, a CTAS from Parquet to Parquet, 100-130G > of > >> >> data > >> >> > per > >> >> > > > data (I am doing a day at a time), 103 columns. I have to use > >> the > >> >> > > > "use_new_reader" option due to my other issues, but other than > >> that > >> >> I > >> >> > am > >> >> > > > just setting the block size on MFS and then updating the block > >> size > >> >> in > >> >> > > > Drill, and it's dying. Since this is a simple CTAS (no sort) > >> which > >> >> > > settings > >> >> > > > can be beneficial for what is happening here? > >> >> > > > > >> >> > > > Thanks > >> >> > > > > >> >> > > > John > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > > >> > > >> > -- > >> > Sent from my iThing > >> > > >> > > > > >