Good digs, Dylan. I don't think it's too rare to matter. I notice often during MR jobs, and there's usually a point where I give up and just start writing RFiles.
It could possibly be related to what I saw back in the dayoday with: https://mail-archives.apache.org/mod_mbox/accumulo-user/201406.mbox/%3ccamz+duvmmhegon9ejehr9h_rrpp50l2qz53bbdruvo0pira...@mail.gmail.com%3E On Tue, Apr 19, 2016 at 6:26 PM, Josh Elser <josh.el...@gmail.com> wrote: > Nice findings. Sorry I haven't had any cycles to dig into this myself. > > I look forward to hearing what you find :) > > > Dylan Hutchison wrote: > >> I investigated a bit more and I am pretty sure the problem is that the >> BatchWriter is not recognizing that the tablet vb<< split into vb;2436< >> and >> vb<;2436. It keeps trying to update the closed tablet vb<<. Each update >> writes 0 mutations and records a failure at the tablet server >> UpdateSession >> because vb<< is closed. >> >> I'm not sure why this is happening because the BatchWriter should have >> invalidated its tablet locator cache upon recognizing a failure. Then it >> would recognize that the entries it wants to write fall into the new >> tablets vb;2436< and vb<;2436. I think there is a timing bug for this >> edge >> case, when a table split occurs during heavy writes. >> >> I will write this up if I can reproduce it. Maybe it is too rare to >> matter. >> >> Cheers, Dylan >> >> On Mon, Apr 18, 2016 at 2:38 PM, Dylan Hutchison< >> dhutc...@cs.washington.edu >> >>> wrote: >>> >> >> Hi devs, >>> >>> I'd like to ask your help in figuring out what is happening to a >>> BatchWriter. The following gives my reasoning so far. >>> >>> In Accumulo 1.7.1, I have a BatchWriter that is stuck in WAITING status >>> in >>> its addMutation method. I saw that it is stuck by jstack'ing the >>> Accumulo >>> client. It's been stuck like this for 16 hours. >>> >>> The BatchWriter is supposed to wait when a mutation is added if no >>> failures have recorded and either (a) the total memory used exceeds the >>> maximum allowed for the BatchWriter, or (b) the batchwriter is currently >>> flushed. So we conclude that one of (a) or (b) have occurred and no >>> failures were recorded, at the time when addMutation was called. I think >>> (a) is likely. >>> >>> The BatchWriter is supposed to notify itself when either (1) a flush >>> finishes, (2) a constraint violation or authorization failure or server >>> error or unknown error occurs, (3) memory usage decreases, which happens >>> when entries successfully send to the tablet server. Since the >>> BatchWriter >>> is stuck on WAITING, none of these conditions are occurring. >>> >>> The BatchWriter has 3 write threads (the default number). All three have >>> status TIMED_WAITING (parked) in jstack. Their stack traces don't give >>> useful information. >>> >>> Here's what I can tell from the tserver logs. A new table (and tablet) >>> was created successfully. The BatchWriter started writing to this tablet >>> steadily. The logs show that the tablet (vb<<) flushed every 5 seconds >>> or >>> so and major compacted at a steady periodic rate. >>> >>> Everything looks good, until vb<< grew large enough that it needed >>> splitting. This occurred about 42 minutes after the BatchWriter started >>> writing entries. The logs show a failure in an UpdateSession that popped >>> up in the middle of the split operation. This failure continues to show >>> for the next 15 hours. >>> >>> I copied the portion of the tserver logs that look relevant to the split >>> below. I highlighted the line reporting the first failure. It occurs in >>> between when the split starts and when it finishes. >>> >>> Any idea what could have caused this? I don't know if the failure is >>> related to the BatchWriter being stuck in WAITING. It seems likely. I >>> think it is weird that the 3 write threads are all idle; at least one of >>> them should be doing something if the thread calling addMutation() is >>> waiting. >>> >>> Here is a pastebin of the jstack<http://pastebin.com/TLfdrDwX>, though I >>> think I wrote the useful parts from it. >>> >>> 2016-04-17 22:38:06,436 [tablet.Tablet] TABLET_HIST: vb<< closed >>> 2016-04-17 22:38:06,439 [tablet.Tablet] DEBUG: Files for low split >>> vb;2436< >>> [hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C00008lh.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C00009iz.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C0000a08.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C0000a4t.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C0000a7m.rf, >>> hdfs: >>> //localhost:9000/accumulo/tables/vb/default_tablet/C0000a8f.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C0000a8n.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/F0000a8p.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/F0000a8q.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000a93.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000a9b.rf, >>> hdfs://localhost:90 >>> 00/accumulo/tables/vb/default_tablet/M0000a9g.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000aa7.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000aap.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000abe.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000abn.rf] >>> 2016-04-17 22:38:06,439 [tablet.Tablet] DEBUG: Files for high split >>> vb<;2436 >>> [hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C00008lh.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C00009iz.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C0000a08.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C0000a4t.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C0000a7m.rf, hdfs >>> ://localhost:9000/accumulo/tables/vb/default_tablet/C0000a8f.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/C0000a8n.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/F0000a8p.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/F0000a8q.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000a93.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000a9b.rf, >>> hdfs://localhost:9 >>> 000/accumulo/tables/vb/default_tablet/M0000a9g.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000aa7.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000aap.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000abe.rf, >>> hdfs://localhost:9000/accumulo/tables/vb/default_tablet/M0000abn.rf] >>> 2016-04-17 22:38:06,440 [tserver.TabletServer] DEBUG: ScanSess tid >>> 127.0.0.1:42302 !0 0 entries in 0.00 secs, nbTimes = [0 0 0.00 1] >>> 2016-04-17 22:38:06,472 [tablet.Tablet] TABLET_HIST: vb<< split vb;2436< >>> vb<;2436 >>> 2016-04-17 22:38:06,472 [tablet.Tablet] DEBUG: offline split time : >>> 3.22 >>> secs >>> *2016-04-17 22:38:06,472 [tserver.TabletServer] INFO : Starting split: >>> vb<<* >>> 2016-04-17 22:38:06,472 [tserver.TabletServer] DEBUG: UpSess >>> 127.0.0.1:41090 0 in 3.059s, at=[0 0 0.00 1] ft=3.010s(pt=3.010s >>> lt=0.000s ct=0.000s) >>> 2016-04-17 22:38:06,472 [tserver.NativeMap] DEBUG: Allocated native map >>> 0x0000000009a20e60 >>> >>> *2016-04-17 22:38:06,473 [tserver.TabletServer] DEBUG: Failures: 1, first >>> extent vb<< successful commits: 0* >>> >>> 2016-04-17 22:38:06,473 [tablet.Tablet] DEBUG: Reloading constraints for >>> extent: vb;2436< >>> 2016-04-17 22:38:06,473 [constraints.ConstraintChecker] DEBUG: Loaded >>> constraint org.apache.accumulo.core.constraints.DefaultKeySizeConstraint >>> for vb >>> 2016-04-17 22:38:06,474 [tablet.Tablet] TABLET_HIST: vb;2436< opened >>> 2016-04-17 22:38:06,474 [tserver.NativeMap] DEBUG: Allocated native map >>> 0x0000000003edf090 >>> 2016-04-17 22:38:06,474 [tablet.Tablet] DEBUG: Reloading constraints for >>> extent: vb<;2436 >>> 2016-04-17 22:38:06,474 [constraints.ConstraintChecker] DEBUG: Loaded >>> constraint org.apache.accumulo.core.constraints.DefaultKeySizeConstraint >>> for vb >>> 2016-04-17 22:38:06,475 [tablet.Tablet] TABLET_HIST: vb<;2436 opened >>> *2016-04-17 22:38:06,475 [tserver.TabletServer] INFO : Tablet split: vb<< >>> size0 539900144 size1 539900148 time 8268ms* >>> 2016-04-17 22:38:06,594 [tserver.TabletServer] DEBUG: MultiScanSess >>> 127.0.0.1:51092 0 entries in 0.00 secs (lookup_time:0.00 secs tablets:1 >>> ranges:1) >>> 2016-04-17 22:38:06,597 [tserver.TabletServer] DEBUG: MultiScanSess >>> 127.0.0.1:51092 0 entries in 0.00 secs (lookup_time:0.00 secs tablets:1 >>> ranges:1) >>> 2016-04-17 22:38:07,568 [tserver.TabletServer] DEBUG: ScanSess tid >>> 127.0.0.1:42307 !0 0 entries in 0.00 secs, nbTimes = [1 1 1.00 1] >>> 2016-04-17 22:38:07,571 [zookeeper.DistributedWorkQueue] DEBUG: Looking >>> for work in >>> /accumulo/c7998e9d-129d-4c61-9732-45bb6bcab3f0/bulk_failed_copyq >>> 2016-04-17 22:38:07,940 [tserver.TabletServer] DEBUG: UpSess >>> 127.0.0.1:41090 0 in 0.067s, at=[1 1 1.00 1] ft=0.000s(pt=0.000s >>> lt=0.000s ct=0.000s) >>> 2016-04-17 22:38:07,941 [tserver.TabletServer] DEBUG: Failures: 1, first >>> extent vb<< successful commits: 0 >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>