After update web interface at Master show that every region server now 1.4.7 and no RITS.
Cluster recovered only when we restart all regions servers 4 times... > On 11 Sep 2018, at 04:08, Josh Elser <els...@apache.org> wrote: > > Did you update the HBase jars on all RegionServers? > > Make sure that you have all of the Regions assigned (no RITs). There could be > a pretty simple explanation as to why the index can't be written to. > > On 9/9/18 3:46 PM, Batyrshin Alexander wrote: >> Correct me if im wrong. >> But looks like if you have A and B region server that has index and primary >> table then possible situation like this. >> A and B under writes on table with indexes >> A - crash >> B failed on index update because A is not operating then B starting aborting >> A after restart try to rebuild index from WAL but B at this time is aborting >> then A starting aborting too >> From this moment nothing happens (0 requests to region servers) and A and B >> is not responsible from Master-status web interface >>> On 9 Sep 2018, at 04:38, Batyrshin Alexander <0x62...@gmail.com >>> <mailto:0x62...@gmail.com>> wrote: >>> >>> After update we still can't recover HBase cluster. Our region servers >>> ABORTING over and over: >>> >>> prod003: >>> Sep 09 02:51:27 prod003 hbase[1440]: 2018-09-09 02:51:27,395 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=92,queue=2,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod003,60020,1536446665703: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:51:27 prod003 hbase[1440]: 2018-09-09 02:51:27,395 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=77,queue=7,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod003,60020,1536446665703: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:52:19 prod003 hbase[1440]: 2018-09-09 02:52:19,224 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=82,queue=2,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod003,60020,1536446665703: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:52:28 prod003 hbase[1440]: 2018-09-09 02:52:28,922 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=94,queue=4,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod003,60020,1536446665703: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:55:02 prod003 hbase[957]: 2018-09-09 02:55:02,096 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=95,queue=5,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod003,60020,1536450772841: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:55:18 prod003 hbase[957]: 2018-09-09 02:55:18,793 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=97,queue=7,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod003,60020,1536450772841: Could not update the index table, killing >>> server region because couldn't write to an index table >>> >>> prod004: >>> Sep 09 02:52:13 prod004 hbase[4890]: 2018-09-09 02:52:13,541 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=83,queue=3,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod004,60020,1536446387325: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:52:50 prod004 hbase[4890]: 2018-09-09 02:52:50,264 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=75,queue=5,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod004,60020,1536446387325: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:53:40 prod004 hbase[4890]: 2018-09-09 02:53:40,709 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=66,queue=6,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod004,60020,1536446387325: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:54:00 prod004 hbase[4890]: 2018-09-09 02:54:00,060 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=89,queue=9,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod004,60020,1536446387325: Could not update the index table, killing >>> server region because couldn't write to an index table >>> >>> prod005: >>> Sep 09 02:52:50 prod005 hbase[3772]: 2018-09-09 02:52:50,661 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=65,queue=5,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod005,60020,1536446400009: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:53:27 prod005 hbase[3772]: 2018-09-09 02:53:27,542 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=90,queue=0,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod005,60020,1536446400009: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:54:00 prod005 hbase[3772]: 2018-09-09 02:53:59,915 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=7,queue=7,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod005,60020,1536446400009: Could not update the index table, killing >>> server region because couldn't write to an index table >>> Sep 09 02:54:30 prod005 hbase[3772]: 2018-09-09 02:54:30,058 FATAL >>> [RpcServer.default.FPBQ.Fifo.handler=16,queue=6,port=60020] >>> regionserver.HRegionServer: ABORTING region server >>> prod005,60020,1536446400009: Could not update the index table, killing >>> server region because couldn't write to an index table >>> >>> And so on... >>> >>> Trace is the same everywhere: >>> >>> Sep 09 02:54:30 prod005 hbase[3772]: >>> org.apache.phoenix.hbase.index.exception.MultiIndexWriteFailureException: >>> disableIndexOnFailure=true, Failed to write to multiple index tables: >>> [KM_IDX1, KM_IDX2, KM_HISTORY_IDX1, KM_HISTORY_IDX2, KM_HISTORY_IDX3] >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.hbase.index.write.TrackingParallelWriterIndexCommitter.write(TrackingParallelWriterIndexCommitter.java:235) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.hbase.index.write.IndexWriter.write(IndexWriter.java:195) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:156) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:145) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.hbase.index.Indexer.doPostWithExceptions(Indexer.java:620) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.hbase.index.Indexer.doPost(Indexer.java:595) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.hbase.index.Indexer.postBatchMutateIndispensably(Indexer.java:578) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$37.call(RegionCoprocessorHost.java:1048) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1711) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1789) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1745) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1044) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3646) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3108) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3050) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.commitBatch(UngroupedAggregateRegionObserver.java:271) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.commitBatchWithRetries(UngroupedAggregateRegionObserver.java:241) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.rebuildIndices(UngroupedAggregateRegionObserver.java:1068) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:386) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:239) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:287) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2843) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3080) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36613) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2354) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) >>> Sep 09 02:54:30 prod005 hbase[3772]: at >>> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277) >>> >>>> On 9 Sep 2018, at 01:44, Batyrshin Alexander <0x62...@gmail.com >>>> <mailto:0x62...@gmail.com>> wrote: >>>> >>>> Thank you. >>>> We're updating our cluster right now... >>>> >>>> >>>>> On 9 Sep 2018, at 01:39, Ted Yu <yuzhih...@gmail.com >>>>> <mailto:yuzhih...@gmail.com>> wrote: >>>>> >>>>> It seems you should deploy hbase with the following fix: >>>>> >>>>> HBASE-21069 NPE in StoreScanner.updateReaders causes RS to crash >>>>> >>>>> 1.4.7 was recently released. >>>>> >>>>> FYI >>>>> >>>>> On Sat, Sep 8, 2018 at 3:32 PM Batyrshin Alexander <0x62...@gmail.com >>>>> <mailto:0x62...@gmail.com>> wrote: >>>>> >>>>> Hello, >>>>> >>>>> We got this exception from *prod006* server >>>>> >>>>> Sep 09 00:38:02 prod006 hbase[18907]: 2018-09-09 00:38:02,532 >>>>> FATAL [MemStoreFlusher.1] regionserver.HRegionServer: ABORTING >>>>> region server prod006,60020,1536235102833: Replay of >>>>> WAL required. Forcing server shutdown >>>>> Sep 09 00:38:02 prod006 hbase[18907]: >>>>> org.apache.hadoop.hbase.DroppedSnapshotException: >>>>> region: >>>>> KM,c\xEF\xBF\xBD\x16I7\xEF\xBF\xBD\x0A"A\xEF\xBF\xBDd\xEF\xBF\xBD\xEF\xBF\xBD\x19\x07t,1536178245576.60c121ba50e67f2429b9ca2ba2a11bad. >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2645) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2322) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2284) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2170) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2095) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:508) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:478) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:76) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:264) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> java.lang.Thread.run(Thread.java:748) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: Caused by: >>>>> java.lang.NullPointerException >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> java.util.ArrayList.<init>(ArrayList.java:178) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.StoreScanner.updateReaders(StoreScanner.java:863) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1172) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HStore.updateStorefiles(HStore.java:1145) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> org.apache.hadoop.hbase.regionserver.HStore.access$900(HStore.java:122) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:2505) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2600) >>>>> Sep 09 00:38:02 prod006 hbase[18907]: ... 9 more >>>>> Sep 09 00:38:02 prod006 hbase[18907]: 2018-09-09 00:38:02,532 >>>>> FATAL [MemStoreFlusher.1] regionserver.HRegionServer: >>>>> RegionServer abort: loaded coprocessors >>>>> are: >>>>> [org.apache.hadoop.hbase.regionserver.IndexHalfStoreFileReaderGenerator, >>>>> org.apache.phoenix.coprocessor.SequenceRegionObserver, >>>>> org.apache.phoenix.c >>>>> >>>>> After that we got ABORTING on almost every Region Servers in >>>>> cluster with different reasons: >>>>> >>>>> *prod003* >>>>> Sep 09 01:12:11 prod003 hbase[11552]: 2018-09-09 01:12:11,799 >>>>> FATAL [PostOpenDeployTasks:88bfac1dfd807c4cd1e9c1f31b4f053f] >>>>> regionserver.HRegionServer: ABORTING region >>>>> server prod003,60020,1536444066291: Exception running >>>>> postOpenDeployTasks; region=88bfac1dfd807c4cd1e9c1f31b4f053f >>>>> Sep 09 01:12:11 prod003 hbase[11552]: >>>>> java.io.InterruptedIOException: #139, interrupted. >>>>> currentNumberOfTask=8 >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1853) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1823) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1899) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:250) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:213) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1484) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> org.apache.hadoop.hbase.client.HTable.put(HTable.java:1031) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.MetaTableAccessor.put(MetaTableAccessor.java:1033) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.MetaTableAccessor.putToMetaTable(MetaTableAccessor.java:1023) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.MetaTableAccessor.updateLocation(MetaTableAccessor.java:1433) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.MetaTableAccessor.updateRegionLocation(MetaTableAccessor.java:1400) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:2041) >>>>> Sep 09 01:12:11 prod003 hbase[11552]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:329) >>>>> >>>>> *prod002* >>>>> Sep 09 01:12:30 prod002 hbase[29056]: 2018-09-09 01:12:30,144 >>>>> FATAL >>>>> [RpcServer.default.FPBQ.Fifo.handler=36,queue=6,port=60020] >>>>> regionserver.HRegionServer: ABORTING region >>>>> server prod002,60020,1536235138673: Could not update the index >>>>> table, killing server region because couldn't write to an index >>>>> table >>>>> Sep 09 01:12:30 prod002 hbase[29056]: >>>>> >>>>> org.apache.phoenix.hbase.index.exception.MultiIndexWriteFailureException: >>>>> disableIndexOnFailure=true, Failed to write to multiple index >>>>> tables: [KM_IDX1, KM_IDX2, KM_HISTORY1, KM_HISTORY2, >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.hbase.index.write.TrackingParallelWriterIndexCommitter.write(TrackingParallelWriterIndexCommitter.java:235) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.hbase.index.write.IndexWriter.write(IndexWriter.java:195) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:156) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:145) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.hbase.index.Indexer.doPostWithExceptions(Indexer.java:620) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> org.apache.phoenix.hbase.index.Indexer.doPost(Indexer.java:595) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.hbase.index.Indexer.postBatchMutateIndispensably(Indexer.java:578) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$37.call(RegionCoprocessorHost.java:1048) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1711) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1789) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1745) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1044) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3646) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3108) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3050) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.commitBatch(UngroupedAggregateRegionObserver.java:271) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.access$000(UngroupedAggregateRegionObserver.java:164) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver$1.doMutation(UngroupedAggregateRegionObserver.java:246) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.index.PhoenixIndexFailurePolicy.doBatchWithRetries(PhoenixIndexFailurePolicy.java:455) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.handleIndexWriteException(UngroupedAggregateRegionObserver.java:929) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.commitBatchWithRetries(UngroupedAggregateRegionObserver.java:243) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.rebuildIndices(UngroupedAggregateRegionObserver.java:1077) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:386) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:239) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:287) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2843) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3080) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36613) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2354) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) >>>>> Sep 09 01:12:30 prod002 hbase[29056]: at >>>>> >>>>> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277) >>>>> >>>>> >>>>> And etc... >>>>> >>>>> Master-status web interface shows that contact lost from this >>>>> aborted servers. >>>> >>>