[ https://issues.apache.org/jira/browse/ACCUMULO-331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217628#comment-13217628 ]
Keith Turner commented on ACCUMULO-331: --------------------------------------- Saw this again {noformat} 24 22:05:22,648 [tabletserver.Tablet] TABLET_HIST: n4;a200;a100 closed 24 22:05:23,672 [tabletserver.Tablet] TABLET_HIST: n4;a200;a100 opened 24 22:05:23,800 [tabletserver.TabletServer] WARN : exception while scanning tablet n4;a200;a100 java.io.IOException: Stream closed at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) at java.io.BufferedInputStream.fill(BufferedInputStream.java:189) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.FilterInputStream.read(FilterInputStream.java:66) at java.io.DataInputStream.readByte(DataInputStream.java:248) at org.apache.accumulo.core.file.rfile.RelativeKey.readFields(RelativeKey.java:116) at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._next(RFile.java:584) at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.next(RFile.java:556) at org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.consume(ColumnFamilySkippingIterator.java:64) at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:38) at org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.seek(ColumnFamilySkippingIterator.java:125) at org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:1020) at org.apache.accumulo.server.problems.ProblemReportingIterator.seek(ProblemReportingIterator.java:94) at org.apache.accumulo.core.iterators.system.MultiIterator.seek(MultiIterator.java:105) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) at org.apache.accumulo.core.iterators.system.DeletingIterator.seek(DeletingIterator.java:67) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) at org.apache.accumulo.core.iterators.user.VersioningIterator.seek(VersioningIterator.java:78) at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:116) at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:168) at org.apache.accumulo.server.tabletserver.Tablet.nextBatch(Tablet.java:1743) at org.apache.accumulo.server.tabletserver.Tablet.access$3200(Tablet.java:143) at org.apache.accumulo.server.tabletserver.Tablet$Scanner.read(Tablet.java:1883) at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$NextBatchTask.run(TabletServer.java:905) at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 24 22:05:23,948 [tabletserver.Tablet] TABLET_HIST: n4;a200;a100 closed {noformat} > Scan of metadata table failed because tablet closed > --------------------------------------------------- > > Key: ACCUMULO-331 > URL: https://issues.apache.org/jira/browse/ACCUMULO-331 > Project: Accumulo > Issue Type: Bug > Components: tserver > Environment: Running random walk test on 10 node cluster > Reporter: Keith Turner > Assignee: Keith Turner > Labels: 14_qa_bug > Fix For: 1.4.0 > > > The security random walk test failed to drop a table. > {noformat} > 19 05:56:04,122 [randomwalk.Framework] ERROR: Error during random walk > java.lang.Exception: Error running node Security.xml > at > org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:253) > at > org.apache.accumulo.server.test.randomwalk.Framework.run(Framework.java:61) > at > org.apache.accumulo.server.test.randomwalk.Framework.main(Framework.java:114) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.accumulo.start.Main$1.run(Main.java:89) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.lang.Exception: Error running node security.TableOp > at > org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:253) > at > org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:249) > ... 8 more > Caused by: org.apache.accumulo.core.client.AccumuloException: Internal error > processing waitForTableOperation > at > org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:293) > at > org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:261) > at > org.apache.accumulo.core.client.admin.TableOperationsImpl.delete(TableOperationsImpl.java:508) > at > org.apache.accumulo.server.test.randomwalk.security.DropTable.dropTable(DropTable.java:66) > at > org.apache.accumulo.server.test.randomwalk.security.TableOp.visit(TableOp.java:204) > at > org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:249) > ... 9 more > Caused by: org.apache.thrift.TApplicationException: Internal error processing > waitForTableOperation > at > org.apache.thrift.TApplicationException.read(TApplicationException.java:108) > at > org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_waitForTableOperation(MasterClientService.java:684) > at > org.apache.accumulo.core.master.thrift.MasterClientService$Client.waitForTableOperation(MasterClientService.java:665) > at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83) > at $Proxy1.waitForTableOperation(Unknown Source) > at > org.apache.accumulo.core.client.admin.TableOperationsImpl.waitForTableOperation(TableOperationsImpl.java:233) > at > org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:275) > ... 14 more > {noformat} > Looking in the master logs see the following indicating a scan failed while > trying to delete the table. > {noformat} > 19 05:56:04,098 [thrift.MasterClientService$Processor] ERROR: Internal error > processing waitForTableOperation > java.lang.RuntimeException: > org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server > xxx.xxx.xxx.xxx:9997 > at > org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.hasNext(TabletServerBatchReaderIterator.java:188) > at > org.apache.accumulo.server.master.state.MetaDataTableScanner.hasNext(MetaDataTableScanner.java:97) > at > org.apache.accumulo.server.master.tableOps.CleanUp.isReady(DeleteTable.java:94) > at > org.apache.accumulo.server.master.tableOps.CleanUp.isReady(DeleteTable.java:55) > at > org.apache.accumulo.server.master.tableOps.TraceRepo.isReady(TraceRepo.java:50) > at > org.apache.accumulo.server.fate.Fate$TransactionRunner.run(Fate.java:62) > at > org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.accumulo.core.client.impl.AccumuloServerException: > Error on server xxx.xxx.xxx.xxx:9997 > at > org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:579) > at > org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:335) > at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > ... 1 more > Caused by: org.apache.thrift.TApplicationException: Internal error processing > startMultiScan > at > org.apache.thrift.TApplicationException.read(TApplicationException.java:108) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:298) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:274) > at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83) > at $Proxy4.startMultiScan(Unknown Source) > at > org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:539) > ... 5 more > {noformat} > Looking on the tablet server where the scan failed, see the following > {noformat} > 19 05:56:04,062 [tabletserver.Tablet] DEBUG: initiateClose(saveState=true > queueMinC=false disableWrites=false) !0;~;!0< > 19 05:56:04,062 [tabletserver.Tablet] DEBUG: completeClose(saveState=true > completeClose=true) !0;~;!0< > 19 05:56:04,063 [problems.ProblemReports] DEBUG: Filing problem report !0 > FILE_READ /accumulo/tables/!0/table_info/F0002xng.rf > 19 05:56:04,072 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< closed > 19 05:56:04,072 [tabletserver.TabletServer] DEBUG: Unassigning > !0;~;!0<@(null,192.168.117.13:9997[134d7425fc5940d],null) > 19 05:56:04,073 [tabletserver.TabletServer] WARN : lookup failed for tablet > !0;~;!0< > java.io.EOFException > at java.io.DataInputStream.readByte(DataInputStream.java:250) > at > org.apache.accumulo.core.file.rfile.RelativeKey.fastSkip(RelativeKey.java:249) > at > org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._seek(RFile.java:677) > at > org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.seek(RFile.java:614) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at > org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37) > at > org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.seek(ColumnFamilySkippingIterator.java:125) > at > org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:1020) > at > org.apache.accumulo.server.problems.ProblemReportingIterator.seek(ProblemReportingIterator.java:94) > at > org.apache.accumulo.core.iterators.system.MultiIterator.seek(MultiIterator.java:105) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at > org.apache.accumulo.core.iterators.system.DeletingIterator.seek(DeletingIterator.java:67) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at > org.apache.accumulo.core.iterators.user.VersioningIterator.seek(VersioningIterator.java:89) > at > org.apache.accumulo.core.iterators.user.WholeRowIterator.seek(WholeRowIterator.java:240) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at > org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37) > at > org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:116) > at > org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:168) > at > org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1607) > at > org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1710) > at > org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:988) > at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > 19 05:56:04,076 [tabletserver.TabletServer] INFO : unloaded !0;~;!0< > 19 05:56:04,076 [tabletserver.TabletServer] INFO : told to unload tablet that > was not being served !0;~;!0< > 19 05:56:04,084 [thrift.TabletClientService$Processor] ERROR: Internal error > processing startMultiScan > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.io.EOFException > at > org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.continueMultiScan(TabletServer.java:1321) > at > org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.startMultiScan(TabletServer.java:1272) > at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at cloudtrace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:58) > at $Proxy2.startMultiScan(Unknown Source) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startMultiScan.process(TabletClientService.java:2164) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor.process(TabletClientService.java:2037) > at > org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:151) > at > org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:631) > at > org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TServerUtils.java:199) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at > org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.io.EOFException > at > org.apache.accumulo.server.tabletserver.TabletServer$ScanTask.get(TabletServer.java:661) > at > org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.continueMultiScan(TabletServer.java:1308) > ... 15 more > Caused by: java.lang.RuntimeException: java.io.EOFException > at > org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:999) > at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > ... 1 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readByte(DataInputStream.java:250) > at > org.apache.accumulo.core.file.rfile.RelativeKey.fastSkip(RelativeKey.java:249) > at > org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._seek(RFile.java:677) > at > org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.seek(RFile.java:614) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at > org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37) > at > org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.seek(ColumnFamilySkippingIterator.java:125) > at > org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:1020) > at > org.apache.accumulo.server.problems.ProblemReportingIterator.seek(ProblemReportingIterator.java:94) > at > org.apache.accumulo.core.iterators.system.MultiIterator.seek(MultiIterator.java:105) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at > org.apache.accumulo.core.iterators.system.DeletingIterator.seek(DeletingIterator.java:67) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at > org.apache.accumulo.core.iterators.user.VersioningIterator.seek(VersioningIterator.java:89) > at > org.apache.accumulo.core.iterators.user.WholeRowIterator.seek(WholeRowIterator.java:240) > at > org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:92) > at > org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37) > at > org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:116) > at > org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:168) > at > org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1607) > at > org.apache.accumulo.server.tabletserver.Tablet.lookup(Tablet.java:1710) > at > org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:988) > ... 4 more > 19 05:56:04,120 [tabletserver.TabletServer] INFO : Loading tablet !0;~;!0< > 19 05:56:04,128 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< opened > {noformat} > The metadata tablet were being frequently loaded and unloaded as descried in > ACCUMULO-329 when this happened. > The file in question was major compacted away on another tablet server. > Shortly after this. > {noformat} > 19 05:57:54,187 [tabletserver.Tablet] DEBUG: Starting MajC !0;~;!0< > [/table_info/A0002xm4.rf, /table_info/C0002xz3.rf, /table_info/C0002y0d.rf, > /table_info/F0002xng.rf, /table_info/M0002xzv.rf, /table_info/M0002y6z.rf] > --> /table_info/A0002y0e.rf_tmp > 19 05:57:54,309 [tabletserver.Tablet] TABLET_HIST: !0;~;!0< MajC > [/table_info/A0002xm4.rf, /table_info/C0002xz3.rf, /table_info/C0002y0d.rf, > /table_info/F0002xng.rf, /table_info/M0002xzv.rf, /table_info/M0002y6z.rf] > --> /table_info/A0002y0e.rf > {noformat} > {noformat} > 19 06:10:29,546 [gc.SimpleGarbageCollector] DEBUG: Deleting > /accumulo/tables/!0/table_info/F0002xng.rf > {noformat} > The file was garbage collected later. The GC and Majc show that the file > existed while the EOF happened. So this is probably either a bug in RFile > fast skip or a bug related to the tablet closing. Probably related to the > close w/ the close temporal relationship. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira