Shard random walk test failed w/ TableDeleted exception
-------------------------------------------------------
Key: ACCUMULO-383
URL: https://issues.apache.org/jira/browse/ACCUMULO-383
Project: Accumulo
Issue Type: Bug
Reporter: Keith Turner
While running the random walk test on a 10 node cluster, the shard test failed
with the following errors.
{noformat}
07 23:43:38,483 [shard.CloneIndex] DEBUG: Cloned
ST_index_26192_1328649015908_tmp from ST_index_26192_1328649015908 flush:
5825ms clone: 96ms
07 23:43:38,503 [impl.ThriftScanner] DEBUG: Failed to locate tablet for table :
rr row :
07 23:49:52,795 [impl.ThriftScanner] DEBUG: Scan failed, not serving tablet
(ln<<,xxx.xxx.xxx.10:9997)
07 23:51:11,539 [shard.VerifyIndex] DEBUG: Verified 65438656 index entries
07 23:51:12,052 [shard.Search] DEBUG: Looking up terms [67x, 1o3g, dl4, e4m]
expect to find 2682000000000000
07 23:51:12,110 [impl.TabletServerBatchReaderIterator] DEBUG: Server :
xxx.xxx.xxx.8:9997 msg : null
ThriftSecurityException(user:root, code:PERMISSION_DENIED)
at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result.read(TabletClientService.java:7586)
at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:306)
at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:274)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83)
at $Proxy2.startMultiScan(Unknown Source)
at
org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:539)
at
org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:335)
at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
07 23:51:12,113 [impl.TabletServerBatchReaderIterator] DEBUG: Error
PERMISSION_DENIED - User does not have permission to perform this action
org.apache.accumulo.core.client.AccumuloSecurityException: Error
PERMISSION_DENIED - User does not have permission to perform this action
at
org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:576)
at
org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:335)
at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: ThriftSecurityException(user:root, code:PERMISSION_DENIED)
at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result.read(TabletClientService.java:7586)
at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:306)
at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:274)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:83)
at $Proxy2.startMultiScan(Unknown Source)
at
org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:539)
... 5 more
07 23:51:12,114 [randomwalk.Module] DEBUG: Properties for node: shard.Search
07 23:51:12,114 [randomwalk.Module] DEBUG: teardown:
07 23:51:12,114 [randomwalk.Module] DEBUG: maxSec:
07 23:51:12,114 [randomwalk.Module] DEBUG: maxHops:
07 23:51:12,114 [randomwalk.Module] DEBUG: Properties for node: Shard.xml
07 23:51:12,114 [randomwalk.Module] DEBUG: teardown:
07 23:51:12,114 [randomwalk.Module] DEBUG: maxSec:
07 23:51:12,114 [randomwalk.Module] DEBUG: maxHops:
07 23:51:12,114 [randomwalk.Framework] ERROR: Error during random walk
java.lang.Exception: Error running node Shard.xml
at
org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:259)
at
org.apache.accumulo.server.test.randomwalk.Framework.run(Framework.java:61)
at
org.apache.accumulo.server.test.randomwalk.Framework.main(Framework.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.accumulo.start.Main$1.run(Main.java:89)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.Exception: Error running node shard.Search
at
org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:259)
at
org.apache.accumulo.server.test.randomwalk.Module.visit(Module.java:251)
... 8 more
Caused by: org.apache.accumulo.core.client.TableDeletedException
at
org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:357)
at cloudtrace.instrument.TraceRunnable.run(TraceRunnable.java:47)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
... 1 more
{noformat}
After seeing this looked and saw that the table existed. So for some reason a
tablet server and client process did not see metadata related to the table.
This happened after a delete and rename. One possibility is that the client is
using the old table id from before the delete and rename. However, not sure
how this could have happened. If the log messages printed the table id this
would be easy to verify. Another possibility is that its some sort of
zookeeper hiccup.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira