Hole in metadata table occurred during random walk test
-------------------------------------------------------

                 Key: ACCUMULO-315
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-315
             Project: Accumulo
          Issue Type: Bug
          Components: master, tserver
         Environment: Running 1.4.0 SNAPSHOT on 10 node cluster.
            Reporter: Keith Turner
            Assignee: Keith Turner
            Priority: Critical
             Fix For: 1.4.0


While running the random walk test a hole in the metadata table occurred.  A 
client tried to delete the table with the whole and the fate op got stuck.  Was 
continually seeing the following in the master logs.

{noformat}
14 00:02:11,273 [tableOps.CleanUp] DEBUG: Still waiting for table to be 
deleted: 4ct locationState: 
4ct;4d2d3be2823b0bf4;27b693c626c2d4ef@(null,xxx.xxx.xxx.xxx:9997[134d7425fc503e1],null)
{noformat}

The metadata table contained the following.  Tablet 4ct;4d2d3be2823b0bf4 had a 
location.

{noformat}
4ct;262249211a62cd6f ~tab:~pr []    \x011819e56edae21302
4ct;27b693c626c2d4ef ~tab:~pr []    \x01262249211a62cd6f
4ct;43422047c78fa52b ~tab:~pr []    \x0141ea825af0f262d9
4ct;4d2d3be2823b0bf4 ~tab:~pr []    \x0127b693c626c2d4ef
4ct;4f89df61392bb311 ~tab:~pr []    \x014d2d3be2823b0bf4
{noformat}

Found the following events on a tablet server.

{noformat}
21:36:04,369 [tabletserver.Tablet] TABLET_HIST: 
4ct;4d2d3be2823b0bf4;27b693c626c2d4ef split 
4ct;41ea825af0f262d9;27b693c626c2d4ef 4ct;4d2d3be2823b0bf4;41ea825af0f262d9

21:36:06,351 [tabletserver.Tablet] TABLET_HIST: 
4ct;4d2d3be2823b0bf4;41ea825af0f262d9 split 
4ct;43422047c78fa52b;41ea825af0f262d9 4ct;4d2d3be2823b0bf4;43422047c78fa52b
{noformat}

Saw the following on the tablet server serving the metadata tablet at around 
the time of the splits.  Not sure if this is related.

{noformat}

13 21:36:10,956 [server.TNonblockingServer] WARN : Got an IOException in 
internalRead!
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcher.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198)
        at sun.nio.ch.IOUtil.read(IOUtil.java:171)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
        at 
org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
        at 
org.apache.thrift.server.TNonblockingServer$FrameBuffer.internalRead(TNonblockingServer.java:668)
        at 
org.apache.thrift.server.TNonblockingServer$FrameBuffer.read(TNonblockingServer.java:457)
        at 
org.apache.thrift.server.TNonblockingServer$SelectThread.handleRead(TNonblockingServer.java:358)
        at 
org.apache.thrift.server.TNonblockingServer$SelectThread.select(TNonblockingServer.java:303)
        at 
org.apache.thrift.server.TNonblockingServer$SelectThread.run(TNonblockingServer.java:242)


{noformat}

Not sure what caused the metadata problem.  Further investigation is needed.  
Also, while debugging the master started assigning and unassigning metadata 
tablets rapidly.  Did not get a change to investigate this, it stopped when I 
stopped the random walk test.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to