[ 
https://issues.apache.org/jira/browse/ACCUMULO-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated ACCUMULO-3774:
---------------------------------
    Fix Version/s:     (was: 1.7.0)
                   1.8.0

> Deadlock after recovering root tablet
> -------------------------------------
>
>                 Key: ACCUMULO-3774
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3774
>             Project: Accumulo
>          Issue Type: Bug
>         Environment: Hadoop 2.7.0, ZK 3.4.6, Accumulo 
> 83d1b8388ad807d678c9a3a922e5025faa9a5933, 20 node m3.large EC2 cluster
>            Reporter: Keith Turner
>            Assignee: Eric Newton
>            Priority: Blocker
>              Labels: 1.7.0_QA
>             Fix For: 1.8.0
>
>         Attachments: ACCUMULO-3774-01.patch
>
>
> I started CI running against 1.7.0-SNAP.   After CI ran for while I started 
> agitation.   Then everything froze up.   The root tablet node was killed, the 
> root tablet had a lot of walogs (will open a seperate issue for this), the 
> root tablet was reloaded on another machine.  However it hung up while 
> loading with the following issue.  The minor compaction after recovery was 
> trying to write to the root tablet.  This happened before the root tablet 
> location was set.
> {noformat}
> "Minor compacting +r<<" daemon prio=10 tid=0x00000000046cd800 nid=0x3508 in 
> Object.wait() [0x00007fb0ac3b1000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:503)
>         at 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter.waitRTE(TabletServerBatchWriter.java:459)
>         at 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter.close(TabletServerBatchWriter.java:352)
>         - locked <0x000000078d154840> (a 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter)
>         at 
> org.apache.accumulo.core.client.impl.BatchWriterImpl.close(BatchWriterImpl.java:54)
>         at 
> org.apache.accumulo.server.util.MetadataTableUtil.markLogUnused(MetadataTableUtil.java:1131)
>         at 
> org.apache.accumulo.tserver.TabletServer.markUnusedWALs(TabletServer.java:3032)
>         at 
> org.apache.accumulo.tserver.TabletServer.minorCompactionFinished(TabletServer.java:2917)
>         at 
> org.apache.accumulo.tserver.tablet.DatafileManager.bringMinorCompactionOnline(DatafileManager.java:440)
>         at 
> org.apache.accumulo.tserver.tablet.Tablet.minorCompact(Tablet.java:956)
>         at 
> org.apache.accumulo.tserver.tablet.MinorCompactionTask.run(MinorCompactionTask.java:84)
>         at 
> org.apache.accumulo.tserver.tablet.Tablet.minorCompactNow(Tablet.java:1080)
>         at 
> org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2124)
>         at 
> org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$3.run(TabletServer.java:1510)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to