[ https://issues.apache.org/jira/browse/HBASE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488782#comment-16488782 ]
Ted Yu commented on HBASE-20616: -------------------------------- {code} + @VisibleForTesting + protected TableDescriptor tableDescriptor; {code} Was this accessed in test ? Otherwise this change is not needed. {code} + RegionInfo regionInfo = regions.get(0); {code} Since the new test doesn't modify regions, you can add a test-visible getter for the first region instead of exposing regions. > TruncateTableProcedure is stuck in retry loop in > TRUNCATE_TABLE_CREATE_FS_LAYOUT state > -------------------------------------------------------------------------------------- > > Key: HBASE-20616 > URL: https://issues.apache.org/jira/browse/HBASE-20616 > Project: HBase > Issue Type: Bug > Components: amv2 > Environment: HDP-2.5.3 > Reporter: Toshihiro Suzuki > Assignee: Toshihiro Suzuki > Priority: Major > Attachments: HBASE-20616.master.001.patch, > HBASE-20616.master.002.patch > > > At first, TruncateTableProcedure failed to write some files to HDFS in > TRUNCATE_TABLE_CREATE_FS_LAYOUT state for some reason. > {code:java} > 2018-05-15 08:00:25,346 WARN [ProcedureExecutorThread-8] > procedure.TruncateTableProcedure: Retriable error trying to truncate > table=<namespace>:<table> state=TRUNCATE_TABLE_CREATE_FS_LAYOUT > java.io.IOException: java.util.concurrent.ExecutionException: > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /apps/hbase/data/.tmp/data/<namespace>/<table>/<region>/.regioninfo could > only be replicated to 0 nodes instead of minReplication (=1). There are <the > number of DNs> datanode(s) running and no node(s) are excluded in this > operation. > ... > {code} > But at this time, seemed like writing some files to HDFS was successful. > And then, TruncateTableProcedure was stuck in retry loop in > TRUNCATE_TABLE_CREATE_FS_LAYOUT state. At this point, the following log > messages were shown repeatedly in the master log: > {code:java} > 2018-05-15 08:00:25,463 WARN [ProcedureExecutorThread-8] > procedure.TruncateTableProcedure: Retriable error trying to truncate > table=<namespace>:<table> state=TRUNCATE_TABLE_CREATE_FS_LAYOUT > java.io.IOException: java.util.concurrent.ExecutionException: > java.io.IOException: The specified region already exists on disk: > hdfs://<name>/apps/hbase/data/.tmp/data/<namespace>/<table>/<region> > ... > {code} > It seems like this is because TruncateTableProcedure tried to write the files > that were written successfully in the first try. > I think we need to delete all the files and directories that are written > successfully in the previous try before retrying the > TRUNCATE_TABLE_CREATE_FS_LAYOUT state. > Actually, this issue was observed in HDP-2.5.3, but I think the upstream has > the same issue. Also, it looks to me that CreateTableProcedure has a similar > issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)