[ https://issues.apache.org/jira/browse/HBASE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652171#comment-16652171 ]
Ted Yu commented on HBASE-21149: -------------------------------- Added more DEBUG log: {code} 2018-10-16 17:59:05,032 DEBUG [Time-limited test] mapreduce.MapReduceBackupCopyJob$BackupDistCp(317): HdfsNamedFileStatus{path=hdfs://localhost:37318/user/hbase/test-data/f61cbd60-4f59-dea6-e789-213dedb22d55/data/default/test-1539712693301/db94181f831f30089fa1ff6d0973d754/f/d0512abaaf69408b9d0edcb4c3c4cb46_SeqId_205_; isDirectory=false; length=5100; replication=1; blocksize=134217728; modification_time=1539712730259; access_time=1539712729843; owner=hbase; group=supergroup; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false} 2018-10-16 17:59:05,033 DEBUG [Time-limited test] mapreduce.MapReduceBackupCopyJob$BackupDistCp(317): HdfsNamedFileStatus{path=hdfs://localhost:37318/user/hbase/test-data/f61cbd60-4f59-dea6-e789-213dedb22d55/data/default/test-1539712693301/db94181f831f30089fa1ff6d0973d754/f/fabb7b1dbe2845ca90c66d4e34987da3_SeqId_205_; isDirectory=false; length=5142; replication=1; blocksize=134217728; modification_time=1539712730686; access_time=1539712730274; owner=hbase; group=supergroup; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false} 2018-10-16 17:59:06,222 WARN [Thread-933] mapred.LocalJobRunner$Job(590): job_local1245512275_0004 java.io.IOException: Inconsistent sequence file: current chunk file org.apache.hadoop.tools.CopyListingFileStatus@2bf8b9e{hdfs://localhost:37318/user/hbase/test-data/ f61cbd60-4f59-dea6-e789-213dedb22d55/data/default/test-1539712693301/db94181f831f30089fa1ff6d0973d754/f/fabb7b1dbe2845ca90c66d4e34987da3_SeqId_205_ length = 5142 aclEntries = null, xAttrs = null} doesnt match prior entry org.apache.hadoop.tools.CopyListingFileStatus@70f30098{hdfs://localhost:37318/user/hbase/test-data/f61cbd60-4f59-dea6-e789-213dedb22d55/data/default/test-1539712693301/db94181f831f30089fa1ff6d0973d754/f/d0512abaaf69408b9d0edcb4c3c4cb46_SeqId_205_ length = 5100 aclEntries = null, xAttrs = null} at org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276) at org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567) {code} We can see that the respective length of bulk loaded hfiles didn't change from when they were recorded in the listing file to when the exception was thrown. We should find a way to skip the concatenation performed by DictCp. Alternatively, we can merge the bulk loaded hfiles prior to copying - but that would require modifying the backup table which introduces more complexity in terms of maintaining consistency for backup images. > TestIncrementalBackupWithBulkLoad may fail due to file copy failure > ------------------------------------------------------------------- > > Key: HBASE-21149 > URL: https://issues.apache.org/jira/browse/HBASE-21149 > Project: HBase > Issue Type: Test > Components: backup&restore > Reporter: Ted Yu > Assignee: Ted Yu > Priority: Major > Fix For: 3.0.0 > > Attachments: 21149.v2.txt, HBASE-21149-v1.patch, > testIncrementalBackupWithBulkLoad-output.txt > > > From > https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/ > : > {code} > 2018-09-03 11:54:30,526 ERROR [Time-limited test] > impl.TableBackupClient(235): Unexpected Exception : Failed copy from > hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_ > to hdfs://localhost:53075/backupUT/backup_1535975655488 > java.io.IOException: Failed copy from > hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_ > to hdfs://localhost:53075/backupUT/backup_1535975655488 > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320) > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605) > at > org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > {code} > However, some part of the test output was lost: > {code} > 2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions > ...[truncated 398396 chars]... > 8) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)