[ https://issues.apache.org/jira/browse/HBASE-21077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vladimir Rodionov updated HBASE-21077: -------------------------------------- Attachment: HBASE-21077-v1.patch > MR job launched by hbase incremental backup command failed with > FileNotFoundException > ------------------------------------------------------------------------------------- > > Key: HBASE-21077 > URL: https://issues.apache.org/jira/browse/HBASE-21077 > Project: HBase > Issue Type: Bug > Reporter: Vladimir Rodionov > Assignee: Vladimir Rodionov > Priority: Major > Attachments: HBASE-21077-v1.patch > > > Discovered during internal testing by Romil Choksi. > MR job launched by hbase incremental backup command failed with > FileNotFoundException > from test console log > {code} > 2018-06-12 04:27:31,160|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:31,160 INFO > [main] mapreduce.JobSubmitter: Submitting tokens for job: > job_1528766389356_0044 > 2018-06-12 04:27:31,186|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:31,184 INFO > [main] mapreduce.JobSubmitter: Executing with tokens: [Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ns1, Ident: (token for hbase: > HDFS_DELEGATION_TOKEN owner=hb...@example.com, renewer=yarn, realUser=, > issueDate=1528777648605, maxDate=1529382448605, sequenceNumber=175, > masterKeyId=2), Kind: kms-dt, Service: 172.27.68.203:9393, Ident: (kms-dt > owner=hbase, renewer=yarn, realUser=, issueDate=1528777649149, > maxDate=1529382449149, sequenceNumber=49, masterKeyId=2), Kind: > HBASE_AUTH_TOKEN, Service: bc71e347-78ff-4f95-af44-006f9b549a84, Ident: > (org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@5), > Kind: kms-dt, Service: 172.27.52.14:9393, Ident: (kms-dt owner=hbase, > renewer=yarn, realUser=, issueDate=1528777648918, maxDate=1529382448918, > sequenceNumber=50, masterKeyId=2)] > 2018-06-12 04:27:31,477|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:31,477 INFO > [main] conf.Configuration: found resource resource-types.xml at > file:/etc/hadoop/3.0.0.0-1476/0/resource-types.xml > 2018-06-12 04:27:31,527|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:31,527 INFO > [main] impl.TimelineClientImpl: Timeline service address: > ctr-e138-1518143905142-359429-01-000004.hwx.site:8190 > 2018-06-12 04:27:32,563|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:32,562 INFO > [main] impl.YarnClientImpl: Submitted application > application_1528766389356_0044 > 2018-06-12 04:27:32,634|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:32,634 INFO > [main] mapreduce.Job: The url to track the job: > https://ctr-e138-1518143905142-359429-01-000003.hwx.site:8090/proxy/application_1528766389356_0044/ > 2018-06-12 04:27:32,635|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:32,635 INFO > [main] mapreduce.Job: Running job: job_1528766389356_0044 > 2018-06-12 04:27:44,807|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:44,806 INFO > [main] mapreduce.Job: Job job_1528766389356_0044 running in uber mode : false > 2018-06-12 04:27:44,809|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:44,809 INFO > [main] mapreduce.Job: map 0% reduce 0% > 2018-06-12 04:27:54,926|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:54,925 INFO > [main] mapreduce.Job: map 5% reduce 0% > 2018-06-12 04:27:56,950|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:56,950 INFO > [main] mapreduce.Job: Task Id : attempt_1528766389356_0044_m_000002_0, > Status : FAILED > 2018-06-12 04:27:56,979|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|Error: > java.io.FileNotFoundException: File does not exist: > hdfs://ns1/apps/hbase/data/oldWALs/ctr-e138-1518143905142-359429-01-000004.hwx.site%2C16020%2C1528776085205.1528776160915 > 2018-06-12 04:27:56,979|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583) > 2018-06-12 04:27:56,980|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576) > 2018-06-12 04:27:56,980|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > 2018-06-12 04:27:56,980|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1591) > 2018-06-12 04:27:56,980|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) > 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:165) > 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:289) > 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:271) > 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:259) > 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:395) > 2018-06-12 04:27:56,982|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hbase.wal.AbstractFSWALProvider.openReader(AbstractFSWALProvider.java:449) > 2018-06-12 04:27:56,982|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.openReader(WALInputFormat.java:166) > 2018-06-12 04:27:56,982|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.initialize(WALInputFormat.java:158) > 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560) > 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798) > 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) > 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > java.security.AccessController.doPrivileged(Native Method) > 2018-06-12 04:27:56,984|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > javax.security.auth.Subject.doAs(Subject.java:422) > 2018-06-12 04:27:56,984|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686) > 2018-06-12 04:27:56,984|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at > org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) > 2018-06-12 04:27:56,984|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab| > {code} > But looks like it did find the file on hdfs, test script runs incremental > backup command as HBase user. > {code} > 2018-06-12 04:27:30,756|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:30,755 > DEBUG [main] mapreduce.WALInputFormat: Scanning > hdfs://ns1/apps/hbase/data/oldWALs/ctr-e138-1518143905142-359429-01-000004.hwx.site%2C16020%2C1528776085205.1528776160915 > for WAL files > 2018-06-12 04:27:30,758|INFO|MainThread|machine.py:167 - > run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:30,758 INFO > [main] mapreduce.WALInputFormat: Found: > HdfsLocatedFileStatus{path=hdfs://ns1/apps/hbase/data/oldWALs/ctr-e138-1518143905142-359429-01-000004.hwx.site%2C16020%2C1528776085205.1528776160915; > isDirectory=false; length=18031; replication=3; blocksize=268435456; > modification_time=1528776689363; access_time=1528776160921; owner=hbase; > group=hdfs; permission=rwx--x--x; isSymlink=false; hasAcl=false; > isEncrypted=false; isErasureCoded=false} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)