arw357 edited a comment on issue #590: Failed to APPEND_FILE 
/HUDI_MOR/2019/03/01/.abca886a-e213-4199-b571-475919d34fe5_20190301123358.log.1 
for DFSClient_NONMAPREDUCE_-263902482_1 on 172.29.0.10 because lease recovery 
is in progress
URL: https://github.com/apache/incubator-hudi/issues/590#issuecomment-516377266
 
 
   I found something alike . I have only one datanode - it happens when the 
archive log is being appended.  I can reproduce it. 
   I see that this build failed as well : 
https://api.travis-ci.org/v3/job/547004744/log.txt 
   
   ```:08:04.613 [DataLakeSystem-akka.actor.default-dispatcher-5]  INFO 
com.uber.hoodie.common.table.log.HoodieLogFormat$WriterBuilder  - Building 
HoodieLogFormat Writer
   11:08:04.613 [DataLakeSystem-akka.actor.default-dispatcher-5]  INFO 
com.uber.hoodie.common.table.log.HoodieLogFormat$WriterBuilder  - Computing the 
next log version for commits in 
hdfs://namenode:8020/data/lake/3ecfc2fc-8f48-46e1-8e0c-be8fc13eb596/converted/.hoodie/archived
   11:08:04.621 [DataLakeSystem-akka.actor.default-dispatcher-5]  INFO 
com.uber.hoodie.common.table.log.HoodieLogFormat$WriterBuilder  - Computed the 
next log version for commits in 
hdfs://namenode:8020/data/lake/3ecfc2fc-8f48-46e1-8e0c-be8fc13eb596/converted/.hoodie/archived
 as 1
   11:08:04.621 [DataLakeSystem-akka.actor.default-dispatcher-5]  INFO 
com.uber.hoodie.common.table.log.HoodieLogFormat$WriterBuilder  - HoodieLogFile 
on path 
hdfs://namenode:8020/data/lake/3ecfc2fc-8f48-46e1-8e0c-be8fc13eb596/converted/.hoodie/archived/.commits_.archive.1
   11:08:04.630 [DataLakeSystem-akka.actor.default-dispatcher-5]  INFO 
com.uber.hoodie.common.table.log.HoodieLogFormatWriter  - HoodieLogFile 
{hdfs://namenode:8020/data/lake/3ecfc2fc-8f48-46e1-8e0c-be8fc13eb596/converted/.hoodie/archived/.commits_.archive.1}
 exists. Appending to existing file
   11:08:04.652 [DataLakeSystem-akka.actor.default-dispatcher-5]  INFO 
com.uber.hoodie.io.HoodieCommitArchiveLog  - Archiving instants 
[[20190730104047__clean__COMPLETED], [20190730104512__clean__COMPLETED], 
[20190730104605__clean__COMPLETED], [20190730104628__clean__COMPLETED], 
[20190730104651__clean__COMPLETED], [20190730104721__clean__COMPLETED], 
[20190730104758__clean__COMPLETED], [20190730104821__clean__COMPLETED], 
[20190730104843__clean__COMPLETED], [20190730104905__clean__COMPLETED], 
[20190730104926__clean__COMPLETED]]
   11:08:04.653 [DataLakeSystem-akka.actor.default-dispatcher-5]  INFO 
com.uber.hoodie.io.HoodieCommitArchiveLog  - Wrapper schema 
{"type":"record","name":"HoodieArchivedMetaEntry","namespace":"com.uber.hoodie.avro.model","fields":[{"name":"hoodieCommitMetadata","type":["null",{"type":"record","name":"HoodieCommitMetadata","fields":[{"name":"partitionToWriteStats","type":["null",{"type":"map","values":{"type":"array","items":{"type":"record","name":"HoodieWriteStat","fields":[{"name":"fileId","type":["null",{"type":"string","avro.java.string":"String"}],"default":null},{"name":"path","type":["null",{"type":"string","avro.java.string":"String"}],"default":null},{"name":"prevCommit","type":["null",{"type":"string","avro.java.string":"String"}],"default":null},{"name":"numWrites","type":["null","long"],"default":null},{"name":"numDeletes","type":["null","long"],"default":null},{"name":"numUpdateWrites","type":["null","long"],"default":null},{"name":"totalWriteBytes","type":["null","long"],"default":null},{"name":"totalWriteErrors","type":["null","long"],"default":null},{"name":"partitionPath","type":["null",{"type":"string","avro.java.string":"String"}],"default":null},{"name":"totalLogRecords","type":["null","long"],"default":null},{"name":"totalLogFiles","type":["null","long"],"default":null},{"name":"totalUpdatedRecordsCompacted","type":["null","long"],"default":null},{"name":"numInserts","type":["null","long"],"default":null},{"name":"totalLogBlocks","type":["null","long"],"default":null},{"name":"totalCorruptLogBlock","type":["null","long"],"default":null},{"name":"totalRollbackBlocks","type":["null","long"],"default":null}]}},"avro.java.string":"String"}]},{"name":"extraMetadata","type":["null",{"type":"map","values":{"type":"string","avro.java.string":"String"},"avro.java.string":"String"}]}]}],"default":"null"},{"name":"hoodieCleanMetadata","type":["null",{"type":"record","name":"HoodieCleanMetadata","fields":[{"name":"startCleanTime","type":{"type":"string","avro.java.string":"String"}},{"name":"timeTakenInMillis","type":"long"},{"name":"totalFilesDeleted","type":"int"},{"name":"earliestCommitToRetain","type":{"type":"string","avro.java.string":"String"}},{"name":"partitionMetadata","type":{"type":"map","values":{"type":"record","name":"HoodieCleanPartitionMetadata","fields":[{"name":"partitionPath","type":{"type":"string","avro.java.string":"String"}},{"name":"policy","type":{"type":"string","avro.java.string":"String"}},{"name":"deletePathPatterns","type":{"type":"array","items":{"type":"string","avro.java.string":"String"}}},{"name":"successDeleteFiles","type":{"type":"array","items":{"type":"string","avro.java.string":"String"}}},{"name":"failedDeleteFiles","type":{"type":"array","items":{"type":"string","avro.java.string":"String"}}}]},"avro.java.string":"String"}}]}],"default":"null"},{"name":"hoodieCompactionMetadata","type":["null",{"type":"record","name":"HoodieCompactionMetadata","fields":[{"name":"partitionToCompactionWriteStats","type":["null",{"type":"map","values":{"type":"array","items":{"type":"record","name":"HoodieCompactionWriteStat","fields":[{"name":"partitionPath","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"totalLogRecords","type":["null","long"]},{"name":"totalLogFiles","type":["null","long"]},{"name":"totalUpdatedRecordsCompacted","type":["null","long"]},{"name":"hoodieWriteStat","type":["null","HoodieWriteStat"]}]}},"avro.java.string":"String"}]}]}],"default":"null"},{"name":"hoodieRollbackMetadata","type":["null",{"type":"record","name":"HoodieRollbackMetadata","fields":[{"name":"startRollbackTime","type":{"type":"string","avro.java.string":"String"}},{"name":"timeTakenInMillis","type":"long"},{"name":"totalFilesDeleted","type":"int"},{"name":"commitsRollback","type":{"type":"array","items":{"type":"string","avro.java.string":"String"}}},{"name":"partitionMetadata","type":{"type":"map","values":{"type":"record","name":"HoodieRollbackPartitionMetadata","fields":[{"name":"partitionPath","type":{"type":"string","avro.java.string":"String"}},{"name":"successDeleteFiles","type":{"type":"array","items":{"type":"string","avro.java.string":"String"}}},{"name":"failedDeleteFiles","type":{"type":"array","items":{"type":"string","avro.java.string":"String"}}}]},"avro.java.string":"String"}}]}],"default":"null"},{"name":"hoodieSavePointMetadata","type":["null",{"type":"record","name":"HoodieSavepointMetadata","fields":[{"name":"savepointedBy","type":{"type":"string","avro.java.string":"String"}},{"name":"savepointedAt","type":"long"},{"name":"comments","type":{"type":"string","avro.java.string":"String"}},{"name":"partitionMetadata","type":{"type":"map","values":{"type":"record","name":"HoodieSavepointPartitionMetadata","fields":[{"name":"partitionPath","type":{"type":"string","avro.java.string":"String"}},{"name":"savepointDataFile","type":{"type":"array","items":{"type":"string","avro.java.string":"String"}}}]},"avro.java.string":"String"}}]}],"default":"null"},{"name":"commitTime","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"actionType","type":["null",{"type":"string","avro.java.string":"String"}]}]}
   11:08:04.706 [DataLakeSystem-akka.actor.default-dispatcher-5] ERROR  
   com.uber.hoodie.exception.HoodieException: Unable to close HoodieLogFormat 
writer
        at 
com.uber.hoodie.io.HoodieCommitArchiveLog.close(HoodieCommitArchiveLog.java:103)
        at 
com.uber.hoodie.io.HoodieCommitArchiveLog.archiveIfRequired(HoodieCommitArchiveLog.java:124)
        at com.uber.hoodie.HoodieWriteClient.commit(HoodieWriteClient.java:542)
        at com.uber.hoodie.HoodieWriteClient.commit(HoodieWriteClient.java:489)
        at com.uber.hoodie.HoodieWriteClient.commit(HoodieWriteClient.java:480)
        at 
com.uber.hoodie.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:155)
        at com.uber.hoodie.DefaultSource.createRelation(DefaultSource.scala:91)
        at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
        at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
        at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
        at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:654)
        at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:225)
        
        ....
        at akka.stream.impl.fusing.Map$$anon$10.onPush(Ops.scala:52)
        at 
akka.stream.impl.fusing.GraphInterpreter.processPush(GraphInterpreter.scala:519)
        at 
akka.stream.impl.fusing.GraphInterpreter.execute(GraphInterpreter.scala:411)
        at 
akka.stream.impl.fusing.GraphInterpreterShell.runBatch(ActorGraphInterpreter.scala:588)
        at 
akka.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:472)
        at 
akka.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:563)
        at 
akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:745)
        at 
akka.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:760)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
        at 
akka.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:670)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
        at akka.actor.ActorCell.invoke(ActorCell.scala:561)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
   Caused by: java.io.IOException: Failed to replace a bad datanode on the 
existing pipeline due to no more good datanodes being available to try. (Nodes: 
current=[DatanodeInfoWithStorage[172.21.0.14:9866,DS-3b46b080-0c8e-4d74-a090-734caebd4a16,DISK]],
 
original=[DatanodeInfoWithStorage[172.21.0.14:9866,DS-3b46b080-0c8e-4d74-a090-734caebd4a16,DISK]]).
 The current failed datanode replacement policy is DEFAULT, and a client may 
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' 
in its configuration.
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1044)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1107)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1276)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:560)
   
   ```
   
   Any clue  ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to