[ 
https://issues.apache.org/jira/browse/HDDS-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059327#comment-17059327
 ] 

Yiqun Lin edited comment on HDDS-3180 at 3/14/20, 12:29 PM:
------------------------------------------------------------

We need to additionally add log for the inconsistent state because this state 
will lead Datanode failed to start.

A more friendly message tested in my local:
{noformat}
2020-03-14 04:41:27,249 [main] INFO  (HddsVolume.java:177)     - Creating 
Volume: /tmp/hadoop-hdfs/dfs/data/hdds of storage type : DISK and capacity : 
9997713408
2020-03-14 04:41:27,250 [main] WARN  (HddsVolume.java:252)     - VERSION file 
does not exist in volume /tmp/hadoop-hdfs/dfs/data/hdds, current volume state: 
INCONSISTENT.
2020-03-14 04:41:27,257 [main] ERROR (MutableVolumeSet.java:202)     - Failed 
to parse the storage location: file:///tmp/hadoop-hdfs/dfs/data
java.io.IOException: Volume is in an INCONSISTENT state. Skipped loading 
volume: /tmp/hadoop-hdfs/dfs/data/hdds
        at 
org.apache.hadoop.ozone.container.common.volume.HddsVolume.initialize(HddsVolume.java:226)
        at 
org.apache.hadoop.ozone.container.common.volume.HddsVolume.<init>(HddsVolume.java:180)
        at 
org.apache.hadoop.ozone.container.common.volume.HddsVolume.<init>(HddsVolume.java:71)
        at 
org.apache.hadoop.ozone.container.common.volume.HddsVolume$Builder.build(HddsVolume.java:158)
        at 
org.apache.hadoop.ozone.container.common.volume.MutableVolumeSet.createVolume(MutableVolumeSet.java:336)
{noformat}


was (Author: linyiqun):
We need to additionally add log for the inconsistent state because this state 
will lead Datanode failed to start.

> Datanode fails to start due to confused inconsistent volume state
> -----------------------------------------------------------------
>
>                 Key: HDDS-3180
>                 URL: https://issues.apache.org/jira/browse/HDDS-3180
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>    Affects Versions: 0.4.1
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> I meet an error in my testing ozone cluster when I restart datanode. From the 
> log, it throws inconsistent volume state but without other detailed helpful 
> info:
> {noformat}
> 2020-03-14 02:31:46,204 [main] INFO  (LogAdapter.java:51)     - registered 
> UNIX signal handlers for [TERM, HUP, INT]
> 2020-03-14 02:31:46,736 [main] INFO  (HddsDatanodeService.java:204)     - 
> HddsDatanodeService host:lyq-xx.xx.xx.xx ip:xx.xx.xx.xx
> 2020-03-14 02:31:46,784 [main] INFO  (HddsVolume.java:177)     - Creating 
> Volume: /tmp/hadoop-hdfs/dfs/data/hdds of storage type : DISK and capacity : 
> 20063645696
> 2020-03-14 02:31:46,786 [main] ERROR (MutableVolumeSet.java:202)     - Failed 
> to parse the storage location: file:///tmp/hadoop-hdfs/dfs/data
> java.io.IOException: Volume is in an INCONSISTENT state. Skipped loading 
> volume: /tmp/hadoop-hdfs/dfs/data/hdds
>         at 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume.initialize(HddsVolume.java:226)
>         at 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume.<init>(HddsVolume.java:180)
>         at 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume.<init>(HddsVolume.java:71)
>         at 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume$Builder.build(HddsVolume.java:158)
>         at 
> org.apache.hadoop.ozone.container.common.volume.MutableVolumeSet.createVolume(MutableVolumeSet.java:336)
>         at 
> org.apache.hadoop.ozone.container.common.volume.MutableVolumeSet.initializeVolumeSet(MutableVolumeSet.java:183)
>         at 
> org.apache.hadoop.ozone.container.common.volume.MutableVolumeSet.<init>(MutableVolumeSet.java:139)
>         at 
> org.apache.hadoop.ozone.container.common.volume.MutableVolumeSet.<init>(MutableVolumeSet.java:111)
>         at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.<init>(OzoneContainer.java:97)
>         at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.<init>(DatanodeStateMachine.java:128)
>         at 
> org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:235)
>         at 
> org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:179)
>         at 
> org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:154)
>         at 
> org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:78)
>         at picocli.CommandLine.execute(CommandLine.java:1173)
>         at picocli.CommandLine.access$800(CommandLine.java:141)
>         at picocli.CommandLine$RunLast.handle(CommandLine.java:1367)
>         at picocli.CommandLine$RunLast.handle(CommandLine.java:1335)
>         at 
> picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1243)
>         at picocli.CommandLine.parseWithHandlers(CommandLine.java:1526)
>         at picocli.CommandLine.parseWithHandler(CommandLine.java:1465)
>         at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:65)
>         at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:56)
>         at 
> org.apache.hadoop.ozone.HddsDatanodeService.main(HddsDatanodeService.java:137)
> 2020-03-14 02:31:46,795 [shutdown-hook-0] INFO  (LogAdapter.java:51)     - 
> SHUTDOWN_MSG:
> {noformat}
> Then I look into the code and the root cause is that the version file was 
> lost in that node.
> We need to log key message as well to help user quickly know the root cause 
> of this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to