[jira] [Updated] (HDDS-4269) Ozone DataNode thinks a volume is failed if an unexpected file is in the HDDS root directory

2020-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4269:
-
Labels: newbie pull-request-available  (was: newbie)

> Ozone DataNode thinks a volume is failed if an unexpected file is in the HDDS 
> root directory
> 
>
> Key: HDDS-4269
> URL: https://issues.apache.org/jira/browse/HDDS-4269
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 1.1.0
>Reporter: Wei-Chiu Chuang
>Assignee: Zheng Huang-Mu
>Priority: Major
>  Labels: newbie, pull-request-available
>
> Took me some time to debug a trivial bug.
> DataNode crashes after this mysterious error and no explanation:
> {noformat}
> 10:11:44.382 PM   INFOMutableVolumeSetMoving Volume : 
> /var/lib/hadoop-ozone/fake_datanode/data/hdds to failed Volumes
> 10:11:46.287 PM   ERROR   StateContextCritical error occurred in 
> StateMachine, setting shutDownMachine
> 10:11:46.287 PM   ERROR   DatanodeStateMachineDatanodeStateMachine 
> Shutdown due to an critical error
> {noformat}
> Turns out that if there are unexpected files under the hdds directory 
> ($hdds.datanode.dir/hdds), DN thinks the volume is bad and move it to failed 
> volume list, without an error explanation. I was editing the VERSION file and 
> vim created a temp file under the directory. This is impossible to debug 
> without reading the code.
> {code:java|title=HddsVolumeUtil#checkVolume()}
> } else if(hddsFiles.length == 2) {
>   // The files should be Version and SCM directory
>   if (scmDir.exists()) {
> return true;
>   } else {
> logger.error("Volume {} is in Inconsistent state, expected scm " +
> "directory {} does not exist", volumeRoot, scmDir
> .getAbsolutePath());
> return false;
>   }
> } else {
>   // The hdds root dir should always have 2 files. One is Version file
>   // and other is SCM directory.
>   < HERE!
>   return false;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4269) Ozone DataNode thinks a volume is failed if an unexpected file is in the HDDS root directory

2020-09-22 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDDS-4269:
--
Labels: newbie  (was: )

> Ozone DataNode thinks a volume is failed if an unexpected file is in the HDDS 
> root directory
> 
>
> Key: HDDS-4269
> URL: https://issues.apache.org/jira/browse/HDDS-4269
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 1.1.0
>Reporter: Wei-Chiu Chuang
>Priority: Major
>  Labels: newbie
>
> Took me some time to debug a trivial bug.
> DataNode crashes after this mysterious error and no explanation:
> {noformat}
> 10:11:44.382 PM   INFOMutableVolumeSetMoving Volume : 
> /var/lib/hadoop-ozone/fake_datanode/data/hdds to failed Volumes
> 10:11:46.287 PM   ERROR   StateContextCritical error occurred in 
> StateMachine, setting shutDownMachine
> 10:11:46.287 PM   ERROR   DatanodeStateMachineDatanodeStateMachine 
> Shutdown due to an critical error
> {noformat}
> Turns out that if there are unexpected files under the hdds directory 
> ($hdds.datanode.dir/hdds), DN thinks the volume is bad and move it to failed 
> volume list, without an error explanation. I was editing the VERSION file and 
> vim created a temp file under the directory. This is impossible to debug 
> without reading the code.
> {code:java|title=HddsVolumeUtil#checkVolume()}
> } else if(hddsFiles.length == 2) {
>   // The files should be Version and SCM directory
>   if (scmDir.exists()) {
> return true;
>   } else {
> logger.error("Volume {} is in Inconsistent state, expected scm " +
> "directory {} does not exist", volumeRoot, scmDir
> .getAbsolutePath());
> return false;
>   }
> } else {
>   // The hdds root dir should always have 2 files. One is Version file
>   // and other is SCM directory.
>   < HERE!
>   return false;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org