[jira] [Commented] (HDFS-5745) Unnecessary disk check triggered when socket operation has problem.

2015-05-08 Thread jun aoki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535331#comment-14535331
 ] 

jun aoki commented on HDFS-5745:


Around Jan 2014 when the ticket was submitted

SocketOutputStream throws IOException with ["the stream is 
closed"|https://github.com/apache/hadoop/blob/f3ee35ab288a171e2a4b8633b3417025a2ba97ab/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/SocketOutputStream.java#L118]
 This should be categorized as network related issue.

DataTransfer.run() thread catches IOExceptions and pass it to 
DataNode.checkDiskError(Exception). This is very *synchronous*
DataNode.checkDiskError(Exception) checks exceptions
If the exception is not identified network related. call 
DataNode.checkDiskError(), which I assume an *expensive* disk check operation.
If the exception is network related, warn and ignore.
IOExceptions from SocketOutputStream should be identified as network issue and 
[it should not cause the disk 
check|https://github.com/apache/hadoop/blob/f3ee35ab288a171e2a4b8633b3417025a2ba97ab/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java#L1333].
 (this is what [~kennethxian] pointed out originally)


Now with HEAD.

DataNode.checkDiskError(Exeption) has been gone and the synchronous check no 
longer exists.
DataNode.checkDiskError() exists and is called only
1. 
[initBlockPool()|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java#L1371]
 - which is called once (probably) when datanode starts
2. and from asynchronous 
[checkDiskErrorThread|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java#L3216].

*Conclusion*:
checkDiskError is no longer synchronous, it is now meant to intend to take some 
time and throughly check disk, and does not need to identify if IOExceptions 
are network related or disk related. So I'd think this ticket is no longer 
applicable and should be closed.
(This is my research outcome from Hadoop Bug Bash 2015 at Altiscale, and I'm 
newbie to HDFS so I would like to leave the decision to someone insightful. )


> Unnecessary disk check triggered when socket operation has problem.
> ---
>
> Key: HDFS-5745
> URL: https://issues.apache.org/jira/browse/HDFS-5745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.2.1
>Reporter: MaoYuan Xian
>Assignee: jun aoki
> Attachments: HDFS-5745.patch
>
>
> When BlockReceiver transfer data fails, it can be found SocketOutputStream 
> translates the exception as IOException with the message "The stream is 
> closed":
> 2014-01-06 11:48:04,716 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> IOException in BlockReceiver.run():
> java.io.IOException: The stream is closed
> at org.apache.hadoop.net.SocketOutputStream.write
> at java.io.BufferedOutputStream.flushBuffer
> at java.io.BufferedOutputStream.flush
> at java.io.DataOutputStream.flush
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run
> at java.lang.Thread.run
> Which makes the checkDiskError method of DataNode called and triggers the 
> disk scan.
> Can we make the modifications like below in checkDiskError to avoiding this 
> unneccessary disk scan operations?:
> {code}
> --- a/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> +++ b/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> @@ -938,7 +938,8 @@ public class DataNode extends Configured
>   || e.getMessage().startsWith("An established connection was 
> aborted")
>   || e.getMessage().startsWith("Broken pipe")
>   || e.getMessage().startsWith("Connection reset")
> - || e.getMessage().contains("java.nio.channels.SocketChannel")) {
> + || e.getMessage().contains("java.nio.channels.SocketChannel")
> + || e.getMessage().startsWith("The stream is closed")) {
>LOG.info("Not checking disk as checkDiskError was called on a network" 
> +
>  " related exception"); 
>return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5745) Unnecessary disk check triggered when socket operation has problem.

2015-05-08 Thread jun aoki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jun aoki updated HDFS-5745:
---
Labels: BB2015-05-RFC  (was: )

> Unnecessary disk check triggered when socket operation has problem.
> ---
>
> Key: HDFS-5745
> URL: https://issues.apache.org/jira/browse/HDFS-5745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.2.1
>Reporter: MaoYuan Xian
>Assignee: jun aoki
>  Labels: BB2015-05-RFC
> Attachments: HDFS-5745.patch
>
>
> When BlockReceiver transfer data fails, it can be found SocketOutputStream 
> translates the exception as IOException with the message "The stream is 
> closed":
> 2014-01-06 11:48:04,716 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> IOException in BlockReceiver.run():
> java.io.IOException: The stream is closed
> at org.apache.hadoop.net.SocketOutputStream.write
> at java.io.BufferedOutputStream.flushBuffer
> at java.io.BufferedOutputStream.flush
> at java.io.DataOutputStream.flush
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run
> at java.lang.Thread.run
> Which makes the checkDiskError method of DataNode called and triggers the 
> disk scan.
> Can we make the modifications like below in checkDiskError to avoiding this 
> unneccessary disk scan operations?:
> {code}
> --- a/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> +++ b/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> @@ -938,7 +938,8 @@ public class DataNode extends Configured
>   || e.getMessage().startsWith("An established connection was 
> aborted")
>   || e.getMessage().startsWith("Broken pipe")
>   || e.getMessage().startsWith("Connection reset")
> - || e.getMessage().contains("java.nio.channels.SocketChannel")) {
> + || e.getMessage().contains("java.nio.channels.SocketChannel")
> + || e.getMessage().startsWith("The stream is closed")) {
>LOG.info("Not checking disk as checkDiskError was called on a network" 
> +
>  " related exception"); 
>return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5745) Unnecessary disk check triggered when socket operation has problem.

2015-05-07 Thread jun aoki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533754#comment-14533754
 ] 

jun aoki commented on HDFS-5745:


Hi [~kennethxian] this was labeled as BB2015-05-TBR and I have taken this jira 
to follow for the bug bash. Let me know if you want to take it back.

> Unnecessary disk check triggered when socket operation has problem.
> ---
>
> Key: HDFS-5745
> URL: https://issues.apache.org/jira/browse/HDFS-5745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.2.1
>Reporter: MaoYuan Xian
>Assignee: jun aoki
> Attachments: HDFS-5745.patch
>
>
> When BlockReceiver transfer data fails, it can be found SocketOutputStream 
> translates the exception as IOException with the message "The stream is 
> closed":
> 2014-01-06 11:48:04,716 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> IOException in BlockReceiver.run():
> java.io.IOException: The stream is closed
> at org.apache.hadoop.net.SocketOutputStream.write
> at java.io.BufferedOutputStream.flushBuffer
> at java.io.BufferedOutputStream.flush
> at java.io.DataOutputStream.flush
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run
> at java.lang.Thread.run
> Which makes the checkDiskError method of DataNode called and triggers the 
> disk scan.
> Can we make the modifications like below in checkDiskError to avoiding this 
> unneccessary disk scan operations?:
> {code}
> --- a/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> +++ b/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> @@ -938,7 +938,8 @@ public class DataNode extends Configured
>   || e.getMessage().startsWith("An established connection was 
> aborted")
>   || e.getMessage().startsWith("Broken pipe")
>   || e.getMessage().startsWith("Connection reset")
> - || e.getMessage().contains("java.nio.channels.SocketChannel")) {
> + || e.getMessage().contains("java.nio.channels.SocketChannel")
> + || e.getMessage().startsWith("The stream is closed")) {
>LOG.info("Not checking disk as checkDiskError was called on a network" 
> +
>  " related exception"); 
>return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5745) Unnecessary disk check triggered when socket operation has problem.

2015-05-07 Thread jun aoki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jun aoki updated HDFS-5745:
---
Status: Open  (was: Patch Available)

> Unnecessary disk check triggered when socket operation has problem.
> ---
>
> Key: HDFS-5745
> URL: https://issues.apache.org/jira/browse/HDFS-5745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.2.1
>Reporter: MaoYuan Xian
>Assignee: jun aoki
> Attachments: HDFS-5745.patch
>
>
> When BlockReceiver transfer data fails, it can be found SocketOutputStream 
> translates the exception as IOException with the message "The stream is 
> closed":
> 2014-01-06 11:48:04,716 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> IOException in BlockReceiver.run():
> java.io.IOException: The stream is closed
> at org.apache.hadoop.net.SocketOutputStream.write
> at java.io.BufferedOutputStream.flushBuffer
> at java.io.BufferedOutputStream.flush
> at java.io.DataOutputStream.flush
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run
> at java.lang.Thread.run
> Which makes the checkDiskError method of DataNode called and triggers the 
> disk scan.
> Can we make the modifications like below in checkDiskError to avoiding this 
> unneccessary disk scan operations?:
> {code}
> --- a/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> +++ b/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> @@ -938,7 +938,8 @@ public class DataNode extends Configured
>   || e.getMessage().startsWith("An established connection was 
> aborted")
>   || e.getMessage().startsWith("Broken pipe")
>   || e.getMessage().startsWith("Connection reset")
> - || e.getMessage().contains("java.nio.channels.SocketChannel")) {
> + || e.getMessage().contains("java.nio.channels.SocketChannel")
> + || e.getMessage().startsWith("The stream is closed")) {
>LOG.info("Not checking disk as checkDiskError was called on a network" 
> +
>  " related exception"); 
>return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5745) Unnecessary disk check triggered when socket operation has problem.

2015-05-07 Thread jun aoki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jun aoki updated HDFS-5745:
---
Labels:   (was: BB2015-05-TBR)

> Unnecessary disk check triggered when socket operation has problem.
> ---
>
> Key: HDFS-5745
> URL: https://issues.apache.org/jira/browse/HDFS-5745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.2.1
>Reporter: MaoYuan Xian
>Assignee: jun aoki
> Attachments: HDFS-5745.patch
>
>
> When BlockReceiver transfer data fails, it can be found SocketOutputStream 
> translates the exception as IOException with the message "The stream is 
> closed":
> 2014-01-06 11:48:04,716 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> IOException in BlockReceiver.run():
> java.io.IOException: The stream is closed
> at org.apache.hadoop.net.SocketOutputStream.write
> at java.io.BufferedOutputStream.flushBuffer
> at java.io.BufferedOutputStream.flush
> at java.io.DataOutputStream.flush
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run
> at java.lang.Thread.run
> Which makes the checkDiskError method of DataNode called and triggers the 
> disk scan.
> Can we make the modifications like below in checkDiskError to avoiding this 
> unneccessary disk scan operations?:
> {code}
> --- a/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> +++ b/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> @@ -938,7 +938,8 @@ public class DataNode extends Configured
>   || e.getMessage().startsWith("An established connection was 
> aborted")
>   || e.getMessage().startsWith("Broken pipe")
>   || e.getMessage().startsWith("Connection reset")
> - || e.getMessage().contains("java.nio.channels.SocketChannel")) {
> + || e.getMessage().contains("java.nio.channels.SocketChannel")
> + || e.getMessage().startsWith("The stream is closed")) {
>LOG.info("Not checking disk as checkDiskError was called on a network" 
> +
>  " related exception"); 
>return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-5745) Unnecessary disk check triggered when socket operation has problem.

2015-05-07 Thread jun aoki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jun aoki reassigned HDFS-5745:
--

Assignee: jun aoki

> Unnecessary disk check triggered when socket operation has problem.
> ---
>
> Key: HDFS-5745
> URL: https://issues.apache.org/jira/browse/HDFS-5745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.2.1
>Reporter: MaoYuan Xian
>Assignee: jun aoki
> Attachments: HDFS-5745.patch
>
>
> When BlockReceiver transfer data fails, it can be found SocketOutputStream 
> translates the exception as IOException with the message "The stream is 
> closed":
> 2014-01-06 11:48:04,716 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> IOException in BlockReceiver.run():
> java.io.IOException: The stream is closed
> at org.apache.hadoop.net.SocketOutputStream.write
> at java.io.BufferedOutputStream.flushBuffer
> at java.io.BufferedOutputStream.flush
> at java.io.DataOutputStream.flush
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run
> at java.lang.Thread.run
> Which makes the checkDiskError method of DataNode called and triggers the 
> disk scan.
> Can we make the modifications like below in checkDiskError to avoiding this 
> unneccessary disk scan operations?:
> {code}
> --- a/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> +++ b/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
> @@ -938,7 +938,8 @@ public class DataNode extends Configured
>   || e.getMessage().startsWith("An established connection was 
> aborted")
>   || e.getMessage().startsWith("Broken pipe")
>   || e.getMessage().startsWith("Connection reset")
> - || e.getMessage().contains("java.nio.channels.SocketChannel")) {
> + || e.getMessage().contains("java.nio.channels.SocketChannel")
> + || e.getMessage().startsWith("The stream is closed")) {
>LOG.info("Not checking disk as checkDiskError was called on a network" 
> +
>  " related exception"); 
>return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6103) FSImage file system image version check throw a (slightly) wrong parameter.

2014-03-14 Thread jun aoki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935892#comment-13935892
 ] 

jun aoki commented on HDFS-6103:


Hi [~ajisakaa], thank you for clarifying. I'm using bigtop. 
Let's focus on StartupOption.UPGRADE in this ticket.

> FSImage file system image version check throw a (slightly) wrong parameter.
> ---
>
> Key: HDFS-6103
> URL: https://issues.apache.org/jira/browse/HDFS-6103
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.2.0
>Reporter: jun aoki
>Priority: Trivial
>
> Trivial error message issue:
> When upgrading hdfs, say from 2.0.5 to 2.2.0, users will need to start 
> namenode with "upgrade" option.
> e.g. 
> {code}
> sudo service namenode upgrade
> {code}
> That said, the actual error while without the option said "-upgrade" (with a 
> hyphen) 
> {code}
> 2014-03-13 23:38:15,488 FATAL 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
> java.io.IOException:
> File system image contains an old layout version -40.
> An upgrade to version -47 is required.
> Please restart NameNode with -upgrade option.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:221)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:787)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:568)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:684)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:669)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)
> 2014-03-13 23:38:15,492 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1
> 2014-03-13 23:38:15,493 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.2.202
> /
> ~
> {code}
> I'm referring to 2.0.5 above, 
> https://github.com/apache/hadoop-common/blob/branch-2.0.5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L225
> I haven't tried the trunk but it seems to return "UPGRADE" (all upper case) 
> which again anther slightly wrong error description.
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L232



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6103) FSImage file system image version check throw a (slightly) wrong parameter.

2014-03-14 Thread jun aoki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935747#comment-13935747
 ] 

jun aoki commented on HDFS-6103:


Hi [~vinayrpet], I got the error message when I executed
{code}
sudo service hadoop-hdfs-namenode start
{code}

Then I found that I'd have to execute
{code}
sudo service hadoop-hdfs-namenode upgrade #(1)
{code}
Note that this does not have a hyphen e.g. "-upgrade"
I also have found that users can execute hadoop-daemon.sh. I've never tried it 
this way but something like
{code}
hadoop-daemon.sh --config /etc/hadoop start namenode -upgrade # (2)
{code}
Then this will require a hyphen.
I thought (1) is a preferred way thus this ticket, but if I'm wrong and (2) is 
equally or more preferred, please let me know.




> FSImage file system image version check throw a (slightly) wrong parameter.
> ---
>
> Key: HDFS-6103
> URL: https://issues.apache.org/jira/browse/HDFS-6103
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.2.0
>Reporter: jun aoki
>Priority: Trivial
>
> Trivial error message issue:
> When upgrading hdfs, say from 2.0.5 to 2.2.0, users will need to start 
> namenode with "upgrade" option.
> e.g. 
> {code}
> sudo service namenode upgrade
> {code}
> That said, the actual error while without the option said "-upgrade" (with a 
> hyphen) 
> {code}
> 2014-03-13 23:38:15,488 FATAL 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
> java.io.IOException:
> File system image contains an old layout version -40.
> An upgrade to version -47 is required.
> Please restart NameNode with -upgrade option.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:221)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:787)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:568)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:684)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:669)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)
> 2014-03-13 23:38:15,492 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1
> 2014-03-13 23:38:15,493 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.2.202
> /
> ~
> {code}
> I'm referring to 2.0.5 above, 
> https://github.com/apache/hadoop-common/blob/branch-2.0.5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L225
> I haven't tried the trunk but it seems to return "UPGRADE" (all upper case) 
> which again anther slightly wrong error description.
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L232



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6103) FSImage file system image version check throw a (slightly) wrong parameter.

2014-03-13 Thread jun aoki (JIRA)
jun aoki created HDFS-6103:
--

 Summary: FSImage file system image version check throw a 
(slightly) wrong parameter.
 Key: HDFS-6103
 URL: https://issues.apache.org/jira/browse/HDFS-6103
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.2.0
Reporter: jun aoki
Priority: Trivial


Trivial error message issue:
When upgrading hdfs, say from 2.0.5 to 2.2.0, users will need to start namenode 
with "upgrade" option.
e.g. 
{code}
sudo service namenode upgrade
{code}

That said, the actual error while without the option said "-upgrade" (with a 
hyphen) 
{code}
2014-03-13 23:38:15,488 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: 
Exception in namenode join
java.io.IOException:
File system image contains an old layout version -40.
An upgrade to version -47 is required.
Please restart NameNode with -upgrade option.
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:221)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:787)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:568)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:684)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:669)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)
2014-03-13 23:38:15,492 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1
2014-03-13 23:38:15,493 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.2.202
/
~
{code}

I'm referring to 2.0.5 above, 
https://github.com/apache/hadoop-common/blob/branch-2.0.5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L225

I haven't tried the trunk but it seems to return "UPGRADE" (all upper case) 
which again anther slightly wrong error description.

https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L232




--
This message was sent by Atlassian JIRA
(v6.2#6252)