[jira] [Commented] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.

2012-03-19 Thread liaowenrui (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232474#comment-13232474
 ] 

liaowenrui commented on HDFS-3091:
--

yeah,I agree with him! thank you for your answer!

its designing idea is good,But this implement have a defect.

I think this featrue is to guarantee replication in reliability.

assume cluster size is 10,user set the replication values to 10, when it have 
been writen 9,one of them is bad, do you think that its writing is success?

 Failed to add new DataNode in pipeline and will be resulted into write 
 failure.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G

 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3113) httpfs does not support delegation tokens

2012-03-19 Thread Alejandro Abdelnur (Created) (JIRA)
httpfs does not support delegation tokens
-

 Key: HDFS-3113
 URL: https://issues.apache.org/jira/browse/HDFS-3113
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.24.0, 0.23.3


httpfs does not support calls to get/renew tokens nor delegation token 
authentication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3114) Remove implementing Writable interface for the internal data types in HDFS

2012-03-19 Thread Suresh Srinivas (Created) (JIRA)
Remove implementing Writable interface for the internal data types in HDFS
--

 Key: HDFS-3114
 URL: https://issues.apache.org/jira/browse/HDFS-3114
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Affects Versions: 0.24.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas


With changes done in 0.23 and trunk, there is a clear separation of wire types 
and implementation types. Given this, lot of Writable code associated with 
internal types can be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3107) HDFS truncate

2012-03-19 Thread Milind Bhandarkar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232725#comment-13232725
 ] 

Milind Bhandarkar commented on HDFS-3107:
-

This will be a great addition to HDFS for a couple of reasons:

1. Having an append without a truncate is a serious deficiency.
2. If a user mistakenly starts to append data to an existing large file, and 
discovers the mistake, the only recourse is to recreate that file, by rewriting 
the contents. This is very inefficient.

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Lei Chang
 Attachments: HDFS_truncate_semantics_Mar15.pdf

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232737#comment-13232737
 ] 

Suresh Srinivas commented on HDFS-3105:
---

Comments:
# Not sure how UpdateReplicaUnderRecoveryResponseProto can have storage instead 
of block? Also do you need DatanodeStorage or just storageID sufficient?
# Please do not update service protocol version, as this is with in a release. 
This is not used any more and we need to clean this up at some point in time.


 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3115) Update hdfs design doc to consider HA NNs

2012-03-19 Thread Todd Lipcon (Created) (JIRA)
Update hdfs design doc to consider HA NNs
-

 Key: HDFS-3115
 URL: https://issues.apache.org/jira/browse/HDFS-3115
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon
Priority: Minor


The hdfs_design_doc.xml still references the NN as an SPOF, which is no longer 
true. We should sweep docs for anything else that seems to be out of date with 
HA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232786#comment-13232786
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3091:
--

 What if we log about the cluster size and can recommend to disable this 
 feature with smaller clusters? ...

Sure, let's add some comments for this.

 assume cluster size is 10,user set the replication values to 10, when it have 
 been writen 9,one of them is bad, do you think that its writing is success?

If the DEFAULT policy is used, the pipeline won't fail until the number of 
datanodes N drops to 5 as described in (2) in the description.  In your 
example, if the user set replication to 18 and N drops to 9, the write should 
fail.


 Failed to add new DataNode in pipeline and will be resulted into write 
 failure.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G

 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3091:
-

Attachment: h3091_20120319.patch

h3091_20120319.patch: add comments for small clusters.

 Failed to add new DataNode in pipeline and will be resulted into write 
 failure.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3091:
-

Attachment: h3091_20120319.patch

 Failed to add new DataNode in pipeline and will be resulted into write 
 failure.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
 Attachments: h3091_20120319.patch, h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3091:
-

Attachment: (was: h3091_20120319.patch)

 Failed to add new DataNode in pipeline and will be resulted into write 
 failure.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.

2012-03-19 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232817#comment-13232817
 ] 

Uma Maheswara Rao G commented on HDFS-3091:
---

Thanks a lot Nicholas.
Patch looks good.
+1

 Failed to add new DataNode in pipeline and will be resulted into write 
 failure.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3091:
--

Target Version/s: 0.24.0, 0.23.3  (was: 0.23.3, 0.24.0)
 Summary: Update the usage limitations of ReplaceDatanodeOnFailure 
policy in the config description for the smaller clusters.  (was: Failed to add 
new DataNode in pipeline and will be resulted into write failure.)

 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3105:
-

Attachment: h3105_20120319.patch

Thanks Suresh for the review.

h3105_20120319.patch: returns storageID instead of DatanodeStorage and reverts 
the versionID changes.

 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3004) Implement Recovery Mode

2012-03-19 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3004:
--

Status: Patch Available  (was: Open)

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, 
 HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004.015.patch, 
 HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, 
 HDFS-3004.019.patch, HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232844#comment-13232844
 ] 

Uma Maheswara Rao G commented on HDFS-3091:
---

Updated the title.

Committed to trunk. Thanks Nicholas for the patch.

 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Uma Maheswara Rao G (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G resolved HDFS-3091.
---

  Resolution: Fixed
Assignee: Tsz Wo (Nicholas), SZE
Target Version/s: 0.24.0, 0.23.3  (was: 0.23.3, 0.24.0)
Hadoop Flags: Reviewed

 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3091:
--

Target Version/s: 0.24.0, 0.23.3  (was: 0.23.3, 0.24.0)
   Fix Version/s: 0.24.0

 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232845#comment-13232845
 ] 

Uma Maheswara Rao G commented on HDFS-3091:
---

tomorrow, I will back-port this to 0.23 branch as well.

 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232849#comment-13232849
 ] 

Hudson commented on HDFS-3091:
--

Integrated in Hadoop-Common-trunk-Commit #1900 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1900/])
HDFS-3091. Update the usage limitations of ReplaceDatanodeOnFailure policy 
in the config description for the smaller clusters. Contributed by Nicholas. 
(Revision 1302624)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302624
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232854#comment-13232854
 ] 

Hudson commented on HDFS-3091:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1974 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1974/])
HDFS-3091. Update the usage limitations of ReplaceDatanodeOnFailure policy 
in the config description for the smaller clusters. Contributed by Nicholas. 
(Revision 1302624)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302624
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232858#comment-13232858
 ] 

Uma Maheswara Rao G commented on HDFS-3091:
---

Just merged to 0.23 also.

 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3091:
-

Target Version/s: 0.24.0, 0.23.3  (was: 0.23.3, 0.24.0)
   Fix Version/s: 0.23.3
  Issue Type: Improvement  (was: Bug)

 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232867#comment-13232867
 ] 

Hudson commented on HDFS-3091:
--

Integrated in Hadoop-Hdfs-0.23-Commit #693 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/693/])
Merge HDFS-3091. Update the usage limitations of ReplaceDatanodeOnFailure 
policy in the config description for the smaller clusters. Contributed by 
Nicholas. (Revision 1302633)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302633
Files : 
* /hadoop/common/branches/branch-0.23
* /hadoop/common/branches/branch-0.23/hadoop-common-project
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-auth
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/core
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/native
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/hdfs
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/bin
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/conf
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-examples
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/c++
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/block_forensics
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build-contrib.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/data_join
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/eclipse-plugin
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/index
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/vaidya
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/examples
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/fs
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/ipc
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/webapps/job
* /hadoop/common/branches/branch-0.23/hadoop-project
* /hadoop/common/branches/branch-0.23/hadoop-project/src/site


 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of 

[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232871#comment-13232871
 ] 

Hudson commented on HDFS-3091:
--

Integrated in Hadoop-Common-0.23-Commit #702 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/702/])
Merge HDFS-3091. Update the usage limitations of ReplaceDatanodeOnFailure 
policy in the config description for the smaller clusters. Contributed by 
Nicholas. (Revision 1302633)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302633
Files : 
* /hadoop/common/branches/branch-0.23
* /hadoop/common/branches/branch-0.23/hadoop-common-project
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-auth
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/core
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/native
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/hdfs
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/bin
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/conf
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-examples
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/c++
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/block_forensics
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build-contrib.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/data_join
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/eclipse-plugin
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/index
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/vaidya
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/examples
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/fs
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/ipc
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/webapps/job
* /hadoop/common/branches/branch-0.23/hadoop-project
* /hadoop/common/branches/branch-0.23/hadoop-project/src/site


 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple 

[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232873#comment-13232873
 ] 

Suresh Srinivas commented on HDFS-3105:
---

+1 for the patch.

 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2386) with security enabled fsck calls lead to handshake_failure and hftp fails throwing the same exception in the logs

2012-03-19 Thread Joey Echeverria (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232875#comment-13232875
 ] 

Joey Echeverria commented on HDFS-2386:
---

From testing I've been doing it looks like KSSL won't work without at least 
one of the DES encryption types enabled (e.g. DES_CBC_CRC). This looks like 
it's caused by a bug in the JDK. Basically, AES and RC4 don't pad unless they 
encrypt a message which is not a multiple of a block. However, the JDK is 
assuming that the PreMasterSecret will be padded and assumes that the last 
byte in the decrypted secret is the length of the padding. When using AES or 
RC4, this ends up being a random byte and usually will cause the JDK to end up 
with an invalid PreMasterSecret. In defense of this, the JDK generates a 
random secret that then caused the handshake to fail later on. I need to do 
some more testing with another version of Kerberos, but I plan on filing a JDK 
bug.

 with security enabled fsck calls lead to handshake_failure and hftp fails 
 throwing the same exception in the logs
 -

 Key: HDFS-2386
 URL: https://issues.apache.org/jira/browse/HDFS-2386
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.205.0
Reporter: Arpit Gupta



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232885#comment-13232885
 ] 

Hudson commented on HDFS-3091:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1908 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1908/])
HDFS-3091. Update the usage limitations of ReplaceDatanodeOnFailure policy 
in the config description for the smaller clusters. Contributed by Nicholas. 
(Revision 1302624)

 Result = ABORTED
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302624
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232897#comment-13232897
 ] 

Hadoop QA commented on HDFS-3105:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518930/h3105_20120319.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 12 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2034//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2034//console

This message is automatically generated.

 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3116) Typo in fetchdt error message

2012-03-19 Thread Aaron T. Myers (Created) (JIRA)
Typo in fetchdt error message
-

 Key: HDFS-3116
 URL: https://issues.apache.org/jira/browse/HDFS-3116
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Aaron T. Myers
Priority: Trivial


In {{DelegationTokenFetcher.java}} there's the following typo of the word 
exactly:

{code}
System.err.println(ERROR: Must specify exacltly one token file);
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3004) Implement Recovery Mode

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232927#comment-13232927
 ] 

Hadoop QA commented on HDFS-3004:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518778/HDFS-3004.019.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 24 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.server.namenode.TestFSEditLogLoader
  org.apache.hadoop.hdfs.server.namenode.TestNameNodeRecovery
  org.apache.hadoop.hdfs.server.namenode.TestEditLog

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2035//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2035//console

This message is automatically generated.

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, 
 HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004.015.patch, 
 HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, 
 HDFS-3004.019.patch, HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232936#comment-13232936
 ] 

Hudson commented on HDFS-3091:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #710 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/710/])
Merge HDFS-3091. Update the usage limitations of ReplaceDatanodeOnFailure 
policy in the config description for the smaller clusters. Contributed by 
Nicholas. (Revision 1302633)

 Result = ABORTED
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302633
Files : 
* /hadoop/common/branches/branch-0.23
* /hadoop/common/branches/branch-0.23/hadoop-common-project
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-auth
* /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/core
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/native
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/hdfs
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/bin
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/conf
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-examples
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/c++
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/block_forensics
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build-contrib.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/data_join
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/eclipse-plugin
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/index
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/vaidya
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/examples
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/fs
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/ipc
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/webapps/job
* /hadoop/common/branches/branch-0.23/hadoop-project
* /hadoop/common/branches/branch-0.23/hadoop-project/src/site


 Update the usage limitations of ReplaceDatanodeOnFailure policy in the config 
 description for the smaller clusters.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3091_20120319.patch


 When verifying the HDFS-1606 feature, Observed 

[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command

2012-03-19 Thread Arpit Gupta (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3094:
--

Attachment: (was: HDFS-3094.docs.patch)

 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.patch, HDFS-3094.patch, HDFS-3094.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3094) add -nonInteractive and -force option to namenode -format command

2012-03-19 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232956#comment-13232956
 ] 

Arpit Gupta commented on HDFS-3094:
---

{code}
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hdfs.TestLeaseRecovery2

+1 contrib tests. The patch passed contrib unit tests.
{code}

I reran the test class multiple times and it went through. I have also created 
HADOOP-8185 for documentation changes to trunk.

 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.patch, HDFS-3094.patch, HDFS-3094.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3105:
-

   Resolution: Fixed
Fix Version/s: 0.23.3
   0.24.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I have committed this to trunk and 0.23.

 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232960#comment-13232960
 ] 

Hudson commented on HDFS-3105:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1976 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1976/])
HDFS-3105.  Add DatanodeStorage information to block recovery. (Revision 
1302683)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302683
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InterDatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/InterDatanodeProtocol.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPipelinesFailover.java


 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232963#comment-13232963
 ] 

Hudson commented on HDFS-3105:
--

Integrated in Hadoop-Common-trunk-Commit #1902 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1902/])
HDFS-3105.  Add DatanodeStorage information to block recovery. (Revision 
1302683)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302683
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InterDatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/InterDatanodeProtocol.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPipelinesFailover.java


 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232976#comment-13232976
 ] 

Hudson commented on HDFS-3105:
--

Integrated in Hadoop-Common-0.23-Commit #704 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/704/])
svn merge -c 1302683 from trunk for HDFS-3105. (Revision 1302685)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302685
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InterDatanodeProtocol.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/InterDatanodeProtocol.proto
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPipelinesFailover.java


 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232977#comment-13232977
 ] 

Hudson commented on HDFS-3105:
--

Integrated in Hadoop-Hdfs-0.23-Commit #695 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/695/])
svn merge -c 1302683 from trunk for HDFS-3105. (Revision 1302685)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302685
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InterDatanodeProtocol.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/InterDatanodeProtocol.proto
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPipelinesFailover.java


 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-309) FSEditLog should log progress during replay

2012-03-19 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232982#comment-13232982
 ] 

Todd Lipcon commented on HDFS-309:
--

Hi Sho. This patch fell out of date when we merged the HA branch, I believe. 
Would you mind updating it against the current trunk?

 FSEditLog should log progress during replay
 ---

 Key: HDFS-309
 URL: https://issues.apache.org/jira/browse/HDFS-309
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Todd Lipcon
Assignee: Sho Shimauchi
  Labels: newbie
 Attachments: HDFS-309.txt, HDFS-309.txt, HDFS-309.txt


 When the NameNode is replaying a long edit log, it's handy to have reports on 
 how far through it is, so you can judge how much time it is remaining.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2983) Relax the build version check to permit rolling upgrades within a release

2012-03-19 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2983:
--

Target Version/s: 1.1.0, 0.23.2  (was: 0.23.2)

 Relax the build version check to permit rolling upgrades within a release
 -

 Key: HDFS-2983
 URL: https://issues.apache.org/jira/browse/HDFS-2983
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: Eli Collins

 Currently the version check for DN/NN communication is strict (it checks the 
 exact svn revision or git hash, Storage#getBuildVersion calls 
 VersionInfo#getRevision), which prevents rolling upgrades across any 
 releases. Once we have the PB-base RPC in place (coming soon to branch-23) 
 we'll have the necessary pieces in place to loosen this restriction, though 
 perhaps it takes another 23 minor release or so before we're ready to commit 
 to making the minor versions compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232991#comment-13232991
 ] 

Hudson commented on HDFS-3105:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1910 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1910/])
HDFS-3105.  Add DatanodeStorage information to block recovery. (Revision 
1302683)

 Result = ABORTED
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302683
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InterDatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/InterDatanodeProtocol.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPipelinesFailover.java


 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3105) Add DatanodeStorage information to block recovery

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233002#comment-13233002
 ] 

Hudson commented on HDFS-3105:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #711 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/711/])
svn merge -c 1302683 from trunk for HDFS-3105. (Revision 1302685)

 Result = ABORTED
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302685
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InterDatanodeProtocol.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/InterDatanodeProtocol.proto
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPipelinesFailover.java


 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3105_20120315.patch, h3105_20120315b.patch, 
 h3105_20120316.patch, h3105_20120316b.patch, h3105_20120319.patch


 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3083) HA+security: failed to run a mapred job from yarn after a manual failover

2012-03-19 Thread Aaron T. Myers (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3083:
-

Attachment: HDFS-3083-combined.patch

Here's a patch which addresses the issue. It includes changes in both HDFS and 
Common projects, so test-patch isn't going to work. I can create separate JIRAs 
if folks want, but I figure reviewing it would be easier as a single patch.

No tests are included since security has to be enabled to verify the fix. To 
test it out, I ran the DT test script attached to HDFS-2904, with the following 
extra test case appended:

{code}
# Token issued by nn2 should work when nn2 still active
kinit -k -t ~/keytabs/$ADMIN.keytab $ADMIN/simon
kinit -R
hdfs haadmin -failover nn1 nn2
rm -f /tmp/token
hdfs fetchdt --renewer $RENEWER /tmp/token
kdestroy
HADOOP_TOKEN_FILE_LOCATION=/tmp/token hadoop fs -ls /
{code}

All of the tests in the test script passed with this patch applied. The above 
test fails without the patch, and passes with it. I also successfully ran some 
MR jobs with the second-listed NN in the active state, and confirmed that 
everything worked as expected.

 HA+security: failed to run a mapred job from yarn after a manual failover
 -

 Key: HDFS-3083
 URL: https://issues.apache.org/jira/browse/HDFS-3083
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, security
Affects Versions: 0.24.0, 0.23.3
Reporter: Mingjie Lai
Assignee: Aaron T. Myers
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3083-combined.patch


 Steps to reproduce:
 - turned on ha and security
 - run a mapred job, and wait to finish
 - failover to another namenode
 - run the mapred job again, it fails. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-03-19 Thread Hari Mankude (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Mankude updated HDFS-2802:
---

Attachment: snapshot-one-pager.pdf

 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Affects Versions: 0.24.0
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snapshot-one-pager.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-03-19 Thread Hari Mankude (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233044#comment-13233044
 ] 

Hari Mankude commented on HDFS-2802:


Uploaded the one pager. More detailed design doc and the first version of the 
patch is in the works.

 Support for RW/RO snapshots in HDFS
 ---

 Key: HDFS-2802
 URL: https://issues.apache.org/jira/browse/HDFS-2802
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Affects Versions: 0.24.0
Reporter: Hari Mankude
Assignee: Hari Mankude
 Attachments: snapshot-one-pager.pdf


 Snapshots are point in time images of parts of the filesystem or the entire 
 filesystem. Snapshots can be a read-only or a read-write point in time copy 
 of the filesystem. There are several use cases for snapshots in HDFS. I will 
 post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3100) failed to append data using webhdfs

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3100:
-

Description: 
STEP:
1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
A) enable webhdfs
B) enable append
C) disable permissions
2, start hdfs
3, run the test script as attached

RESULT:
expected: a file named testFile should be created and populated with 32K * 5000 
zeros, HDFS should be OK.
I got: script cannot be finished, file has been created but not be populated as 
expected, actually append operation failed.

Datanode log shows that, blockscaner report a bad replica and nanenode decide 
to delete it. Since it is a single node cluster, append fail. It makes no sense 
that the script failed every time.

Datanode and Namenode logs are attached.

  was:

STEP:
1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
A) enable webhdfs
B) enable append
C) disable permissions
2, start hdfs
3, run the test script as attached

RESULT:
expected: a file named testFile should be created and populated with 32K * 5000 
zeros, HDFS should be OK.
I got: script cannot be finished, file has been created but not be populated as 
expected, actually append operation failed.

Datanode log shows that, blockscaner report a bad replica and nanenode decide 
to delete it. Since it is a single node cluster, append fail. It makes no sense 
that the script failed every time.

Datanode and Namenode logs are attached.

   Assignee: Brandon Li  (was: Tsz Wo (Nicholas), SZE)

 failed to append data using webhdfs
 ---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Brandon Li
 Attachments: hadoop-wangzw-datanode-ubuntu.log, 
 hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch


 STEP:
 1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
 A) enable webhdfs
 B) enable append
 C) disable permissions
 2, start hdfs
 3, run the test script as attached
 RESULT:
 expected: a file named testFile should be created and populated with 32K * 
 5000 zeros, HDFS should be OK.
 I got: script cannot be finished, file has been created but not be populated 
 as expected, actually append operation failed.
 Datanode log shows that, blockscaner report a bad replica and nanenode decide 
 to delete it. Since it is a single node cluster, append fail. It makes no 
 sense that the script failed every time.
 Datanode and Namenode logs are attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3100) failed to append data using webhdfs

2012-03-19 Thread Brandon Li (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3100:
-

Affects Version/s: 0.24.0

 failed to append data using webhdfs
 ---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0, 0.23.1
Reporter: Zhanwei.Wang
Assignee: Brandon Li
 Attachments: hadoop-wangzw-datanode-ubuntu.log, 
 hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch


 STEP:
 1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
 A) enable webhdfs
 B) enable append
 C) disable permissions
 2, start hdfs
 3, run the test script as attached
 RESULT:
 expected: a file named testFile should be created and populated with 32K * 
 5000 zeros, HDFS should be OK.
 I got: script cannot be finished, file has been created but not be populated 
 as expected, actually append operation failed.
 Datanode log shows that, blockscaner report a bad replica and nanenode decide 
 to delete it. Since it is a single node cluster, append fail. It makes no 
 sense that the script failed every time.
 Datanode and Namenode logs are attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3089) Move FSDatasetInterface and other related classes/interfaces to a package

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3089:
-

Attachment: svn_mv.sh
h3089_20120319_svn_mv.patch

svn_mv.sh: a script to run svn mv
h3089_20120319_svn_mv.patch: updated with trunk.


 Move FSDatasetInterface and other related classes/interfaces to a package
 -

 Key: HDFS-3089
 URL: https://issues.apache.org/jira/browse/HDFS-3089
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3089_20120316_svn_mv.patch, 
 h3089_20120319_svn_mv.patch, svn_mv.sh




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3100) failed to append data using webhdfs

2012-03-19 Thread Brandon Li (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3100:
-

Attachment: HDFS-3100.patch

Attached patch for the trunk.

 failed to append data using webhdfs
 ---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0, 0.23.1
Reporter: Zhanwei.Wang
Assignee: Brandon Li
 Attachments: HDFS-3100.patch, hadoop-wangzw-datanode-ubuntu.log, 
 hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch


 STEP:
 1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
 A) enable webhdfs
 B) enable append
 C) disable permissions
 2, start hdfs
 3, run the test script as attached
 RESULT:
 expected: a file named testFile should be created and populated with 32K * 
 5000 zeros, HDFS should be OK.
 I got: script cannot be finished, file has been created but not be populated 
 as expected, actually append operation failed.
 Datanode log shows that, blockscaner report a bad replica and nanenode decide 
 to delete it. Since it is a single node cluster, append fail. It makes no 
 sense that the script failed every time.
 Datanode and Namenode logs are attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3100) failed to append data using webhdfs

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3100:
-

Status: Patch Available  (was: Open)

 failed to append data using webhdfs
 ---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.23.1, 0.24.0
Reporter: Zhanwei.Wang
Assignee: Brandon Li
 Attachments: HDFS-3100.patch, hadoop-wangzw-datanode-ubuntu.log, 
 hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch


 STEP:
 1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
 A) enable webhdfs
 B) enable append
 C) disable permissions
 2, start hdfs
 3, run the test script as attached
 RESULT:
 expected: a file named testFile should be created and populated with 32K * 
 5000 zeros, HDFS should be OK.
 I got: script cannot be finished, file has been created but not be populated 
 as expected, actually append operation failed.
 Datanode log shows that, blockscaner report a bad replica and nanenode decide 
 to delete it. Since it is a single node cluster, append fail. It makes no 
 sense that the script failed every time.
 Datanode and Namenode logs are attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3100) failed to append data using webhdfs

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233068#comment-13233068
 ] 

Hadoop QA commented on HDFS-3100:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518984/HDFS-3100.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 11 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2036//console

This message is automatically generated.

 failed to append data using webhdfs
 ---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0, 0.23.1
Reporter: Zhanwei.Wang
Assignee: Brandon Li
 Attachments: HDFS-3100.patch, hadoop-wangzw-datanode-ubuntu.log, 
 hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch


 STEP:
 1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
 A) enable webhdfs
 B) enable append
 C) disable permissions
 2, start hdfs
 3, run the test script as attached
 RESULT:
 expected: a file named testFile should be created and populated with 32K * 
 5000 zeros, HDFS should be OK.
 I got: script cannot be finished, file has been created but not be populated 
 as expected, actually append operation failed.
 Datanode log shows that, blockscaner report a bad replica and nanenode decide 
 to delete it. Since it is a single node cluster, append fail. It makes no 
 sense that the script failed every time.
 Datanode and Namenode logs are attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3004) Implement Recovery Mode

2012-03-19 Thread Colin Patrick McCabe (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233088#comment-13233088
 ] 

Colin Patrick McCabe commented on HDFS-3004:


Hi Todd,

Thanks for looking at this.  We'll have to chat about EditLogInputException, 
since there are a few things that are unclear to me about that exception.  It's 
used almost nowhere in the code.  Pretty much every deserialization error shows 
up as an IOException.  If the intention was that deserialization errors would 
be EditLogInputExceptions, we need to make that clear and actually implement 
it.  It will be quite a large amount of work, though-- probably a patch at 
least as big as this one, maybe more.

I don't really understand how EditLogTailer is used in practice, so I can't 
evaluate how reasonable this is.

C.

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, 
 HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004.015.patch, 
 HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, 
 HDFS-3004.019.patch, HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2834) ByteBuffer-based read API for DFSInputStream

2012-03-19 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233095#comment-13233095
 ] 

jirapos...@reviews.apache.org commented on HDFS-2834:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4212/#review6103
---


Real close now!


hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
https://reviews.apache.org/r/4212/#comment13128

this comment seems like it's in the wrong spot, since the code that comes 
after it doesn't reference offsetFromChunkBoundary.



hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
https://reviews.apache.org/r/4212/#comment13130

shouldn't this be true?



hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
https://reviews.apache.org/r/4212/#comment13132

no reason to use DFSClient here. Instead you can just use the filesystem, 
right? Then downcast the stream you get back?



hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
https://reviews.apache.org/r/4212/#comment13131

don't you want an assert on sawException here? You can also use 
GenericTestUtils.assertExceptionContains() if you want to check the text of it


- Todd


On 2012-03-09 00:47:24, Henry Robinson wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4212/
bq.  ---
bq.  
bq.  (Updated 2012-03-09 00:47:24)
bq.  
bq.  
bq.  Review request for hadoop-hdfs and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  New patch for HDFS-2834 (I can't update the old review request).
bq.  
bq.  
bq.  This addresses bug HDFS-2834.
bq.  http://issues.apache.org/jira/browse/HDFS-2834
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReader.java
 dfab730 
bq.
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
 cc61697 
bq.
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
 4187f1c 
bq.
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
 2b817ff 
bq.
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
 b7da8d4 
bq.
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
 ea24777 
bq.
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java
 9d4f4a2 
bq.
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
 PRE-CREATION 
bq.
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelRead.java
 bbd0012 
bq.
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java
 eb2a1d8 
bq.  
bq.  Diff: https://reviews.apache.org/r/4212/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Henry
bq.  
bq.



 ByteBuffer-based read API for DFSInputStream
 

 Key: HDFS-2834
 URL: https://issues.apache.org/jira/browse/HDFS-2834
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Henry Robinson
Assignee: Henry Robinson
 Attachments: HDFS-2834-no-common.patch, HDFS-2834.3.patch, 
 HDFS-2834.4.patch, HDFS-2834.5.patch, HDFS-2834.6.patch, HDFS-2834.7.patch, 
 HDFS-2834.8.patch, HDFS-2834.9.patch, HDFS-2834.patch, HDFS-2834.patch, 
 hdfs-2834-libhdfs-benchmark.png


 The {{DFSInputStream}} read-path always copies bytes into a JVM-allocated 
 {{byte[]}}. Although for many clients this is desired behaviour, in certain 
 situations, such as native-reads through libhdfs, this imposes an extra copy 
 penalty since the {{byte[]}} needs to be copied out again into a natively 
 readable memory area. 
 For these cases, it would be preferable to allow the client to supply its own 
 buffer, wrapped in a {{ByteBuffer}}, to avoid that final copy overhead. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3004) Implement Recovery Mode

2012-03-19 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233099#comment-13233099
 ] 

Todd Lipcon commented on HDFS-3004:
---

The EditLogInputExceptions are currently being thrown by this code:
{code}
  try {
if ((op = in.readOp()) == null) {
  break;
}
  } catch (IOException ioe) {
long badTxId = txId + 1; // because txId hasn't been incremented yet
String errorMessage = formatEditLogReplayError(in, 
recentOpcodeOffsets, badTxId);
FSImage.LOG.error(errorMessage);
throw new EditLogInputException(errorMessage,
ioe, numEdits);
  }
{code}
It indicates that whatever exception happened was due to a deserialization 
error, which is distinct from an application error.

EditLogTailer is used by the HA StandbyNode to tail the edits out of the edit 
log and apply them to the SBN's namespace. Since it's reading the same log that 
the active is writing, it's possible that it can see a partial edit at the end 
of the file, in which case it will generally see an IOException. The fact that 
it's being wrapped with EditLogInputException indicates that it was some 
problem reading the edits and can likely be retried. If the EditLogTailer gets 
a different type of exception, though, indicating that the _appplication_ of 
the edit failed, then it will exit, because it may have left the namespace in 
an inconsistent state and thus is no longer a candidate for failover.

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, 
 HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004.015.patch, 
 HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, 
 HDFS-3004.019.patch, HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3089) Move FSDatasetInterface and other related classes/interfaces to a package

2012-03-19 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3089:
-

Attachment: h3089_20120319.patch

h3089_20120319.patch: generated by svn rm/add for Jenkins.

 Move FSDatasetInterface and other related classes/interfaces to a package
 -

 Key: HDFS-3089
 URL: https://issues.apache.org/jira/browse/HDFS-3089
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3089_20120316_svn_mv.patch, h3089_20120319.patch, 
 h3089_20120319_svn_mv.patch, svn_mv.sh




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3094) add -nonInteractive and -force option to namenode -format command

2012-03-19 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233127#comment-13233127
 ] 

Todd Lipcon commented on HDFS-3094:
---

{code}
+NONINTERACTIVE(-nonInterActive);
{code}
should be {{-nonInteractive}} (not a capital 'A')

{code}
+//default force to false
+private boolean isForce=false;
+//default interactive to true
+private boolean isInteractive=true;
{code}
These comments are superfluous, since they just say the same thing as the code. 
Also, please add spaces before and after the '=' in the variable definitions.

{code}
+public boolean getisForce() {
+  return isForce;
+}
+
+public void setisForce(boolean force) {
+  isForce = force;
+}
{code}
Rename {{getisForce}} to just {{isForce}} or {{isForceEnabled()}}. Rename 
{{setisForce}} to {{setForceEnabled()}} or {{setForce()}}. Same goes for 
{{isInteractive}}/{{setisInteractive}} below it.


{code}
+//by default force is off and interactive is on
+startOpt.setisForce(false);
+startOpt.setisInteractive(true);
{code}
you already have these defaults in the variable declarations, no need to 
duplicate them

- It looks like if you specify invalid options, it won't give any kind of 
useful error message. You should probably be throwing 
HadoopIllegalArgumentException instead of returning null in several of these 
cases.
- I don't follow the following comment:
{{+//make sure the user did not sent force or noninteractive as the 
clusterid or an empty clusterid}}

Can you clarify it?

In one of your test cases, you make a new thread and then sleep. This is not a 
reliable way of testing, especially since it wants to get user input. This 
won't work well in many test environments. I'd suggest we just use manual tests 
for this, or else set up a way to override System.in for the purpose of the 
test, so you can test without spawning a new thread.


Style nits: please make the code look like the surrounding style in the rest of 
the codebase. Spaces around '=' signs. No spaces after '(' in if statements. 
Maximum 80 characters in a line, etc. No tabs (two space indentation). Space 
after '//'. Please read over your comments for typos as well.


 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.patch, HDFS-3094.patch, HDFS-3094.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3107) HDFS truncate

2012-03-19 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233134#comment-13233134
 ] 

Suresh Srinivas commented on HDFS-3107:
---

bq. if a user mistakenly starts to append data to an existing large file, and 
discovers the mistake, the only recourse is to recreate that file, by rewriting 
the contents. This is very inefficient.

What if user accidentally truncates a file :-)

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Lei Chang
 Attachments: HDFS_truncate_semantics_Mar15.pdf

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3004) Implement Recovery Mode

2012-03-19 Thread Colin Patrick McCabe (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233146#comment-13233146
 ] 

Colin Patrick McCabe commented on HDFS-3004:


Ok, I think I see what you are trying to express with this exception.  
Exceptions reading the edit log, as opposed to exceptions applying the edits.  
Since I only looked at what was in trunk, I didn't really see where it was 
useful, but now I understand.

I do kind of wonder if readOp itself should be doing this, just for 
consistency's sake.

C.

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, 
 HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004.015.patch, 
 HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, 
 HDFS-3004.019.patch, HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3107) HDFS truncate

2012-03-19 Thread Milind Bhandarkar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233148#comment-13233148
 ] 

Milind Bhandarkar commented on HDFS-3107:
-

What if user accidentally deletes a directory ? You guys never supported me 
when I asked for a file-by-file deletion, that could be aborted in time to save 
70 pct of users' time, right? Instead you have always supported a directory 
deletion with a single misdirected RPC.

Anyway, to answer your question, if user accidentally truncates, he/she can 
always append again, without losing any efficiency.

Can we have some mature discussions on this jira please ?

--
Milind Bhandarkar
Chief Architect, Greenplum Labs,
Data Computing Division, EMC
+1-650-523-3858 (W)
+1-408-666-8483 (C)



 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, name-node
Reporter: Lei Chang
 Attachments: HDFS_truncate_semantics_Mar15.pdf

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3117) clean cache and can't start hadoop

2012-03-19 Thread cldoltd (Created) (JIRA)
clean cache and can't start hadoop
--

 Key: HDFS-3117
 URL: https://issues.apache.org/jira/browse/HDFS-3117
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: cldoltd


i use command cache 3 /proc/sys/vm/drop_caches to clean cache
Now i can't start hadoop.
thanks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3094) add -nonInteractive and -force option to namenode -format command

2012-03-19 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233160#comment-13233160
 ] 

Arpit Gupta commented on HDFS-3094:
---

Thanks for the review Todd. I will make appropriate changes to the branch 1 and 
trunk.

bq. I don't follow the following comment:
+ //make sure the user did not sent force or noninteractive as the clusterid or 
an empty clusterid
Can you clarify it?

What i mean there is that if the user entered a wrong command where they did 
not specify a clusterid.

{code}
./bin/hadoop namenode -format -clusterid -force
{code}



 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.patch, HDFS-3094.patch, HDFS-3094.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3071) haadmin failover command does not provide enough detail for when target NN is not ready to be active

2012-03-19 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3071:
--

Attachment: hdfs-3071.txt

Here's a patch which addresses the issue. Unfortunately it's cross-project, and 
no real way to split it up without breaking one or the other. on commit.

As an experiment, I made the change in such a way that it wouldn't break 
protocol compatibility. This resulted in a sort of strange API naming. Let me 
know if you think it's better to just break the wire protocol (since we haven't 
had an Apache release with HA yet, it's probably acceptable)

 haadmin failover command does not provide enough detail for when target NN is 
 not ready to be active
 

 Key: HDFS-3071
 URL: https://issues.apache.org/jira/browse/HDFS-3071
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 0.24.0
Reporter: Philip Zeyliger
Assignee: Todd Lipcon
 Attachments: hdfs-3071.txt


 When running the failover command, you can get an error message like the 
 following:
 {quote}
 $ hdfs --config $(pwd) haadmin -failover namenode2 namenode1
 Failover failed: xxx.yyy/1.2.3.4:8020 is not ready to become active
 {quote}
 Unfortunately, the error message doesn't describe why that node isn't ready 
 to be active.  In my case, the target namenode's logs don't indicate anything 
 either. It turned out that the issue was Safe mode is ON.Resources are low 
 on NN. Safe mode must be turned off manually., but ideally the user would be 
 told that at the time of the failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3071) haadmin failover command does not provide enough detail for when target NN is not ready to be active

2012-03-19 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233193#comment-13233193
 ] 

Todd Lipcon commented on HDFS-3071:
---

I tested this manually in addition to the unit tests. For the manual test, I 
put one of the NNs in safemode and then issued the failover command:
{code}
todd@todd-w510:~/git/hadoop-common/hadoop-dist/target/hadoop-0.24.0-SNAPSHOT$ 
./bin/hdfs haadmin -failover nn2 nn1
Failover failed: todd-w510/127.0.0.1:8021 is not ready to become active: Not 
ready to go active, since the node is in safemode. Use hdfs dfsadmin -safemode 
leave to turn safe mode off.
{code}

 haadmin failover command does not provide enough detail for when target NN is 
 not ready to be active
 

 Key: HDFS-3071
 URL: https://issues.apache.org/jira/browse/HDFS-3071
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 0.24.0
Reporter: Philip Zeyliger
Assignee: Todd Lipcon
 Attachments: hdfs-3071.txt


 When running the failover command, you can get an error message like the 
 following:
 {quote}
 $ hdfs --config $(pwd) haadmin -failover namenode2 namenode1
 Failover failed: xxx.yyy/1.2.3.4:8020 is not ready to become active
 {quote}
 Unfortunately, the error message doesn't describe why that node isn't ready 
 to be active.  In my case, the target namenode's logs don't indicate anything 
 either. It turned out that the issue was Safe mode is ON.Resources are low 
 on NN. Safe mode must be turned off manually., but ideally the user would be 
 told that at the time of the failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3081) SshFenceByTcpPort uses netcat incorrectly

2012-03-19 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3081:
--

Status: Patch Available  (was: Open)

 SshFenceByTcpPort uses netcat incorrectly
 -

 Key: HDFS-3081
 URL: https://issues.apache.org/jira/browse/HDFS-3081
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 0.24.0
Reporter: Philip Zeyliger
Assignee: Todd Lipcon
 Attachments: hdfs-3081.txt


 SshFencyByTcpPort currently assumes that the NN is listening on localhost.  
 Typical setups have the namenode listening just on the hostname of the 
 namenode, which would lead nc -z to not catch it.
 Here's an example in which the NN is running, listening on 8020, but doesn't 
 respond to localhost 8020.
 {noformat}
 [root@xxx ~]# lsof -P -p 5286 | grep -i listen
 java5286 root  110u  IPv41772357  TCP xxx:8020 
 (LISTEN)
 java5286 root  121u  IPv41772397  TCP xxx:50070 
 (LISTEN)
 [root@xxx ~]# nc -z localhost 8020
 [root@xxx ~]# nc -z xxx 8020
 Connection to xxx 8020 port [tcp/intu-ec-svcdisc] succeeded!
 {noformat}
 Here's the likely offending code:
 {code}
 LOG.info(
 Indeterminate response from trying to kill service.  +
 Verifying whether it is running using nc...);
 rc = execCommand(session, nc -z localhost 8020);
 {code}
 Naively, we could rely on netcat to the correct hostname (since the NN ought 
 to be listening on the hostname it's configured as), or just to use fuser.  
 Fuser catches ports independently of what IPs they're bound to:
 {noformat}
 [root@xxx ~]# fuser 1234/tcp
 1234/tcp: 6766  6768
 [root@xxx ~]# jobs
 [1]-  Running nc -l localhost 1234 
 [2]+  Running nc -l rhel56-18.ent.cloudera.com 1234 
 [root@xxx ~]# sudo lsof -P | grep -i LISTEN | grep -i 1234
 nc 6766  root3u IPv42563626 
 TCP localhost:1234 (LISTEN)
 nc 6768  root3u IPv42563671 
 TCP xxx:1234 (LISTEN)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3081) SshFenceByTcpPort uses netcat incorrectly

2012-03-19 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3081:
--

Attachment: hdfs-3081.txt

Attached patch fixes the problem.

I am still using nc to verify that it's down, since it's possible that, if the 
user is wrong, then fuser won't be able to find the listening process. (it has 
to be either the same user or root).

I tested locally by using my external hostname and verifying the following in 
the logs:

12/03/19 21:40:19 INFO ha.SshFenceByTcpPort: Connected to todd-w510
12/03/19 21:40:19 INFO ha.SshFenceByTcpPort: Looking for process running on 
port 8020
12/03/19 21:40:19 DEBUG ha.SshFenceByTcpPort: Running cmd: 
PATH=$PATH:/sbin:/usr/sbin fuser -v -k -n tcp 8020
12/03/19 21:40:19 INFO ha.SshFenceByTcpPort: Indeterminate response from trying 
to kill service. Verifying whether it is running using nc...
12/03/19 21:40:19 DEBUG ha.SshFenceByTcpPort: Running cmd: nc -z todd-w510 8020
12/03/19 21:40:19 INFO ha.SshFenceByTcpPort: Verified that the service is down.


 SshFenceByTcpPort uses netcat incorrectly
 -

 Key: HDFS-3081
 URL: https://issues.apache.org/jira/browse/HDFS-3081
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 0.24.0
Reporter: Philip Zeyliger
Assignee: Todd Lipcon
 Attachments: hdfs-3081.txt


 SshFencyByTcpPort currently assumes that the NN is listening on localhost.  
 Typical setups have the namenode listening just on the hostname of the 
 namenode, which would lead nc -z to not catch it.
 Here's an example in which the NN is running, listening on 8020, but doesn't 
 respond to localhost 8020.
 {noformat}
 [root@xxx ~]# lsof -P -p 5286 | grep -i listen
 java5286 root  110u  IPv41772357  TCP xxx:8020 
 (LISTEN)
 java5286 root  121u  IPv41772397  TCP xxx:50070 
 (LISTEN)
 [root@xxx ~]# nc -z localhost 8020
 [root@xxx ~]# nc -z xxx 8020
 Connection to xxx 8020 port [tcp/intu-ec-svcdisc] succeeded!
 {noformat}
 Here's the likely offending code:
 {code}
 LOG.info(
 Indeterminate response from trying to kill service.  +
 Verifying whether it is running using nc...);
 rc = execCommand(session, nc -z localhost 8020);
 {code}
 Naively, we could rely on netcat to the correct hostname (since the NN ought 
 to be listening on the hostname it's configured as), or just to use fuser.  
 Fuser catches ports independently of what IPs they're bound to:
 {noformat}
 [root@xxx ~]# fuser 1234/tcp
 1234/tcp: 6766  6768
 [root@xxx ~]# jobs
 [1]-  Running nc -l localhost 1234 
 [2]+  Running nc -l rhel56-18.ent.cloudera.com 1234 
 [root@xxx ~]# sudo lsof -P | grep -i LISTEN | grep -i 1234
 nc 6766  root3u IPv42563626 
 TCP localhost:1234 (LISTEN)
 nc 6768  root3u IPv42563671 
 TCP xxx:1234 (LISTEN)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3084) FenceMethod.tryFence() and ShellCommandFencer should pass namenodeId as well as host:port

2012-03-19 Thread Todd Lipcon (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HDFS-3084:
-

Assignee: Todd Lipcon

 FenceMethod.tryFence() and ShellCommandFencer should pass namenodeId as well 
 as host:port
 -

 Key: HDFS-3084
 URL: https://issues.apache.org/jira/browse/HDFS-3084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 0.24.0, 0.23.3
Reporter: Philip Zeyliger
Assignee: Todd Lipcon

 The FenceMethod interface passes along the host:port of the NN that needs to 
 be fenced.  That's great for the common case.  However, it's likely necessary 
 to have extra configuration parameters for fencing, and these are typically 
 keyed off the nameserviceId.namenodeId (if, for nothing else, consistency 
 with all the other parameters that are keyed off of namespaceId.namenodeId).  
 Obviously this can be backed out from the host:port, but it's inconvenient, 
 and requires iterating through all the configs.
 The shell interface exhibits the same issue: host:port is great for most 
 fencers, but if you need extra configs (like the host:port of the power 
 supply unit), those are harder to pipe through without the namenodeId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3081) SshFenceByTcpPort uses netcat incorrectly

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233206#comment-13233206
 ] 

Hadoop QA commented on HDFS-3081:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12519021/hdfs-3081.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2038//console

This message is automatically generated.

 SshFenceByTcpPort uses netcat incorrectly
 -

 Key: HDFS-3081
 URL: https://issues.apache.org/jira/browse/HDFS-3081
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 0.24.0
Reporter: Philip Zeyliger
Assignee: Todd Lipcon
 Attachments: hdfs-3081.txt


 SshFencyByTcpPort currently assumes that the NN is listening on localhost.  
 Typical setups have the namenode listening just on the hostname of the 
 namenode, which would lead nc -z to not catch it.
 Here's an example in which the NN is running, listening on 8020, but doesn't 
 respond to localhost 8020.
 {noformat}
 [root@xxx ~]# lsof -P -p 5286 | grep -i listen
 java5286 root  110u  IPv41772357  TCP xxx:8020 
 (LISTEN)
 java5286 root  121u  IPv41772397  TCP xxx:50070 
 (LISTEN)
 [root@xxx ~]# nc -z localhost 8020
 [root@xxx ~]# nc -z xxx 8020
 Connection to xxx 8020 port [tcp/intu-ec-svcdisc] succeeded!
 {noformat}
 Here's the likely offending code:
 {code}
 LOG.info(
 Indeterminate response from trying to kill service.  +
 Verifying whether it is running using nc...);
 rc = execCommand(session, nc -z localhost 8020);
 {code}
 Naively, we could rely on netcat to the correct hostname (since the NN ought 
 to be listening on the hostname it's configured as), or just to use fuser.  
 Fuser catches ports independently of what IPs they're bound to:
 {noformat}
 [root@xxx ~]# fuser 1234/tcp
 1234/tcp: 6766  6768
 [root@xxx ~]# jobs
 [1]-  Running nc -l localhost 1234 
 [2]+  Running nc -l rhel56-18.ent.cloudera.com 1234 
 [root@xxx ~]# sudo lsof -P | grep -i LISTEN | grep -i 1234
 nc 6766  root3u IPv42563626 
 TCP localhost:1234 (LISTEN)
 nc 6768  root3u IPv42563671 
 TCP xxx:1234 (LISTEN)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3081) SshFenceByTcpPort uses netcat incorrectly

2012-03-19 Thread Philip Zeyliger (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233211#comment-13233211
 ] 

Philip Zeyliger commented on HDFS-3081:
---

Patch looks good to me; thanks!

 SshFenceByTcpPort uses netcat incorrectly
 -

 Key: HDFS-3081
 URL: https://issues.apache.org/jira/browse/HDFS-3081
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 0.24.0
Reporter: Philip Zeyliger
Assignee: Todd Lipcon
 Attachments: hdfs-3081.txt


 SshFencyByTcpPort currently assumes that the NN is listening on localhost.  
 Typical setups have the namenode listening just on the hostname of the 
 namenode, which would lead nc -z to not catch it.
 Here's an example in which the NN is running, listening on 8020, but doesn't 
 respond to localhost 8020.
 {noformat}
 [root@xxx ~]# lsof -P -p 5286 | grep -i listen
 java5286 root  110u  IPv41772357  TCP xxx:8020 
 (LISTEN)
 java5286 root  121u  IPv41772397  TCP xxx:50070 
 (LISTEN)
 [root@xxx ~]# nc -z localhost 8020
 [root@xxx ~]# nc -z xxx 8020
 Connection to xxx 8020 port [tcp/intu-ec-svcdisc] succeeded!
 {noformat}
 Here's the likely offending code:
 {code}
 LOG.info(
 Indeterminate response from trying to kill service.  +
 Verifying whether it is running using nc...);
 rc = execCommand(session, nc -z localhost 8020);
 {code}
 Naively, we could rely on netcat to the correct hostname (since the NN ought 
 to be listening on the hostname it's configured as), or just to use fuser.  
 Fuser catches ports independently of what IPs they're bound to:
 {noformat}
 [root@xxx ~]# fuser 1234/tcp
 1234/tcp: 6766  6768
 [root@xxx ~]# jobs
 [1]-  Running nc -l localhost 1234 
 [2]+  Running nc -l rhel56-18.ent.cloudera.com 1234 
 [root@xxx ~]# sudo lsof -P | grep -i LISTEN | grep -i 1234
 nc 6766  root3u IPv42563626 
 TCP localhost:1234 (LISTEN)
 nc 6768  root3u IPv42563671 
 TCP xxx:1234 (LISTEN)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira