[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4832:
---

Status: Open  (was: Patch Available)

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.7, 3.0.0, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677842#comment-13677842
 ] 

Ravi Prakash commented on HDFS-4832:


Oops! I mean the patch ported to branch-0.23


 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4832:
---

Attachment: HDFS-4832.branch-0.23.patch

The patch ported to trunk

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4832:
---

Status: Patch Available  (was: Open)

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.7, 3.0.0, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4832:
---

Attachment: HDFS-4832.patch

The patch for trunk and branch-2

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2013-06-07 Thread Zesheng Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-4882:
-

Attachment: 4882.patch

The main consideration of the patch:
When there's DN in the pipeline down and the pipeline stage is PIPELINE_CLOSE, 
the client triggers the data replication, do not wait the NN to do this(NN 
needs the file be finalized to do the replication, but finalized need all the 
blocks have at least dfs.namenode.replication.min(=2) replicas, these two 
conditions are contradicting).

 Namenode LeaseManager checkLeases() runs into infinite loop
 ---

 Key: HDFS-4882
 URL: https://issues.apache.org/jira/browse/HDFS-4882
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: 2.0.0-alpha
Reporter: Zesheng Wu
 Attachments: 4882.1.patch, 4882.patch


 Scenario:
 1. cluster with 4 DNs
 2. the size of the file to be written is a little more than one block
 3. write the first block to 3 DNs, DN1-DN2-DN3
 4. all the data packets of first block is successfully acked and the client 
 sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
 5. DN2 and DN3 are down
 6. client recovers the pipeline, but no new DN is added to the pipeline 
 because of the current pipeline stage is PIPELINE_CLOSE
 7. client continuously writes the last block, and try to close the file after 
 written all the data
 8. NN finds that the penultimate block doesn't has enough replica(our 
 dfs.namenode.replication.min=2), and the client's close runs into indefinite 
 loop(HDFS-2936), and at the same time, NN makes the last block's state to 
 COMPLETE
 9. shutdown the client
 10. the file's lease exceeds hard limit
 11. LeaseManager realizes that and begin to do lease recovery by call 
 fsnamesystem.internalReleaseLease()
 12. but the last block's state is COMPLETE, and this triggers lease manager's 
 infinite loop and prints massive logs like this:
 {noformat}
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
 DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
  limit
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
 /user/h_wuzesheng/test.dat
 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
 NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
 blk_-7028017402720175688_1202597,
 lastBLockState=COMPLETE
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
 for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
 APREDUCE_-1252656407_1, pendingcreates: 1]
 {noformat}
 (the 3rd line log is a debug log added by us)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2013-06-07 Thread Zesheng Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-4882:
-

Status: Patch Available  (was: Open)

 Namenode LeaseManager checkLeases() runs into infinite loop
 ---

 Key: HDFS-4882
 URL: https://issues.apache.org/jira/browse/HDFS-4882
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: 2.0.0-alpha
Reporter: Zesheng Wu
 Attachments: 4882.1.patch, 4882.patch


 Scenario:
 1. cluster with 4 DNs
 2. the size of the file to be written is a little more than one block
 3. write the first block to 3 DNs, DN1-DN2-DN3
 4. all the data packets of first block is successfully acked and the client 
 sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
 5. DN2 and DN3 are down
 6. client recovers the pipeline, but no new DN is added to the pipeline 
 because of the current pipeline stage is PIPELINE_CLOSE
 7. client continuously writes the last block, and try to close the file after 
 written all the data
 8. NN finds that the penultimate block doesn't has enough replica(our 
 dfs.namenode.replication.min=2), and the client's close runs into indefinite 
 loop(HDFS-2936), and at the same time, NN makes the last block's state to 
 COMPLETE
 9. shutdown the client
 10. the file's lease exceeds hard limit
 11. LeaseManager realizes that and begin to do lease recovery by call 
 fsnamesystem.internalReleaseLease()
 12. but the last block's state is COMPLETE, and this triggers lease manager's 
 infinite loop and prints massive logs like this:
 {noformat}
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
 DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
  limit
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
 /user/h_wuzesheng/test.dat
 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
 NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
 blk_-7028017402720175688_1202597,
 lastBLockState=COMPLETE
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
 for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
 APREDUCE_-1252656407_1, pendingcreates: 1]
 {noformat}
 (the 3rd line log is a debug log added by us)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3447) StandbyException should not be logged at ERROR level on server

2013-06-07 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-3447:
--

Labels:   (was: newbie)

 StandbyException should not be logged at ERROR level on server
 --

 Key: HDFS-3447
 URL: https://issues.apache.org/jira/browse/HDFS-3447
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor

 Currently, the standby NN will log StandbyExceptions at ERROR level any time 
 a client tries to connect to it. So, if the second NN in an HA pair is 
 active, the first NN will spew a lot of these errors in the log, as each 
 client gets redirected to the proper NN. Instead, this should be at INFO 
 level, and should probably be logged in a less scary manner (eg Received 
 READ request from client 1.2.3.4, but in Standby state. Redirecting client to 
 other NameNode.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

2013-06-07 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677879#comment-13677879
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-4873:
--

If a file is under construction when a snapshot is taken, should we keep it as 
an under construction file or a closed file?

 callGetBlockLocations returns incorrect number of blocks for snapshotted files
 --

 Key: HDFS-4873
 URL: https://issues.apache.org/jira/browse/HDFS-4873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Hari Mankude
Assignee: Jing Zhao
 Attachments: HDFS-4873.001.patch


 callGetBlockLocations() returns all the blocks of a file even when they are 
 not present in the snap version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4895) Add a function of -fixmisplaced to fsck for fixing blocks in mistaken placed

2013-06-07 Thread Junping Du (JIRA)
Junping Du created HDFS-4895:


 Summary: Add a function of -fixmisplaced to fsck for fixing 
blocks in mistaken placed
 Key: HDFS-4895
 URL: https://issues.apache.org/jira/browse/HDFS-4895
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Junping Du
Assignee: Junping Du




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4885) Update verifyBlockPlacement() API in BlockPlacementPolicy

2013-06-07 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677916#comment-13677916
 ] 

Junping Du commented on HDFS-4885:
--

As replica number is per file/directory property, and minRack is decided by 
this property, so I think we should still keep minRacks as input parameter.

 Update verifyBlockPlacement() API in BlockPlacementPolicy
 -

 Key: HDFS-4885
 URL: https://issues.apache.org/jira/browse/HDFS-4885
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Junping Du
Assignee: Junping Du

 verifyBlockPlacement() has unused parameter -srcPath as its responsibility 
 just verify single block rather than files under a specific path. Also the 
 return value (int) does not make sense as the violation of block placement 
 has other case than number of racks, so boolean value should be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4885) Update verifyBlockPlacement() API in BlockPlacementPolicy

2013-06-07 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-4885:
-

Attachment: HDFS-4885.patch

 Update verifyBlockPlacement() API in BlockPlacementPolicy
 -

 Key: HDFS-4885
 URL: https://issues.apache.org/jira/browse/HDFS-4885
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HDFS-4885.patch


 verifyBlockPlacement() has unused parameter -srcPath as its responsibility 
 just verify single block rather than files under a specific path. Also the 
 return value (int) does not make sense as the violation of block placement 
 has other case than number of racks, so boolean value should be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4885) Update verifyBlockPlacement() API in BlockPlacementPolicy

2013-06-07 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-4885:
-

  Labels: BlockPlacementPolicy  (was: )
Target Version/s: 3.0.0
  Status: Patch Available  (was: Open)

 Update verifyBlockPlacement() API in BlockPlacementPolicy
 -

 Key: HDFS-4885
 URL: https://issues.apache.org/jira/browse/HDFS-4885
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Junping Du
Assignee: Junping Du
  Labels: BlockPlacementPolicy
 Attachments: HDFS-4885.patch


 verifyBlockPlacement() has unused parameter -srcPath as its responsibility 
 just verify single block rather than files under a specific path. Also the 
 return value (int) does not make sense as the violation of block placement 
 has other case than number of racks, so boolean value should be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2013-06-07 Thread Zesheng Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-4882:
-

Attachment: 4882.patch

 Namenode LeaseManager checkLeases() runs into infinite loop
 ---

 Key: HDFS-4882
 URL: https://issues.apache.org/jira/browse/HDFS-4882
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: 2.0.0-alpha
Reporter: Zesheng Wu
 Attachments: 4882.1.patch, 4882.patch, 4882.patch


 Scenario:
 1. cluster with 4 DNs
 2. the size of the file to be written is a little more than one block
 3. write the first block to 3 DNs, DN1-DN2-DN3
 4. all the data packets of first block is successfully acked and the client 
 sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
 5. DN2 and DN3 are down
 6. client recovers the pipeline, but no new DN is added to the pipeline 
 because of the current pipeline stage is PIPELINE_CLOSE
 7. client continuously writes the last block, and try to close the file after 
 written all the data
 8. NN finds that the penultimate block doesn't has enough replica(our 
 dfs.namenode.replication.min=2), and the client's close runs into indefinite 
 loop(HDFS-2936), and at the same time, NN makes the last block's state to 
 COMPLETE
 9. shutdown the client
 10. the file's lease exceeds hard limit
 11. LeaseManager realizes that and begin to do lease recovery by call 
 fsnamesystem.internalReleaseLease()
 12. but the last block's state is COMPLETE, and this triggers lease manager's 
 infinite loop and prints massive logs like this:
 {noformat}
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
 DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
  limit
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
 /user/h_wuzesheng/test.dat
 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
 NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
 blk_-7028017402720175688_1202597,
 lastBLockState=COMPLETE
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
 for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
 APREDUCE_-1252656407_1, pendingcreates: 1]
 {noformat}
 (the 3rd line log is a debug log added by us)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4832:
---

Status: Open  (was: Patch Available)

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.7, 3.0.0, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4832:
---

Status: Patch Available  (was: Open)

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.7, 3.0.0, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3905) Secure cluster cannot use hftp to an insecure cluster

2013-06-07 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678033#comment-13678033
 ] 

Suresh Srinivas commented on HDFS-3905:
---

Is this change not needed for trunk and branch-2?

 Secure cluster cannot use hftp to an insecure cluster
 -

 Key: HDFS-3905
 URL: https://issues.apache.org/jira/browse/HDFS-3905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, security
Affects Versions: 0.23.3
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Fix For: 0.23.5

 Attachments: HDFS-3905.patch


 HDFS-3873 fixed the case where all exceptions acquiring tokens for hftp were 
 ignored.  Jobs would be submitted sans tokens, and then the tasks would 
 eventually all fail trying to get the missing token.  HDFS-3873 made jobs 
 fail to submit if the remote cluster is secure.
 Unfortunately it regressed the ability for a secure cluster to access an 
 insecure cluster over hftp.  The issue is unique to 23 due to KSSL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3905) Secure cluster cannot use hftp to an insecure cluster

2013-06-07 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678043#comment-13678043
 ] 

Daryn Sharp commented on HDFS-3905:
---

I don't think so.  The issue is specific to KSSL, which I don't think 
trunk/branch-2 supports?

 Secure cluster cannot use hftp to an insecure cluster
 -

 Key: HDFS-3905
 URL: https://issues.apache.org/jira/browse/HDFS-3905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, security
Affects Versions: 0.23.3
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Fix For: 0.23.5

 Attachments: HDFS-3905.patch


 HDFS-3873 fixed the case where all exceptions acquiring tokens for hftp were 
 ignored.  Jobs would be submitted sans tokens, and then the tasks would 
 eventually all fail trying to get the missing token.  HDFS-3873 made jobs 
 fail to submit if the remote cluster is secure.
 Unfortunately it regressed the ability for a secure cluster to access an 
 insecure cluster over hftp.  The issue is unique to 23 due to KSSL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4832:
---

Attachment: HDFS-4832.patch

Y u no test my patch Hadoop QA?

Uploading the same patch. Maybe this time it will get picked up

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4878) On Remove Block, Block is not Removed from neededReplications queue

2013-06-07 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678092#comment-13678092
 ] 

Ravi Prakash commented on HDFS-4878:


Do we need to call decrementReplicationIndex() after the remove? I am not sure 
I understand the whole mechanism, but could you please check?

 On Remove Block, Block is not Removed from neededReplications queue
 ---

 Key: HDFS-4878
 URL: https://issues.apache.org/jira/browse/HDFS-4878
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Tao Luo
Assignee: Tao Luo
 Fix For: 3.0.0

 Attachments: HDFS-4878_branch2.patch, HDFS-4878.patch, HDFS-4878.patch


 Remove block removes the specified block from pendingReplications, but not 
 from neededReplications queue. 
 The fix would be to remove from neededReplications as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4880) Diagnostic logging while loading name/edits files

2013-06-07 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678121#comment-13678121
 ] 

Arpit Agarwal commented on HDFS-4880:
-

No new tests should be necessary since this change just adds some logging.

 Diagnostic logging while loading name/edits files
 -

 Key: HDFS-4880
 URL: https://issues.apache.org/jira/browse/HDFS-4880
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 1.3.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-4880.branch-1.patch, HDFS-4880.branch-1.patch, 
 HDFS-4880.trunk.patch


 Add some minimal diagnostic logging to help determine location of the files 
 being loaded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4880) Diagnostic logging while loading name/edits files

2013-06-07 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678183#comment-13678183
 ] 

Suresh Srinivas commented on HDFS-4880:
---

+1 for the trunk patch. This is a very useful information to check where 
Namenode loaded image and edits from.

 Diagnostic logging while loading name/edits files
 -

 Key: HDFS-4880
 URL: https://issues.apache.org/jira/browse/HDFS-4880
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 1.3.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-4880.branch-1.patch, HDFS-4880.branch-1.patch, 
 HDFS-4880.trunk.patch


 Add some minimal diagnostic logging to help determine location of the files 
 being loaded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4880) Diagnostic logging while loading name/edits files

2013-06-07 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678184#comment-13678184
 ] 

Suresh Srinivas commented on HDFS-4880:
---

+1 for the branch-1 patch as well.

 Diagnostic logging while loading name/edits files
 -

 Key: HDFS-4880
 URL: https://issues.apache.org/jira/browse/HDFS-4880
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 1.3.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-4880.branch-1.patch, HDFS-4880.branch-1.patch, 
 HDFS-4880.trunk.patch


 Add some minimal diagnostic logging to help determine location of the files 
 being loaded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-4880) Diagnostic logging while loading name/edits files

2013-06-07 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas reassigned HDFS-4880:
-

Assignee: Suresh Srinivas  (was: Arpit Agarwal)

 Diagnostic logging while loading name/edits files
 -

 Key: HDFS-4880
 URL: https://issues.apache.org/jira/browse/HDFS-4880
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 1.3.0
Reporter: Arpit Agarwal
Assignee: Suresh Srinivas
 Attachments: HDFS-4880.branch-1.patch, HDFS-4880.branch-1.patch, 
 HDFS-4880.trunk.patch


 Add some minimal diagnostic logging to help determine location of the files 
 being loaded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4880) Diagnostic logging while loading name/edits files

2013-06-07 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4880:
--

   Resolution: Fixed
Fix Version/s: 1.3.0
   2.1.0-beta
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I have committed the patch to trunk, branch-2, branch-2.1 and branch-1. 

Thank you Arpit!

 Diagnostic logging while loading name/edits files
 -

 Key: HDFS-4880
 URL: https://issues.apache.org/jira/browse/HDFS-4880
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 1.3.0
Reporter: Arpit Agarwal
Assignee: Suresh Srinivas
 Fix For: 2.1.0-beta, 1.3.0

 Attachments: HDFS-4880.branch-1.patch, HDFS-4880.branch-1.patch, 
 HDFS-4880.trunk.patch


 Add some minimal diagnostic logging to help determine location of the files 
 being loaded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4878) On Remove Block, Block is not Removed from neededReplications queue

2013-06-07 Thread Plamen Jeliazkov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678283#comment-13678283
 ] 

Plamen Jeliazkov commented on HDFS-4878:


The only place where the replicationIndex is used is for this segment in 
computeUnderReplicatedBlocks(int).
{code}
Integer replIndex = priorityToReplIdx.get(priority);
  
  // skip to the first unprocessed block, which is at replIndex
  for (int i = 0; i  replIndex  neededReplicationsIterator.hasNext(); 
i++) {
neededReplicationsIterator.next();
  }
{code}

Because our condition includes .hasNext() we will stop before trying to go 
outside our iterator. So we are safe with this change.

The reason we cannot decrementReplicationIndex(), or why it would be difficult 
to, is because we cannot guarantee that we will have actually removed the block 
from any SPECIFIC priority from within neededReplications. We do not know what 
the priority level of this block is within neededReplications when we try to do 
BlockManager.removeBlock(Block). If you look at remove(Block, int) in 
UnderReplicatedBlocks, it will try to remove it from the first queue, and if 
that fails, then tries to remove from all the queues. Therefore we will not be 
able to determine WHICH replicationIndex to decrement appropriately. However, 
as noted above, we are safe-guarded by the iterator nonetheless.

Now with that being said, I can understand from a consistency point as to why 
we would want to decrement the replicationIndex. However, in order to do it, we 
will need to add a method in UnderReplicatedBlocks such that we attempt to 
remove from a priority, and return true ONLY if we actually removed it from 
that priority; then we can safely decrement the replicationIndex of that 
priority.

 On Remove Block, Block is not Removed from neededReplications queue
 ---

 Key: HDFS-4878
 URL: https://issues.apache.org/jira/browse/HDFS-4878
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Tao Luo
Assignee: Tao Luo
 Fix For: 3.0.0

 Attachments: HDFS-4878_branch2.patch, HDFS-4878.patch, HDFS-4878.patch


 Remove block removes the specified block from pendingReplications, but not 
 from neededReplications queue. 
 The fix would be to remove from neededReplications as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

2013-06-07 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678286#comment-13678286
 ] 

Jing Zhao commented on HDFS-4873:
-

Thanks for the comments Nicholas!

Yeah, currently we record the information about whether the file was under 
construction when a snapshot was taken. However, here maybe we still should 
return the status of a snapshot file as a closed file since our snapshot is 
read-only? 

 callGetBlockLocations returns incorrect number of blocks for snapshotted files
 --

 Key: HDFS-4873
 URL: https://issues.apache.org/jira/browse/HDFS-4873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Hari Mankude
Assignee: Jing Zhao
 Attachments: HDFS-4873.001.patch


 callGetBlockLocations() returns all the blocks of a file even when they are 
 not present in the snap version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3905) Secure cluster cannot use hftp to an insecure cluster

2013-06-07 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678287#comment-13678287
 ] 

Suresh Srinivas commented on HDFS-3905:
---

Thanks Daryn.

 Secure cluster cannot use hftp to an insecure cluster
 -

 Key: HDFS-3905
 URL: https://issues.apache.org/jira/browse/HDFS-3905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, security
Affects Versions: 0.23.3
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Fix For: 0.23.5

 Attachments: HDFS-3905.patch


 HDFS-3873 fixed the case where all exceptions acquiring tokens for hftp were 
 ignored.  Jobs would be submitted sans tokens, and then the tasks would 
 eventually all fail trying to get the missing token.  HDFS-3873 made jobs 
 fail to submit if the remote cluster is secure.
 Unfortunately it regressed the ability for a secure cluster to access an 
 insecure cluster over hftp.  The issue is unique to 23 due to KSSL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678291#comment-13678291
 ] 

Suresh Srinivas commented on HDFS-4866:
---

This is a blocker for 2.1.0-beta.

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Ralph Castain
Assignee: Arpit Agarwal
 Attachments: NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, 
 pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4866:
--

 Priority: Blocker  (was: Major)
Affects Version/s: (was: 2.0.4-alpha)
   2.1.0-beta

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Attachments: NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, 
 pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2013-06-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678330#comment-13678330
 ] 

Hadoop QA commented on HDFS-4882:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12586700/4882.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4497//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4497//console

This message is automatically generated.

 Namenode LeaseManager checkLeases() runs into infinite loop
 ---

 Key: HDFS-4882
 URL: https://issues.apache.org/jira/browse/HDFS-4882
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: 2.0.0-alpha
Reporter: Zesheng Wu
 Attachments: 4882.1.patch, 4882.patch, 4882.patch


 Scenario:
 1. cluster with 4 DNs
 2. the size of the file to be written is a little more than one block
 3. write the first block to 3 DNs, DN1-DN2-DN3
 4. all the data packets of first block is successfully acked and the client 
 sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
 5. DN2 and DN3 are down
 6. client recovers the pipeline, but no new DN is added to the pipeline 
 because of the current pipeline stage is PIPELINE_CLOSE
 7. client continuously writes the last block, and try to close the file after 
 written all the data
 8. NN finds that the penultimate block doesn't has enough replica(our 
 dfs.namenode.replication.min=2), and the client's close runs into indefinite 
 loop(HDFS-2936), and at the same time, NN makes the last block's state to 
 COMPLETE
 9. shutdown the client
 10. the file's lease exceeds hard limit
 11. LeaseManager realizes that and begin to do lease recovery by call 
 fsnamesystem.internalReleaseLease()
 12. but the last block's state is COMPLETE, and this triggers lease manager's 
 infinite loop and prints massive logs like this:
 {noformat}
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
 DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
  limit
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
 /user/h_wuzesheng/test.dat
 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
 NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
 blk_-7028017402720175688_1202597,
 lastBLockState=COMPLETE
 2013-06-05,17:42:25,695 INFO 
 org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
 for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
 APREDUCE_-1252656407_1, pendingcreates: 1]
 {noformat}
 (the 3rd line log is a debug log added by us)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4885) Update verifyBlockPlacement() API in BlockPlacementPolicy

2013-06-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678333#comment-13678333
 ] 

Hadoop QA commented on HDFS-4885:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12586691/HDFS-4885.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4496//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4496//console

This message is automatically generated.

 Update verifyBlockPlacement() API in BlockPlacementPolicy
 -

 Key: HDFS-4885
 URL: https://issues.apache.org/jira/browse/HDFS-4885
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Junping Du
Assignee: Junping Du
  Labels: BlockPlacementPolicy
 Attachments: HDFS-4885.patch


 verifyBlockPlacement() has unused parameter -srcPath as its responsibility 
 just verify single block rather than files under a specific path. Also the 
 return value (int) does not make sense as the violation of block placement 
 has other case than number of racks, so boolean value should be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678334#comment-13678334
 ] 

Kihwal Lee commented on HDFS-4832:
--

Your precommit build is running right now. 
https://builds.apache.org/job/PreCommit-HDFS-Build/449

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678335#comment-13678335
 ] 

Hadoop QA commented on HDFS-4832:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12586729/HDFS-4832.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4498//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4498//console

This message is automatically generated.

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4880) Diagnostic logging while loading name/edits files

2013-06-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678349#comment-13678349
 ] 

Hudson commented on HDFS-4880:
--

Integrated in Hadoop-trunk-Commit #3880 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3880/])
HDFS-4880. Print the image and edits file loaded by the namenode in the 
logs. Contributed by Arpit Agarwal. (Revision 1490746)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490746
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java


 Diagnostic logging while loading name/edits files
 -

 Key: HDFS-4880
 URL: https://issues.apache.org/jira/browse/HDFS-4880
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 1.3.0
Reporter: Arpit Agarwal
Assignee: Suresh Srinivas
 Fix For: 2.1.0-beta, 1.3.0

 Attachments: HDFS-4880.branch-1.patch, HDFS-4880.branch-1.patch, 
 HDFS-4880.trunk.patch


 Add some minimal diagnostic logging to help determine location of the files 
 being loaded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678369#comment-13678369
 ] 

Kihwal Lee commented on HDFS-4832:
--

+1 the patch looks good.

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4874) create with OVERWRITE deletes existing file without checking the lease: feature or a bug.

2013-06-07 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678385#comment-13678385
 ] 

Colin Patrick McCabe commented on HDFS-4874:


As Daryn said, this has been discussed before in other JIRAs (HDFS-4437 is a 
good one).

I think it's simpler to make leases act on inode IDs, not on paths.  Then all 
these problems go away without hacks (again, as noted in HDFS-4437).

 create with OVERWRITE deletes existing file without checking the lease: 
 feature or a bug.
 -

 Key: HDFS-4874
 URL: https://issues.apache.org/jira/browse/HDFS-4874
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.0.4-alpha
Reporter: Konstantin Shvachko

 create with OVERWRITE flag will remove a file under construction even if the 
 issuing client does not hold a lease on the file.
 It could be a bug or the feature that applications rely upon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-4832:
-

   Resolution: Fixed
Fix Version/s: 0.23.9
   2.1.0-beta
   3.0.0
 Release Note: This change makes name node keep its internal replication 
queues and data node state updated in manual safe mode. This allows metrics and 
UI to present up-to-date information while in safe mode. The behavior during 
start-up safe mode is unchanged. 
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed this to trunk, branch-2, branch-2.1.0-beta, and branch-0.23. 
Thanks for working on this patch, Ravi.

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Fix For: 3.0.0, 2.1.0-beta, 0.23.9

 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4867) metaSave NPEs when there are invalid blocks in repl queue.

2013-06-07 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678390#comment-13678390
 ] 

Kihwal Lee commented on HDFS-4867:
--

bq. It looks like the change was added to the CHANGES.TXT in mapreduce, not 
hdfs in branch-0.23.
Fixed it.

 metaSave NPEs when there are invalid blocks in repl queue.
 --

 Key: HDFS-4867
 URL: https://issues.apache.org/jira/browse/HDFS-4867
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.23.7, 2.0.4-alpha, 0.23.8
Reporter: Kihwal Lee
Assignee: Plamen Jeliazkov
 Fix For: 2.1.0-beta, 0.23.9

 Attachments: HDFS-4867.branch-0.23.patch, 
 HDFS-4867.branch-0.23.patch, HDFS-4867.branch-0.23.patch, 
 HDFS-4867.branch-0.23.patch, HDFS-4867.branch-2.patch, 
 HDFS-4867.branch2.patch, HDFS-4867.branch2.patch, HDFS-4867.branch2.patch, 
 HDFS-4867.trunk.patch, HDFS-4867.trunk.patch, HDFS-4867.trunk.patch, 
 HDFS-4867.trunk.patch, testMetaSave.log


 Since metaSave cannot get the inode holding a orphaned/invalid block, it NPEs 
 and stops generating further report. Normally ReplicationMonitor removes them 
 quickly, but if the queue is huge, it takes very long time. Also in safe 
 mode, they stay.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678412#comment-13678412
 ] 

Hudson commented on HDFS-4832:
--

Integrated in Hadoop-trunk-Commit #3881 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3881/])
HDFS-4832. Namenode doesn't change the number of missing blocks in safemode 
when DNs rejoin or leave. Contributed by Ravi Prakash. (Revision 1490803)

 Result = SUCCESS
kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490803
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java


 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Fix For: 3.0.0, 2.1.0-beta, 0.23.9

 Attachments: HDFS-4832.branch-0.23.patch, HDFS-4832.patch, 
 HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-4866:


Affects Version/s: 3.0.0

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Attachments: NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, 
 pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-4866:


Fix Version/s: 2.1.0-beta

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Fix For: 2.1.0-beta

 Attachments: NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, 
 pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-4866:


Fix Version/s: 3.0.0

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, 
 pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4895) Add a function of -fixmisplaced to fsck for fixing blocks in mistaken placed

2013-06-07 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678436#comment-13678436
 ] 

Colin Patrick McCabe commented on HDFS-4895:


Isn't this the job of the Balancer rather than of fsck?

 Add a function of -fixmisplaced to fsck for fixing blocks in mistaken placed
 --

 Key: HDFS-4895
 URL: https://issues.apache.org/jira/browse/HDFS-4895
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Junping Du
Assignee: Junping Du
  Labels: BlockPlacementPolicy



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-4866:


Fix Version/s: (was: 2.1.0-beta)
   (was: 3.0.0)

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Attachments: HDFS-4866.branch-2.001.patch, HDFS-4866.trunk.001.patch, 
 NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-4866:


Attachment: HDFS-4866.trunk.001.patch
HDFS-4866.branch-2.001.patch

Rename register to registerSubordinateNamenode.

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Attachments: HDFS-4866.branch-2.001.patch, HDFS-4866.trunk.001.patch, 
 NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-4866:


Target Version/s: 2.1.0-beta

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Attachments: HDFS-4866.branch-2.001.patch, HDFS-4866.trunk.001.patch, 
 NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-4866:


Status: Patch Available  (was: Open)

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Attachments: HDFS-4866.branch-2.001.patch, HDFS-4866.trunk.001.patch, 
 NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4896) dfs -command webhdfs:// is broke for secure cluster

2013-06-07 Thread yeshavora (JIRA)
yeshavora created HDFS-4896:
---

 Summary: dfs -command webhdfs:// is broke for secure cluster
 Key: HDFS-4896
 URL: https://issues.apache.org/jira/browse/HDFS-4896
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: yeshavora
 Fix For: 2.1.0-beta


Running:
hadoop dfs -copyToLocal webhdfs://node1:port1/File1 /tmp/File1
13/06/07 21:54:58 WARN util.ShutdownHookManager: ShutdownHook 'ClientFinalizer' 
failed, java.lang.IllegalStateException: Shutdown in progress, cannot add a 
shutdownHook
java.lang.IllegalStateException: Shutdown in progress, cannot add a shutdownHook
at 
org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:152)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2401)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2373)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:352)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.getWebHdfs(WebHdfsFileSystem.java:1003)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.cancel(WebHdfsFileSystem.java:1015)
at org.apache.hadoop.security.token.Token.cancel(Token.java:382)
at 
org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.cancel(DelegationTokenRenewer.java:152)
at 
org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.access$200(DelegationTokenRenewer.java:58)
at 
org.apache.hadoop.fs.DelegationTokenRenewer.removeRenewAction(DelegationTokenRenewer.java:241)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.close(WebHdfsFileSystem.java:824)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2447)
at 
org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2464)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4841) FsShell commands using secure webhfds fail ClientFinalizer shutdown hook

2013-06-07 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-4841:
--

Summary: FsShell commands using secure webhfds fail ClientFinalizer 
shutdown hook  (was: FsShell commands using webhfds fail ClientFinalizer 
shutdown hook)

 FsShell commands using secure webhfds fail ClientFinalizer shutdown hook
 

 Key: HDFS-4841
 URL: https://issues.apache.org/jira/browse/HDFS-4841
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security, webhdfs
Affects Versions: 3.0.0
Reporter: Stephen Chu
 Attachments: core-site.xml, 
 hadoop-root-namenode-hdfs-upgrade-pseudo.ent.cloudera.com.out, hdfs-site.xml, 
 jsvc.out


 Hadoop version:
 {code}
 bash-4.1$ $HADOOP_HOME/bin/hadoop version
 Hadoop 3.0.0-SNAPSHOT
 Subversion git://github.com/apache/hadoop-common.git -r 
 d5373b9c550a355d4e91330ba7cc8f4c7c3aac51
 Compiled by root on 2013-05-22T08:06Z
 From source with checksum 8c4cc9b1e8d6e8361431e00f64483f
 This command was run using 
 /var/lib/hadoop-hdfs/hadoop-3.0.0-SNAPSHOT/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT.jar
 {code}
 I'm seeing a problem when issuing FsShell commands using the webhdfs:// URI 
 when security is enabled. The command completes but leaves a warning that 
 ShutdownHook 'ClientFinalizer' failed.
 {code}
 bash-4.1$ hadoop-3.0.0-SNAPSHOT/bin/hadoop fs -ls 
 webhdfs://hdfs-upgrade-pseudo.ent.cloudera.com:50070/
 2013-05-22 09:46:55,710 INFO  [main] util.Shell 
 (Shell.java:isSetsidSupported(311)) - setsid exited with exit code 0
 Found 3 items
 drwxr-xr-x   - hbase supergroup  0 2013-05-22 09:46 
 webhdfs://hdfs-upgrade-pseudo.ent.cloudera.com:50070/hbase
 drwxr-xr-x   - hdfs  supergroup  0 2013-05-22 09:46 
 webhdfs://hdfs-upgrade-pseudo.ent.cloudera.com:50070/tmp
 drwxr-xr-x   - hdfs  supergroup  0 2013-05-22 09:46 
 webhdfs://hdfs-upgrade-pseudo.ent.cloudera.com:50070/user
 2013-05-22 09:46:58,660 WARN  [Thread-3] util.ShutdownHookManager 
 (ShutdownHookManager.java:run(56)) - ShutdownHook 'ClientFinalizer' failed, 
 java.lang.IllegalStateException: Shutdown in progress, cannot add a 
 shutdownHook
 java.lang.IllegalStateException: Shutdown in progress, cannot add a 
 shutdownHook
   at 
 org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:152)
   at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2400)
   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2372)
   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:352)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.getWebHdfs(WebHdfsFileSystem.java:1001)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.cancel(WebHdfsFileSystem.java:1013)
   at org.apache.hadoop.security.token.Token.cancel(Token.java:382)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.cancel(DelegationTokenRenewer.java:152)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.access$200(DelegationTokenRenewer.java:58)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer.removeRenewAction(DelegationTokenRenewer.java:241)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.close(WebHdfsFileSystem.java:822)
   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2446)
   at 
 org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2463)
   at 
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
 {code}
 I've checked that FsShell + hdfs:// commands and WebHDFS operations through 
 curl work successfully:
 {code}
 bash-4.1$ hadoop-3.0.0-SNAPSHOT/bin/hadoop fs -ls /
 2013-05-22 09:46:43,663 INFO  [main] util.Shell 
 (Shell.java:isSetsidSupported(311)) - setsid exited with exit code 0
 Found 3 items
 drwxr-xr-x   - hbase supergroup  0 2013-05-22 09:46 /hbase
 drwxr-xr-x   - hdfs  supergroup  0 2013-05-22 09:46 /tmp
 drwxr-xr-x   - hdfs  supergroup  0 2013-05-22 09:46 /user
 bash-4.1$ curl -i --negotiate -u : 
 http://hdfs-upgrade-pseudo.ent.cloudera.com:50070/webhdfs/v1/?op=GETHOMEDIRECTORY;
 HTTP/1.1 401 
 Cache-Control: must-revalidate,no-cache,no-store
 Date: Wed, 22 May 2013 16:47:14 GMT
 Pragma: no-cache
 Date: Wed, 22 May 2013 16:47:14 GMT
 Pragma: no-cache
 Content-Type: text/html; charset=iso-8859-1
 WWW-Authenticate: Negotiate
 Set-Cookie: hadoop.auth=;Path=/;Expires=Thu, 01-Jan-1970 00:00:00 GMT
 Content-Length: 1358
 Server: Jetty(6.1.26)
 HTTP/1.1 200 OK
 Cache-Control: no-cache
 Expires: Thu, 01-Jan-1970 00:00:00 GMT
 Date: Wed, 22 May 2013 16:47:14 GMT
 Pragma: no-cache
 Date: Wed, 22 May 2013 16:47:14 GMT
 Pragma: no-cache
 Content-Type: application/json
 Set-Cookie: 
 

[jira] [Commented] (HDFS-4896) dfs -command webhdfs:// is broke for secure cluster

2013-06-07 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678495#comment-13678495
 ] 

Stephen Chu commented on HDFS-4896:
---

Same problem as HDFS-4841.

 dfs -command webhdfs:// is broke for secure cluster
 -

 Key: HDFS-4896
 URL: https://issues.apache.org/jira/browse/HDFS-4896
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: yeshavora
 Fix For: 2.1.0-beta


 Running:
 hadoop dfs -copyToLocal webhdfs://node1:port1/File1 /tmp/File1
 13/06/07 21:54:58 WARN util.ShutdownHookManager: ShutdownHook 
 'ClientFinalizer' failed, java.lang.IllegalStateException: Shutdown in 
 progress, cannot add a shutdownHook
 java.lang.IllegalStateException: Shutdown in progress, cannot add a 
 shutdownHook
   at 
 org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:152)
   at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2401)
   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2373)
   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:352)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.getWebHdfs(WebHdfsFileSystem.java:1003)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.cancel(WebHdfsFileSystem.java:1015)
   at org.apache.hadoop.security.token.Token.cancel(Token.java:382)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.cancel(DelegationTokenRenewer.java:152)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.access$200(DelegationTokenRenewer.java:58)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer.removeRenewAction(DelegationTokenRenewer.java:241)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.close(WebHdfsFileSystem.java:824)
   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2447)
   at 
 org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2464)
   at 
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4896) dfs -command webhdfs:// gives for secure cluster

2013-06-07 Thread yeshavora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yeshavora updated HDFS-4896:


Summary: dfs -command webhdfs:// gives for secure cluster  (was: dfs 
-command webhdfs:// is broke for secure cluster)

 dfs -command webhdfs:// gives for secure cluster
 --

 Key: HDFS-4896
 URL: https://issues.apache.org/jira/browse/HDFS-4896
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: yeshavora
 Fix For: 2.1.0-beta


 Running:
 hadoop dfs -copyToLocal webhdfs://node1:port1/File1 /tmp/File1
 13/06/07 21:54:58 WARN util.ShutdownHookManager: ShutdownHook 
 'ClientFinalizer' failed, java.lang.IllegalStateException: Shutdown in 
 progress, cannot add a shutdownHook
 java.lang.IllegalStateException: Shutdown in progress, cannot add a 
 shutdownHook
   at 
 org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:152)
   at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2401)
   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2373)
   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:352)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.getWebHdfs(WebHdfsFileSystem.java:1003)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.cancel(WebHdfsFileSystem.java:1015)
   at org.apache.hadoop.security.token.Token.cancel(Token.java:382)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.cancel(DelegationTokenRenewer.java:152)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.access$200(DelegationTokenRenewer.java:58)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer.removeRenewAction(DelegationTokenRenewer.java:241)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.close(WebHdfsFileSystem.java:824)
   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2447)
   at 
 org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2464)
   at 
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4896) dfs -command webhdfs:// gives IllegalStateException for secure cluster

2013-06-07 Thread yeshavora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yeshavora updated HDFS-4896:


Summary: dfs -command webhdfs:// gives IllegalStateException for secure 
cluster  (was: dfs -command webhdfs:// gives for secure cluster)

 dfs -command webhdfs:// gives IllegalStateException for secure cluster
 

 Key: HDFS-4896
 URL: https://issues.apache.org/jira/browse/HDFS-4896
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: yeshavora
 Fix For: 2.1.0-beta


 Running:
 hadoop dfs -copyToLocal webhdfs://node1:port1/File1 /tmp/File1
 13/06/07 21:54:58 WARN util.ShutdownHookManager: ShutdownHook 
 'ClientFinalizer' failed, java.lang.IllegalStateException: Shutdown in 
 progress, cannot add a shutdownHook
 java.lang.IllegalStateException: Shutdown in progress, cannot add a 
 shutdownHook
   at 
 org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:152)
   at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2401)
   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2373)
   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:352)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.getWebHdfs(WebHdfsFileSystem.java:1003)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.cancel(WebHdfsFileSystem.java:1015)
   at org.apache.hadoop.security.token.Token.cancel(Token.java:382)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.cancel(DelegationTokenRenewer.java:152)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.access$200(DelegationTokenRenewer.java:58)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer.removeRenewAction(DelegationTokenRenewer.java:241)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.close(WebHdfsFileSystem.java:824)
   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2447)
   at 
 org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2464)
   at 
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4896) dfs -command webhdfs:// gives IllegalStateException for secure cluster

2013-06-07 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-4896.
---

Resolution: Duplicate

Resolving this as duplicate of HDFS-4841.

 dfs -command webhdfs:// gives IllegalStateException for secure cluster
 

 Key: HDFS-4896
 URL: https://issues.apache.org/jira/browse/HDFS-4896
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: yeshavora
 Fix For: 2.1.0-beta


 Running:
 hadoop dfs -copyToLocal webhdfs://node1:port1/File1 /tmp/File1
 13/06/07 21:54:58 WARN util.ShutdownHookManager: ShutdownHook 
 'ClientFinalizer' failed, java.lang.IllegalStateException: Shutdown in 
 progress, cannot add a shutdownHook
 java.lang.IllegalStateException: Shutdown in progress, cannot add a 
 shutdownHook
   at 
 org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:152)
   at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2401)
   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2373)
   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:352)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.getWebHdfs(WebHdfsFileSystem.java:1003)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$DtRenewer.cancel(WebHdfsFileSystem.java:1015)
   at org.apache.hadoop.security.token.Token.cancel(Token.java:382)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.cancel(DelegationTokenRenewer.java:152)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.access$200(DelegationTokenRenewer.java:58)
   at 
 org.apache.hadoop.fs.DelegationTokenRenewer.removeRenewAction(DelegationTokenRenewer.java:241)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.close(WebHdfsFileSystem.java:824)
   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2447)
   at 
 org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2464)
   at 
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678549#comment-13678549
 ] 

Hadoop QA commented on HDFS-4866:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12586801/HDFS-4866.trunk.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4499//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4499//console

This message is automatically generated.

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Attachments: HDFS-4866.branch-2.001.patch, HDFS-4866.trunk.001.patch, 
 NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4866) Protocol buffer support cannot compile under C

2013-06-07 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678556#comment-13678556
 ] 

Arpit Agarwal commented on HDFS-4866:
-

I believe no new tests are needed.

 Protocol buffer support cannot compile under C
 --

 Key: HDFS-4866
 URL: https://issues.apache.org/jira/browse/HDFS-4866
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ralph Castain
Assignee: Arpit Agarwal
Priority: Blocker
 Attachments: HDFS-4866.branch-2.001.patch, HDFS-4866.trunk.001.patch, 
 NamenodeProtocol.pb-c.c, NamenodeProtocol.pb-c.h, pcreate.pl


 When compiling Hadoop's .proto descriptions for use in C, an error occurs 
 because one of the RPC's in NamenodeProtocol.proto is named register. This 
 name is a reserved word in languages such as C. When using the Java and C++ 
 languages, the name is hidden inside a class and therefore doesn't cause an 
 error. Unfortunately, that is not the case in non-class languages such as C.
 Note: generating the C translation of the .proto files requires installation 
 of the protobuf-c package from google:
 http://code.google.com/p/protobuf-c/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4876) The javadoc of FileWithSnapshot is incorrect

2013-06-07 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678599#comment-13678599
 ] 

Jing Zhao commented on HDFS-4876:
-

Thanks Nicholas! I also committed this to branch-2.1-beta.

 The javadoc of FileWithSnapshot is incorrect
 

 Key: HDFS-4876
 URL: https://issues.apache.org/jira/browse/HDFS-4876
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: snapshots
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Fix For: 2.1.0-beta

 Attachments: h4876_20130604_branch-2.patch, h4876_20130604.patch


 The javadoc said that snapshot files and the original file form a circular 
 linked list.  It is no longer true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4895) Add a function of -fixmisplaced to fsck for fixing blocks in mistaken placed

2013-06-07 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678627#comment-13678627
 ] 

Junping Du commented on HDFS-4895:
--

Hi Colin, this is also a option. However, IMO, balancer is more to address the 
balanced distribution of data, and its algorithm now is to match aboveAvg node 
with underAvg node and then scan blocks on specific nodes rather than scan all 
blocks in HDFS or under specific directory. fsck may more suitable as its 
responsibility is to scan all blocks under specific directory, check block 
replica robust status, and report corrupt ones. Thoughts?

 Add a function of -fixmisplaced to fsck for fixing blocks in mistaken placed
 --

 Key: HDFS-4895
 URL: https://issues.apache.org/jira/browse/HDFS-4895
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Junping Du
Assignee: Junping Du
  Labels: BlockPlacementPolicy



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira