date:20130701


 [ 
https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-4762:
-

Attachment: HDFS-4762.patch.4

 Provide HDFS based NFSv3 and Mountd implementation
 --

 Key: HDFS-4762
 URL: https://issues.apache.org/jira/browse/HDFS-4762
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, 
 HDFS-4762.patch.3, HDFS-4762.patch.4


 This is to track the implementation of NFSv3 to HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation


[ 
https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696601#comment-13696601
 ] 

Brandon Li commented on HDFS-4762:
--

Uploaded new patch to address Nicholas' comments. Thanks!

 Provide HDFS based NFSv3 and Mountd implementation
 --

 Key: HDFS-4762
 URL: https://issues.apache.org/jira/browse/HDFS-4762
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, 
 HDFS-4762.patch.3, HDFS-4762.patch.4


 This is to track the implementation of NFSv3 to HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation

2013-07-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696620#comment-13696620
 ] 

Hadoop QA commented on HDFS-4762:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12590225/HDFS-4762.patch.4
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 6 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs:

  org.apache.hadoop.hdfs.nfs.nfs3.TestOffsetRange

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4581//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4581//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs-nfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4581//console

This message is automatically generated.

 Provide HDFS based NFSv3 and Mountd implementation
 --

 Key: HDFS-4762
 URL: https://issues.apache.org/jira/browse/HDFS-4762
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, 
 HDFS-4762.patch.3, HDFS-4762.patch.4


 This is to track the implementation of NFSv3 to HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4946) Allow preferLocalNode in BlockPlacementPolicyDefault to be configurable

2013-07-01 Thread James Kinley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Kinley updated HDFS-4946:
---

Summary: Allow preferLocalNode in BlockPlacementPolicyDefault to be 
configurable  (was: Allow preferLocalNode to be configurable in 
BlockPlacementPolicyDefault)

 Allow preferLocalNode in BlockPlacementPolicyDefault to be configurable
 ---

 Key: HDFS-4946
 URL: https://issues.apache.org/jira/browse/HDFS-4946
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha
Reporter: James Kinley

 Allow preferLocalNode in BlockPlacementPolicyDefault to be disabled in 
 configuration to prevent a client from writing the first replica of every 
 block (i.e. the entire file) to the local DataNode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4946) Allow preferLocalNode to be configurable in BlockPlacementPolicyDefault

2013-07-01 Thread James Kinley (JIRA)

James Kinley created HDFS-4946:
--

 Summary: Allow preferLocalNode to be configurable in 
BlockPlacementPolicyDefault
 Key: HDFS-4946
 URL: https://issues.apache.org/jira/browse/HDFS-4946
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha
Reporter: James Kinley


Allow preferLocalNode in BlockPlacementPolicyDefault to be disabled in 
configuration to prevent a client from writing the first replica of every block 
(i.e. the entire file) to the local DataNode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4797) BlockScanInfo does not override equals(..) and hashCode() consistently


[ 
https://issues.apache.org/jira/browse/HDFS-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696718#comment-13696718
 ] 

Hudson commented on HDFS-4797:
--

Integrated in Hadoop-Yarn-trunk #257 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/257/])
HDFS-4797. BlockScanInfo does not override equals(..) and hashCode() 
consistently. (Revision 1498202)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1498202
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java


 BlockScanInfo does not override equals(..) and hashCode() consistently
 --

 Key: HDFS-4797
 URL: https://issues.apache.org/jira/browse/HDFS-4797
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Fix For: 2.2.0

 Attachments: h4797_20130513b.patch, h4797_20130513.patch


 In the code below, equals(..) compares lastScanTime but hashCode() is 
 computed using block ID.  Therefore, it could have two BlockScanInfo objects 
 which are equal but have two different hash codes.
 {code}
 //BlockScanInfo
 @Override
 public int hashCode() {
   return block.hashCode();
 }
 
 @Override
 public boolean equals(Object other) {
   return other instanceof BlockScanInfo 
  compareTo((BlockScanInfo)other) == 0;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4915) Add config to ZKFC to disable fencing

2013-07-01 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696779#comment-13696779
 ] 

Uma Maheswara Rao G commented on HDFS-4915:
---

I think this is same as HDFS-3862 right?


 Add config to ZKFC to disable fencing
 -

 Key: HDFS-4915
 URL: https://issues.apache.org/jira/browse/HDFS-4915
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 3.0.0
Reporter: Todd Lipcon

 With QuorumJournalManager, it's not important for the ZKFCs to perform any 
 fencing. We currently workaround this by setting the fencer to /bin/true, but 
 the ZKFC still does things like create breadcrumb znodes, etc. It would be 
 simpler to add a config to disable fencing, and then the ZKFC's job would be 
 simpler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4945) A Distributed and Cooperative NameNode Cluster for a Highly-Available HDFS

2013-07-01 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696782#comment-13696782
 ] 

Uma Maheswara Rao G commented on HDFS-4945:
---

Most of the Questions Suresh already asked for more clarity on this feature.

I have one question to know:

{quote}
 When each fragment has k replicas, the file system can tolerate up to 
floor(k/2 - 1) faulty NameNodes.
{quote}
How/where you will manage this fragment details metadata?


Regards,
Uma

 A Distributed and Cooperative NameNode Cluster for a Highly-Available HDFS
 --

 Key: HDFS-4945
 URL: https://issues.apache.org/jira/browse/HDFS-4945
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: auto-failover
Affects Versions: HA branch (HDFS-1623)
Reporter: Yonghwan Kim
  Labels: documentation

 See the following comment for detailed description.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4797) BlockScanInfo does not override equals(..) and hashCode() consistently


[ 
https://issues.apache.org/jira/browse/HDFS-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696784#comment-13696784
 ] 

Hudson commented on HDFS-4797:
--

Integrated in Hadoop-Hdfs-trunk #1447 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1447/])
HDFS-4797. BlockScanInfo does not override equals(..) and hashCode() 
consistently. (Revision 1498202)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1498202
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java


 BlockScanInfo does not override equals(..) and hashCode() consistently
 --

 Key: HDFS-4797
 URL: https://issues.apache.org/jira/browse/HDFS-4797
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Fix For: 2.2.0

 Attachments: h4797_20130513b.patch, h4797_20130513.patch


 In the code below, equals(..) compares lastScanTime but hashCode() is 
 computed using block ID.  Therefore, it could have two BlockScanInfo objects 
 which are equal but have two different hash codes.
 {code}
 //BlockScanInfo
 @Override
 public int hashCode() {
   return block.hashCode();
 }
 
 @Override
 public boolean equals(Object other) {
   return other instanceof BlockScanInfo 
  compareTo((BlockScanInfo)other) == 0;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

2013-07-01 Thread Uma Maheswara Rao G (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696797#comment-13696797
]

Uma Maheswara Rao G commented on HDFS-4937:
---

Hi Kihwal, you said in the comment that operator added large number of new
nodes right. Even then it was not able choose at least from them?

ReplicationMonitor can infinite-loop in
BlockPlacementPolicyDefault#chooseRandom()
--

Key: HDFS-4937
URL: https://issues.apache.org/jira/browse/HDFS-4937
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 2.0.4-alpha, 0.23.8
Reporter: Kihwal Lee

When a large number of nodes are removed by refreshing node lists, the
network topology is updated. If the refresh happens at the right moment, the
replication monitor thread may stuck in the while loop of {{chooseRandom()}}.
This is because the cached cluster size is used in the terminal condition
check of the loop. This usually happens when a block with a high replication
factor is being processed. Since replicas/rack is also calculated beforehand,
no node choice may satisfy the goodness criteria if refreshing removed racks.
All nodes will end up in the excluded list, but the size will still be less
than the cached cluster size, so it will loop infinitely. This was observed
in a production environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4797) BlockScanInfo does not override equals(..) and hashCode() consistently


[ 
https://issues.apache.org/jira/browse/HDFS-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696834#comment-13696834
 ] 

Hudson commented on HDFS-4797:
--

Integrated in Hadoop-Mapreduce-trunk #1474 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1474/])
HDFS-4797. BlockScanInfo does not override equals(..) and hashCode() 
consistently. (Revision 1498202)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1498202
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java


 BlockScanInfo does not override equals(..) and hashCode() consistently
 --

 Key: HDFS-4797
 URL: https://issues.apache.org/jira/browse/HDFS-4797
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Fix For: 2.2.0

 Attachments: h4797_20130513b.patch, h4797_20130513.patch


 In the code below, equals(..) compares lastScanTime but hashCode() is 
 computed using block ID.  Therefore, it could have two BlockScanInfo objects 
 which are equal but have two different hash codes.
 {code}
 //BlockScanInfo
 @Override
 public int hashCode() {
   return block.hashCode();
 }
 
 @Override
 public boolean equals(Object other) {
   return other instanceof BlockScanInfo 
  compareTo((BlockScanInfo)other) == 0;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc

2013-07-01 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696838#comment-13696838
 ] 

Daryn Sharp commented on HDFS-2856:
---

I haven't digested the whole jira, but want to request more info about:

bq. The only potential downside I see is that if we ever pipeline multiple 
operations over a single connection, then we'd need to renegotiate SASL per 
operation, because the authorization decision may be different per block

I've made some RPCv9 changes to allow the future possibility to multiplex 
connections.  Will multiplexing help with this jira's use case?  If so, SASL 
negotiation per operation should not be necessary as negotiation will occur per 
virtual stream.

 Fix block protocol so that Datanodes don't require root or jsvc
 ---

 Key: HDFS-2856
 URL: https://issues.apache.org/jira/browse/HDFS-2856
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, security
Reporter: Owen O'Malley
Assignee: Chris Nauroth
 Attachments: Datanode-Security-Design.pdf, 
 Datanode-Security-Design.pdf, Datanode-Security-Design.pdf


 Since we send the block tokens unencrypted to the datanode, we currently 
 start the datanode as root using jsvc and get a secure ( 1024) port.
 If we have the datanode generate a nonce and send it on the connection and 
 the sends an hmac of the nonce back instead of the block token it won't 
 reveal any secrets. Thus, we wouldn't require a secure port and would not 
 require root or jsvc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4696) Branch 0.23 Patch for Block Replication Policy Implementation May Skip Higher-Priority Blocks for Lower-Priority Blocks

2013-07-01 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated HDFS-4696:


Target Version/s: 0.23.10  (was: 0.23.9)

 Branch 0.23 Patch for Block Replication Policy Implementation May Skip 
 Higher-Priority Blocks for Lower-Priority Blocks
 -

 Key: HDFS-4696
 URL: https://issues.apache.org/jira/browse/HDFS-4696
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.5
Reporter: Derek Dagit
Assignee: Derek Dagit

 This JIRA tracks the solution to HDFS-4366 for the 0.23 branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4887) TestNNThroughputBenchmark exits abruptly


[ 
https://issues.apache.org/jira/browse/HDFS-4887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696863#comment-13696863
 ] 

Kihwal Lee commented on HDFS-4887:
--

bq. For the new patch, do we need to declare checkNSRunning as volatile, since 
it can be set and retrieved by different threads?

The replication monitor thread accesses this variable only once when 
terminating, so there will be no issue. 

 TestNNThroughputBenchmark exits abruptly
 

 Key: HDFS-4887
 URL: https://issues.apache.org/jira/browse/HDFS-4887
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: benchmarks, test
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HDFS-4887.patch, HDFS-4887.patch


 After HDFS-4840, TestNNThroughputBenchmark exits in the middle. This is 
 because ReplicationMonitor is being stopped while NN is still running. 
 This is only valid during testing. In normal cases, ReplicationMonitor thread 
 runs all the time once started. In standby or safemode, it just skips 
 calculating DN work. I think NNThroughputBenchmark needs to use ExitUtil to 
 prevent termination, rather than modifying ReplicationMonitor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation

2013-07-01 Thread Tsz Wo (Nicholas), SZE (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696877#comment-13696877
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-4762:
--

It seems that TestOffsetRange is incorrect: r2 and r4 have overlap but the 
compareTo(..) method does not allow it.

Please also fix the findbugs warnings.

 Provide HDFS based NFSv3 and Mountd implementation
 --

 Key: HDFS-4762
 URL: https://issues.apache.org/jira/browse/HDFS-4762
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, 
 HDFS-4762.patch.3, HDFS-4762.patch.4


 This is to track the implementation of NFSv3 to HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4888) Refactor and fix FSNamesystem.getTurnOffTip to sanity


[ 
https://issues.apache.org/jira/browse/HDFS-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696891#comment-13696891
 ] 

Kihwal Lee commented on HDFS-4888:
--

+1 

 Refactor and fix FSNamesystem.getTurnOffTip to sanity
 -

 Key: HDFS-4888
 URL: https://issues.apache.org/jira/browse/HDFS-4888
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-4888.patch, HDFS-4888.patch, HDFS-4888.patch


 e.g. When resources are low, the command to leave safe mode is not printed.
 This method is unnecessarily complex

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

[
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696922#comment-13696922
]

Kihwal Lee commented on HDFS-4937:
--

bq. Even then it was not able choose at least from them?

It couldn't pick enough number of nodes because the max replicas/rack was
already calculated. I think it worked fine for majority of blocks with 3
replicas since the cluster had more than 3 racks even after refresh. The issue
was with blocks with many more replicas. But picking enough nodes is just one
condition. The other is for checking the exhaustion of candidate nodes. It
would have bailed out of the while loop, if the cached cluster size was updated
inside the loop.

To avoid frequent cluster-size refresh for this rare condition, we can make it
update the cached value after {{dfs.replication.max}} iterations, within which
most blocks should find all they need. If NN hits this issue, it will loop
{{dfs.replication.max}} times and break out. I prefer this over adding locking,
which will slow down normal cases.

ReplicationMonitor can infinite-loop in
BlockPlacementPolicyDefault#chooseRandom()
--

Key: HDFS-4937
URL: https://issues.apache.org/jira/browse/HDFS-4937
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 2.0.4-alpha, 0.23.8
Reporter: Kihwal Lee

[jira] [Commented] (HDFS-4851) Deadlock in pipeline recovery

2013-07-01 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697062#comment-13697062
 ] 

Andrew Wang commented on HDFS-4851:
---

Hey Uma, thanks for taking a look!

I may not understand your proposal entirely, but I found it pretty complex to 
interrupt while not holding the lock (see the patch in HDFS-3655 for the 
general idea).

The core issue is that more recovery threads can keep coming in, so even if we 
interrupt the current old writer, by the time we re-get the FSD lock to 
rbw.setWriter to ourselves, some other recovery thread might have again come in 
and we need to interrupt them too. Repeating the stopWriter requires re-doing 
the precondition checks in the three places we call stopWriter, each of which 
have different preconditions.

Would love if a simpler or better solution is present though, so please let me 
know if I missed something.

 Deadlock in pipeline recovery
 -

 Key: HDFS-4851
 URL: https://issues.apache.org/jira/browse/HDFS-4851
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0, 2.0.4-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-4851-1.patch


 Here's a deadlock scenario that cropped up during pipeline recovery, debugged 
 through jstacks. Todd tipped me off to this one.
 # Pipeline fails, client initiates recovery. We have the old leftover 
 DataXceiver, and a new one doing recovery.
 # New DataXceiver does {{recoverRbw}}, grabbing the {{FsDatasetImpl}} lock
 # Old DataXceiver is in {{BlockReceiver#computePartialChunkCrc}}, calls 
 {{FsDatasetImpl#getTmpInputStreams}} and blocks on the {{FsDatasetImpl}} lock.
 # New DataXceiver {{ReplicaInPipeline#stopWriter}}, interrupting the old 
 DataXceiver and then joining on it.
 # Boom, deadlock. New DX holds the {{FsDatasetImpl}} lock and is joining on 
 the old DX, which is in turn waiting on the {{FsDatasetImpl}} lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc

2013-07-01 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697123#comment-13697123
 ] 

Chris Nauroth commented on HDFS-2856:
-

{quote}
Will multiplexing help with this jira's use case?
{quote}

My comment referred to the fact that block-level operations, like readBlock and 
writeBlock, require a unique authorization decision per block, using a 
different block access token for each one.  If multiple readBlock/writeBlock 
calls were pipelined over a single connection, then we'd need to check 
authorization on each one.  If authorization for DataTransferProtocol is moving 
fully to SASL, then this implies to me that we would need to renegotiate SASL 
at the start of each block-level operation.

I don't see a way for multiplexing to help with this problem, because there 
would still be the problem that we don't know what block the client requested 
until we start inspecting the front of the message.  I haven't followed the 
RPCv9 changes closely though, so if I'm misunderstanding, please let me know.  
Thanks, Daryn.


 Fix block protocol so that Datanodes don't require root or jsvc
 ---

 Key: HDFS-2856
 URL: https://issues.apache.org/jira/browse/HDFS-2856
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, security
Reporter: Owen O'Malley
Assignee: Chris Nauroth
 Attachments: Datanode-Security-Design.pdf, 
 Datanode-Security-Design.pdf, Datanode-Security-Design.pdf


 Since we send the block tokens unencrypted to the datanode, we currently 
 start the datanode as root using jsvc and get a secure ( 1024) port.
 If we have the datanode generate a nonce and send it on the connection and 
 the sends an hmac of the nonce back instead of the block token it won't 
 reveal any secrets. Thus, we wouldn't require a secure port and would not 
 require root or jsvc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4888) Refactor and fix FSNamesystem.getTurnOffTip to sanity


[ 
https://issues.apache.org/jira/browse/HDFS-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697134#comment-13697134
 ] 

Hudson commented on HDFS-4888:
--

Integrated in Hadoop-trunk-Commit #4025 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4025/])
HDFS-4888. Refactor and fix FSNamesystem.getTurnOffTip. Contributed by Ravi 
Prakash. (Revision 1498665)

 Result = SUCCESS
kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1498665
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSafeMode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHASafeMode.java


 Refactor and fix FSNamesystem.getTurnOffTip to sanity
 -

 Key: HDFS-4888
 URL: https://issues.apache.org/jira/browse/HDFS-4888
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-4888.patch, HDFS-4888.patch, HDFS-4888.patch


 e.g. When resources are low, the command to leave safe mode is not printed.
 This method is unnecessarily complex

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4888) Refactor and fix FSNamesystem.getTurnOffTip to sanity


 [ 
https://issues.apache.org/jira/browse/HDFS-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-4888:
-

   Resolution: Fixed
Fix Version/s: 2.1.0-beta
   3.0.0
   Status: Resolved  (was: Patch Available)

 Refactor and fix FSNamesystem.getTurnOffTip to sanity
 -

 Key: HDFS-4888
 URL: https://issues.apache.org/jira/browse/HDFS-4888
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HDFS-4888.patch, HDFS-4888.patch, HDFS-4888.patch


 e.g. When resources are low, the command to leave safe mode is not printed.
 This method is unnecessarily complex

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4888) Refactor and fix FSNamesystem.getTurnOffTip to sanity


[ 
https://issues.apache.org/jira/browse/HDFS-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697150#comment-13697150
 ] 

Kihwal Lee commented on HDFS-4888:
--

Thanks for working on the fix, Ravi. I've committed this to trunk, branch-2 and 
branch-2.1-beta.

 Refactor and fix FSNamesystem.getTurnOffTip to sanity
 -

 Key: HDFS-4888
 URL: https://issues.apache.org/jira/browse/HDFS-4888
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-4888.patch, HDFS-4888.patch, HDFS-4888.patch


 e.g. When resources are low, the command to leave safe mode is not printed.
 This method is unnecessarily complex

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4905) Add appendToFile command to hdfs dfs

2013-07-01 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697180#comment-13697180
 ] 

Arpit Agarwal commented on HDFS-4905:
-

Thanks for the great feedback Chris, all reasonable points. I'll post an 
updated patch soon.

 Add appendToFile command to hdfs dfs
 --

 Key: HDFS-4905
 URL: https://issues.apache.org/jira/browse/HDFS-4905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 3.0.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor
 Attachments: HDFS-4905.patch


 A hdfs dfs -appendToFile... option would be quite useful for quick testing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4940) namenode OOMs under Bigtop's TestCLI

2013-07-01 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697268#comment-13697268
 ] 

Suresh Srinivas commented on HDFS-4940:
---

bq. Have you checked out the heap dump?
yes. I agree with your assessment based on that alone.

bq. I'm still not sure how the test is causing this problem
It would be good to get bigtop results for this to understand what causes this.

 namenode OOMs under Bigtop's TestCLI
 

 Key: HDFS-4940
 URL: https://issues.apache.org/jira/browse/HDFS-4940
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Roman Shaposhnik
Priority: Blocker
 Fix For: 2.1.0-beta


 Bigtop's TestCLI when executed against Hadoop 2.1.0 seems to make it OOM 
 quite reliably regardless of the heap size settings. I'm attaching a heap 
 dump URL. Alliteratively anybody can just take Bigtop's tests, compiled them 
 against Hadoop 2.1.0 bits and try to reproduce it.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation


 [ 
https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-4762:
-

Attachment: HDFS-4762.patch.5

 Provide HDFS based NFSv3 and Mountd implementation
 --

 Key: HDFS-4762
 URL: https://issues.apache.org/jira/browse/HDFS-4762
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, 
 HDFS-4762.patch.3, HDFS-4762.patch.4, HDFS-4762.patch.5


 This is to track the implementation of NFSv3 to HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation