[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Status: Patch Available  (was: Open)

> zero-copy reads are incorrectly disabled on file offsets above 2GB
> --
>
> Key: HDFS-6097
> URL: https://issues.apache.org/jira/browse/HDFS-6097
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-6097.003.patch
>
>
> Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
> some code that is supposed to disable zero-copy reads on offsets in block 
> files greater than 2GB (because MappedByteBuffer segments are limited to that 
> size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6010) Make balancer able to balance data among specified servers

2014-03-12 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932891#comment-13932891
 ] 

Yu Li commented on HDFS-6010:
-

{quote}
You know how things work when there are deadlines to meet
{quote}
Totally understand, no problem :-)

{quote}
1. How would you maintain the mapping of files to groups?
{quote}
We don't maintain the mapping in HDFS, but use the regionserver group 
information. Or say, in our use case, this is used along with the regionserver 
group feature, the admin can get the RS group information through a hbase shell 
command, and pass the server list to balancer. To make it easier, we actually 
wrote a simple script to do the whole process, while admin only need to enter a 
RS group name for data balancing. More details please refer to answer of 
question #4
\\
{quote}
wondering whether it makes sense to have the tool take paths for balancing as 
opposed to servers
{quote}
In our hbase use case, this is Ok. But I think it might be better to make the 
tool more general. There might be other scenarios requring balancing data among 
subset instead of fullset of datanodes, although I cannot give one for now. :-)

{quote}
2. Are these mappings set up by some admin?
{quote}
Yes according to above comments

{quote}
3. Would you expand a group when it is nearing capacity?
{quote}
Yes, we could change the setting of one RS group, like moving one RS from 
groupA to groupB, then we would need to use the HDFS-6012 tool to move blocks 
to assure "group-block-locality". We'll come back more about this topic in 
answer of question #5

{quote}
4. How does someone like HBase use this? Is HBase going to have visibility into 
the mappings as well (to take care of HBASE-6721 and favored-nodes for writes)?
{quote}
Yes, through HBASE-6721(actually we have done quite some improvements to it to 
make it simplier and more suitable to use in our product env, but that's 
another topic and won't discuss here:-)) we could group RS to supply 
multi-tenant service, one application would use one RS group(regions of all 
tables of this application would be served only by RS in its own group), and 
would write data to the mapping DN through favored-node feature. To be more 
specific, it's an "app-regionserverGroup-datanodeGroup" mapping, all hfiles of 
the table of one application would locate only on the DNs of the RS group.

{quote}
5. Would you need a higher level balancer for keeping the whole cluster 
balanced (do migrations of blocks associated with certain paths from one group 
to another)? Otherwise, there would be skews in the block distribution. 
{quote}
You really have got the point here:-) Actually the most downside of this 
solution for io isolation is that it will cause data imbalance in the view of 
the whole HDFS cluster. In our use case, we recommend admin not to use balancer 
over all DNs. Instead, like mentioned in answer of question #3, if we find some 
group with high disk usage while another group relatively "empty", admin can 
reset the group to move one RS/DN server around. HDFS-6010 tool plus HDFS-6012 
tool would make the trick work.

{quote}
6. When there is a failure of a datanode in a group, how would you choose which 
datanodes to replicate the blocks to. The choice would be somewhat important 
given that some target datanodes might be busy serving requests
{quote}
Currently we don't control the replication of failed datanodes, but use the 
HDFS default policy. So the only impact datanode failure does for isolation is 
that the blocks might be replicated outside the group, that's why we need 
HDFS-6012 tool to periodly check and move "cross-group" blocks back

[~devaraj] hope the above comments could answer your questions, and feel free 
to let me know if any further comments. :-)

> Make balancer able to balance data among specified servers
> --
>
> Key: HDFS-6010
> URL: https://issues.apache.org/jira/browse/HDFS-6010
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 2.3.0
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Minor
> Attachments: HDFS-6010-trunk.patch
>
>
> Currently, the balancer tool balances data among all datanodes. However, in 
> some particular case, we would need to balance data only among specified 
> nodes instead of the whole set.
> In this JIRA, a new "-servers" option would be introduced to implement this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932797#comment-13932797
 ] 

haosdent commented on HDFS-6092:


[~te...@apache.org] Your patch looks well. But I think my way maybe more clear. 
I also add test cases in my patch(haosdent-HDFS-6092-v2.patch). Maybe what I 
say looks rude, but let us use a better way to avoid confusion here. Add some 
comments here maybe necessary here. :-P

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
> hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: HDFS-6097.003.patch

fix log message

> zero-copy reads are incorrectly disabled on file offsets above 2GB
> --
>
> Key: HDFS-6097
> URL: https://issues.apache.org/jira/browse/HDFS-6097
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-6097.003.patch
>
>
> Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
> some code that is supposed to disable zero-copy reads on offsets in block 
> files greater than 2GB (because MappedByteBuffer segments are limited to that 
> size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: (was: HDFS-6097.002.patch)

> zero-copy reads are incorrectly disabled on file offsets above 2GB
> --
>
> Key: HDFS-6097
> URL: https://issues.apache.org/jira/browse/HDFS-6097
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-6097.003.patch
>
>
> Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
> some code that is supposed to disable zero-copy reads on offsets in block 
> files greater than 2GB (because MappedByteBuffer segments are limited to that 
> size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: (was: HDFS-6097.001.patch)

> zero-copy reads are incorrectly disabled on file offsets above 2GB
> --
>
> Key: HDFS-6097
> URL: https://issues.apache.org/jira/browse/HDFS-6097
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-6097.002.patch
>
>
> Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
> some code that is supposed to disable zero-copy reads on offsets in block 
> files greater than 2GB (because MappedByteBuffer segments are limited to that 
> size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: HDFS-6097.002.patch

fix typo

> zero-copy reads are incorrectly disabled on file offsets above 2GB
> --
>
> Key: HDFS-6097
> URL: https://issues.apache.org/jira/browse/HDFS-6097
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-6097.002.patch
>
>
> Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
> some code that is supposed to disable zero-copy reads on offsets in block 
> files greater than 2GB (because MappedByteBuffer segments are limited to that 
> size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: HDFS-6097.001.patch

> zero-copy reads are incorrectly disabled on file offsets above 2GB
> --
>
> Key: HDFS-6097
> URL: https://issues.apache.org/jira/browse/HDFS-6097
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-6097.001.patch
>
>
> Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
> some code that is supposed to disable zero-copy reads on offsets in block 
> files greater than 2GB (because MappedByteBuffer segments are limited to that 
> size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6101) TestReplaceDatanodeOnFailure fails occasionally

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932722#comment-13932722
 ] 

Arpit Agarwal commented on HDFS-6101:
-

Exception details:
{code}
testReplaceDatanodeOnFailure(org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure)
  Time elapsed: 25.176 sec  <<< FAILURE!java.lang.AssertionError: expected:<3> 
but was:<2>  at org.junit.Assert.fail(Assert.java:93)
  at org.junit.Assert.failNotEquals(Assert.java:647)  at 
org.junit.Assert.assertEquals(Assert.java:128)
  at org.junit.Assert.assertEquals(Assert.java:472)  at 
org.junit.Assert.assertEquals(Assert.java:456)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure$SlowWriter.checkReplication(TestReplaceDatanodeOnFailure.java:234)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure.testReplaceDatanodeOnFailure(TestReplaceDatanodeOnFailure.java:153)
{code}

> TestReplaceDatanodeOnFailure fails occasionally
> ---
>
> Key: HDFS-6101
> URL: https://issues.apache.org/jira/browse/HDFS-6101
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Arpit Agarwal
> Attachments: TestReplaceDatanodeOnFailure.log
>
>
> Exception details in a comment below.
> The failure repros on both OS X and Linux if I run the test ~10 times in a 
> loop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6101) TestReplaceDatanodeOnFailure fails occasionally

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6101:


Description: 
Exception details in a comment below.

The failure repros on both OS X and Linux if I run the test ~10 times in a loop.

  was:
Exception details:

{code}
testReplaceDatanodeOnFailure(org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure)
  Time elapsed: 25.176 sec  <<< FAILURE!java.lang.AssertionError: expected:<3> 
but was:<2>  at org.junit.Assert.fail(Assert.java:93)
  at org.junit.Assert.failNotEquals(Assert.java:647)  at 
org.junit.Assert.assertEquals(Assert.java:128)
  at org.junit.Assert.assertEquals(Assert.java:472)  at 
org.junit.Assert.assertEquals(Assert.java:456)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure$SlowWriter.checkReplication(TestReplaceDatanodeOnFailure.java:234)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure.testReplaceDatanodeOnFailure(TestReplaceDatanodeOnFailure.java:153)
{code}


> TestReplaceDatanodeOnFailure fails occasionally
> ---
>
> Key: HDFS-6101
> URL: https://issues.apache.org/jira/browse/HDFS-6101
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Arpit Agarwal
> Attachments: TestReplaceDatanodeOnFailure.log
>
>
> Exception details in a comment below.
> The failure repros on both OS X and Linux if I run the test ~10 times in a 
> loop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6101) TestReplaceDatanodeOnFailure fails occasionally

2014-03-12 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-6101:
---

 Summary: TestReplaceDatanodeOnFailure fails occasionally
 Key: HDFS-6101
 URL: https://issues.apache.org/jira/browse/HDFS-6101
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Arpit Agarwal
 Attachments: TestReplaceDatanodeOnFailure.log

Exception details:

{code}
testReplaceDatanodeOnFailure(org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure)
  Time elapsed: 25.176 sec  <<< FAILURE!java.lang.AssertionError: expected:<3> 
but was:<2>  at org.junit.Assert.fail(Assert.java:93)
  at org.junit.Assert.failNotEquals(Assert.java:647)  at 
org.junit.Assert.assertEquals(Assert.java:128)
  at org.junit.Assert.assertEquals(Assert.java:472)  at 
org.junit.Assert.assertEquals(Assert.java:456)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure$SlowWriter.checkReplication(TestReplaceDatanodeOnFailure.java:234)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure.testReplaceDatanodeOnFailure(TestReplaceDatanodeOnFailure.java:153)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6101) TestReplaceDatanodeOnFailure fails occasionally

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6101:


Attachment: TestReplaceDatanodeOnFailure.log

Full log from test run attached.

> TestReplaceDatanodeOnFailure fails occasionally
> ---
>
> Key: HDFS-6101
> URL: https://issues.apache.org/jira/browse/HDFS-6101
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Arpit Agarwal
> Attachments: TestReplaceDatanodeOnFailure.log
>
>
> Exception details:
> {code}
> testReplaceDatanodeOnFailure(org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure)
>   Time elapsed: 25.176 sec  <<< FAILURE!java.lang.AssertionError: 
> expected:<3> but was:<2>  at org.junit.Assert.fail(Assert.java:93)
>   at org.junit.Assert.failNotEquals(Assert.java:647)  at 
> org.junit.Assert.assertEquals(Assert.java:128)
>   at org.junit.Assert.assertEquals(Assert.java:472)  at 
> org.junit.Assert.assertEquals(Assert.java:456)
>   at 
> org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure$SlowWriter.checkReplication(TestReplaceDatanodeOnFailure.java:234)
>   at 
> org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure.testReplaceDatanodeOnFailure(TestReplaceDatanodeOnFailure.java:153)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6100) webhdfs filesystem does not failover in HA mode

2014-03-12 Thread Arpit Gupta (JIRA)
Arpit Gupta created HDFS-6100:
-

 Summary: webhdfs filesystem does not failover in HA mode
 Key: HDFS-6100
 URL: https://issues.apache.org/jira/browse/HDFS-6100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai


While running slive with a webhdfs file system reducers fail as they keep 
trying to write to standby namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932662#comment-13932662
 ] 

Hudson commented on HDFS-6096:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5315 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5315/])
HDFS-6096. TestWebHdfsTokens may timeout. (Contributed by szetszwo) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576999)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java


> TestWebHdfsTokens may timeout
> -
>
> Key: HDFS-6096
> URL: https://issues.apache.org/jira/browse/HDFS-6096
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0, 2.4.0
>
> Attachments: h6096_20140312.patch
>
>
> The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
> some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932649#comment-13932649
 ] 

Hudson commented on HDFS-5705:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5314 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5314/])
HDFS-5705. Update CHANGES.txt for merging the original fix (r1555190) to 
branch-2 and branch-2.4. (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576989)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to 
> ConcurrentModificationException
> 
>
> Key: HDFS-5705
> URL: https://issues.apache.org/jira/browse/HDFS-5705
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 3.0.0, 2.4.0
>
> Attachments: hdfs-5705.html, hdfs-5705.txt
>
>
> From 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/
>  :
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116)
> {code}
> The above happens when shutdown() is called in parallel to addBlockPool() or 
> shutdownBlockPool().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6096:


  Resolution: Fixed
   Fix Version/s: 2.4.0
  3.0.0
Target Version/s: 2.4.0
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks Nicholas and Haohui! I committed this to trunk through branch-2.4.

> TestWebHdfsTokens may timeout
> -
>
> Key: HDFS-6096
> URL: https://issues.apache.org/jira/browse/HDFS-6096
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0, 2.4.0
>
> Attachments: h6096_20140312.patch
>
>
> The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
> some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5705:


Target Version/s: 2.4.0
   Fix Version/s: 2.4.0

Merged to branch-2 and branch-2.4

> TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to 
> ConcurrentModificationException
> 
>
> Key: HDFS-5705
> URL: https://issues.apache.org/jira/browse/HDFS-5705
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 3.0.0, 2.4.0
>
> Attachments: hdfs-5705.html, hdfs-5705.txt
>
>
> From 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/
>  :
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116)
> {code}
> The above happens when shutdown() is called in parallel to addBlockPool() or 
> shutdownBlockPool().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5477) Block manager as a service

2014-03-12 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-5477:
--

Attachment: Block Manager as a Service - Implementation decisions.pdf

Attached some more documentation about our design and implementation decisions 
in the attached patches. (Block Manager as a Service - Implementation 
decisions.pdf)
This document should complement the RemoteBM.pdf design doc.


> Block manager as a service
> --
>
> Key: HDFS-5477
> URL: https://issues.apache.org/jira/browse/HDFS-5477
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: Block Manager as a Service - Implementation 
> decisions.pdf, Proposal.pdf, Proposal.pdf, Remote BM.pdf, Standalone BM.pdf, 
> Standalone BM.pdf, patches.tar.gz
>
>
> The block manager needs to evolve towards having the ability to run as a 
> standalone service to improve NN vertical and horizontal scalability.  The 
> goal is reducing the memory footprint of the NN proper to support larger 
> namespaces, and improve overall performance by decoupling the block manager 
> from the namespace and its lock.  Ideally, a distinct BM will be transparent 
> to clients and DNs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932620#comment-13932620
 ] 

Hadoop QA commented on HDFS-6096:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634240/h6096_20140312.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6387//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6387//console

This message is automatically generated.

> TestWebHdfsTokens may timeout
> -
>
> Key: HDFS-6096
> URL: https://issues.apache.org/jira/browse/HDFS-6096
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h6096_20140312.patch
>
>
> The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
> some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6098:


Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Duplicate of HDFS-5075. I'll merge the other Jira down to branch-2.4.

> ConcurrentModificationException exception during DataNode shutdown
> --
>
> Key: HDFS-6098
> URL: https://issues.apache.org/jira/browse/HDFS-6098
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
> Attachments: HDFS-6098.01.patch
>
>
> Exception hit during DN shutdown while running 
> {{TestWebHdfsWithMultipleNameNodes}}:
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
>   at 
> org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932603#comment-13932603
 ] 

Hudson commented on HDFS-6079:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5313 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5313/])
HDFS-6079. Timeout for getFileBlockStorageLocations does not work. Contributed 
by Andrew Wang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576979)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockStorageLocationUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java


> Timeout for getFileBlockStorageLocations does not work
> --
>
> Key: HDFS-6079
> URL: https://issues.apache.org/jira/browse/HDFS-6079
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.4.0
>
> Attachments: hdfs-6079-1.patch
>
>
> {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
> which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6079:
--

   Resolution: Fixed
Fix Version/s: 2.4.0
   Status: Resolved  (was: Patch Available)

Thanks ATM, committed this back through branch-2.4.

> Timeout for getFileBlockStorageLocations does not work
> --
>
> Key: HDFS-6079
> URL: https://issues.apache.org/jira/browse/HDFS-6079
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.4.0
>
> Attachments: hdfs-6079-1.patch
>
>
> {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
> which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932599#comment-13932599
 ] 

Hadoop QA commented on HDFS-6098:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634287/HDFS-6098.01.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6388//console

This message is automatically generated.

> ConcurrentModificationException exception during DataNode shutdown
> --
>
> Key: HDFS-6098
> URL: https://issues.apache.org/jira/browse/HDFS-6098
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
> Attachments: HDFS-6098.01.patch
>
>
> Exception hit during DN shutdown while running 
> {{TestWebHdfsWithMultipleNameNodes}}:
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
>   at 
> org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932580#comment-13932580
 ] 

Hadoop QA commented on HDFS-6079:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634239/hdfs-6079-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6386//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6386//console

This message is automatically generated.

> Timeout for getFileBlockStorageLocations does not work
> --
>
> Key: HDFS-6079
> URL: https://issues.apache.org/jira/browse/HDFS-6079
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-6079-1.patch
>
>
> {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
> which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6098:


Status: Patch Available  (was: Open)

> ConcurrentModificationException exception during DataNode shutdown
> --
>
> Key: HDFS-6098
> URL: https://issues.apache.org/jira/browse/HDFS-6098
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
> Attachments: HDFS-6098.01.patch
>
>
> Exception hit during DN shutdown while running 
> {{TestWebHdfsWithMultipleNameNodes}}:
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
>   at 
> org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6098:


Attachment: HDFS-6098.01.patch

Trivial patch to make {{bpSlices}} a {{ConcurrentHashMap}}. I think it is 
sufficient here.

> ConcurrentModificationException exception during DataNode shutdown
> --
>
> Key: HDFS-6098
> URL: https://issues.apache.org/jira/browse/HDFS-6098
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
> Attachments: HDFS-6098.01.patch
>
>
> Exception hit during DN shutdown while running 
> {{TestWebHdfsWithMultipleNameNodes}}:
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
>   at 
> org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads

2014-03-12 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932518#comment-13932518
 ] 

Masatake Iwasaki commented on HDFS-6007:


Thanks again for your comments.

bq. Please skip the configuration tables. They just duplicate hdfs-default.xml

Most of the properties in the table are not in hdfs-default.xml but only in 
DFSConfigKeys. Would I add them to hdfs-default.xml?

> Update documentation about short-circuit local reads
> 
>
> Key: HDFS-6007
> URL: https://issues.apache.org/jira/browse/HDFS-6007
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
> HDFS-6007-3.patch
>
>
> updating the contents of "HDFS SHort-Circuit Local Reads" based on the 
> changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932485#comment-13932485
 ] 

Arpit Agarwal edited comment on HDFS-6098 at 3/12/14 10:13 PM:
---

The synchronization of {{FsVolumeImpl#bpSlices}} looks wrong. In different 
locations it is modified without a lock or synchronized via 
{{FsVolumeImpl#dataset}}. Also reads are unsynchronized.


was (Author: arpitagarwal):
The synchronization of {{FsDatasetImpl#bpSlices}} looks wrong. In different 
locations it is modified without a lock or synchronized via 
{{FsDatasetImpl#dataset}}. Also reads are unsynchronized.

> ConcurrentModificationException exception during DataNode shutdown
> --
>
> Key: HDFS-6098
> URL: https://issues.apache.org/jira/browse/HDFS-6098
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>
> Exception hit during DN shutdown while running 
> {{TestWebHdfsWithMultipleNameNodes}}:
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
>   at 
> org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932498#comment-13932498
 ] 

Arpit Agarwal commented on HDFS-6098:
-

Correction - modification of the map is not synchronized at all.

> ConcurrentModificationException exception during DataNode shutdown
> --
>
> Key: HDFS-6098
> URL: https://issues.apache.org/jira/browse/HDFS-6098
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>
> Exception hit during DN shutdown while running 
> {{TestWebHdfsWithMultipleNameNodes}}:
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
>   at 
> org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932502#comment-13932502
 ] 

Aaron T. Myers commented on HDFS-6079:
--

Patch looks good to me. +1 pending Jenkins.

> Timeout for getFileBlockStorageLocations does not work
> --
>
> Key: HDFS-6079
> URL: https://issues.apache.org/jira/browse/HDFS-6079
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-6079-1.patch
>
>
> {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
> which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932485#comment-13932485
 ] 

Arpit Agarwal edited comment on HDFS-6098 at 3/12/14 10:08 PM:
---

The synchronization of {{FsDatasetImpl#bpSlices}} looks wrong. In different 
locations it is modified without a lock or synchronized via 
{{FsDatasetImpl#dataset}}. Also reads are unsynchronized.


was (Author: arpitagarwal):
The synchronization of {{FsDatasetImpl#bpSlices}} looks wront. In different 
locations it is modified without a lock or synchronized via 
{{FsDatasetImpl#dataset}}. Also reads are unsynchronized.

> ConcurrentModificationException exception during DataNode shutdown
> --
>
> Key: HDFS-6098
> URL: https://issues.apache.org/jira/browse/HDFS-6098
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>
> Exception hit during DN shutdown while running 
> {{TestWebHdfsWithMultipleNameNodes}}:
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
>   at 
> org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932485#comment-13932485
 ] 

Arpit Agarwal commented on HDFS-6098:
-

The synchronization of {{FsDatasetImpl#bpSlices}} looks wront. In different 
locations it is modified without a lock or synchronized via 
{{FsDatasetImpl#dataset}}. Also reads are unsynchronized.

> ConcurrentModificationException exception during DataNode shutdown
> --
>
> Key: HDFS-6098
> URL: https://issues.apache.org/jira/browse/HDFS-6098
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>
> Exception hit during DN shutdown while running 
> {{TestWebHdfsWithMultipleNameNodes}}:
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
>   at 
> org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order.

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932472#comment-13932472
 ] 

Hadoop QA commented on HDFS-5244:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604323/HDFS-5244.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6385//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6385//console

This message is automatically generated.

> TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly 
> expects Hashmap values to have order. 
> 
>
> Key: HDFS-5244
> URL: https://issues.apache.org/jira/browse/HDFS-5244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.1.0-beta
> Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
>Reporter: Jinghui Wang
>Assignee: Jinghui Wang
> Fix For: 3.0.0, 2.1.0-beta, 2.4.0
>
> Attachments: HDFS-5244.patch
>
>
> The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
> HashMap(dirRoots) to store the root storages to be mocked for the purging 
> test, which does not have any predictable order. The directories needs be 
> purged are stored in a LinkedHashSet, which has a predictable order. So, when 
> the directories get mocked for the test, they could be already out of
> the order that they were added. Thus, the order that the directories were
> actually purged and the order of them being added to the LinkedHashList could
> be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-6098:
---

 Summary: ConcurrentModificationException exception during DataNode 
shutdown
 Key: HDFS-6098
 URL: https://issues.apache.org/jira/browse/HDFS-6098
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal


Exception hit during DN shutdown while running 
{{TestWebHdfsWithMultipleNameNodes}}:

{code}
java.util.ConcurrentModificationException: null
at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
at java.util.HashMap$EntryIterator.next(HashMap.java:934)
at java.util.HashMap$EntryIterator.next(HashMap.java:932)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
at 
org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932443#comment-13932443
 ] 

Arpit Agarwal commented on HDFS-6098:
-

Faulting function.

{code}
  void shutdown() {
cacheExecutor.shutdown();
Set> set = bpSlices.entrySet();
for (Entry entry : set) {
  entry.getValue().shutdown();
}
  }
{code}

> ConcurrentModificationException exception during DataNode shutdown
> --
>
> Key: HDFS-6098
> URL: https://issues.apache.org/jira/browse/HDFS-6098
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>
> Exception hit during DN shutdown while running 
> {{TestWebHdfsWithMultipleNameNodes}}:
> {code}
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
>   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
>   at 
> org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6099) HDFS file system limits not enforced on renames.

2014-03-12 Thread Chris Nauroth (JIRA)
Chris Nauroth created HDFS-6099:
---

 Summary: HDFS file system limits not enforced on renames.
 Key: HDFS-6099
 URL: https://issues.apache.org/jira/browse/HDFS-6099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth


{{dfs.namenode.fs-limits.max-component-length}} and 
{{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the 
destination path during rename operations.  This means that it's still possible 
to create files that violate these limits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5274) Add Tracing to HDFS

2014-03-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932436#comment-13932436
 ] 

stack commented on HDFS-5274:
-

[~iwasakims] That is a beautiful png

> Add Tracing to HDFS
> ---
>
> Key: HDFS-5274
> URL: https://issues.apache.org/jira/browse/HDFS-5274
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 2.1.1-beta
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: 3node_get_200mb.png, 3node_put_200mb.png, 
> 3node_put_200mb.png, HDFS-5274-0.patch, HDFS-5274-1.patch, 
> HDFS-5274-10.patch, HDFS-5274-11.txt, HDFS-5274-12.patch, HDFS-5274-13.patch, 
> HDFS-5274-2.patch, HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, 
> HDFS-5274-6.patch, HDFS-5274-7.patch, HDFS-5274-8.patch, HDFS-5274-8.patch, 
> HDFS-5274-9.patch, Zipkin   Trace a06e941b0172ec73.png, Zipkin   Trace 
> d0f0d66b8a258a69.png, ss-5274v8-get.png, ss-5274v8-put.png
>
>
> Since Google's Dapper paper has shown the benefits of tracing for a large 
> distributed system, it seems like a good time to add tracing to HDFS.  HBase 
> has added tracing using HTrace.  I propose that the same can be done within 
> HDFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-6097:
--

 Summary: zero-copy reads are incorrectly disabled on file offsets 
above 2GB
 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some 
code that is supposed to disable zero-copy reads on offsets in block files 
greater than 2GB (because MappedByteBuffer segments are limited to that size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932341#comment-13932341
 ] 

Arpit Agarwal commented on HDFS-6094:
-

No concrete diagnosis of this issue yet, I am still investigating.

> The same block can be counted twice towards safe mode threshold
> ---
>
> Key: HDFS-6094
> URL: https://issues.apache.org/jira/browse/HDFS-6094
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>
> {{BlockManager#addStoredBlock}} can cause the same block can be counted 
> towards safe mode threshold. We see this manifest via 
> {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
> details to follow in a comment.
> Exception details:
> {code}
>   Time elapsed: 12.874 sec  <<< FAILURE!
> java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
> blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
> live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
> off automatically in 28 seconds.'
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.assertTrue(Assert.java:43)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6094) The same block can be counted twice towards safe mode threshold

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6094:


Description: 
{{BlockManager#addStoredBlock}} can cause the same block can be counted towards 
safe mode threshold. We see this manifest via 
{{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
details to follow in a comment.

Exception details:
{code}
  Time elapsed: 12.874 sec  <<< FAILURE!
java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of live 
datanodes 3 has reached the minimum number 0. Safe mode will be turned off 
automatically in 28 seconds.'
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
{code}

  was:{{BlockManager#addStoredBlock}} can cause the same block can be counted 
towards safe mode threshold. We see this manifest via 
{{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
details to follow in a comment.


> The same block can be counted twice towards safe mode threshold
> ---
>
> Key: HDFS-6094
> URL: https://issues.apache.org/jira/browse/HDFS-6094
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>
> {{BlockManager#addStoredBlock}} can cause the same block can be counted 
> towards safe mode threshold. We see this manifest via 
> {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
> details to follow in a comment.
> Exception details:
> {code}
>   Time elapsed: 12.874 sec  <<< FAILURE!
> java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
> blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
> live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
> off automatically in 28 seconds.'
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.assertTrue(Assert.java:43)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6075) Introducing "non-replication mode"

2014-03-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932300#comment-13932300
 ] 

Ravi Prakash commented on HDFS-6075:


I feel another option to do this would be to disable replication on a set of 
*nodes* temporarily (not cluster wide). i.e. the list of nodes, and a timeout 
after which replications should be done.

> Introducing "non-replication mode"
> --
>
> Key: HDFS-6075
> URL: https://issues.apache.org/jira/browse/HDFS-6075
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Adam Kawa
>Priority: Minor
>
> Afaik, HDFS does not provide an easy way to temporarily disable the 
> replication of missing blocks.
> If you would like to temporarily disable the replication, you would have to
> * set dfs.namenode.replication.interval (_The periodicity in seconds with 
> which the namenode computes repliaction work for datanodes_ Default 3) to 
> something very high. *Disadvantage*: you have to restart the NN
> * go into the safe-mode. *Disadvantage*: all write operations will fail
> We have the situation that we need to replace our top-of-rack switches for 
> each rack. Replacing a switch should take around 30 minutes. Each rack has 
> around 0.6 PB of data. We would like to avoid an expensive replication, since 
> we know that we will put this rack online quickly. To avoid any downtime, or 
> excessive network transfer, we think that temporarily disabling the 
> replication could fit us.
> The default block placement policy puts blocks into two racks, so when one 
> rack temporarily goes offline, we still have an access to at least replica of 
> each block. Of course, if we lose this replica, then we would have to wait 
> until the rack goes back online. This is what the administrator should be 
> aware of.
> This feature could disable the replication
> * globally - for a whole cluster
> * partially - e.g. only for missing blocks that come from a specified set of 
> DataNodes. So a file like "we_will_be_back_soon" :) could be introduced, 
> similar to include and exclude.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932301#comment-13932301
 ] 

Haohui Mai commented on HDFS-6096:
--

+1 pending jenkins.

> TestWebHdfsTokens may timeout
> -
>
> Key: HDFS-6096
> URL: https://issues.apache.org/jira/browse/HDFS-6096
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h6096_20140312.patch
>
>
> The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
> some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-6096:
--

Attachment: h6096_20140312.patch

h6096_20140312.patch: increases the timeout to 5 seconds and removes 
unnecessary @SuppressWarnings("unchecked").

> TestWebHdfsTokens may timeout
> -
>
> Key: HDFS-6096
> URL: https://issues.apache.org/jira/browse/HDFS-6096
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h6096_20140312.patch
>
>
> The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
> some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-6096:
--

Status: Patch Available  (was: Open)

> TestWebHdfsTokens may timeout
> -
>
> Key: HDFS-6096
> URL: https://issues.apache.org/jira/browse/HDFS-6096
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h6096_20140312.patch
>
>
> The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
> some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Tsz Wo Nicholas Sze (JIRA)
Tsz Wo Nicholas Sze created HDFS-6096:
-

 Summary: TestWebHdfsTokens may timeout
 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor


The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for some 
machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932285#comment-13932285
 ] 

Hadoop QA commented on HDFS-6089:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634208/HDFS-6089.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6384//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6384//console

This message is automatically generated.

> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
> Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6079:
--

Status: Patch Available  (was: Open)

> Timeout for getFileBlockStorageLocations does not work
> --
>
> Key: HDFS-6079
> URL: https://issues.apache.org/jira/browse/HDFS-6079
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-6079-1.patch
>
>
> {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
> which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6079:
--

Attachment: hdfs-6079-1.patch

Patch attached. The fix is pretty simple, just need to catch the 
CancellationException that was previously bubbling all the way up.

> Timeout for getFileBlockStorageLocations does not work
> --
>
> Key: HDFS-6079
> URL: https://issues.apache.org/jira/browse/HDFS-6079
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-6079-1.patch
>
>
> {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
> which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order.

2014-03-12 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932228#comment-13932228
 ] 

Suresh Srinivas commented on HDFS-5244:
---

+1 for the patch. Will commit it shortly once Jenkins +1s the patch.

> TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly 
> expects Hashmap values to have order. 
> 
>
> Key: HDFS-5244
> URL: https://issues.apache.org/jira/browse/HDFS-5244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.1.0-beta
> Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
>Reporter: Jinghui Wang
>Assignee: Jinghui Wang
> Fix For: 3.0.0, 2.1.0-beta, 2.4.0
>
> Attachments: HDFS-5244.patch
>
>
> The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
> HashMap(dirRoots) to store the root storages to be mocked for the purging 
> test, which does not have any predictable order. The directories needs be 
> purged are stored in a LinkedHashSet, which has a predictable order. So, when 
> the directories get mocked for the test, they could be already out of
> the order that they were added. Thus, the order that the directories were
> actually purged and the order of them being added to the LinkedHashList could
> be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order.

2014-03-12 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-5244:
--

Target Version/s: 2.1.1-beta, 3.0.0  (was: 3.0.0, 2.1.1-beta)
  Status: Patch Available  (was: Open)

> TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly 
> expects Hashmap values to have order. 
> 
>
> Key: HDFS-5244
> URL: https://issues.apache.org/jira/browse/HDFS-5244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.1.0-beta
> Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
>Reporter: Jinghui Wang
>Assignee: Jinghui Wang
> Fix For: 3.0.0, 2.4.0, 2.1.0-beta
>
> Attachments: HDFS-5244.patch
>
>
> The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
> HashMap(dirRoots) to store the root storages to be mocked for the purging 
> test, which does not have any predictable order. The directories needs be 
> purged are stored in a LinkedHashSet, which has a predictable order. So, when 
> the directories get mocked for the test, they could be already out of
> the order that they were added. Thus, the order that the directories were
> actually purged and the order of them being added to the LinkedHashList could
> be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932212#comment-13932212
 ] 

Hadoop QA commented on HDFS-6092:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634196/hdfs-6092-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6383//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6383//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6383//console

This message is automatically generated.

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
> hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932209#comment-13932209
 ] 

Hadoop QA commented on HDFS-6092:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634196/hdfs-6092-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6382//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6382//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6382//console

This message is automatically generated.

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
> hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads

2014-03-12 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932178#comment-13932178
 ] 

Colin Patrick McCabe commented on HDFS-6007:


Thanks for looking at this.  I think we should limit the scope here to just 
adding a sentence about shared-memory segments, and adding some documentation 
about the legacy short-circuit implementation.

I think the zero-copy API should get its own document.  Putting it in here just 
seems like information overload.

{code}
+  Client and DataNode uses shared memory segments
+  to communicate short-circuit read.
{code}

How about "The client and the DataNode exchange information via a shared memory 
segment."

{code}
+  if /dev/shm is not world writable or does not exist in your environment,
+  You can change the paths on which shared memory segments are created by
+  setting the value of <<>>
+  to comma separated paths like <<>>.
+  It tries paths in order until creation of shared memory segment succeeds.
{code}

Can we skip this section?  99.999% of users will never need to change that 
config value, and there's documentation in hdfs-defaults.xml for those who do.  
The number of UNIX systems without /tmp must be pretty small indeed.

{code}
+  Legacy short-circuit local reads implementation
+  on which clients directly open HDFS block files is still available
+  for platforms other than Linux.
{code}

Missing 'the'

I think we need a sentence or two explaining that the old short-circuit 
implementation is insecure, because it allows users to directly access the 
blocks.  We also need some explanation about how you have to chmod the blocks 
into the correct UNIX group so that they are accessible.

Please skip the configuration tables.  They just duplicate hdfs-default.xml

> Update documentation about short-circuit local reads
> 
>
> Key: HDFS-6007
> URL: https://issues.apache.org/jira/browse/HDFS-6007
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
> HDFS-6007-3.patch
>
>
> updating the contents of "HDFS SHort-Circuit Local Reads" based on the 
> changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5477) Block manager as a service

2014-03-12 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-5477:
--

Attachment: Remote BM.pdf

Attached design doc. RemoteBM.pdf to its JIRA

> Block manager as a service
> --
>
> Key: HDFS-5477
> URL: https://issues.apache.org/jira/browse/HDFS-5477
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: Proposal.pdf, Proposal.pdf, Remote BM.pdf, Standalone 
> BM.pdf, Standalone BM.pdf, patches.tar.gz
>
>
> The block manager needs to evolve towards having the ability to run as a 
> standalone service to improve NN vertical and horizontal scalability.  The 
> goal is reducing the memory footprint of the NN proper to support larger 
> namespaces, and improve overall performance by decoupling the block manager 
> from the namespace and its lock.  Ideally, a distinct BM will be transparent 
> to clients and DNs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6095) to add missing description of default value of config properties in the document

2014-03-12 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6095:


Affects Version/s: 2.3.0
Fix Version/s: (was: 2.3.0)

> to add missing description of default value of config properties in the 
> document
> 
>
> Key: HDFS-6095
> URL: https://issues.apache.org/jira/browse/HDFS-6095
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.3.0
>Reporter: Yongjun Zhang
>Priority: Minor
>
> We should describe the default value of config property in the document when 
> appropriate.
> As an example, default value of config property dfs.webhdfs.enabled is 
> changed from false to true by HDFS-5532. The document
> http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
> (and various versions) listed the property with no default value described.
> I hope different documents can be reviewed and updated per this JIRA request.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6095) to add missing description of default value of config properties in the document

2014-03-12 Thread Yongjun Zhang (JIRA)
Yongjun Zhang created HDFS-6095:
---

 Summary: to add missing description of default value of config 
properties in the document
 Key: HDFS-6095
 URL: https://issues.apache.org/jira/browse/HDFS-6095
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Yongjun Zhang
Priority: Minor
 Fix For: 2.3.0


We should describe the default value of config property in the document when 
appropriate.

As an example, default value of config property dfs.webhdfs.enabled is changed 
from false to true by HDFS-5532. The document
http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
(and various versions) listed the property with no default value described.

I hope different documents can be reviewed and updated per this JIRA request.

Thanks.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6010) Make balancer able to balance data among specified servers

2014-03-12 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932055#comment-13932055
 ] 

Devaraj Das commented on HDFS-6010:
---

[~carp84], sorry for the delay in getting back. You know how things work when 
there are deadlines to meet :-)  I have some follow up questions for my 
understanding.

1. How would you maintain the mapping of files to groups? (for the HDFS-6012 to 
work). If the mapping is maintained, wondering whether it makes sense to have 
the tool take paths for balancing as opposed to servers. Then maybe you can 
also combine the tool that does group management (HDFS-6012) into the balancer.
2. Are these mappings set up by some admin?
3. Would you expand a group when it is nearing capacity?
4. How does someone like HBase use this? Is HBase going to have visibility into 
the mappings as well (to take care of HBASE-6721 and favored-nodes for writes)?
5. Would you need a higher level balancer for keeping the whole cluster 
balanced (do migrations of blocks associated with certain paths from one group 
to another)? Otherwise, there would be skews in the block distribution. 
6. When there is a failure of a datanode in a group, how would you choose which 
datanodes to replicate the blocks to. The choice would be somewhat important 
given that some target datanodes might be busy serving requests for apps for 
its group. Adding some more work to these datanodes might make apps in the 
other group suffer. But maybe it's not that big a deal. On the other hand, if 
the group still has capacity, and the failure zones are still intact for the 
members in the group, then the replication could take into account the mapping 
in (1).

> Make balancer able to balance data among specified servers
> --
>
> Key: HDFS-6010
> URL: https://issues.apache.org/jira/browse/HDFS-6010
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 2.3.0
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Minor
> Attachments: HDFS-6010-trunk.patch
>
>
> Currently, the balancer tool balances data among all datanodes. However, in 
> some particular case, we would need to balance data only among specified 
> nodes instead of the whole set.
> In this JIRA, a new "-servers" option would be introduced to implement this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-12 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6089:


Attachment: HDFS-6089.001.patch

Fix unit tests.

> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
> Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-12 Thread Thanh Do (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932037#comment-13932037
 ] 

Thanh Do commented on HDFS-6009:


Thank you!

> Tools based on favored node feature for isolation
> -
>
> Key: HDFS-6009
> URL: https://issues.apache.org/jira/browse/HDFS-6009
> Project: Hadoop HDFS
>  Issue Type: Task
>Affects Versions: 2.3.0
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Minor
>
> There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
> multi-tenant deployments of HBase we prefer to specify several groups of 
> regionservers to serve different applications, to achieve some kind of 
> isolation or resource allocation. However, although the regionservers are 
> grouped, the datanodes which store the data are not, which leads to the case 
> that one datanode failure affects multiple applications, as we already 
> observed in our product environment.
> To relieve the above issue, we could take usage of the favored node feature 
> (HDFS-2576) to make regionserver able to locate data within its group, or say 
> make datanodes also grouped (passively), to form some level of isolation.
> In this case, or any other case that needs datanodes to group, we would need 
> a bunch of tools to maintain the "group", including:
> 1. Making balancer able to balance data among specified servers, rather than 
> the whole set
> 2. Set balance bandwidth for specified servers, rather than the whole set
> 3. Some tool to check whether the block is "cross-group" placed, and move it 
> back if so
> This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-12 Thread Sirianni, Eric (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932042#comment-13932042
 ] 

Sirianni, Eric commented on HDFS-6009:
--

Thanks for emailing NetApp. The email inbox you have attempted to reach has 
been deactivated.


> Tools based on favored node feature for isolation
> -
>
> Key: HDFS-6009
> URL: https://issues.apache.org/jira/browse/HDFS-6009
> Project: Hadoop HDFS
>  Issue Type: Task
>Affects Versions: 2.3.0
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Minor
>
> There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
> multi-tenant deployments of HBase we prefer to specify several groups of 
> regionservers to serve different applications, to achieve some kind of 
> isolation or resource allocation. However, although the regionservers are 
> grouped, the datanodes which store the data are not, which leads to the case 
> that one datanode failure affects multiple applications, as we already 
> observed in our product environment.
> To relieve the above issue, we could take usage of the favored node feature 
> (HDFS-2576) to make regionserver able to locate data within its group, or say 
> make datanodes also grouped (passively), to form some level of isolation.
> In this case, or any other case that needs datanodes to group, we would need 
> a bunch of tools to maintain the "group", including:
> 1. Making balancer able to balance data among specified servers, rather than 
> the whole set
> 2. Set balance bandwidth for specified servers, rather than the whole set
> 3. Some tool to check whether the block is "cross-group" placed, and move it 
> back if so
> This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations

2014-03-12 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932008#comment-13932008
 ] 

Arun C Murthy commented on HDFS-4564:
-

How is this looking [~daryn]? Thanks.

(Doing a pass over 2.4 blockers)

> Webhdfs returns incorrect http response codes for denied operations
> ---
>
> Key: HDFS-4564
> URL: https://issues.apache.org/jira/browse/HDFS-4564
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: webhdfs
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, 
> HDFS-4564.branch-23.patch, HDFS-4564.patch
>
>
> Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's 
> denying operations.  Examples including rejecting invalid proxy user attempts 
> and renew/cancel with an invalid user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-12 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932001#comment-13932001
 ] 

Suresh Srinivas commented on HDFS-5840:
---

[~atm], most of us are swamped with wrapping up rolling upgrades and testing 
it. Can you please look into this?

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5138) Support HDFS upgrade in HA

2014-03-12 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931998#comment-13931998
 ] 

Suresh Srinivas commented on HDFS-5138:
---

Sorry I mean the above comment to be in HDFS-5840.

> Support HDFS upgrade in HA
> --
>
> Key: HDFS-5138
> URL: https://issues.apache.org/jira/browse/HDFS-5138
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Kihwal Lee
>Assignee: Aaron T. Myers
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
> HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
> HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
> hdfs-5138-branch-2.txt
>
>
> With HA enabled, NN wo't start with "-upgrade". Since there has been a layout 
> version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
> necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
> to get around this was to disable HA and upgrade. 
> The NN and the cluster cannot be flipped back to HA until the upgrade is 
> finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
> back on without involving DNs, things will work, but finaliizeUpgrade won't 
> work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
> snapshots won't get removed.
> We will need a different ways of doing layout upgrade and upgrade snapshot.  
> I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
> there is a reasonable workaround that does not increase maintenance window 
> greatly, we can lower its priority from blocker to critical.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5138) Support HDFS upgrade in HA

2014-03-12 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931985#comment-13931985
 ] 

Suresh Srinivas commented on HDFS-5138:
---

[~atm], most of us are swamped with wrapping up rolling upgrades and testing 
it. Can you please look into this?

> Support HDFS upgrade in HA
> --
>
> Key: HDFS-5138
> URL: https://issues.apache.org/jira/browse/HDFS-5138
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Kihwal Lee
>Assignee: Aaron T. Myers
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
> HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
> HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
> hdfs-5138-branch-2.txt
>
>
> With HA enabled, NN wo't start with "-upgrade". Since there has been a layout 
> version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
> necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
> to get around this was to disable HA and upgrade. 
> The NN and the cluster cannot be flipped back to HA until the upgrade is 
> finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
> back on without involving DNs, things will work, but finaliizeUpgrade won't 
> work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
> snapshots won't get removed.
> We will need a different ways of doing layout upgrade and upgrade snapshot.  
> I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
> there is a reasonable workaround that does not increase maintenance window 
> greatly, we can lower its priority from blocker to critical.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6010) Make balancer able to balance data among specified servers

2014-03-12 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931974#comment-13931974
 ] 

Yu Li commented on HDFS-6010:
-

Hi [~devaraj], it seems we are waiting for your comment here. :-)

[~szetszwo], any review points about the patch attached here? Or we need to 
wait for Das' comments before starting the code review? Thanks.

> Make balancer able to balance data among specified servers
> --
>
> Key: HDFS-6010
> URL: https://issues.apache.org/jira/browse/HDFS-6010
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 2.3.0
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Minor
> Attachments: HDFS-6010-trunk.patch
>
>
> Currently, the balancer tool balances data among all datanodes. However, in 
> some particular case, we would need to balance data only among specified 
> nodes instead of the whole set.
> In this JIRA, a new "-servers" option would be introduced to implement this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-12 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931955#comment-13931955
 ] 

Yu Li commented on HDFS-6009:
-

Hi [~thanhdo],

Yes, the data are replicated, so there won't be data loss. However, since one 
datanode might carry on data of multiple applications, the datanode failure 
will cause *several* application read request to retry until timeout and change 
to another datanode, while we'd like to reduce the impact range

Another scenario we experienced here is that application A crazily reading data 
from one DN, which occupied almost all network bandwidth, while meantime 
application B tried to write data to this DN but blocked a long time.

As I mentioned in HDFS-6010, people might ask in this case why don't use 
phasically separated clusters, the answer would be it's more convenient and 
saves people resource to manage one big cluster than several small ones.

There's also other solution like HDFS-5776 to reduce the impact of bad 
datanode, but I believe there're still scenarios which need more strict io 
isolation, so I think it's still valuable to contribute our tools.

Hope this answers your question. :-)

> Tools based on favored node feature for isolation
> -
>
> Key: HDFS-6009
> URL: https://issues.apache.org/jira/browse/HDFS-6009
> Project: Hadoop HDFS
>  Issue Type: Task
>Affects Versions: 2.3.0
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Minor
>
> There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
> multi-tenant deployments of HBase we prefer to specify several groups of 
> regionservers to serve different applications, to achieve some kind of 
> isolation or resource allocation. However, although the regionservers are 
> grouped, the datanodes which store the data are not, which leads to the case 
> that one datanode failure affects multiple applications, as we already 
> observed in our product environment.
> To relieve the above issue, we could take usage of the favored node feature 
> (HDFS-2576) to make regionserver able to locate data within its group, or say 
> make datanodes also grouped (passively), to form some level of isolation.
> In this case, or any other case that needs datanodes to group, we would need 
> a bunch of tools to maintain the "group", including:
> 1. Making balancer able to balance data among specified servers, rather than 
> the whole set
> 2. Set balance bandwidth for specified servers, rather than the whole set
> 3. Some tool to check whether the block is "cross-group" placed, and move it 
> back if so
> This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931930#comment-13931930
 ] 

Ted Yu commented on HDFS-6092:
--

How about patch v3 ?
I try to be a little generic by detecting (port) number in the service name.

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
> hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6092:
-

Attachment: hdfs-6092-v3.txt

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
> hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931913#comment-13931913
 ] 

haosdent commented on HDFS-6092:


[~te...@apache.org] Sorry, I misunderstand the title before. I replace your 
code with this snippet.

{code}
if (dfs.getCanonicalServiceName() != null
&& 
!(dfs.getCanonicalServiceName().startsWith(HdfsConstants.HA_DT_SERVICE_PREFIX))
&& uri.getPort() == -1) {
  uri = UriBuilder.fromUri(uri).port(NameNode.DEFAULT_PORT).build();
}
{code}

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
> hdfs-6092-v1.txt, hdfs-6092-v2.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated HDFS-6092:
---

Attachment: haosdent-HDFS-6092-v2.patch

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
> hdfs-6092-v1.txt, hdfs-6092-v2.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6072) Clean up dead code of FSImage

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931841#comment-13931841
 ] 

Hudson commented on HDFS-6072:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1724 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1724/])
HDFS-6072. Clean up dead code of FSImage. Contributed by Haohui Mai. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576513)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


> Clean up dead code of FSImage
> -
>
> Key: HDFS-6072
> URL: https://issues.apache.org/jira/browse/HDFS-6072
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.4.0
>
> Attachments: HDFS-6072.000.patch, HDFS-6072.001.patch, 
> HDFS-6072.002.patch
>
>
> After HDFS-5698 HDFS store FSImage in protobuf format. The old code of saving 
> the FSImage is now dead, which should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5638) HDFS implementation of FileContext API for ACLs.

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931837#comment-13931837
 ] 

Hudson commented on HDFS-5638:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1724 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1724/])
HDFS-5638. HDFS implementation of FileContext API for ACLs. Contributed by 
Vinayakumar B. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576405)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileContextAcl.java


> HDFS implementation of FileContext API for ACLs.
> 
>
> Key: HDFS-5638
> URL: https://issues.apache.org/jira/browse/HDFS-5638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: HDFS ACLs (HDFS-4685)
>Reporter: Chris Nauroth
>Assignee: Vinayakumar B
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HDFS-5638.2.patch, HDFS-5638.patch, HDFS-5638.patch, 
> HDFS-5638.patch
>
>
> Add new methods to {{AbstractFileSystem}} and {{FileContext}} for 
> manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931840#comment-13931840
 ] 

Hudson commented on HDFS-6086:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1724 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1724/])
HDFS-6086. Fix a case where zero-copy or no-checksum reads were not allowed 
even when the block was cached. (cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576533)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


> Fix a case where zero-copy or no-checksum reads were not allowed even when 
> the block was cached
> ---
>
> Key: HDFS-6086
> URL: https://issues.apache.org/jira/browse/HDFS-6086
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.4.0
>
> Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch
>
>
> We need to fix a case where zero-copy or no-checksum reads are not allowed 
> even when the block is cached.  The case is when the block is cached before 
> the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
> {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
> block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5638) HDFS implementation of FileContext API for ACLs.

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931773#comment-13931773
 ] 

Hudson commented on HDFS-5638:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1699 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1699/])
HDFS-5638. HDFS implementation of FileContext API for ACLs. Contributed by 
Vinayakumar B. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576405)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileContextAcl.java


> HDFS implementation of FileContext API for ACLs.
> 
>
> Key: HDFS-5638
> URL: https://issues.apache.org/jira/browse/HDFS-5638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: HDFS ACLs (HDFS-4685)
>Reporter: Chris Nauroth
>Assignee: Vinayakumar B
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HDFS-5638.2.patch, HDFS-5638.patch, HDFS-5638.patch, 
> HDFS-5638.patch
>
>
> Add new methods to {{AbstractFileSystem}} and {{FileContext}} for 
> manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6072) Clean up dead code of FSImage

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931777#comment-13931777
 ] 

Hudson commented on HDFS-6072:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1699 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1699/])
HDFS-6072. Clean up dead code of FSImage. Contributed by Haohui Mai. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576513)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


> Clean up dead code of FSImage
> -
>
> Key: HDFS-6072
> URL: https://issues.apache.org/jira/browse/HDFS-6072
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.4.0
>
> Attachments: HDFS-6072.000.patch, HDFS-6072.001.patch, 
> HDFS-6072.002.patch
>
>
> After HDFS-5698 HDFS store FSImage in protobuf format. The old code of saving 
> the FSImage is now dead, which should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931776#comment-13931776
 ] 

Hudson commented on HDFS-6086:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1699 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1699/])
HDFS-6086. Fix a case where zero-copy or no-checksum reads were not allowed 
even when the block was cached. (cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576533)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


> Fix a case where zero-copy or no-checksum reads were not allowed even when 
> the block was cached
> ---
>
> Key: HDFS-6086
> URL: https://issues.apache.org/jira/browse/HDFS-6086
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.4.0
>
> Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch
>
>
> We need to fix a case where zero-copy or no-checksum reads are not allowed 
> even when the block is cached.  The case is when the block is cached before 
> the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
> {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
> block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931766#comment-13931766
 ] 

Ted Yu commented on HDFS-6092:
--

[~haosd...@gmail.com]:
With your patch, desFs's URI still doesn't have port.

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, 
> hdfs-6092-v2.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931669#comment-13931669
 ] 

haosdent commented on HDFS-6092:


[~te...@apache.org] I think add this to HBase is well, but it maybe some trick 
if we add this to HDFS. Because I think we should fix this at the source of 
problem in HDFS.

{code}
   String str = uri.getScheme()+"://"+uri.getAuthority();
this.uri = URI.create(str);
if (uri.getPort() == -1) {
  String svcName = this.dfs.getCanonicalServiceName();
  int idx = svcName.indexOf(':');
  if (idx > 0) {
str = str + svcName.substring(idx);
this.uri = URI.create(str);
  }
}
{code}

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, 
> hdfs-6092-v2.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated HDFS-6092:
---

Attachment: haosdent-HDFS-6092.patch

{code}
public static Text buildTokenService(InetSocketAddress addr, boolean 
isForceUseIp) {
{code}

I add this method to fix this.

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, 
> hdfs-6092-v2.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931663#comment-13931663
 ] 

haosdent commented on HDFS-6092:


I have an other idea to fix this. Let me attach my patch.

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
> Attachments: hdfs-6092-v1.txt, hdfs-6092-v2.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6072) Clean up dead code of FSImage

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931639#comment-13931639
 ] 

Hudson commented on HDFS-6072:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #507 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/507/])
HDFS-6072. Clean up dead code of FSImage. Contributed by Haohui Mai. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576513)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


> Clean up dead code of FSImage
> -
>
> Key: HDFS-6072
> URL: https://issues.apache.org/jira/browse/HDFS-6072
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.4.0
>
> Attachments: HDFS-6072.000.patch, HDFS-6072.001.patch, 
> HDFS-6072.002.patch
>
>
> After HDFS-5698 HDFS store FSImage in protobuf format. The old code of saving 
> the FSImage is now dead, which should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5638) HDFS implementation of FileContext API for ACLs.

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931635#comment-13931635
 ] 

Hudson commented on HDFS-5638:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #507 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/507/])
HDFS-5638. HDFS implementation of FileContext API for ACLs. Contributed by 
Vinayakumar B. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576405)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileContextAcl.java


> HDFS implementation of FileContext API for ACLs.
> 
>
> Key: HDFS-5638
> URL: https://issues.apache.org/jira/browse/HDFS-5638
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: HDFS ACLs (HDFS-4685)
>Reporter: Chris Nauroth
>Assignee: Vinayakumar B
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HDFS-5638.2.patch, HDFS-5638.patch, HDFS-5638.patch, 
> HDFS-5638.patch
>
>
> Add new methods to {{AbstractFileSystem}} and {{FileContext}} for 
> manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931638#comment-13931638
 ] 

Hudson commented on HDFS-6086:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #507 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/507/])
HDFS-6086. Fix a case where zero-copy or no-checksum reads were not allowed 
even when the block was cached. (cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576533)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


> Fix a case where zero-copy or no-checksum reads were not allowed even when 
> the block was cached
> ---
>
> Key: HDFS-6086
> URL: https://issues.apache.org/jira/browse/HDFS-6086
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.4.0
>
> Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch
>
>
> We need to fix a case where zero-copy or no-checksum reads are not allowed 
> even when the block is cached.  The case is when the block is cached before 
> the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
> {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
> block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)