[jira] [Commented] (HDFS-6088) Add configurable maximum block count for datanode

2014-03-12 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931464#comment-13931464
 ] 

Todd Lipcon commented on HDFS-6088:
---

Any chance we could determine this automatically based on heap size? Would be 
nice to avoid having yet another config that users have to set.

 Add configurable maximum block count for datanode
 -

 Key: HDFS-6088
 URL: https://issues.apache.org/jira/browse/HDFS-6088
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee

 Currently datanode resources are protected by the free space check and the 
 balancer.  But datanodes can run out of memory simply storing too many 
 blocks. If the sizes of blocks are small, datanodes will appear to have 
 plenty of space to put more blocks.
 I propose adding a configurable max block count to datanode. Since datanodes 
 can have different heap configurations, it will make sense to make it 
 datanode-level, rather than something enforced by namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5638) HDFS implementation of FileContext API for ACLs.

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931635#comment-13931635
 ] 

Hudson commented on HDFS-5638:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #507 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/507/])
HDFS-5638. HDFS implementation of FileContext API for ACLs. Contributed by 
Vinayakumar B. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576405)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileContextAcl.java


 HDFS implementation of FileContext API for ACLs.
 

 Key: HDFS-5638
 URL: https://issues.apache.org/jira/browse/HDFS-5638
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-5638.2.patch, HDFS-5638.patch, HDFS-5638.patch, 
 HDFS-5638.patch


 Add new methods to {{AbstractFileSystem}} and {{FileContext}} for 
 manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931638#comment-13931638
 ] 

Hudson commented on HDFS-6086:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #507 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/507/])
HDFS-6086. Fix a case where zero-copy or no-checksum reads were not allowed 
even when the block was cached. (cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576533)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Fix a case where zero-copy or no-checksum reads were not allowed even when 
 the block was cached
 ---

 Key: HDFS-6086
 URL: https://issues.apache.org/jira/browse/HDFS-6086
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch


 We need to fix a case where zero-copy or no-checksum reads are not allowed 
 even when the block is cached.  The case is when the block is cached before 
 the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
 {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
 block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6072) Clean up dead code of FSImage

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931639#comment-13931639
 ] 

Hudson commented on HDFS-6072:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #507 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/507/])
HDFS-6072. Clean up dead code of FSImage. Contributed by Haohui Mai. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576513)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


 Clean up dead code of FSImage
 -

 Key: HDFS-6072
 URL: https://issues.apache.org/jira/browse/HDFS-6072
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.4.0

 Attachments: HDFS-6072.000.patch, HDFS-6072.001.patch, 
 HDFS-6072.002.patch


 After HDFS-5698 HDFS store FSImage in protobuf format. The old code of saving 
 the FSImage is now dead, which should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931663#comment-13931663
 ] 

haosdent commented on HDFS-6092:


I have an other idea to fix this. Let me attach my patch.

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: hdfs-6092-v1.txt, hdfs-6092-v2.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated HDFS-6092:
---

Attachment: haosdent-HDFS-6092.patch

{code}
public static Text buildTokenService(InetSocketAddress addr, boolean 
isForceUseIp) {
{code}

I add this method to fix this.

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, 
 hdfs-6092-v2.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931669#comment-13931669
 ] 

haosdent commented on HDFS-6092:


[~te...@apache.org] I think add this to HBase is well, but it maybe some trick 
if we add this to HDFS. Because I think we should fix this at the source of 
problem in HDFS.

{code}
   String str = uri.getScheme()+://+uri.getAuthority();
this.uri = URI.create(str);
if (uri.getPort() == -1) {
  String svcName = this.dfs.getCanonicalServiceName();
  int idx = svcName.indexOf(':');
  if (idx  0) {
str = str + svcName.substring(idx);
this.uri = URI.create(str);
  }
}
{code}

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, 
 hdfs-6092-v2.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931766#comment-13931766
 ] 

Ted Yu commented on HDFS-6092:
--

[~haosd...@gmail.com]:
With your patch, desFs's URI still doesn't have port.

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, 
 hdfs-6092-v2.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6072) Clean up dead code of FSImage

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931777#comment-13931777
 ] 

Hudson commented on HDFS-6072:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1699 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1699/])
HDFS-6072. Clean up dead code of FSImage. Contributed by Haohui Mai. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576513)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


 Clean up dead code of FSImage
 -

 Key: HDFS-6072
 URL: https://issues.apache.org/jira/browse/HDFS-6072
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.4.0

 Attachments: HDFS-6072.000.patch, HDFS-6072.001.patch, 
 HDFS-6072.002.patch


 After HDFS-5698 HDFS store FSImage in protobuf format. The old code of saving 
 the FSImage is now dead, which should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931776#comment-13931776
 ] 

Hudson commented on HDFS-6086:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1699 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1699/])
HDFS-6086. Fix a case where zero-copy or no-checksum reads were not allowed 
even when the block was cached. (cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576533)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Fix a case where zero-copy or no-checksum reads were not allowed even when 
 the block was cached
 ---

 Key: HDFS-6086
 URL: https://issues.apache.org/jira/browse/HDFS-6086
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch


 We need to fix a case where zero-copy or no-checksum reads are not allowed 
 even when the block is cached.  The case is when the block is cached before 
 the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
 {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
 block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5638) HDFS implementation of FileContext API for ACLs.

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931773#comment-13931773
 ] 

Hudson commented on HDFS-5638:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1699 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1699/])
HDFS-5638. HDFS implementation of FileContext API for ACLs. Contributed by 
Vinayakumar B. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576405)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileContextAcl.java


 HDFS implementation of FileContext API for ACLs.
 

 Key: HDFS-5638
 URL: https://issues.apache.org/jira/browse/HDFS-5638
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-5638.2.patch, HDFS-5638.patch, HDFS-5638.patch, 
 HDFS-5638.patch


 Add new methods to {{AbstractFileSystem}} and {{FileContext}} for 
 manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5638) HDFS implementation of FileContext API for ACLs.

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931837#comment-13931837
 ] 

Hudson commented on HDFS-5638:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1724 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1724/])
HDFS-5638. HDFS implementation of FileContext API for ACLs. Contributed by 
Vinayakumar B. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576405)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileContextAcl.java


 HDFS implementation of FileContext API for ACLs.
 

 Key: HDFS-5638
 URL: https://issues.apache.org/jira/browse/HDFS-5638
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-5638.2.patch, HDFS-5638.patch, HDFS-5638.patch, 
 HDFS-5638.patch


 Add new methods to {{AbstractFileSystem}} and {{FileContext}} for 
 manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931840#comment-13931840
 ] 

Hudson commented on HDFS-6086:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1724 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1724/])
HDFS-6086. Fix a case where zero-copy or no-checksum reads were not allowed 
even when the block was cached. (cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576533)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Fix a case where zero-copy or no-checksum reads were not allowed even when 
 the block was cached
 ---

 Key: HDFS-6086
 URL: https://issues.apache.org/jira/browse/HDFS-6086
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch


 We need to fix a case where zero-copy or no-checksum reads are not allowed 
 even when the block is cached.  The case is when the block is cached before 
 the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
 {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
 block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6072) Clean up dead code of FSImage

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931841#comment-13931841
 ] 

Hudson commented on HDFS-6072:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1724 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1724/])
HDFS-6072. Clean up dead code of FSImage. Contributed by Haohui Mai. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576513)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


 Clean up dead code of FSImage
 -

 Key: HDFS-6072
 URL: https://issues.apache.org/jira/browse/HDFS-6072
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.4.0

 Attachments: HDFS-6072.000.patch, HDFS-6072.001.patch, 
 HDFS-6072.002.patch


 After HDFS-5698 HDFS store FSImage in protobuf format. The old code of saving 
 the FSImage is now dead, which should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated HDFS-6092:
---

Attachment: haosdent-HDFS-6092-v2.patch

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
 hdfs-6092-v1.txt, hdfs-6092-v2.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931913#comment-13931913
 ] 

haosdent commented on HDFS-6092:


[~te...@apache.org] Sorry, I misunderstand the title before. I replace your 
code with this snippet.

{code}
if (dfs.getCanonicalServiceName() != null
 
!(dfs.getCanonicalServiceName().startsWith(HdfsConstants.HA_DT_SERVICE_PREFIX))
 uri.getPort() == -1) {
  uri = UriBuilder.fromUri(uri).port(NameNode.DEFAULT_PORT).build();
}
{code}

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
 hdfs-6092-v1.txt, hdfs-6092-v2.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6092:
-

Attachment: hdfs-6092-v3.txt

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
 hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931930#comment-13931930
 ] 

Ted Yu commented on HDFS-6092:
--

How about patch v3 ?
I try to be a little generic by detecting (port) number in the service name.

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
 hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-12 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931955#comment-13931955
 ] 

Yu Li commented on HDFS-6009:
-

Hi [~thanhdo],

Yes, the data are replicated, so there won't be data loss. However, since one 
datanode might carry on data of multiple applications, the datanode failure 
will cause *several* application read request to retry until timeout and change 
to another datanode, while we'd like to reduce the impact range

Another scenario we experienced here is that application A crazily reading data 
from one DN, which occupied almost all network bandwidth, while meantime 
application B tried to write data to this DN but blocked a long time.

As I mentioned in HDFS-6010, people might ask in this case why don't use 
phasically separated clusters, the answer would be it's more convenient and 
saves people resource to manage one big cluster than several small ones.

There's also other solution like HDFS-5776 to reduce the impact of bad 
datanode, but I believe there're still scenarios which need more strict io 
isolation, so I think it's still valuable to contribute our tools.

Hope this answers your question. :-)

 Tools based on favored node feature for isolation
 -

 Key: HDFS-6009
 URL: https://issues.apache.org/jira/browse/HDFS-6009
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

 There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
 multi-tenant deployments of HBase we prefer to specify several groups of 
 regionservers to serve different applications, to achieve some kind of 
 isolation or resource allocation. However, although the regionservers are 
 grouped, the datanodes which store the data are not, which leads to the case 
 that one datanode failure affects multiple applications, as we already 
 observed in our product environment.
 To relieve the above issue, we could take usage of the favored node feature 
 (HDFS-2576) to make regionserver able to locate data within its group, or say 
 make datanodes also grouped (passively), to form some level of isolation.
 In this case, or any other case that needs datanodes to group, we would need 
 a bunch of tools to maintain the group, including:
 1. Making balancer able to balance data among specified servers, rather than 
 the whole set
 2. Set balance bandwidth for specified servers, rather than the whole set
 3. Some tool to check whether the block is cross-group placed, and move it 
 back if so
 This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6010) Make balancer able to balance data among specified servers

2014-03-12 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931974#comment-13931974
 ] 

Yu Li commented on HDFS-6010:
-

Hi [~devaraj], it seems we are waiting for your comment here. :-)

[~szetszwo], any review points about the patch attached here? Or we need to 
wait for Das' comments before starting the code review? Thanks.

 Make balancer able to balance data among specified servers
 --

 Key: HDFS-6010
 URL: https://issues.apache.org/jira/browse/HDFS-6010
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor
 Attachments: HDFS-6010-trunk.patch


 Currently, the balancer tool balances data among all datanodes. However, in 
 some particular case, we would need to balance data only among specified 
 nodes instead of the whole set.
 In this JIRA, a new -servers option would be introduced to implement this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5138) Support HDFS upgrade in HA

2014-03-12 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931985#comment-13931985
 ] 

Suresh Srinivas commented on HDFS-5138:
---

[~atm], most of us are swamped with wrapping up rolling upgrades and testing 
it. Can you please look into this?

 Support HDFS upgrade in HA
 --

 Key: HDFS-5138
 URL: https://issues.apache.org/jira/browse/HDFS-5138
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Kihwal Lee
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 3.0.0

 Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 hdfs-5138-branch-2.txt


 With HA enabled, NN wo't start with -upgrade. Since there has been a layout 
 version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
 necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
 to get around this was to disable HA and upgrade. 
 The NN and the cluster cannot be flipped back to HA until the upgrade is 
 finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
 back on without involving DNs, things will work, but finaliizeUpgrade won't 
 work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
 snapshots won't get removed.
 We will need a different ways of doing layout upgrade and upgrade snapshot.  
 I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
 there is a reasonable workaround that does not increase maintenance window 
 greatly, we can lower its priority from blocker to critical.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5138) Support HDFS upgrade in HA

2014-03-12 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931998#comment-13931998
 ] 

Suresh Srinivas commented on HDFS-5138:
---

Sorry I mean the above comment to be in HDFS-5840.

 Support HDFS upgrade in HA
 --

 Key: HDFS-5138
 URL: https://issues.apache.org/jira/browse/HDFS-5138
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Kihwal Lee
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 3.0.0

 Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 hdfs-5138-branch-2.txt


 With HA enabled, NN wo't start with -upgrade. Since there has been a layout 
 version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
 necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
 to get around this was to disable HA and upgrade. 
 The NN and the cluster cannot be flipped back to HA until the upgrade is 
 finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
 back on without involving DNs, things will work, but finaliizeUpgrade won't 
 work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
 snapshots won't get removed.
 We will need a different ways of doing layout upgrade and upgrade snapshot.  
 I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
 there is a reasonable workaround that does not increase maintenance window 
 greatly, we can lower its priority from blocker to critical.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-12 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932001#comment-13932001
 ] 

Suresh Srinivas commented on HDFS-5840:
---

[~atm], most of us are swamped with wrapping up rolling upgrades and testing 
it. Can you please look into this?

 Follow-up to HDFS-5138 to improve error handling during partial upgrade 
 failures
 

 Key: HDFS-5840
 URL: https://issues.apache.org/jira/browse/HDFS-5840
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 3.0.0

 Attachments: HDFS-5840.patch


 Suresh posted some good comment in HDFS-5138 after that patch had already 
 been committed to trunk. This JIRA is to address those. See the first comment 
 of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations

2014-03-12 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932008#comment-13932008
 ] 

Arun C Murthy commented on HDFS-4564:
-

How is this looking [~daryn]? Thanks.

(Doing a pass over 2.4 blockers)

 Webhdfs returns incorrect http response codes for denied operations
 ---

 Key: HDFS-4564
 URL: https://issues.apache.org/jira/browse/HDFS-4564
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: webhdfs
Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, 
 HDFS-4564.branch-23.patch, HDFS-4564.patch


 Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's 
 denying operations.  Examples including rejecting invalid proxy user attempts 
 and renew/cancel with an invalid user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-12 Thread Sirianni, Eric (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932042#comment-13932042
 ] 

Sirianni, Eric commented on HDFS-6009:
--

Thanks for emailing NetApp. The email inbox you have attempted to reach has 
been deactivated.


 Tools based on favored node feature for isolation
 -

 Key: HDFS-6009
 URL: https://issues.apache.org/jira/browse/HDFS-6009
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

 There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
 multi-tenant deployments of HBase we prefer to specify several groups of 
 regionservers to serve different applications, to achieve some kind of 
 isolation or resource allocation. However, although the regionservers are 
 grouped, the datanodes which store the data are not, which leads to the case 
 that one datanode failure affects multiple applications, as we already 
 observed in our product environment.
 To relieve the above issue, we could take usage of the favored node feature 
 (HDFS-2576) to make regionserver able to locate data within its group, or say 
 make datanodes also grouped (passively), to form some level of isolation.
 In this case, or any other case that needs datanodes to group, we would need 
 a bunch of tools to maintain the group, including:
 1. Making balancer able to balance data among specified servers, rather than 
 the whole set
 2. Set balance bandwidth for specified servers, rather than the whole set
 3. Some tool to check whether the block is cross-group placed, and move it 
 back if so
 This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-12 Thread Thanh Do (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932037#comment-13932037
 ] 

Thanh Do commented on HDFS-6009:


Thank you!

 Tools based on favored node feature for isolation
 -

 Key: HDFS-6009
 URL: https://issues.apache.org/jira/browse/HDFS-6009
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

 There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
 multi-tenant deployments of HBase we prefer to specify several groups of 
 regionservers to serve different applications, to achieve some kind of 
 isolation or resource allocation. However, although the regionservers are 
 grouped, the datanodes which store the data are not, which leads to the case 
 that one datanode failure affects multiple applications, as we already 
 observed in our product environment.
 To relieve the above issue, we could take usage of the favored node feature 
 (HDFS-2576) to make regionserver able to locate data within its group, or say 
 make datanodes also grouped (passively), to form some level of isolation.
 In this case, or any other case that needs datanodes to group, we would need 
 a bunch of tools to maintain the group, including:
 1. Making balancer able to balance data among specified servers, rather than 
 the whole set
 2. Set balance bandwidth for specified servers, rather than the whole set
 3. Some tool to check whether the block is cross-group placed, and move it 
 back if so
 This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-12 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6089:


Attachment: HDFS-6089.001.patch

Fix unit tests.

 Standby NN while transitioning to active throws a connection refused error 
 when the prior active NN process is suspended
 

 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch


 The following scenario was tested:
 * Determine Active NN and suspend the process (kill -19)
 * Wait about 60s to let the standby transition to active
 * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
 active.
 What was noticed that some times the call to get the service state of nn2 got 
 a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6010) Make balancer able to balance data among specified servers

2014-03-12 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932055#comment-13932055
 ] 

Devaraj Das commented on HDFS-6010:
---

[~carp84], sorry for the delay in getting back. You know how things work when 
there are deadlines to meet :-)  I have some follow up questions for my 
understanding.

1. How would you maintain the mapping of files to groups? (for the HDFS-6012 to 
work). If the mapping is maintained, wondering whether it makes sense to have 
the tool take paths for balancing as opposed to servers. Then maybe you can 
also combine the tool that does group management (HDFS-6012) into the balancer.
2. Are these mappings set up by some admin?
3. Would you expand a group when it is nearing capacity?
4. How does someone like HBase use this? Is HBase going to have visibility into 
the mappings as well (to take care of HBASE-6721 and favored-nodes for writes)?
5. Would you need a higher level balancer for keeping the whole cluster 
balanced (do migrations of blocks associated with certain paths from one group 
to another)? Otherwise, there would be skews in the block distribution. 
6. When there is a failure of a datanode in a group, how would you choose which 
datanodes to replicate the blocks to. The choice would be somewhat important 
given that some target datanodes might be busy serving requests for apps for 
its group. Adding some more work to these datanodes might make apps in the 
other group suffer. But maybe it's not that big a deal. On the other hand, if 
the group still has capacity, and the failure zones are still intact for the 
members in the group, then the replication could take into account the mapping 
in (1).

 Make balancer able to balance data among specified servers
 --

 Key: HDFS-6010
 URL: https://issues.apache.org/jira/browse/HDFS-6010
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor
 Attachments: HDFS-6010-trunk.patch


 Currently, the balancer tool balances data among all datanodes. However, in 
 some particular case, we would need to balance data only among specified 
 nodes instead of the whole set.
 In this JIRA, a new -servers option would be introduced to implement this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6095) to add missing description of default value of config properties in the document

2014-03-12 Thread Yongjun Zhang (JIRA)
Yongjun Zhang created HDFS-6095:
---

 Summary: to add missing description of default value of config 
properties in the document
 Key: HDFS-6095
 URL: https://issues.apache.org/jira/browse/HDFS-6095
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Yongjun Zhang
Priority: Minor
 Fix For: 2.3.0


We should describe the default value of config property in the document when 
appropriate.

As an example, default value of config property dfs.webhdfs.enabled is changed 
from false to true by HDFS-5532. The document
http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
(and various versions) listed the property with no default value described.

I hope different documents can be reviewed and updated per this JIRA request.

Thanks.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6095) to add missing description of default value of config properties in the document

2014-03-12 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6095:


Affects Version/s: 2.3.0
Fix Version/s: (was: 2.3.0)

 to add missing description of default value of config properties in the 
 document
 

 Key: HDFS-6095
 URL: https://issues.apache.org/jira/browse/HDFS-6095
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.3.0
Reporter: Yongjun Zhang
Priority: Minor

 We should describe the default value of config property in the document when 
 appropriate.
 As an example, default value of config property dfs.webhdfs.enabled is 
 changed from false to true by HDFS-5532. The document
 http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
 (and various versions) listed the property with no default value described.
 I hope different documents can be reviewed and updated per this JIRA request.
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5477) Block manager as a service

2014-03-12 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-5477:
--

Attachment: Remote BM.pdf

Attached design doc. RemoteBM.pdf to its JIRA

 Block manager as a service
 --

 Key: HDFS-5477
 URL: https://issues.apache.org/jira/browse/HDFS-5477
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: Proposal.pdf, Proposal.pdf, Remote BM.pdf, Standalone 
 BM.pdf, Standalone BM.pdf, patches.tar.gz


 The block manager needs to evolve towards having the ability to run as a 
 standalone service to improve NN vertical and horizontal scalability.  The 
 goal is reducing the memory footprint of the NN proper to support larger 
 namespaces, and improve overall performance by decoupling the block manager 
 from the namespace and its lock.  Ideally, a distinct BM will be transparent 
 to clients and DNs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads

2014-03-12 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932178#comment-13932178
 ] 

Colin Patrick McCabe commented on HDFS-6007:


Thanks for looking at this.  I think we should limit the scope here to just 
adding a sentence about shared-memory segments, and adding some documentation 
about the legacy short-circuit implementation.

I think the zero-copy API should get its own document.  Putting it in here just 
seems like information overload.

{code}
+  Client and DataNode uses shared memory segments
+  to communicate short-circuit read.
{code}

How about The client and the DataNode exchange information via a shared memory 
segment.

{code}
+  if /dev/shm is not world writable or does not exist in your environment,
+  You can change the paths on which shared memory segments are created by
+  setting the value of dfs.datanode.shared.file.descriptor.paths
+  to comma separated paths like /dev/shm,/tmp.
+  It tries paths in order until creation of shared memory segment succeeds.
{code}

Can we skip this section?  99.999% of users will never need to change that 
config value, and there's documentation in hdfs-defaults.xml for those who do.  
The number of UNIX systems without /tmp must be pretty small indeed.

{code}
+  Legacy short-circuit local reads implementation
+  on which clients directly open HDFS block files is still available
+  for platforms other than Linux.
{code}

Missing 'the'

I think we need a sentence or two explaining that the old short-circuit 
implementation is insecure, because it allows users to directly access the 
blocks.  We also need some explanation about how you have to chmod the blocks 
into the correct UNIX group so that they are accessible.

Please skip the configuration tables.  They just duplicate hdfs-default.xml

 Update documentation about short-circuit local reads
 

 Key: HDFS-6007
 URL: https://issues.apache.org/jira/browse/HDFS-6007
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
 HDFS-6007-3.patch


 updating the contents of HDFS SHort-Circuit Local Reads based on the 
 changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932209#comment-13932209
 ] 

Hadoop QA commented on HDFS-6092:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634196/hdfs-6092-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6382//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6382//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6382//console

This message is automatically generated.

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
 hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932212#comment-13932212
 ] 

Hadoop QA commented on HDFS-6092:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634196/hdfs-6092-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6383//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6383//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6383//console

This message is automatically generated.

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
 hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order.

2014-03-12 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-5244:
--

Target Version/s: 2.1.1-beta, 3.0.0  (was: 3.0.0, 2.1.1-beta)
  Status: Patch Available  (was: Open)

 TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly 
 expects Hashmap values to have order. 
 

 Key: HDFS-5244
 URL: https://issues.apache.org/jira/browse/HDFS-5244
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.1.0-beta
 Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 3.0.0, 2.4.0, 2.1.0-beta

 Attachments: HDFS-5244.patch


 The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
 HashMap(dirRoots) to store the root storages to be mocked for the purging 
 test, which does not have any predictable order. The directories needs be 
 purged are stored in a LinkedHashSet, which has a predictable order. So, when 
 the directories get mocked for the test, they could be already out of
 the order that they were added. Thus, the order that the directories were
 actually purged and the order of them being added to the LinkedHashList could
 be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order.

2014-03-12 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932228#comment-13932228
 ] 

Suresh Srinivas commented on HDFS-5244:
---

+1 for the patch. Will commit it shortly once Jenkins +1s the patch.

 TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly 
 expects Hashmap values to have order. 
 

 Key: HDFS-5244
 URL: https://issues.apache.org/jira/browse/HDFS-5244
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.1.0-beta
 Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 3.0.0, 2.1.0-beta, 2.4.0

 Attachments: HDFS-5244.patch


 The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
 HashMap(dirRoots) to store the root storages to be mocked for the purging 
 test, which does not have any predictable order. The directories needs be 
 purged are stored in a LinkedHashSet, which has a predictable order. So, when 
 the directories get mocked for the test, they could be already out of
 the order that they were added. Thus, the order that the directories were
 actually purged and the order of them being added to the LinkedHashList could
 be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6079:
--

Status: Patch Available  (was: Open)

 Timeout for getFileBlockStorageLocations does not work
 --

 Key: HDFS-6079
 URL: https://issues.apache.org/jira/browse/HDFS-6079
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6079-1.patch


 {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
 which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6079:
--

Attachment: hdfs-6079-1.patch

Patch attached. The fix is pretty simple, just need to catch the 
CancellationException that was previously bubbling all the way up.

 Timeout for getFileBlockStorageLocations does not work
 --

 Key: HDFS-6079
 URL: https://issues.apache.org/jira/browse/HDFS-6079
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6079-1.patch


 {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
 which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932285#comment-13932285
 ] 

Hadoop QA commented on HDFS-6089:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634208/HDFS-6089.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6384//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6384//console

This message is automatically generated.

 Standby NN while transitioning to active throws a connection refused error 
 when the prior active NN process is suspended
 

 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch


 The following scenario was tested:
 * Determine Active NN and suspend the process (kill -19)
 * Wait about 60s to let the standby transition to active
 * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
 active.
 What was noticed that some times the call to get the service state of nn2 got 
 a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Tsz Wo Nicholas Sze (JIRA)
Tsz Wo Nicholas Sze created HDFS-6096:
-

 Summary: TestWebHdfsTokens may timeout
 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor


The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for some 
machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-6096:
--

Status: Patch Available  (was: Open)

 TestWebHdfsTokens may timeout
 -

 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h6096_20140312.patch


 The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
 some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-6096:
--

Attachment: h6096_20140312.patch

h6096_20140312.patch: increases the timeout to 5 seconds and removes 
unnecessary @SuppressWarnings(unchecked).

 TestWebHdfsTokens may timeout
 -

 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h6096_20140312.patch


 The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
 some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932301#comment-13932301
 ] 

Haohui Mai commented on HDFS-6096:
--

+1 pending jenkins.

 TestWebHdfsTokens may timeout
 -

 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h6096_20140312.patch


 The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
 some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6075) Introducing non-replication mode

2014-03-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932300#comment-13932300
 ] 

Ravi Prakash commented on HDFS-6075:


I feel another option to do this would be to disable replication on a set of 
*nodes* temporarily (not cluster wide). i.e. the list of nodes, and a timeout 
after which replications should be done.

 Introducing non-replication mode
 --

 Key: HDFS-6075
 URL: https://issues.apache.org/jira/browse/HDFS-6075
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Adam Kawa
Priority: Minor

 Afaik, HDFS does not provide an easy way to temporarily disable the 
 replication of missing blocks.
 If you would like to temporarily disable the replication, you would have to
 * set dfs.namenode.replication.interval (_The periodicity in seconds with 
 which the namenode computes repliaction work for datanodes_ Default 3) to 
 something very high. *Disadvantage*: you have to restart the NN
 * go into the safe-mode. *Disadvantage*: all write operations will fail
 We have the situation that we need to replace our top-of-rack switches for 
 each rack. Replacing a switch should take around 30 minutes. Each rack has 
 around 0.6 PB of data. We would like to avoid an expensive replication, since 
 we know that we will put this rack online quickly. To avoid any downtime, or 
 excessive network transfer, we think that temporarily disabling the 
 replication could fit us.
 The default block placement policy puts blocks into two racks, so when one 
 rack temporarily goes offline, we still have an access to at least replica of 
 each block. Of course, if we lose this replica, then we would have to wait 
 until the rack goes back online. This is what the administrator should be 
 aware of.
 This feature could disable the replication
 * globally - for a whole cluster
 * partially - e.g. only for missing blocks that come from a specified set of 
 DataNodes. So a file like we_will_be_back_soon :) could be introduced, 
 similar to include and exclude.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6094) The same block can be counted twice towards safe mode threshold

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6094:


Description: 
{{BlockManager#addStoredBlock}} can cause the same block can be counted towards 
safe mode threshold. We see this manifest via 
{{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
details to follow in a comment.

Exception details:
{code}
  Time elapsed: 12.874 sec   FAILURE!
java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of live 
datanodes 3 has reached the minimum number 0. Safe mode will be turned off 
automatically in 28 seconds.'
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
{code}

  was:{{BlockManager#addStoredBlock}} can cause the same block can be counted 
towards safe mode threshold. We see this manifest via 
{{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
details to follow in a comment.


 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932341#comment-13932341
 ] 

Arpit Agarwal commented on HDFS-6094:
-

No concrete diagnosis of this issue yet, I am still investigating.

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-6097:
--

 Summary: zero-copy reads are incorrectly disabled on file offsets 
above 2GB
 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some 
code that is supposed to disable zero-copy reads on offsets in block files 
greater than 2GB (because MappedByteBuffer segments are limited to that size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5274) Add Tracing to HDFS

2014-03-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932436#comment-13932436
 ] 

stack commented on HDFS-5274:
-

[~iwasakims] That is a beautiful png

 Add Tracing to HDFS
 ---

 Key: HDFS-5274
 URL: https://issues.apache.org/jira/browse/HDFS-5274
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Affects Versions: 2.1.1-beta
Reporter: Elliott Clark
Assignee: Elliott Clark
 Attachments: 3node_get_200mb.png, 3node_put_200mb.png, 
 3node_put_200mb.png, HDFS-5274-0.patch, HDFS-5274-1.patch, 
 HDFS-5274-10.patch, HDFS-5274-11.txt, HDFS-5274-12.patch, HDFS-5274-13.patch, 
 HDFS-5274-2.patch, HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, 
 HDFS-5274-6.patch, HDFS-5274-7.patch, HDFS-5274-8.patch, HDFS-5274-8.patch, 
 HDFS-5274-9.patch, Zipkin   Trace a06e941b0172ec73.png, Zipkin   Trace 
 d0f0d66b8a258a69.png, ss-5274v8-get.png, ss-5274v8-put.png


 Since Google's Dapper paper has shown the benefits of tracing for a large 
 distributed system, it seems like a good time to add tracing to HDFS.  HBase 
 has added tracing using HTrace.  I propose that the same can be done within 
 HDFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6099) HDFS file system limits not enforced on renames.

2014-03-12 Thread Chris Nauroth (JIRA)
Chris Nauroth created HDFS-6099:
---

 Summary: HDFS file system limits not enforced on renames.
 Key: HDFS-6099
 URL: https://issues.apache.org/jira/browse/HDFS-6099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth


{{dfs.namenode.fs-limits.max-component-length}} and 
{{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the 
destination path during rename operations.  This means that it's still possible 
to create files that violate these limits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932443#comment-13932443
 ] 

Arpit Agarwal commented on HDFS-6098:
-

Faulting function.

{code}
  void shutdown() {
cacheExecutor.shutdown();
SetEntryString, BlockPoolSlice set = bpSlices.entrySet();
for (EntryString, BlockPoolSlice entry : set) {
  entry.getValue().shutdown();
}
  }
{code}

 ConcurrentModificationException exception during DataNode shutdown
 --

 Key: HDFS-6098
 URL: https://issues.apache.org/jira/browse/HDFS-6098
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal

 Exception hit during DN shutdown while running 
 {{TestWebHdfsWithMultipleNameNodes}}:
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
   at 
 org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order.

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932472#comment-13932472
 ] 

Hadoop QA commented on HDFS-5244:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604323/HDFS-5244.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6385//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6385//console

This message is automatically generated.

 TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly 
 expects Hashmap values to have order. 
 

 Key: HDFS-5244
 URL: https://issues.apache.org/jira/browse/HDFS-5244
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.1.0-beta
 Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 3.0.0, 2.1.0-beta, 2.4.0

 Attachments: HDFS-5244.patch


 The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
 HashMap(dirRoots) to store the root storages to be mocked for the purging 
 test, which does not have any predictable order. The directories needs be 
 purged are stored in a LinkedHashSet, which has a predictable order. So, when 
 the directories get mocked for the test, they could be already out of
 the order that they were added. Thus, the order that the directories were
 actually purged and the order of them being added to the LinkedHashList could
 be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932485#comment-13932485
 ] 

Arpit Agarwal edited comment on HDFS-6098 at 3/12/14 10:08 PM:
---

The synchronization of {{FsDatasetImpl#bpSlices}} looks wrong. In different 
locations it is modified without a lock or synchronized via 
{{FsDatasetImpl#dataset}}. Also reads are unsynchronized.


was (Author: arpitagarwal):
The synchronization of {{FsDatasetImpl#bpSlices}} looks wront. In different 
locations it is modified without a lock or synchronized via 
{{FsDatasetImpl#dataset}}. Also reads are unsynchronized.

 ConcurrentModificationException exception during DataNode shutdown
 --

 Key: HDFS-6098
 URL: https://issues.apache.org/jira/browse/HDFS-6098
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal

 Exception hit during DN shutdown while running 
 {{TestWebHdfsWithMultipleNameNodes}}:
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
   at 
 org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932485#comment-13932485
 ] 

Arpit Agarwal commented on HDFS-6098:
-

The synchronization of {{FsDatasetImpl#bpSlices}} looks wront. In different 
locations it is modified without a lock or synchronized via 
{{FsDatasetImpl#dataset}}. Also reads are unsynchronized.

 ConcurrentModificationException exception during DataNode shutdown
 --

 Key: HDFS-6098
 URL: https://issues.apache.org/jira/browse/HDFS-6098
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal

 Exception hit during DN shutdown while running 
 {{TestWebHdfsWithMultipleNameNodes}}:
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
   at 
 org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932502#comment-13932502
 ] 

Aaron T. Myers commented on HDFS-6079:
--

Patch looks good to me. +1 pending Jenkins.

 Timeout for getFileBlockStorageLocations does not work
 --

 Key: HDFS-6079
 URL: https://issues.apache.org/jira/browse/HDFS-6079
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6079-1.patch


 {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
 which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932498#comment-13932498
 ] 

Arpit Agarwal commented on HDFS-6098:
-

Correction - modification of the map is not synchronized at all.

 ConcurrentModificationException exception during DataNode shutdown
 --

 Key: HDFS-6098
 URL: https://issues.apache.org/jira/browse/HDFS-6098
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal

 Exception hit during DN shutdown while running 
 {{TestWebHdfsWithMultipleNameNodes}}:
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
   at 
 org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932485#comment-13932485
 ] 

Arpit Agarwal edited comment on HDFS-6098 at 3/12/14 10:13 PM:
---

The synchronization of {{FsVolumeImpl#bpSlices}} looks wrong. In different 
locations it is modified without a lock or synchronized via 
{{FsVolumeImpl#dataset}}. Also reads are unsynchronized.


was (Author: arpitagarwal):
The synchronization of {{FsDatasetImpl#bpSlices}} looks wrong. In different 
locations it is modified without a lock or synchronized via 
{{FsDatasetImpl#dataset}}. Also reads are unsynchronized.

 ConcurrentModificationException exception during DataNode shutdown
 --

 Key: HDFS-6098
 URL: https://issues.apache.org/jira/browse/HDFS-6098
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal

 Exception hit during DN shutdown while running 
 {{TestWebHdfsWithMultipleNameNodes}}:
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
   at 
 org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads

2014-03-12 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932518#comment-13932518
 ] 

Masatake Iwasaki commented on HDFS-6007:


Thanks again for your comments.

bq. Please skip the configuration tables. They just duplicate hdfs-default.xml

Most of the properties in the table are not in hdfs-default.xml but only in 
DFSConfigKeys. Would I add them to hdfs-default.xml?

 Update documentation about short-circuit local reads
 

 Key: HDFS-6007
 URL: https://issues.apache.org/jira/browse/HDFS-6007
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
 HDFS-6007-3.patch


 updating the contents of HDFS SHort-Circuit Local Reads based on the 
 changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6098:


Attachment: HDFS-6098.01.patch

Trivial patch to make {{bpSlices}} a {{ConcurrentHashMap}}. I think it is 
sufficient here.

 ConcurrentModificationException exception during DataNode shutdown
 --

 Key: HDFS-6098
 URL: https://issues.apache.org/jira/browse/HDFS-6098
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
 Attachments: HDFS-6098.01.patch


 Exception hit during DN shutdown while running 
 {{TestWebHdfsWithMultipleNameNodes}}:
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
   at 
 org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932580#comment-13932580
 ] 

Hadoop QA commented on HDFS-6079:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634239/hdfs-6079-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6386//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6386//console

This message is automatically generated.

 Timeout for getFileBlockStorageLocations does not work
 --

 Key: HDFS-6079
 URL: https://issues.apache.org/jira/browse/HDFS-6079
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6079-1.patch


 {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
 which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932599#comment-13932599
 ] 

Hadoop QA commented on HDFS-6098:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634287/HDFS-6098.01.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6388//console

This message is automatically generated.

 ConcurrentModificationException exception during DataNode shutdown
 --

 Key: HDFS-6098
 URL: https://issues.apache.org/jira/browse/HDFS-6098
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
 Attachments: HDFS-6098.01.patch


 Exception hit during DN shutdown while running 
 {{TestWebHdfsWithMultipleNameNodes}}:
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
   at 
 org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6079:
--

   Resolution: Fixed
Fix Version/s: 2.4.0
   Status: Resolved  (was: Patch Available)

Thanks ATM, committed this back through branch-2.4.

 Timeout for getFileBlockStorageLocations does not work
 --

 Key: HDFS-6079
 URL: https://issues.apache.org/jira/browse/HDFS-6079
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.4.0

 Attachments: hdfs-6079-1.patch


 {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
 which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932603#comment-13932603
 ] 

Hudson commented on HDFS-6079:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5313 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5313/])
HDFS-6079. Timeout for getFileBlockStorageLocations does not work. Contributed 
by Andrew Wang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576979)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockStorageLocationUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java


 Timeout for getFileBlockStorageLocations does not work
 --

 Key: HDFS-6079
 URL: https://issues.apache.org/jira/browse/HDFS-6079
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.4.0

 Attachments: hdfs-6079-1.patch


 {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
 which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6098) ConcurrentModificationException exception during DataNode shutdown

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6098:


Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Duplicate of HDFS-5075. I'll merge the other Jira down to branch-2.4.

 ConcurrentModificationException exception during DataNode shutdown
 --

 Key: HDFS-6098
 URL: https://issues.apache.org/jira/browse/HDFS-6098
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
 Attachments: HDFS-6098.01.patch


 Exception hit during DN shutdown while running 
 {{TestWebHdfsWithMultipleNameNodes}}:
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
   at java.util.HashMap$EntryIterator.next(HashMap.java:934)
   at java.util.HashMap$EntryIterator.next(HashMap.java:932)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:249)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1376)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1301)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1523)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1498)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1482)
   at 
 org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:99)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932620#comment-13932620
 ] 

Hadoop QA commented on HDFS-6096:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634240/h6096_20140312.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6387//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6387//console

This message is automatically generated.

 TestWebHdfsTokens may timeout
 -

 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h6096_20140312.patch


 The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
 some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5477) Block manager as a service

2014-03-12 Thread Amir Langer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Langer updated HDFS-5477:
--

Attachment: Block Manager as a Service - Implementation decisions.pdf

Attached some more documentation about our design and implementation decisions 
in the attached patches. (Block Manager as a Service - Implementation 
decisions.pdf)
This document should complement the RemoteBM.pdf design doc.


 Block manager as a service
 --

 Key: HDFS-5477
 URL: https://issues.apache.org/jira/browse/HDFS-5477
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: Block Manager as a Service - Implementation 
 decisions.pdf, Proposal.pdf, Proposal.pdf, Remote BM.pdf, Standalone BM.pdf, 
 Standalone BM.pdf, patches.tar.gz


 The block manager needs to evolve towards having the ability to run as a 
 standalone service to improve NN vertical and horizontal scalability.  The 
 goal is reducing the memory footprint of the NN proper to support larger 
 namespaces, and improve overall performance by decoupling the block manager 
 from the namespace and its lock.  Ideally, a distinct BM will be transparent 
 to clients and DNs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5705:


Target Version/s: 2.4.0
   Fix Version/s: 2.4.0

Merged to branch-2 and branch-2.4

 TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to 
 ConcurrentModificationException
 

 Key: HDFS-5705
 URL: https://issues.apache.org/jira/browse/HDFS-5705
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 3.0.0, 2.4.0

 Attachments: hdfs-5705.html, hdfs-5705.txt


 From 
 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/
  :
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116)
 {code}
 The above happens when shutdown() is called in parallel to addBlockPool() or 
 shutdownBlockPool().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6096:


  Resolution: Fixed
   Fix Version/s: 2.4.0
  3.0.0
Target Version/s: 2.4.0
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks Nicholas and Haohui! I committed this to trunk through branch-2.4.

 TestWebHdfsTokens may timeout
 -

 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 3.0.0, 2.4.0

 Attachments: h6096_20140312.patch


 The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
 some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932649#comment-13932649
 ] 

Hudson commented on HDFS-5705:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5314 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5314/])
HDFS-5705. Update CHANGES.txt for merging the original fix (r1555190) to 
branch-2 and branch-2.4. (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576989)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to 
 ConcurrentModificationException
 

 Key: HDFS-5705
 URL: https://issues.apache.org/jira/browse/HDFS-5705
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 3.0.0, 2.4.0

 Attachments: hdfs-5705.html, hdfs-5705.txt


 From 
 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/
  :
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116)
 {code}
 The above happens when shutdown() is called in parallel to addBlockPool() or 
 shutdownBlockPool().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout

2014-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932662#comment-13932662
 ] 

Hudson commented on HDFS-6096:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5315 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5315/])
HDFS-6096. TestWebHdfsTokens may timeout. (Contributed by szetszwo) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576999)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java


 TestWebHdfsTokens may timeout
 -

 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 3.0.0, 2.4.0

 Attachments: h6096_20140312.patch


 The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
 some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6100) webhdfs filesystem does not failover in HA mode

2014-03-12 Thread Arpit Gupta (JIRA)
Arpit Gupta created HDFS-6100:
-

 Summary: webhdfs filesystem does not failover in HA mode
 Key: HDFS-6100
 URL: https://issues.apache.org/jira/browse/HDFS-6100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai


While running slive with a webhdfs file system reducers fail as they keep 
trying to write to standby namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6101) TestReplaceDatanodeOnFailure fails occasionally

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6101:


Attachment: TestReplaceDatanodeOnFailure.log

Full log from test run attached.

 TestReplaceDatanodeOnFailure fails occasionally
 ---

 Key: HDFS-6101
 URL: https://issues.apache.org/jira/browse/HDFS-6101
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Arpit Agarwal
 Attachments: TestReplaceDatanodeOnFailure.log


 Exception details:
 {code}
 testReplaceDatanodeOnFailure(org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure)
   Time elapsed: 25.176 sec   FAILURE!java.lang.AssertionError: 
 expected:3 but was:2  at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)  at 
 org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)  at 
 org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure$SlowWriter.checkReplication(TestReplaceDatanodeOnFailure.java:234)
   at 
 org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure.testReplaceDatanodeOnFailure(TestReplaceDatanodeOnFailure.java:153)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6101) TestReplaceDatanodeOnFailure fails occasionally

2014-03-12 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-6101:
---

 Summary: TestReplaceDatanodeOnFailure fails occasionally
 Key: HDFS-6101
 URL: https://issues.apache.org/jira/browse/HDFS-6101
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Arpit Agarwal
 Attachments: TestReplaceDatanodeOnFailure.log

Exception details:

{code}
testReplaceDatanodeOnFailure(org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure)
  Time elapsed: 25.176 sec   FAILURE!java.lang.AssertionError: expected:3 
but was:2  at org.junit.Assert.fail(Assert.java:93)
  at org.junit.Assert.failNotEquals(Assert.java:647)  at 
org.junit.Assert.assertEquals(Assert.java:128)
  at org.junit.Assert.assertEquals(Assert.java:472)  at 
org.junit.Assert.assertEquals(Assert.java:456)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure$SlowWriter.checkReplication(TestReplaceDatanodeOnFailure.java:234)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure.testReplaceDatanodeOnFailure(TestReplaceDatanodeOnFailure.java:153)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6101) TestReplaceDatanodeOnFailure fails occasionally

2014-03-12 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6101:


Description: 
Exception details in a comment below.

The failure repros on both OS X and Linux if I run the test ~10 times in a loop.

  was:
Exception details:

{code}
testReplaceDatanodeOnFailure(org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure)
  Time elapsed: 25.176 sec   FAILURE!java.lang.AssertionError: expected:3 
but was:2  at org.junit.Assert.fail(Assert.java:93)
  at org.junit.Assert.failNotEquals(Assert.java:647)  at 
org.junit.Assert.assertEquals(Assert.java:128)
  at org.junit.Assert.assertEquals(Assert.java:472)  at 
org.junit.Assert.assertEquals(Assert.java:456)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure$SlowWriter.checkReplication(TestReplaceDatanodeOnFailure.java:234)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure.testReplaceDatanodeOnFailure(TestReplaceDatanodeOnFailure.java:153)
{code}


 TestReplaceDatanodeOnFailure fails occasionally
 ---

 Key: HDFS-6101
 URL: https://issues.apache.org/jira/browse/HDFS-6101
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Arpit Agarwal
 Attachments: TestReplaceDatanodeOnFailure.log


 Exception details in a comment below.
 The failure repros on both OS X and Linux if I run the test ~10 times in a 
 loop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6101) TestReplaceDatanodeOnFailure fails occasionally

2014-03-12 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932722#comment-13932722
 ] 

Arpit Agarwal commented on HDFS-6101:
-

Exception details:
{code}
testReplaceDatanodeOnFailure(org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure)
  Time elapsed: 25.176 sec   FAILURE!java.lang.AssertionError: expected:3 
but was:2  at org.junit.Assert.fail(Assert.java:93)
  at org.junit.Assert.failNotEquals(Assert.java:647)  at 
org.junit.Assert.assertEquals(Assert.java:128)
  at org.junit.Assert.assertEquals(Assert.java:472)  at 
org.junit.Assert.assertEquals(Assert.java:456)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure$SlowWriter.checkReplication(TestReplaceDatanodeOnFailure.java:234)
  at 
org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure.testReplaceDatanodeOnFailure(TestReplaceDatanodeOnFailure.java:153)
{code}

 TestReplaceDatanodeOnFailure fails occasionally
 ---

 Key: HDFS-6101
 URL: https://issues.apache.org/jira/browse/HDFS-6101
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Arpit Agarwal
 Attachments: TestReplaceDatanodeOnFailure.log


 Exception details in a comment below.
 The failure repros on both OS X and Linux if I run the test ~10 times in a 
 loop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: (was: HDFS-6097.001.patch)

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.002.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: HDFS-6097.002.patch

fix typo

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.002.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: HDFS-6097.003.patch

fix log message

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: (was: HDFS-6097.002.patch)

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-12 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932797#comment-13932797
 ] 

haosdent commented on HDFS-6092:


[~te...@apache.org] Your patch looks well. But I think my way maybe more clear. 
I also add test cases in my patch(haosdent-HDFS-6092-v2.patch). Maybe what I 
say looks rude, but let us use a better way to avoid confusion here. Add some 
comments here maybe necessary here. :-P

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, 
 hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6010) Make balancer able to balance data among specified servers

2014-03-12 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932891#comment-13932891
 ] 

Yu Li commented on HDFS-6010:
-

{quote}
You know how things work when there are deadlines to meet
{quote}
Totally understand, no problem :-)

{quote}
1. How would you maintain the mapping of files to groups?
{quote}
We don't maintain the mapping in HDFS, but use the regionserver group 
information. Or say, in our use case, this is used along with the regionserver 
group feature, the admin can get the RS group information through a hbase shell 
command, and pass the server list to balancer. To make it easier, we actually 
wrote a simple script to do the whole process, while admin only need to enter a 
RS group name for data balancing. More details please refer to answer of 
question #4
\\
{quote}
wondering whether it makes sense to have the tool take paths for balancing as 
opposed to servers
{quote}
In our hbase use case, this is Ok. But I think it might be better to make the 
tool more general. There might be other scenarios requring balancing data among 
subset instead of fullset of datanodes, although I cannot give one for now. :-)

{quote}
2. Are these mappings set up by some admin?
{quote}
Yes according to above comments

{quote}
3. Would you expand a group when it is nearing capacity?
{quote}
Yes, we could change the setting of one RS group, like moving one RS from 
groupA to groupB, then we would need to use the HDFS-6012 tool to move blocks 
to assure group-block-locality. We'll come back more about this topic in 
answer of question #5

{quote}
4. How does someone like HBase use this? Is HBase going to have visibility into 
the mappings as well (to take care of HBASE-6721 and favored-nodes for writes)?
{quote}
Yes, through HBASE-6721(actually we have done quite some improvements to it to 
make it simplier and more suitable to use in our product env, but that's 
another topic and won't discuss here:-)) we could group RS to supply 
multi-tenant service, one application would use one RS group(regions of all 
tables of this application would be served only by RS in its own group), and 
would write data to the mapping DN through favored-node feature. To be more 
specific, it's an app-regionserverGroup-datanodeGroup mapping, all hfiles of 
the table of one application would locate only on the DNs of the RS group.

{quote}
5. Would you need a higher level balancer for keeping the whole cluster 
balanced (do migrations of blocks associated with certain paths from one group 
to another)? Otherwise, there would be skews in the block distribution. 
{quote}
You really have got the point here:-) Actually the most downside of this 
solution for io isolation is that it will cause data imbalance in the view of 
the whole HDFS cluster. In our use case, we recommend admin not to use balancer 
over all DNs. Instead, like mentioned in answer of question #3, if we find some 
group with high disk usage while another group relatively empty, admin can 
reset the group to move one RS/DN server around. HDFS-6010 tool plus HDFS-6012 
tool would make the trick work.

{quote}
6. When there is a failure of a datanode in a group, how would you choose which 
datanodes to replicate the blocks to. The choice would be somewhat important 
given that some target datanodes might be busy serving requests
{quote}
Currently we don't control the replication of failed datanodes, but use the 
HDFS default policy. So the only impact datanode failure does for isolation is 
that the blocks might be replicated outside the group, that's why we need 
HDFS-6012 tool to periodly check and move cross-group blocks back

[~devaraj] hope the above comments could answer your questions, and feel free 
to let me know if any further comments. :-)

 Make balancer able to balance data among specified servers
 --

 Key: HDFS-6010
 URL: https://issues.apache.org/jira/browse/HDFS-6010
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor
 Attachments: HDFS-6010-trunk.patch


 Currently, the balancer tool balances data among all datanodes. However, in 
 some particular case, we would need to balance data only among specified 
 nodes instead of the whole set.
 In this JIRA, a new -servers option would be introduced to implement this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

2014-03-12 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Status: Patch Available  (was: Open)

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)