date:20141014

[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance

2014-10-14 Thread Yi Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172073#comment-14172073
 ] 

Yi Liu commented on HDFS-6606:
--

Thanks Chris :)  All of the feedback has been addressed. ATM gave the latest 
feedback and would help to look again after I updated, but he was busy recently 
(I talked with him offline). 
So let me ping him again for help to look again in the two days. 

> Optimize HDFS Encrypted Transport performance
> -
>
> Key: HDFS-6606
> URL: https://issues.apache.org/jira/browse/HDFS-6606
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, hdfs-client, security
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, 
> HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, 
> HDFS-6606.006.patch, HDFS-6606.007.patch, HDFS-6606.008.patch, 
> OptimizeHdfsEncryptedTransportperformance.pdf
>
>
> In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
> it was a great work.
> It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
> three security strength:
> * high  3des   or rc4 (128bits)
> * medium des or rc4(56bits)
> * low   rc4(40bits)
> 3des and rc4 are slow, only *tens of MB/s*, 
> http://www.javamex.com/tutorials/cryptography/ciphers.shtml
> http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
> I will give more detailed performance data in future. Absolutely it’s 
> bottleneck and will vastly affect the end to end performance. 
> AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
> it’s more secure; with AES-NI support, the throughput can reach nearly 
> *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
> supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
> a new mode support for AES). 
> This JIRA will use AES with AES-NI support as encryption algorithm for 
> DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7190) Bad use of Preconditions in startFileInternal()

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172072#comment-14172072
 ] 

Hudson commented on HDFS-7190:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6263 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6263/])
HDFS-7190. Bad use of Preconditions in startFileInternal(). Contributed by 
Dawson Choong. (wheat9: rev 128ace10cdde4e966e30ac429c9a65ab8ace2d6c)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Bad use of Preconditions in startFileInternal()
> ---
>
> Key: HDFS-7190
> URL: https://issues.apache.org/jira/browse/HDFS-7190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: HDFS-7190.patch
>
>
> The following precondition is in the middle of startFileInternal()
> {code}
> feInfo = new FileEncryptionInfo(suite, version,
> 
> Preconditions.checkNotNull(feInfo);
> {code}
> Preconditions are recommended to be used in the beginning of the method.
> In this case the check is no-op anyways, because the variable has just been 
> constructed.
> Should be just removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7190) Bad use of Preconditions in startFileInternal()

2014-10-14 Thread Haohui Mai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7190:
-
   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~dawson.choong] for the 
contribution.

> Bad use of Preconditions in startFileInternal()
> ---
>
> Key: HDFS-7190
> URL: https://issues.apache.org/jira/browse/HDFS-7190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: HDFS-7190.patch
>
>
> The following precondition is in the middle of startFileInternal()
> {code}
> feInfo = new FileEncryptionInfo(suite, version,
> 
> Preconditions.checkNotNull(feInfo);
> {code}
> Preconditions are recommended to be used in the beginning of the method.
> In this case the check is no-op anyways, because the variable has just been 
> constructed.
> Should be just removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance

2014-10-14 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172032#comment-14172032
 ] 

Chris Nauroth commented on HDFS-6606:
-

Hello Yi, and all of the reviewers.  What work remains for this patch, and can 
I do anything to help?  I had been +1 way back on patch version 3, but then 
there were a few more rounds of feedback.  Has all of the feedback been 
addressed?  This is great work, and I'd love to see it go into 2.6.0.

> Optimize HDFS Encrypted Transport performance
> -
>
> Key: HDFS-6606
> URL: https://issues.apache.org/jira/browse/HDFS-6606
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, hdfs-client, security
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, 
> HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, 
> HDFS-6606.006.patch, HDFS-6606.007.patch, HDFS-6606.008.patch, 
> OptimizeHdfsEncryptedTransportperformance.pdf
>
>
> In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
> it was a great work.
> It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
> three security strength:
> * high  3des   or rc4 (128bits)
> * medium des or rc4(56bits)
> * low   rc4(40bits)
> 3des and rc4 are slow, only *tens of MB/s*, 
> http://www.javamex.com/tutorials/cryptography/ciphers.shtml
> http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
> I will give more detailed performance data in future. Absolutely it’s 
> bottleneck and will vastly affect the end to end performance. 
> AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
> it’s more secure; with AES-NI support, the throughput can reach nearly 
> *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
> supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
> a new mode support for AES). 
> This JIRA will use AES with AES-NI support as encryption algorithm for 
> DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7208) NN doesn't schedule replication when a DN storage fails

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172029#comment-14172029
 ] 

Hadoop QA commented on HDFS-7208:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674917/HDFS-7208-2.patch
  against trunk revision 0260231.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-hdfs-project/hadoop-hdfs 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8429//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8429//console

This message is automatically generated.

> NN doesn't schedule replication when a DN storage fails
> ---
>
> Key: HDFS-7208
> URL: https://issues.apache.org/jira/browse/HDFS-7208
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-7208-2.patch, HDFS-7208.patch
>
>
> We found the following problem. When a storage device on a DN fails, NN 
> continues to believe replicas of those blocks on that storage are valid and 
> doesn't schedule replication.
> A DN has 12 storage disks. So there is one blockReport for each storage. When 
> a disk fails, # of blockReport from that DN is reduced from 12 to 11. Given 
> dfs.datanode.failed.volumes.tolerated is configured to be > 0, NN still 
> considers that DN healthy.
> 1. A disk failed. All blocks of that disk are removed from DN dataset.
>  
> {noformat}
> 2014-10-04 02:11:12,626 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing 
> replica BP-1748500278-xx.xx.xx.xxx-1377803467793:1121568886 on failed volume 
> /data/disk6/dfs/current
> {noformat}
> 2. NN receives DatanodeProtocol.DISK_ERROR. But that isn't enough to have NN 
> remove the DN and the replicas from the BlocksMap. In addition, blockReport 
> doesn't provide the diff given that is done per storage.
> {noformat}
> 2014-10-04 02:11:12,681 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: 
> Disk error on DatanodeRegistration(xx.xx.xx.xxx, 
> datanodeUuid=f3b8a30b-e715-40d6-8348-3c766f9ba9ab, infoPort=50075, 
> ipcPort=50020, 
> storageInfo=lv=-55;cid=CID-e3c38355-fde5-4e3a-b7ce-edacebdfa7a1;nsid=420527250;c=1410283484939):
>  DataNode failed volumes:/data/disk6/dfs/current
> {noformat}
> 3. Run fsck on the file and confirm the NN's BlocksMap still has that replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7252) small refine for use of isInAnEZ in FSNamesystem

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172030#comment-14172030
 ] 

Hadoop QA commented on HDFS-7252:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674906/HDFS-7252.001.patch
  against trunk revision 0260231.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8428//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8428//console

This message is automatically generated.

> small refine for use of isInAnEZ in FSNamesystem
> 
>
> Key: HDFS-7252
> URL: https://issues.apache.org/jira/browse/HDFS-7252
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Trivial
> Attachments: HDFS-7252.001.patch
>
>
> In {{FSN#startFileInt}}, _EncryptionZoneManager#getEncryptionZoneForPath_ is 
> invoked 3 times (_dir.isInAnEZ(iip)_, _dir.getEZForPath(iip)_, 
> _dir.getKeyName(iip)_) in following code, actually we just need one.
> {code}
> if (dir.isInAnEZ(iip)) {
>   EncryptionZone zone = dir.getEZForPath(iip);
>   protocolVersion = chooseProtocolVersion(zone, supportedVersions);
>   suite = zone.getSuite();
>   ezKeyName = dir.getKeyName(iip);
>   Preconditions.checkNotNull(protocolVersion);
>   Preconditions.checkNotNull(suite);
>   Preconditions.checkArgument(!suite.equals(CipherSuite.UNKNOWN),
>   "Chose an UNKNOWN CipherSuite!");
>   Preconditions.checkNotNull(ezKeyName);
> }
> {code}
> Also there are 2 times in following code, but just need one
> {code}
> if (dir.isInAnEZ(iip)) {
>   // The path is now within an EZ, but we're missing encryption parameters
>   if (suite == null || edek == null) {
> throw new RetryStartFileException();
>   }
>   // Path is within an EZ and we have provided encryption parameters.
>   // Make sure that the generated EDEK matches the settings of the EZ.
>   String ezKeyName = dir.getKeyName(iip);
>   if (!ezKeyName.equals(edek.getEncryptionKeyName())) {
> throw new RetryStartFileException();
>   }
>   feInfo = new FileEncryptionInfo(suite, version,
>   edek.getEncryptedKeyVersion().getMaterial(),
>   edek.getEncryptedKeyIv(),
>   ezKeyName, edek.getEncryptionKeyVersionName());
>   Preconditions.checkNotNull(feInfo);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7208) NN doesn't schedule replication when a DN storage fails

2014-10-14 Thread cho ju il (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171994#comment-14171994
 ] 

cho ju il commented on HDFS-7208:
-

What is the version of the bug occurred?
My cluster version is 2.4.1.
Can I apply the patch without service down-time ?

> NN doesn't schedule replication when a DN storage fails
> ---
>
> Key: HDFS-7208
> URL: https://issues.apache.org/jira/browse/HDFS-7208
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-7208-2.patch, HDFS-7208.patch
>
>
> We found the following problem. When a storage device on a DN fails, NN 
> continues to believe replicas of those blocks on that storage are valid and 
> doesn't schedule replication.
> A DN has 12 storage disks. So there is one blockReport for each storage. When 
> a disk fails, # of blockReport from that DN is reduced from 12 to 11. Given 
> dfs.datanode.failed.volumes.tolerated is configured to be > 0, NN still 
> considers that DN healthy.
> 1. A disk failed. All blocks of that disk are removed from DN dataset.
>  
> {noformat}
> 2014-10-04 02:11:12,626 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing 
> replica BP-1748500278-xx.xx.xx.xxx-1377803467793:1121568886 on failed volume 
> /data/disk6/dfs/current
> {noformat}
> 2. NN receives DatanodeProtocol.DISK_ERROR. But that isn't enough to have NN 
> remove the DN and the replicas from the BlocksMap. In addition, blockReport 
> doesn't provide the diff given that is done per storage.
> {noformat}
> 2014-10-04 02:11:12,681 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: 
> Disk error on DatanodeRegistration(xx.xx.xx.xxx, 
> datanodeUuid=f3b8a30b-e715-40d6-8348-3c766f9ba9ab, infoPort=50075, 
> ipcPort=50020, 
> storageInfo=lv=-55;cid=CID-e3c38355-fde5-4e3a-b7ce-edacebdfa7a1;nsid=420527250;c=1410283484939):
>  DataNode failed volumes:/data/disk6/dfs/current
> {noformat}
> 3. Run fsck on the file and confirm the NN's BlocksMap still has that replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7208) NN doesn't schedule replication when a DN storage fails

2014-10-14 Thread Ming Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-7208:
--
Attachment: HDFS-7208-2.patch

Thanks Nicholas for the review.

The latest patch addresses all your comments, except for the allAlive one. The 
reason is the patch handles deadnode separately from the failedStorage.



> NN doesn't schedule replication when a DN storage fails
> ---
>
> Key: HDFS-7208
> URL: https://issues.apache.org/jira/browse/HDFS-7208
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-7208-2.patch, HDFS-7208.patch
>
>
> We found the following problem. When a storage device on a DN fails, NN 
> continues to believe replicas of those blocks on that storage are valid and 
> doesn't schedule replication.
> A DN has 12 storage disks. So there is one blockReport for each storage. When 
> a disk fails, # of blockReport from that DN is reduced from 12 to 11. Given 
> dfs.datanode.failed.volumes.tolerated is configured to be > 0, NN still 
> considers that DN healthy.
> 1. A disk failed. All blocks of that disk are removed from DN dataset.
>  
> {noformat}
> 2014-10-04 02:11:12,626 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing 
> replica BP-1748500278-xx.xx.xx.xxx-1377803467793:1121568886 on failed volume 
> /data/disk6/dfs/current
> {noformat}
> 2. NN receives DatanodeProtocol.DISK_ERROR. But that isn't enough to have NN 
> remove the DN and the replicas from the BlocksMap. In addition, blockReport 
> doesn't provide the diff given that is done per storage.
> {noformat}
> 2014-10-04 02:11:12,681 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: 
> Disk error on DatanodeRegistration(xx.xx.xx.xxx, 
> datanodeUuid=f3b8a30b-e715-40d6-8348-3c766f9ba9ab, infoPort=50075, 
> ipcPort=50020, 
> storageInfo=lv=-55;cid=CID-e3c38355-fde5-4e3a-b7ce-edacebdfa7a1;nsid=420527250;c=1410283484939):
>  DataNode failed volumes:/data/disk6/dfs/current
> {noformat}
> 3. Run fsck on the file and confirm the NN's BlocksMap still has that replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7243) HDFS concat operation should not be allowed in Encryption Zone

2014-10-14 Thread Yi Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171934#comment-14171934
 ] 

Yi Liu commented on HDFS-7243:
--

Thanks Clarles for post the patch.
*1.*
{quote}
target.substring(0, target.lastIndexOf(Path.SEPARATOR_CHAR))
{quote}
could be empty string (if the parent if root), then will cause {{getEZForPath}} 
failed.
*2.*
Don't invoke FSN#getEZForPath directly, it's a bit heavy, more importantly it 
has own permission check which will cause some issues: For example, concat just 
needs "write" permission for target, and "read, parent write" permission for 
srcs.  If we invoke {{getEZForPath}} on target or parent directly, it requires 
"read" permission of the path. 
Why not do a lite check of whether target is in an encryption zone in 
{{concatInternal}}.


> HDFS concat operation should not be allowed in Encryption Zone
> --
>
> Key: HDFS-7243
> URL: https://issues.apache.org/jira/browse/HDFS-7243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Charles Lamb
> Attachments: HDFS-7243.001.patch
>
>
> For HDFS encryption at rest, files in an encryption zone are using different 
> data encryption keys, so concat should be disallowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7208) NN doesn't schedule replication when a DN storage fails

2014-10-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171928#comment-14171928
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7208:
---

Hi Ming, thanks for working on this.  The patch looks pretty good.  Some 
comments:

- For heartbeatedSinceRegistration == false, let's check failed storage anyway, 
i.e. no need to compare storageMap.size() > reports.length.
- The method removeBlocksOnDatanodeStorage(..) does not use anything in 
DatanodeManager.  We may move the code to 
BlockManager.removeBlocksAssociatedTo(..).
- In HeartbeatManager.heartbeatCheck(), allAlive should be changed to allAlive 
= dead == null && failedStorage == null.
- In DatanodeDescriptor.updateFailedStorage(..), check if a storage was already 
failed.  Log and update the state only if it was not already failed.
- HeartbeatManager.register(..) also calls 
DatanodeDescriptor.updateHeartbeat(..).  So setting 
heartbeatedSinceRegistration = true in updateHeartbeat(..) is wrong.  Need to 
fix it.


> NN doesn't schedule replication when a DN storage fails
> ---
>
> Key: HDFS-7208
> URL: https://issues.apache.org/jira/browse/HDFS-7208
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-7208.patch
>
>
> We found the following problem. When a storage device on a DN fails, NN 
> continues to believe replicas of those blocks on that storage are valid and 
> doesn't schedule replication.
> A DN has 12 storage disks. So there is one blockReport for each storage. When 
> a disk fails, # of blockReport from that DN is reduced from 12 to 11. Given 
> dfs.datanode.failed.volumes.tolerated is configured to be > 0, NN still 
> considers that DN healthy.
> 1. A disk failed. All blocks of that disk are removed from DN dataset.
>  
> {noformat}
> 2014-10-04 02:11:12,626 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing 
> replica BP-1748500278-xx.xx.xx.xxx-1377803467793:1121568886 on failed volume 
> /data/disk6/dfs/current
> {noformat}
> 2. NN receives DatanodeProtocol.DISK_ERROR. But that isn't enough to have NN 
> remove the DN and the replicas from the BlocksMap. In addition, blockReport 
> doesn't provide the diff given that is done per storage.
> {noformat}
> 2014-10-04 02:11:12,681 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: 
> Disk error on DatanodeRegistration(xx.xx.xx.xxx, 
> datanodeUuid=f3b8a30b-e715-40d6-8348-3c766f9ba9ab, infoPort=50075, 
> ipcPort=50020, 
> storageInfo=lv=-55;cid=CID-e3c38355-fde5-4e3a-b7ce-edacebdfa7a1;nsid=420527250;c=1410283484939):
>  DataNode failed volumes:/data/disk6/dfs/current
> {noformat}
> 3. Run fsck on the file and confirm the NN's BlocksMap still has that replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7252) small refine for use of isInAnEZ in FSNamesystem

2014-10-14 Thread Yi Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171916#comment-14171916
 ] 

Yi Liu commented on HDFS-7252:
--

Small code refine.

> small refine for use of isInAnEZ in FSNamesystem
> 
>
> Key: HDFS-7252
> URL: https://issues.apache.org/jira/browse/HDFS-7252
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Trivial
> Attachments: HDFS-7252.001.patch
>
>
> In {{FSN#startFileInt}}, _EncryptionZoneManager#getEncryptionZoneForPath_ is 
> invoked 3 times (_dir.isInAnEZ(iip)_, _dir.getEZForPath(iip)_, 
> _dir.getKeyName(iip)_) in following code, actually we just need one.
> {code}
> if (dir.isInAnEZ(iip)) {
>   EncryptionZone zone = dir.getEZForPath(iip);
>   protocolVersion = chooseProtocolVersion(zone, supportedVersions);
>   suite = zone.getSuite();
>   ezKeyName = dir.getKeyName(iip);
>   Preconditions.checkNotNull(protocolVersion);
>   Preconditions.checkNotNull(suite);
>   Preconditions.checkArgument(!suite.equals(CipherSuite.UNKNOWN),
>   "Chose an UNKNOWN CipherSuite!");
>   Preconditions.checkNotNull(ezKeyName);
> }
> {code}
> Also there are 2 times in following code, but just need one
> {code}
> if (dir.isInAnEZ(iip)) {
>   // The path is now within an EZ, but we're missing encryption parameters
>   if (suite == null || edek == null) {
> throw new RetryStartFileException();
>   }
>   // Path is within an EZ and we have provided encryption parameters.
>   // Make sure that the generated EDEK matches the settings of the EZ.
>   String ezKeyName = dir.getKeyName(iip);
>   if (!ezKeyName.equals(edek.getEncryptionKeyName())) {
> throw new RetryStartFileException();
>   }
>   feInfo = new FileEncryptionInfo(suite, version,
>   edek.getEncryptedKeyVersion().getMaterial(),
>   edek.getEncryptedKeyIv(),
>   ezKeyName, edek.getEncryptionKeyVersionName());
>   Preconditions.checkNotNull(feInfo);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7252) small refine for use of isInAnEZ in FSNamesystem

2014-10-14 Thread Yi Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7252:
-
Status: Patch Available  (was: Open)

> small refine for use of isInAnEZ in FSNamesystem
> 
>
> Key: HDFS-7252
> URL: https://issues.apache.org/jira/browse/HDFS-7252
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Trivial
> Attachments: HDFS-7252.001.patch
>
>
> In {{FSN#startFileInt}}, _EncryptionZoneManager#getEncryptionZoneForPath_ is 
> invoked 3 times (_dir.isInAnEZ(iip)_, _dir.getEZForPath(iip)_, 
> _dir.getKeyName(iip)_) in following code, actually we just need one.
> {code}
> if (dir.isInAnEZ(iip)) {
>   EncryptionZone zone = dir.getEZForPath(iip);
>   protocolVersion = chooseProtocolVersion(zone, supportedVersions);
>   suite = zone.getSuite();
>   ezKeyName = dir.getKeyName(iip);
>   Preconditions.checkNotNull(protocolVersion);
>   Preconditions.checkNotNull(suite);
>   Preconditions.checkArgument(!suite.equals(CipherSuite.UNKNOWN),
>   "Chose an UNKNOWN CipherSuite!");
>   Preconditions.checkNotNull(ezKeyName);
> }
> {code}
> Also there are 2 times in following code, but just need one
> {code}
> if (dir.isInAnEZ(iip)) {
>   // The path is now within an EZ, but we're missing encryption parameters
>   if (suite == null || edek == null) {
> throw new RetryStartFileException();
>   }
>   // Path is within an EZ and we have provided encryption parameters.
>   // Make sure that the generated EDEK matches the settings of the EZ.
>   String ezKeyName = dir.getKeyName(iip);
>   if (!ezKeyName.equals(edek.getEncryptionKeyName())) {
> throw new RetryStartFileException();
>   }
>   feInfo = new FileEncryptionInfo(suite, version,
>   edek.getEncryptedKeyVersion().getMaterial(),
>   edek.getEncryptedKeyIv(),
>   ezKeyName, edek.getEncryptionKeyVersionName());
>   Preconditions.checkNotNull(feInfo);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7252) small refine for use of isInAnEZ in FSNamesystem

2014-10-14 Thread Yi Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7252:
-
Attachment: HDFS-7252.001.patch

> small refine for use of isInAnEZ in FSNamesystem
> 
>
> Key: HDFS-7252
> URL: https://issues.apache.org/jira/browse/HDFS-7252
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Trivial
> Attachments: HDFS-7252.001.patch
>
>
> In {{FSN#startFileInt}}, _EncryptionZoneManager#getEncryptionZoneForPath_ is 
> invoked 3 times (_dir.isInAnEZ(iip)_, _dir.getEZForPath(iip)_, 
> _dir.getKeyName(iip)_) in following code, actually we just need one.
> {code}
> if (dir.isInAnEZ(iip)) {
>   EncryptionZone zone = dir.getEZForPath(iip);
>   protocolVersion = chooseProtocolVersion(zone, supportedVersions);
>   suite = zone.getSuite();
>   ezKeyName = dir.getKeyName(iip);
>   Preconditions.checkNotNull(protocolVersion);
>   Preconditions.checkNotNull(suite);
>   Preconditions.checkArgument(!suite.equals(CipherSuite.UNKNOWN),
>   "Chose an UNKNOWN CipherSuite!");
>   Preconditions.checkNotNull(ezKeyName);
> }
> {code}
> Also there are 2 times in following code, but just need one
> {code}
> if (dir.isInAnEZ(iip)) {
>   // The path is now within an EZ, but we're missing encryption parameters
>   if (suite == null || edek == null) {
> throw new RetryStartFileException();
>   }
>   // Path is within an EZ and we have provided encryption parameters.
>   // Make sure that the generated EDEK matches the settings of the EZ.
>   String ezKeyName = dir.getKeyName(iip);
>   if (!ezKeyName.equals(edek.getEncryptionKeyName())) {
> throw new RetryStartFileException();
>   }
>   feInfo = new FileEncryptionInfo(suite, version,
>   edek.getEncryptedKeyVersion().getMaterial(),
>   edek.getEncryptedKeyIv(),
>   ezKeyName, edek.getEncryptionKeyVersionName());
>   Preconditions.checkNotNull(feInfo);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7252) small refine for use of isInAnEZ in FSNamesystem

2014-10-14 Thread Yi Liu (JIRA)

Yi Liu created HDFS-7252:


 Summary: small refine for use of isInAnEZ in FSNamesystem
 Key: HDFS-7252
 URL: https://issues.apache.org/jira/browse/HDFS-7252
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Trivial


In {{FSN#startFileInt}}, _EncryptionZoneManager#getEncryptionZoneForPath_ is 
invoked 3 times (_dir.isInAnEZ(iip)_, _dir.getEZForPath(iip)_, 
_dir.getKeyName(iip)_) in following code, actually we just need one.
{code}
if (dir.isInAnEZ(iip)) {
  EncryptionZone zone = dir.getEZForPath(iip);
  protocolVersion = chooseProtocolVersion(zone, supportedVersions);
  suite = zone.getSuite();
  ezKeyName = dir.getKeyName(iip);

  Preconditions.checkNotNull(protocolVersion);
  Preconditions.checkNotNull(suite);
  Preconditions.checkArgument(!suite.equals(CipherSuite.UNKNOWN),
  "Chose an UNKNOWN CipherSuite!");
  Preconditions.checkNotNull(ezKeyName);
}
{code}
Also there are 2 times in following code, but just need one
{code}
if (dir.isInAnEZ(iip)) {
  // The path is now within an EZ, but we're missing encryption parameters
  if (suite == null || edek == null) {
throw new RetryStartFileException();
  }
  // Path is within an EZ and we have provided encryption parameters.
  // Make sure that the generated EDEK matches the settings of the EZ.
  String ezKeyName = dir.getKeyName(iip);
  if (!ezKeyName.equals(edek.getEncryptionKeyName())) {
throw new RetryStartFileException();
  }
  feInfo = new FileEncryptionInfo(suite, version,
  edek.getEncryptedKeyVersion().getMaterial(),
  edek.getEncryptedKeyIv(),
  ezKeyName, edek.getEncryptionKeyVersionName());
  Preconditions.checkNotNull(feInfo);
}
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7185) The active NameNode will not accept an fsimage sent from the standby during rolling upgrade

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171898#comment-14171898
 ] 

Hadoop QA commented on HDFS-7185:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674902/HDFS-7185.004.patch
  against trunk revision 0260231.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8427//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8427//console

This message is automatically generated.

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> ---
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Jing Zhao
> Attachments: HDFS-7185.000.patch, HDFS-7185.001.patch, 
> HDFS-7185.002.patch, HDFS-7185.003.patch, HDFS-7185.004.patch
>
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7185) The active NameNode will not accept an fsimage sent from the standby during rolling upgrade

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171896#comment-14171896
 ] 

Hadoop QA commented on HDFS-7185:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674875/HDFS-7185.003.patch
  against trunk revision cdce883.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.cli.TestAclCLI
  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8426//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8426//console

This message is automatically generated.

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> ---
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Jing Zhao
> Attachments: HDFS-7185.000.patch, HDFS-7185.001.patch, 
> HDFS-7185.002.patch, HDFS-7185.003.patch, HDFS-7185.004.patch
>
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7185) The active NameNode will not accept an fsimage sent from the standby during rolling upgrade

2014-10-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171892#comment-14171892
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7185:
---

+1 the new patch looks good.  Thanks for the update.

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> ---
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Jing Zhao
> Attachments: HDFS-7185.000.patch, HDFS-7185.001.patch, 
> HDFS-7185.002.patch, HDFS-7185.003.patch, HDFS-7185.004.patch
>
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7185) The active NameNode will not accept an fsimage sent from the standby during rolling upgrade

2014-10-14 Thread Jing Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7185:

Attachment: HDFS-7185.004.patch

After an offline discussion with Nicholas, this 004 patch adds more restrict 
check for rollingUpgrade rollback. Specifically, we check if the software's 
layout version is the same with the fsimage's layout version if we're doing 
rolling rollback.

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> ---
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Jing Zhao
> Attachments: HDFS-7185.000.patch, HDFS-7185.001.patch, 
> HDFS-7185.002.patch, HDFS-7185.003.patch, HDFS-7185.004.patch
>
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7243) HDFS concat operation should not be allowed in Encryption Zone

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171858#comment-14171858
 ] 

Hadoop QA commented on HDFS-7243:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674865/HDFS-7243.001.patch
  against trunk revision cdce883.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
  org.apache.hadoop.hdfs.server.namenode.TestHDFSConcat
  
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream
  org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8425//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8425//console

This message is automatically generated.

> HDFS concat operation should not be allowed in Encryption Zone
> --
>
> Key: HDFS-7243
> URL: https://issues.apache.org/jira/browse/HDFS-7243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Charles Lamb
> Attachments: HDFS-7243.001.patch
>
>
> For HDFS encryption at rest, files in an encryption zone are using different 
> data encryption keys, so concat should be disallowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7190) Bad use of Preconditions in startFileInternal()

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171826#comment-14171826
 ] 

Hadoop QA commented on HDFS-7190:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674863/HDFS-7190.patch
  against trunk revision cdce883.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing
  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8424//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8424//console

This message is automatically generated.

> Bad use of Preconditions in startFileInternal()
> ---
>
> Key: HDFS-7190
> URL: https://issues.apache.org/jira/browse/HDFS-7190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Attachments: HDFS-7190.patch
>
>
> The following precondition is in the middle of startFileInternal()
> {code}
> feInfo = new FileEncryptionInfo(suite, version,
> 
> Preconditions.checkNotNull(feInfo);
> {code}
> Preconditions are recommended to be used in the beginning of the method.
> In this case the check is no-op anyways, because the variable has just been 
> constructed.
> Should be just removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7201) Fix typos in hdfs-default.xml

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171797#comment-14171797
 ] 

Hudson commented on HDFS-7201:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6261 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6261/])
HDFS-7201. Fix typos in hdfs-default.xml. Contributed by Dawson Choong. 
(wheat9: rev 0260231667bc0da7e8738ed3a313c0b3d957143f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


> Fix typos in hdfs-default.xml
> -
>
> Key: HDFS-7201
> URL: https://issues.apache.org/jira/browse/HDFS-7201
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: HDFS-7201.patch
>
>
> Found the following typos in hdfs-default.xml:
> repliaction
> directoires
> teh
> tranfer
> spage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7201) Fix typos in hdfs-default.xml

2014-10-14 Thread Haohui Mai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7201:
-
   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~shv] for the reviews, 
and [~dawson.choong] for the contribution.

> Fix typos in hdfs-default.xml
> -
>
> Key: HDFS-7201
> URL: https://issues.apache.org/jira/browse/HDFS-7201
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: HDFS-7201.patch
>
>
> Found the following typos in hdfs-default.xml:
> repliaction
> directoires
> teh
> tranfer
> spage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7201) Fix typos in hdfs-default.xml

2014-10-14 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171781#comment-14171781
 ] 

Haohui Mai commented on HDFS-7201:
--

The test failures and the javadoc warnings are unrelated. I'll commit the patch 
shortly.

> Fix typos in hdfs-default.xml
> -
>
> Key: HDFS-7201
> URL: https://issues.apache.org/jira/browse/HDFS-7201
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Attachments: HDFS-7201.patch
>
>
> Found the following typos in hdfs-default.xml:
> repliaction
> directoires
> teh
> tranfer
> spage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7201) Fix typos in hdfs-default.xml

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171757#comment-14171757
 ] 

Hadoop QA commented on HDFS-7201:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674825/HDFS-7201.patch
  against trunk revision cdce883.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
12 warning messages.
See 
https://builds.apache.org/job/PreCommit-HDFS-Build/8422//artifact/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8422//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8422//console

This message is automatically generated.

> Fix typos in hdfs-default.xml
> -
>
> Key: HDFS-7201
> URL: https://issues.apache.org/jira/browse/HDFS-7201
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Attachments: HDFS-7201.patch
>
>
> Found the following typos in hdfs-default.xml:
> repliaction
> directoires
> teh
> tranfer
> spage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7185) The active NameNode will not accept an fsimage sent from the standby during rolling upgrade

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171758#comment-14171758
 ] 

Hadoop QA commented on HDFS-7185:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674855/HDFS-7185.002.patch
  against trunk revision cdce883.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8423//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8423//console

This message is automatically generated.

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> ---
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Jing Zhao
> Attachments: HDFS-7185.000.patch, HDFS-7185.001.patch, 
> HDFS-7185.002.patch, HDFS-7185.003.patch
>
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-7070) TestWebHdfsFileSystemContract fails occassionally

2014-10-14 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang resolved HDFS-7070.
-
Resolution: Cannot Reproduce

Haven't seen the reported tests to fail for 3 weeks. The issue might have been 
addressed by some fix. Closing it for now. Please feel free to reopen if it 
happens again.


> TestWebHdfsFileSystemContract fails occassionally
> -
>
> Key: HDFS-7070
> URL: https://issues.apache.org/jira/browse/HDFS-7070
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract.testResponseCode
> and  
> org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract.testRenameDirToSelf 
> failed recently.
> Need to determine whether it's  introduced by some latest code change due to 
> file descriptor leak; or it's a similar issue as HDFS-6694 reported.
> E.g. 
> https://builds.apache.org/job/PreCommit-HDFS-Build/8026/testReport/org.apache.hadoop.hdfs.web/TestWebHdfsFileSystemContract/testResponseCode/.
> {code}
> 2014-09-15 12:52:18,866 INFO  datanode.DataNode 
> (DataXceiver.java:writeBlock(749)) - opWriteBlock 
> BP-23833599-67.195.81.147-1410785517350:blk_1073741827_1461 received 
> exception java.io.IOException: Cannot run program "stat": 
> java.io.IOException: error=24, Too many open files
> 2014-09-15 12:52:18,867 ERROR datanode.DataNode (DataXceiver.java:run(243)) - 
> 127.0.0.1:47221:DataXceiver error processing WRITE_BLOCK operation  src: 
> /127.0.0.1:38112 dst: /127.0.0.1:47221
> java.io.IOException: Cannot run program "stat": java.io.IOException: 
> error=24, Too many open files
>   at java.lang.ProcessBuilder.start(ProcessBuilder.java:470)
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:485)
>   at org.apache.hadoop.util.Shell.run(Shell.java:455)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
>   at org.apache.hadoop.fs.HardLink.getLinkCount(HardLink.java:495)
>   at 
> org.apache.hadoop.hdfs.server.datanode.ReplicaInfo.unlinkBlock(ReplicaInfo.java:288)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:702)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:680)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:101)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:193)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:604)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:126)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:72)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:225)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.io.IOException: error=24, Too many open 
> files
>   at java.lang.UNIXProcess.(UNIXProcess.java:148)
>   at java.lang.ProcessImpl.start(ProcessImpl.java:65)
>   at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
>   ... 14 more
> 2014-09-15 12:52:18,867 INFO  hdfs.DFSClient 
> (DFSOutputStream.java:createBlockOutputStream(1400)) - Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
>   at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2101)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1368)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1210)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:530)
> 2014-09-15 12:52:18,870 WARN  hdfs.DFSClient (DFSOutputStream.java:run(883)) 
> - DFSOutputStream ResponseProcessor exception  for block 
> BP-23833599-67.195.81.147-1410785517350:blk_1073741827_1461
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2099)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:798)
> 2014-09-15 12:52:18,870 WARN  hdfs.DFSClient (DFSOutputStream.java:run(627)) 
> - DataStreamer Exception
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:273)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFS

[jira] [Updated] (HDFS-7185) The active NameNode will not accept an fsimage sent from the standby during rolling upgrade

2014-10-14 Thread Jing Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7185:

Attachment: HDFS-7185.003.patch

Update the patch: in {{updateStorageVersionForRollingUpgrade}} we should not 
check the layout version number for rolling rollback case, because 1) we have 
done this check before/while loading the fsimage, and 2) we have already used 
the software's layout version number to set the storage directory's layout 
version.

I have tested the patch in a 3-node cluster and looks like the rolling upgrade 
(including rollback) works fine with the patch.

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> ---
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Jing Zhao
> Attachments: HDFS-7185.000.patch, HDFS-7185.001.patch, 
> HDFS-7185.002.patch, HDFS-7185.003.patch
>
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7243) HDFS concat operation should not be allowed in Encryption Zone

2014-10-14 Thread Charles Lamb (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7243:
---
Status: Patch Available  (was: Open)

> HDFS concat operation should not be allowed in Encryption Zone
> --
>
> Key: HDFS-7243
> URL: https://issues.apache.org/jira/browse/HDFS-7243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Charles Lamb
> Attachments: HDFS-7243.001.patch
>
>
> For HDFS encryption at rest, files in an encryption zone are using different 
> data encryption keys, so concat should be disallowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7243) HDFS concat operation should not be allowed in Encryption Zone

2014-10-14 Thread Charles Lamb (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7243:
---
Attachment: HDFS-7243.001.patch

[~hitliuyi],

Good catch. I've attached a patch.



> HDFS concat operation should not be allowed in Encryption Zone
> --
>
> Key: HDFS-7243
> URL: https://issues.apache.org/jira/browse/HDFS-7243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Charles Lamb
> Attachments: HDFS-7243.001.patch
>
>
> For HDFS encryption at rest, files in an encryption zone are using different 
> data encryption keys, so concat should be disallowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7227) Fix findbugs warning about NP_DEREFERENCE_OF_READLINE_VALUE in SpanReceiverHost

2014-10-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171644#comment-14171644
 ] 

stack commented on HDFS-7227:
-

Patch LGTM  +1. An 'if' followed by a short clause all on one line is fine by 
me but if hdfs style-guide has it different, add brackets on commit?

> Fix findbugs warning about NP_DEREFERENCE_OF_READLINE_VALUE in 
> SpanReceiverHost
> ---
>
> Key: HDFS-7227
> URL: https://issues.apache.org/jira/browse/HDFS-7227
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-7227.001.patch
>
>
> Fix findbugs warning about NP_DEREFERENCE_OF_READLINE_VALUE in 
> SpanReceiverHost



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7190) Bad use of Preconditions in startFileInternal()

2014-10-14 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171633#comment-14171633
 ] 

Haohui Mai commented on HDFS-7190:
--

+1 pending jenkins.

> Bad use of Preconditions in startFileInternal()
> ---
>
> Key: HDFS-7190
> URL: https://issues.apache.org/jira/browse/HDFS-7190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Attachments: HDFS-7190.patch
>
>
> The following precondition is in the middle of startFileInternal()
> {code}
> feInfo = new FileEncryptionInfo(suite, version,
> 
> Preconditions.checkNotNull(feInfo);
> {code}
> Preconditions are recommended to be used in the beginning of the method.
> In this case the check is no-op anyways, because the variable has just been 
> constructed.
> Should be just removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7239) Create a servlet for HDFS UI

2014-10-14 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171631#comment-14171631
 ] 

Haohui Mai commented on HDFS-7239:
--

The whole point of the proposed servlet is to push information that is not 
suitable for JMX to UI. Because of the compatibility concerns, JMX should only 
contain information that has well-defined formats and avoid duplication 
whenever possible.

Since the servlet does not guarantee compatibility, it has more freedom to push 
information that has a rich format (e.g., nntop), or information that can be 
parsed from configuration (e.g., nameservice id) to the UI. Information that 
has well-defined formats / semantics can be continued to pushed to the UI 
through JMX.

> Create a servlet for HDFS UI
> 
>
> Key: HDFS-7239
> URL: https://issues.apache.org/jira/browse/HDFS-7239
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> Currently the HDFS UI gathers most of its information from JMX. There are a 
> couple disadvantages:
> * JMX is also used by management tools, thus Hadoop needs to maintain 
> compatibility across minor releases.
> * JMX organizes information as  pairs. The organization does not 
> fit well with emerging use cases like startup progress report and nntop.
> This jira proposes to introduce a new servlet in the NN for the purpose of 
> serving information to the UI.
> It should be viewed as a part of the UI. There is *no* compatibility 
> guarantees for the output of the servlet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7055) Add tracing to DFSInputStream

2014-10-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171629#comment-14171629
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7055:
---

> Nicholas, I apologize if these findbugs issues inconvenienced you. ...

The findbugs issues are minor but the way that the patches got committed ...

> ... I would appreciate a review on HDFS-7227.

I guess I may not be the best person to review it.  Anyway, I just have posted 
some comments.  Please find your favorite reviewers to look at it further.

> Thanks also to Yongjun for fixing HDFS-7194 (introduced by me) and HDFS-7169 
> (introduced by Nicholas).

Yongjun did not work or even comment on HDFS-7169.  Typo?

> Add tracing to DFSInputStream
> -
>
> Key: HDFS-7055
> URL: https://issues.apache.org/jira/browse/HDFS-7055
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.7.0
>
> Attachments: HDFS-7055.002.patch, HDFS-7055.003.patch, 
> HDFS-7055.004.patch, HDFS-7055.005.patch, screenshot-get-1mb.005.png, 
> screenshot-get-1mb.png
>
>
> Add tracing to DFSInputStream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7190) Bad use of Preconditions in startFileInternal()

2014-10-14 Thread Dawson Choong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawson Choong updated HDFS-7190:

Status: Patch Available  (was: Open)

> Bad use of Preconditions in startFileInternal()
> ---
>
> Key: HDFS-7190
> URL: https://issues.apache.org/jira/browse/HDFS-7190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Attachments: HDFS-7190.patch
>
>
> The following precondition is in the middle of startFileInternal()
> {code}
> feInfo = new FileEncryptionInfo(suite, version,
> 
> Preconditions.checkNotNull(feInfo);
> {code}
> Preconditions are recommended to be used in the beginning of the method.
> In this case the check is no-op anyways, because the variable has just been 
> constructed.
> Should be just removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7190) Bad use of Preconditions in startFileInternal()

2014-10-14 Thread Dawson Choong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawson Choong updated HDFS-7190:

Attachment: HDFS-7190.patch

Attached a patch. Removed unnecessary precondition

> Bad use of Preconditions in startFileInternal()
> ---
>
> Key: HDFS-7190
> URL: https://issues.apache.org/jira/browse/HDFS-7190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Attachments: HDFS-7190.patch
>
>
> The following precondition is in the middle of startFileInternal()
> {code}
> feInfo = new FileEncryptionInfo(suite, version,
> 
> Preconditions.checkNotNull(feInfo);
> {code}
> Preconditions are recommended to be used in the beginning of the method.
> In this case the check is no-op anyways, because the variable has just been 
> constructed.
> Should be just removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7227) Fix findbugs warning about NP_DEREFERENCE_OF_READLINE_VALUE in SpanReceiverHost

2014-10-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171611#comment-14171611
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7227:
---

Our code style use separated lines and { } for if-statements, i.e.
{code}
if (...) {
  ...
}
{code}

> Fix findbugs warning about NP_DEREFERENCE_OF_READLINE_VALUE in 
> SpanReceiverHost
> ---
>
> Key: HDFS-7227
> URL: https://issues.apache.org/jira/browse/HDFS-7227
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-7227.001.patch
>
>
> Fix findbugs warning about NP_DEREFERENCE_OF_READLINE_VALUE in 
> SpanReceiverHost



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-7241) Unable to create encryption zone for viewfs:// after namenode federation is enabled

2014-10-14 Thread Charles Lamb (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb resolved HDFS-7241.

Resolution: Not a Problem
  Assignee: Charles Lamb

Since creating an encryption zone is an administrative function, it makes the 
most sense to just create it on the underlying HDFS namenode rather than 
through viewfs.


> Unable to create encryption zone for viewfs:// after namenode federation is 
> enabled
> ---
>
> Key: HDFS-7241
> URL: https://issues.apache.org/jira/browse/HDFS-7241
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, federation
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Xiaomin Zhang
>Assignee: Charles Lamb
>
> After configuring namenode federation for the cluster, I also enabled client 
> mount table and viewfs as default URI. The hdfs crypto commands now failed 
> with below error:
> # hdfs crypto -createZone -keyName key1 -path /user/test
> IllegalArgumentException: FileSystem viewfs://cluster18/ is not an HDFS file 
> system
> This blocks the whole encryption at-rest feature as no zone could be defined



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7185) The active NameNode will not accept an fsimage sent from the standby during rolling upgrade

2014-10-14 Thread Tsz Wo Nicholas Sze (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7185:
--
 Component/s: namenode
Hadoop Flags: Reviewed

+1 patch looks good.  Thanks!

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> ---
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Jing Zhao
> Attachments: HDFS-7185.000.patch, HDFS-7185.001.patch, 
> HDFS-7185.002.patch
>
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-7185) The active NameNode will not accept an fsimage sent from the standby during rolling upgrade

2014-10-14 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171573#comment-14171573
 ] 

Jing Zhao edited comment on HDFS-7185 at 10/14/14 9:37 PM:
---

Thanks Nicholas for the review! Update the patch to address the comments.

Because {{readProperties}} is called in multiple places, to bypass the 
layoutversion mismatch while rolling rollback, the current patch passes in the 
startup option to {{readProperties}} and delay the layoutversion check until 
loading the fsimage.

I will do some more tests in my cluster later.


was (Author: jingzhao):
Thanks Nicholas for the review! Update the patch to address the comments.

Because {{readProperties}} is called in multiple places, to bypass the 
layoutversion mismatch while rolling rollback, the current patch passes in the 
startup option and delay the layoutversion check while loading the fsimage.

I will do some more tests in my cluster later.

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> ---
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Jing Zhao
> Attachments: HDFS-7185.000.patch, HDFS-7185.001.patch, 
> HDFS-7185.002.patch
>
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7185) The active NameNode will not accept an fsimage sent from the standby during rolling upgrade

2014-10-14 Thread Jing Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7185:

Attachment: HDFS-7185.002.patch

Thanks Nicholas for the review! Update the patch to address the comments.

Because {{readProperties}} is called in multiple places, to bypass the 
layoutversion mismatch while rolling rollback, the current patch passes in the 
startup option and delay the layoutversion check while loading the fsimage.

I will do some more tests in my cluster later.

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> ---
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Colin Patrick McCabe
>Assignee: Jing Zhao
> Attachments: HDFS-7185.000.patch, HDFS-7185.001.patch, 
> HDFS-7185.002.patch
>
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7201) Fix typos in hdfs-default.xml

2014-10-14 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171534#comment-14171534
 ] 

Konstantin Shvachko commented on HDFS-7201:
---

+1 Looks good. Pending Jenkins approval.

> Fix typos in hdfs-default.xml
> -
>
> Key: HDFS-7201
> URL: https://issues.apache.org/jira/browse/HDFS-7201
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Attachments: HDFS-7201.patch
>
>
> Found the following typos in hdfs-default.xml:
> repliaction
> directoires
> teh
> tranfer
> spage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7201) Fix typos in hdfs-default.xml

2014-10-14 Thread Dawson Choong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawson Choong updated HDFS-7201:

Status: Patch Available  (was: Open)

Fixed typos for hdfs-default.xml

> Fix typos in hdfs-default.xml
> -
>
> Key: HDFS-7201
> URL: https://issues.apache.org/jira/browse/HDFS-7201
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Attachments: HDFS-7201.patch
>
>
> Found the following typos in hdfs-default.xml:
> repliaction
> directoires
> teh
> tranfer
> spage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7056) Snapshot support for truncate

2014-10-14 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171525#comment-14171525
 ] 

Konstantin Shvachko commented on HDFS-7056:
---

The proposal is to only copy block list when file is truncated. This allowes to 
avoid changes for append.
If the file is truncated and then appended the subsequent snapshots will not 
need to have a copy of the block list. They will reference the original list of 
the file, until the file is truncated again.

> Snapshot support for truncate
> -
>
> Key: HDFS-7056
> URL: https://issues.apache.org/jira/browse/HDFS-7056
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5089) When a LayoutVersion support SNAPSHOT, it must support FSIMAGE_NAME_OPTIMIZATION.

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171498#comment-14171498
 ] 

Hadoop QA commented on HDFS-5089:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636849/h5089_20140325.patch
  against trunk revision cdce883.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-hdfs-project/hadoop-hdfs 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8421//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8421//console

This message is automatically generated.

> When a LayoutVersion support SNAPSHOT, it must support 
> FSIMAGE_NAME_OPTIMIZATION.
> -
>
> Key: HDFS-5089
> URL: https://issues.apache.org/jira/browse/HDFS-5089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h5089_20130813.patch, h5089_20140325.patch
>
>
> The SNAPSHOT layout requires FSIMAGE_NAME_OPTIMIZATION as a prerequisite.  
> However, RESERVED_REL1_3_0 supports SNAPSHOT but not 
> FSIMAGE_NAME_OPTIMIZATION.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171499#comment-14171499
 ] 

Hadoop QA commented on HDFS-6481:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12648186/hdfs-6481-v1.txt
  against trunk revision 7dcad84.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8420//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8420//console

This message is automatically generated.

> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvoc

[jira] [Commented] (HDFS-7204) balancer doesn't run as a daemon

2014-10-14 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171452#comment-14171452
 ] 

Yongjun Zhang commented on HDFS-7204:
-

Thanks [~benoyantony] for asking and [~aw] for the detailed explanation of the 
daemon variable.

It happens that the same string "daemon" is used both as a variable in the 
script and as command line switch, and they meant different thing, thus it is a 
bit confusing. Maybe we change the variable name "daemon" to something like 
"run_via_dh" (run via daemon handler) and add a comment like Allen summarized? 
Thanks.



> balancer doesn't run as a daemon
> 
>
> Key: HDFS-7204
> URL: https://issues.apache.org/jira/browse/HDFS-7204
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Blocker
>  Labels: newbie
> Attachments: HDFS-7204-01.patch, HDFS-7204.patch
>
>
> From HDFS-7184, minor issues with balancer:
> * daemon isn't set to true in hdfs to enable daemonization
> * start-balancer script has usage instead of hadoop_usage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5089) When a LayoutVersion support SNAPSHOT, it must support FSIMAGE_NAME_OPTIMIZATION.

2014-10-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171434#comment-14171434
 ] 

Tsz Wo Nicholas Sze commented on HDFS-5089:
---

If there is no more further comment/question, I am going to commit this 
tomorrow.

> When a LayoutVersion support SNAPSHOT, it must support 
> FSIMAGE_NAME_OPTIMIZATION.
> -
>
> Key: HDFS-5089
> URL: https://issues.apache.org/jira/browse/HDFS-5089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h5089_20130813.patch, h5089_20140325.patch
>
>
> The SNAPSHOT layout requires FSIMAGE_NAME_OPTIMIZATION as a prerequisite.  
> However, RESERVED_REL1_3_0 supports SNAPSHOT but not 
> FSIMAGE_NAME_OPTIMIZATION.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7201) Fix typos in hdfs-default.xml

2014-10-14 Thread Dawson Choong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawson Choong updated HDFS-7201:

Attachment: HDFS-7201.patch

Attached a patch for fixing typos.

> Fix typos in hdfs-default.xml
> -
>
> Key: HDFS-7201
> URL: https://issues.apache.org/jira/browse/HDFS-7201
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Konstantin Shvachko
>Assignee: Dawson Choong
>  Labels: newbie
> Attachments: HDFS-7201.patch
>
>
> Found the following typos in hdfs-default.xml:
> repliaction
> directoires
> teh
> tranfer
> spage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7204) balancer doesn't run as a daemon

2014-10-14 Thread Benoy Antony (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171390#comment-14171390
 ] 

Benoy Antony commented on HDFS-7204:


Thanks for the detailed reply, [~aw]. I'll update HDFS-7184 accordingly.

> balancer doesn't run as a daemon
> 
>
> Key: HDFS-7204
> URL: https://issues.apache.org/jira/browse/HDFS-7204
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Blocker
>  Labels: newbie
> Attachments: HDFS-7204-01.patch, HDFS-7204.patch
>
>
> From HDFS-7184, minor issues with balancer:
> * daemon isn't set to true in hdfs to enable daemonization
> * start-balancer script has usage instead of hadoop_usage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7235) Can not decommission DN which has invalid block due to bad disk

2014-10-14 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171379#comment-14171379
 ] 

Yongjun Zhang commented on HDFS-7235:
-

FYI [~cmccabe], I found a related jira HDFS-5194 and posted a comment there.


> Can not decommission DN which has invalid block due to bad disk
> ---
>
> Key: HDFS-7235
> URL: https://issues.apache.org/jira/browse/HDFS-7235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
> final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
> b.getLocalBlock());
> return replicaInfo != null
> && replicaInfo.getState() == state
> && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5194) Robust support for alternate FsDatasetSpi implementations

2014-10-14 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171374#comment-14171374
 ] 

Yongjun Zhang commented on HDFS-5194:
-

HI [~dep],

I was working on HDFS-7235 and found this jira, thanks for the great work you 
have done here.

Some questions:

1.  {{FsDatasetSpi}} currently has the {{@InterfaceAudience.Private}} 
annotation, I assume it means we can still modify the FsDatasetSpi interface 
and without breaking compatibility from external users' point of view, am I 
correct? 

2. For HDFS-7235, {{DataNode#transferBlock}} calls isValidBlock(), when it 
returns false, DN reports the bad block to NN for some cases, and only send 
error report back to NN for other cases. It failed to report bad block to NN 
for the case described in HDFS-7235. I'm thinking about adding a new method 
like {{needToReportInvalidBlock()}} to FsDatasetSpi, so when {{isValidBlock()}} 
returns false, {{needToReportInvalidBlock()}} can be called to find out whether 
DN need to report it as bad block. What do you think?

Thanks.






 

> Robust support for alternate FsDatasetSpi implementations
> -
>
> Key: HDFS-5194
> URL: https://issues.apache.org/jira/browse/HDFS-5194
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, hdfs-client
>Reporter: David Powell
>Assignee: David Powell
>Priority: Minor
> Attachments: HDFS-5194.design.01222014.pdf, 
> HDFS-5194.design.09112013.pdf, HDFS-5194.patch.09112013
>
>
> The existing FsDatasetSpi interface is well-positioned to permit extending 
> Hadoop to run natively on non-traditional storage architectures.  Before this 
> can be done, however, a number of gaps need to be addressed.  This JIRA 
> documents those gaps, suggests some solutions, and puts forth a sample 
> implementation of some of the key changes needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7056) Snapshot support for truncate

2014-10-14 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171346#comment-14171346
 ] 

Hari Mankude commented on HDFS-7056:


Is the proposal to copy the list of blocks to snapshot copy only when the file 
is truncated or is it when snapshot is taken irrespective of whether file is 
truncated or not?

After the file is truncated and then appended again, will all subsequent 
snapshots of the file get a copy of block list?

> Snapshot support for truncate
> -
>
> Key: HDFS-7056
> URL: https://issues.apache.org/jira/browse/HDFS-7056
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7228) Add an SSD policy into the default BlockStoragePolicySuite

2014-10-14 Thread Jing Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7228:

   Resolution: Fixed
Fix Version/s: 2.6.0
   Status: Resolved  (was: Patch Available)

I've committed this to trunk, branch-2, and branch-2.6. Thanks Nicholas and 
Suresh for review!

> Add an SSD policy into the default BlockStoragePolicySuite
> --
>
> Key: HDFS-7228
> URL: https://issues.apache.org/jira/browse/HDFS-7228
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.6.0
>
> Attachments: HDFS-7228.000.patch, HDFS-7228.001.patch, 
> HDFS-7228.002.patch, HDFS-7228.003.patch, HDFS-7228.003.patch
>
>
> Currently in the default BlockStoragePolicySuite, we've defined 4 storage 
> policies: LAZY_PERSIST, HOT, WARM, and COLD. Since we have already defined 
> the SSD storage type, it will be useful to also include a SSD related storage 
> policy in the default suite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2014-10-14 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171256#comment-14171256
 ] 

Robert Metzger commented on HDFS-6481:
--

I'm running a Hadoop 2.4.0 HDFS on my cluster and I'm getting this error quite 
frequently. I'm writing to HDFS using Apache Flink.
Is there any workaround for this issue (HDFS configuration?) ?

> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
> at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
> 2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: 
> syncer encountered error, will retry. txid=211
> org.apache.hadoop.ipc.RemoteException(j

[jira] [Commented] (HDFS-7228) Add an SSD policy into the default BlockStoragePolicySuite

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171248#comment-14171248
 ] 

Hudson commented on HDFS-7228:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6259 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6259/])
HDFS-7228. Add an SSD policy into the default BlockStoragePolicySuite. 
Contributed by Jing Zhao. (jing9: rev 7dcad84143a9eef059688570cd0f9cf73747f2de)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockStoragePolicy.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestStorageMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockStoragePolicySuite.java


> Add an SSD policy into the default BlockStoragePolicySuite
> --
>
> Key: HDFS-7228
> URL: https://issues.apache.org/jira/browse/HDFS-7228
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-7228.000.patch, HDFS-7228.001.patch, 
> HDFS-7228.002.patch, HDFS-7228.003.patch, HDFS-7228.003.patch
>
>
> Currently in the default BlockStoragePolicySuite, we've defined 4 storage 
> policies: LAZY_PERSIST, HOT, WARM, and COLD. Since we have already defined 
> the SSD storage type, it will be useful to also include a SSD related storage 
> policy in the default suite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-7243) HDFS concat operation should not be allowed in Encryption Zone

2014-10-14 Thread Charles Lamb (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb reassigned HDFS-7243:
--

Assignee: Charles Lamb  (was: Yi Liu)

> HDFS concat operation should not be allowed in Encryption Zone
> --
>
> Key: HDFS-7243
> URL: https://issues.apache.org/jira/browse/HDFS-7243
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Charles Lamb
>
> For HDFS encryption at rest, files in an encryption zone are using different 
> data encryption keys, so concat should be disallowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7242) Code improvement for FSN#checkUnreadableBySuperuser

2014-10-14 Thread Charles Lamb (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171222#comment-14171222
 ] 

Charles Lamb commented on HDFS-7242:


Good catch Yi. Non-binding +1 from me.


> Code improvement for FSN#checkUnreadableBySuperuser
> ---
>
> Key: HDFS-7242
> URL: https://issues.apache.org/jira/browse/HDFS-7242
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-7242.001.patch
>
>
> _checkUnreadableBySuperuser_ is to check whether super user can access 
> specific path. The code logic is not efficient. It does iteration check for 
> all user, actually we just need to check _super user_ and can save few cpu 
> cycle.
> {code}
> private void checkUnreadableBySuperuser(FSPermissionChecker pc,
>   INode inode, int snapshotId)
>   throws IOException {
> for (XAttr xattr : dir.getXAttrs(inode, snapshotId)) {
>   if (XAttrHelper.getPrefixName(xattr).
>   equals(SECURITY_XATTR_UNREADABLE_BY_SUPERUSER)) {
> if (pc.isSuperUser()) {
>   throw new AccessControlException("Access is denied for " +
>   pc.getUser() + " since the superuser is not allowed to " +
>   "perform this operation.");
> }
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7222) Expose DataNode network errors as a metric

2014-10-14 Thread Charles Lamb (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171211#comment-14171211
 ] 

Charles Lamb commented on HDFS-7222:


The two test failures are unrelated.


> Expose DataNode network errors as a metric
> --
>
> Key: HDFS-7222
> URL: https://issues.apache.org/jira/browse/HDFS-7222
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Attachments: HDFS-7222.001.patch, HDFS-7222.002.patch
>
>
> It would be useful to track datanode network errors and expose them as a 
> metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7222) Expose DataNode network errors as a metric

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171205#comment-14171205
 ] 

Hadoop QA commented on HDFS-7222:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674769/HDFS-7222.002.patch
  against trunk revision 5faaba0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8419//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8419//console

This message is automatically generated.

> Expose DataNode network errors as a metric
> --
>
> Key: HDFS-7222
> URL: https://issues.apache.org/jira/browse/HDFS-7222
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Attachments: HDFS-7222.001.patch, HDFS-7222.002.patch
>
>
> It would be useful to track datanode network errors and expose them as a 
> metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7241) Unable to create encryption zone for viewfs:// after namenode federation is enabled

2014-10-14 Thread Charles Lamb (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171193#comment-14171193
 ] 

Charles Lamb commented on HDFS-7241:


[~zhos],

You can create the encryption zone as part of the underlying cluster 
configuration. Then you should be able to access it using viewfs.



> Unable to create encryption zone for viewfs:// after namenode federation is 
> enabled
> ---
>
> Key: HDFS-7241
> URL: https://issues.apache.org/jira/browse/HDFS-7241
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, federation
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Xiaomin Zhang
>
> After configuring namenode federation for the cluster, I also enabled client 
> mount table and viewfs as default URI. The hdfs crypto commands now failed 
> with below error:
> # hdfs crypto -createZone -keyName key1 -path /user/test
> IllegalArgumentException: FileSystem viewfs://cluster18/ is not an HDFS file 
> system
> This blocks the whole encryption at-rest feature as no zone could be defined



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7235) Can not decommission DN which has invalid block due to bad disk

2014-10-14 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171051#comment-14171051
 ] 

Yongjun Zhang commented on HDFS-7235:
-

HI [~cmccabe],

Thanks for the review and discussion yesterday. I was in a rush to leave when I 
put the previous comment with patch yesterday. Here is some more info to share:

* You said that external user might be deriving FsDatasetSpi interface, any 
change to this interface might imply compatibility. This is a very good point. 
So indeed it'd be nice if we can avoid changing FsDatasetSpi.
* If we use {{FsDatasetSpi#getLength}} method to check file existence, it's not 
guaranteed that the replica state is FINALIZED. So it's not sufficient for the 
fix here. 
* Without changing FsDatasetSpi, we need to add similar logic as I did in rev 
001 to DataNode.java. To check replica state in DataNode.java, I had to use the 
deprecated method getReplica(). 
* Having this logic in DataNode.java is a bit concern to me, DataNode is 
supposed to use FsDatasetSpi interface only, now we incorporate logic specific 
to FsDatasetImpl in DataNode.java.  If user derives FsDatasetSpi and write 
their own version, the logic may not be the same as FsDatasetImpl. This might 
cause potential problem. This is the point I was trying to make in last comment.

Would you please comment again?

Thanks.
 


> Can not decommission DN which has invalid block due to bad disk
> ---
>
> Key: HDFS-7235
> URL: https://issues.apache.org/jira/browse/HDFS-7235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
> final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
> b.getLocalBlock());
> return replicaInfo != null
> && replicaInfo.getState() == state
> && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7237) namenode -rollingUpgrade throws ArrayIndexOutOfBoundsException

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171043#comment-14171043
 ] 

Hudson commented on HDFS-7237:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1926 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1926/])
HDFS-7237. The command "hdfs namenode -rollingUpgrade" throws 
ArrayIndexOutOfBoundsException. (szetszwo: rev 
f6d0b8892ab116514fd031a61441141ac3bdfeb5)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHdfsServerConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeOptionParsing.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> namenode -rollingUpgrade throws ArrayIndexOutOfBoundsException
> --
>
> Key: HDFS-7237
> URL: https://issues.apache.org/jira/browse/HDFS-7237
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: h7237_20141013.patch, h7237_20141013b.patch
>
>
> Run "hdfs namenode -rollingUpgrade"
> {noformat}
> 14/10/13 11:30:50 INFO namenode.NameNode: createNameNode [-rollingUpgrade]
> 14/10/13 11:30:50 FATAL namenode.NameNode: Exception in namenode join
> java.lang.ArrayIndexOutOfBoundsException: 1
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.parseArguments(NameNode.java:1252)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1501)
> 14/10/13 11:30:50 INFO util.ExitUtil: Exiting with status 1
> {noformat}
> Although the command is illegal (missing rolling upgrade startup option), it 
> should print a better error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7236) Fix TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171041#comment-14171041
 ] 

Hudson commented on HDFS-7236:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1926 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1926/])
HDFS-7236. Fix TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots. 
Contributed by Yongjun Zhang. (jing9: rev 
98ac9f26c5b3bceb073ce444e42dc89d19132a1f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestOpenFilesWithSnapshot.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots
> 
>
> Key: HDFS-7236
> URL: https://issues.apache.org/jira/browse/HDFS-7236
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.6.0
>
> Attachments: HDFS-7236.001.patch
>
>
> Per the following report
> {code}
> Recently FAILED builds in url: 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk
> THERE ARE 4 builds (out of 5) that have failed tests in the past 7 days, 
> as listed below:
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1898/testReport 
> (2014-10-11 04:30:40)
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1897/testReport 
> (2014-10-10 04:30:40)
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode.testDeadDatanode
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
> Failed test: org.apache.hadoop.tracing.TestTracing.testReadTraceHooks
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
> Failed test: org.apache.hadoop.tracing.TestTracing.testWriteTraceHooks
> ...
> Among 5 runs examined, all failed tests <#failedRuns: testName>:
> 4: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
> 2: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
> 2: 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
> 1: 
> org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode.testDeadDatanode
> ...
> {code}
> TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots failed in most 
> recent two runs in trunk. Creating this jira for it (The other two tests that 
> failed more often were reported in separate jira HDFS-7221 and HDFS-7226)
> Symptom:
> {code}
> Error Message
> Timed out waiting for Mini HDFS Cluster to start
> Stacktrace
> java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1194)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1819)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1789)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.doTestMultipleSnapshots(TestOpenFilesWithSnapshot.java:184)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots(TestOpenFilesWithSnapshot.java:162)
> {code}
> AND
> {code}
> 2014-10-11 12:38:24,385 ERROR datanode.DataNode (DataXceiver.java:run(243)) - 
> 127.0.0.1:55303:DataXceiver error processing WRITE_BLOCK operation  src: 
> /127.0.0.1:32949 dst: /127.0.0.1:55303
> java.io.IOException: Premature EOF from inputStream
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:196)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:468)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:772)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:720)
>

[jira] [Commented] (HDFS-7090) Use unbuffered writes when persisting in-memory replicas

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171036#comment-14171036
 ] 

Hudson commented on HDFS-7090:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1926 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1926/])
HDFS-7090. Use unbuffered writes when persisting in-memory replicas. 
Contributed by Xiaoyu Yao. (cnauroth: rev 
1770bb942f9ebea38b6811ba0bc3cc249ef3ccbb)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/nativeio/TestNativeIO.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/errno_enum.c
* 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/Errno.java


> Use unbuffered writes when persisting in-memory replicas
> 
>
> Key: HDFS-7090
> URL: https://issues.apache.org/jira/browse/HDFS-7090
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0
>
> Attachments: HDFS-7090.0.patch, HDFS-7090.1.patch, HDFS-7090.2.patch, 
> HDFS-7090.3.patch, HDFS-7090.4.patch
>
>
> The LazyWriter thread just uses {{FileUtils.copyFile}} to copy block files to 
> persistent storage. It would be better to use unbuffered writes to avoid 
> churning page cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6544) Broken Link for GFS in package.html

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171040#comment-14171040
 ] 

Hudson commented on HDFS-6544:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1926 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1926/])
HDFS-6544. Broken Link for GFS in package.html. Contributed by Suraj Nayak M. 
(wheat9: rev 53100318ea20c53c4d810dedfd50b88f9f32c1dc)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/package.html


> Broken Link for GFS in package.html
> ---
>
> Key: HDFS-6544
> URL: https://issues.apache.org/jira/browse/HDFS-6544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Suraj Nayak M
>Assignee: Suraj Nayak M
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6544.patch
>
>
> The link to GFS is currently pointing to 
> http://labs.google.com/papers/gfs.html, which is broken. Change it to 
> http://research.google.com/archive/gfs.html which has Abstract of the GFS 
> paper along with link to the PDF version of the GFS Paper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7251) Hadoop fs -put documentation issue

2014-10-14 Thread Sai Srikanth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Srikanth updated HDFS-7251:
---
Assignee: (was: Birender Saini)

> Hadoop fs -put documentation issue
> --
>
> Key: HDFS-7251
> URL: https://issues.apache.org/jira/browse/HDFS-7251
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: nfs
>Reporter: Sai Srikanth
>Priority: Minor
>
> cmd Hadoop fs -put documentation, in most of the version, it was given that 
> source should be file. 
> https://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#put
>  
> Usage: hdfs dfs -put  ... 
> Copy single src, or multiple srcs from local file system to the destination 
> file system. Also reads input from stdin and writes to destination file 
> system.
> hdfs dfs -put localfile /user/hadoop/hadoopfile
> hdfs dfs -put localfile1 localfile2 /user/hadoop/hadoopdir
> hdfs dfs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
> hdfs dfs -put - hdfs://nn.example.com/hadoop/hadoopfile Reads the input from 
> stdin.
> I have tested with the directory as a source and it worked fine. I think the 
> documentation need to updated.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7251) Hadoop fs -put documentation issue

2014-10-14 Thread Sai Srikanth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Srikanth updated HDFS-7251:
---
Assignee: Birender Saini

> Hadoop fs -put documentation issue
> --
>
> Key: HDFS-7251
> URL: https://issues.apache.org/jira/browse/HDFS-7251
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: nfs
>Reporter: Sai Srikanth
>Assignee: Birender Saini
>Priority: Minor
>
> cmd Hadoop fs -put documentation, in most of the version, it was given that 
> source should be file. 
> https://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#put
>  
> Usage: hdfs dfs -put  ... 
> Copy single src, or multiple srcs from local file system to the destination 
> file system. Also reads input from stdin and writes to destination file 
> system.
> hdfs dfs -put localfile /user/hadoop/hadoopfile
> hdfs dfs -put localfile1 localfile2 /user/hadoop/hadoopdir
> hdfs dfs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
> hdfs dfs -put - hdfs://nn.example.com/hadoop/hadoopfile Reads the input from 
> stdin.
> I have tested with the directory as a source and it worked fine. I think the 
> documentation need to updated.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6544) Broken Link for GFS in package.html

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170960#comment-14170960
 ] 

Hudson commented on HDFS-6544:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1901 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1901/])
HDFS-6544. Broken Link for GFS in package.html. Contributed by Suraj Nayak M. 
(wheat9: rev 53100318ea20c53c4d810dedfd50b88f9f32c1dc)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/package.html
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Broken Link for GFS in package.html
> ---
>
> Key: HDFS-6544
> URL: https://issues.apache.org/jira/browse/HDFS-6544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Suraj Nayak M
>Assignee: Suraj Nayak M
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6544.patch
>
>
> The link to GFS is currently pointing to 
> http://labs.google.com/papers/gfs.html, which is broken. Change it to 
> http://research.google.com/archive/gfs.html which has Abstract of the GFS 
> paper along with link to the PDF version of the GFS Paper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7237) namenode -rollingUpgrade throws ArrayIndexOutOfBoundsException

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170963#comment-14170963
 ] 

Hudson commented on HDFS-7237:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1901 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1901/])
HDFS-7237. The command "hdfs namenode -rollingUpgrade" throws 
ArrayIndexOutOfBoundsException. (szetszwo: rev 
f6d0b8892ab116514fd031a61441141ac3bdfeb5)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeOptionParsing.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHdfsServerConstants.java


> namenode -rollingUpgrade throws ArrayIndexOutOfBoundsException
> --
>
> Key: HDFS-7237
> URL: https://issues.apache.org/jira/browse/HDFS-7237
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: h7237_20141013.patch, h7237_20141013b.patch
>
>
> Run "hdfs namenode -rollingUpgrade"
> {noformat}
> 14/10/13 11:30:50 INFO namenode.NameNode: createNameNode [-rollingUpgrade]
> 14/10/13 11:30:50 FATAL namenode.NameNode: Exception in namenode join
> java.lang.ArrayIndexOutOfBoundsException: 1
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.parseArguments(NameNode.java:1252)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1501)
> 14/10/13 11:30:50 INFO util.ExitUtil: Exiting with status 1
> {noformat}
> Although the command is illegal (missing rolling upgrade startup option), it 
> should print a better error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7090) Use unbuffered writes when persisting in-memory replicas

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170956#comment-14170956
 ] 

Hudson commented on HDFS-7090:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1901 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1901/])
HDFS-7090. Use unbuffered writes when persisting in-memory replicas. 
Contributed by Xiaoyu Yao. (cnauroth: rev 
1770bb942f9ebea38b6811ba0bc3cc249ef3ccbb)
* 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/Errno.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/errno_enum.c
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/nativeio/TestNativeIO.java


> Use unbuffered writes when persisting in-memory replicas
> 
>
> Key: HDFS-7090
> URL: https://issues.apache.org/jira/browse/HDFS-7090
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0
>
> Attachments: HDFS-7090.0.patch, HDFS-7090.1.patch, HDFS-7090.2.patch, 
> HDFS-7090.3.patch, HDFS-7090.4.patch
>
>
> The LazyWriter thread just uses {{FileUtils.copyFile}} to copy block files to 
> persistent storage. It would be better to use unbuffered writes to avoid 
> churning page cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7236) Fix TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170961#comment-14170961
 ] 

Hudson commented on HDFS-7236:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1901 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1901/])
HDFS-7236. Fix TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots. 
Contributed by Yongjun Zhang. (jing9: rev 
98ac9f26c5b3bceb073ce444e42dc89d19132a1f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestOpenFilesWithSnapshot.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots
> 
>
> Key: HDFS-7236
> URL: https://issues.apache.org/jira/browse/HDFS-7236
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.6.0
>
> Attachments: HDFS-7236.001.patch
>
>
> Per the following report
> {code}
> Recently FAILED builds in url: 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk
> THERE ARE 4 builds (out of 5) that have failed tests in the past 7 days, 
> as listed below:
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1898/testReport 
> (2014-10-11 04:30:40)
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1897/testReport 
> (2014-10-10 04:30:40)
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode.testDeadDatanode
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
> Failed test: org.apache.hadoop.tracing.TestTracing.testReadTraceHooks
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
> Failed test: org.apache.hadoop.tracing.TestTracing.testWriteTraceHooks
> ...
> Among 5 runs examined, all failed tests <#failedRuns: testName>:
> 4: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
> 2: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
> 2: 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
> 1: 
> org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode.testDeadDatanode
> ...
> {code}
> TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots failed in most 
> recent two runs in trunk. Creating this jira for it (The other two tests that 
> failed more often were reported in separate jira HDFS-7221 and HDFS-7226)
> Symptom:
> {code}
> Error Message
> Timed out waiting for Mini HDFS Cluster to start
> Stacktrace
> java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1194)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1819)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1789)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.doTestMultipleSnapshots(TestOpenFilesWithSnapshot.java:184)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots(TestOpenFilesWithSnapshot.java:162)
> {code}
> AND
> {code}
> 2014-10-11 12:38:24,385 ERROR datanode.DataNode (DataXceiver.java:run(243)) - 
> 127.0.0.1:55303:DataXceiver error processing WRITE_BLOCK operation  src: 
> /127.0.0.1:32949 dst: /127.0.0.1:55303
> java.io.IOException: Premature EOF from inputStream
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:196)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:468)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:772)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:720)
>   at 
> org

[jira] [Updated] (HDFS-7222) Expose DataNode network errors as a metric

2014-10-14 Thread Charles Lamb (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7222:
---
Attachment: HDFS-7222.002.patch

Added a fix for the failing TestDNFencingWithReplication. The other 3 failures 
seem to be unrelated.


> Expose DataNode network errors as a metric
> --
>
> Key: HDFS-7222
> URL: https://issues.apache.org/jira/browse/HDFS-7222
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Attachments: HDFS-7222.001.patch, HDFS-7222.002.patch
>
>
> It would be useful to track datanode network errors and expose them as a 
> metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7251) Hadoop fs -put documentation issue

2014-10-14 Thread Sai Srikanth (JIRA)

Sai Srikanth created HDFS-7251:
--

 Summary: Hadoop fs -put documentation issue
 Key: HDFS-7251
 URL: https://issues.apache.org/jira/browse/HDFS-7251
 Project: Hadoop HDFS
  Issue Type: Task
  Components: nfs
Reporter: Sai Srikanth
Priority: Minor


cmd Hadoop fs -put documentation, in most of the version, it was given that 
source should be file. 

https://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#put
 

Usage: hdfs dfs -put  ... 

Copy single src, or multiple srcs from local file system to the destination 
file system. Also reads input from stdin and writes to destination file system.

hdfs dfs -put localfile /user/hadoop/hadoopfile
hdfs dfs -put localfile1 localfile2 /user/hadoop/hadoopdir
hdfs dfs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
hdfs dfs -put - hdfs://nn.example.com/hadoop/hadoopfile Reads the input from 
stdin.


I have tested with the directory as a source and it worked fine. I think the 
documentation need to updated.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7237) namenode -rollingUpgrade throws ArrayIndexOutOfBoundsException

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170817#comment-14170817
 ] 

Hudson commented on HDFS-7237:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #711 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/711/])
HDFS-7237. The command "hdfs namenode -rollingUpgrade" throws 
ArrayIndexOutOfBoundsException. (szetszwo: rev 
f6d0b8892ab116514fd031a61441141ac3bdfeb5)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHdfsServerConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeOptionParsing.java


> namenode -rollingUpgrade throws ArrayIndexOutOfBoundsException
> --
>
> Key: HDFS-7237
> URL: https://issues.apache.org/jira/browse/HDFS-7237
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: h7237_20141013.patch, h7237_20141013b.patch
>
>
> Run "hdfs namenode -rollingUpgrade"
> {noformat}
> 14/10/13 11:30:50 INFO namenode.NameNode: createNameNode [-rollingUpgrade]
> 14/10/13 11:30:50 FATAL namenode.NameNode: Exception in namenode join
> java.lang.ArrayIndexOutOfBoundsException: 1
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.parseArguments(NameNode.java:1252)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1501)
> 14/10/13 11:30:50 INFO util.ExitUtil: Exiting with status 1
> {noformat}
> Although the command is illegal (missing rolling upgrade startup option), it 
> should print a better error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7244) Reduce Namenode memory using Flyweight pattern

2014-10-14 Thread Amir Langer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170806#comment-14170806
 ] 

Amir Langer commented on HDFS-7244:
---

Although orthogonal, HDFS-6658 is a first step in this direction, especially 
the move towards using ids rather object references to define BlockInfo state.


> Reduce Namenode memory using Flyweight pattern
> --
>
> Key: HDFS-7244
> URL: https://issues.apache.org/jira/browse/HDFS-7244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Amir Langer
>
> Using the flyweight pattern can dramatically reduce memory usage in the 
> Namenode. The pattern also abstracts the actual storage type and allows the 
> decision of whether it is off-heap or not and what is the serialisation 
> mechanism to be configured per deployment. 
> The idea is to move all BlockInfo data (as a first step) to this storage 
> using the Flyweight pattern. The cost to doing it will be in higher latency 
> when accessing/modifying a block. The idea is that this will be offset with a 
> reduction in memory and in the case of off-heap, a dramatic reduction in 
> memory (effectively, memory used for BlockInfo would reduce to a very small 
> constant value).
> This reduction will also have an huge impact on the latency as GC pauses will 
> be reduced considerably and may even end up with better latency results than 
> the original code.
> I wrote a stand-alone project as a proof of concept, to show the pattern, the 
> data structure we can use and what will be the performance costs of this 
> approach.
> see [Slab|https://github.com/langera/slab]
> and [Slab performance 
> results|https://github.com/langera/slab/wiki/Performance-Results].
> Slab abstracts the storage, gives several storage implementations and 
> implements the flyweight pattern for the application (Namenode in our case).
> The stages to incorporate Slab into the Namenode is outlined in the sub-tasks 
> JIRAs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7236) Fix TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170815#comment-14170815
 ] 

Hudson commented on HDFS-7236:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #711 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/711/])
HDFS-7236. Fix TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots. 
Contributed by Yongjun Zhang. (jing9: rev 
98ac9f26c5b3bceb073ce444e42dc89d19132a1f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestOpenFilesWithSnapshot.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots
> 
>
> Key: HDFS-7236
> URL: https://issues.apache.org/jira/browse/HDFS-7236
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.6.0
>
> Attachments: HDFS-7236.001.patch
>
>
> Per the following report
> {code}
> Recently FAILED builds in url: 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk
> THERE ARE 4 builds (out of 5) that have failed tests in the past 7 days, 
> as listed below:
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1898/testReport 
> (2014-10-11 04:30:40)
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1897/testReport 
> (2014-10-10 04:30:40)
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode.testDeadDatanode
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
> Failed test: org.apache.hadoop.tracing.TestTracing.testReadTraceHooks
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
> Failed test: org.apache.hadoop.tracing.TestTracing.testWriteTraceHooks
> ...
> Among 5 runs examined, all failed tests <#failedRuns: testName>:
> 4: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
> 2: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
> 2: 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
> 1: 
> org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode.testDeadDatanode
> ...
> {code}
> TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots failed in most 
> recent two runs in trunk. Creating this jira for it (The other two tests that 
> failed more often were reported in separate jira HDFS-7221 and HDFS-7226)
> Symptom:
> {code}
> Error Message
> Timed out waiting for Mini HDFS Cluster to start
> Stacktrace
> java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1194)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1819)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1789)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.doTestMultipleSnapshots(TestOpenFilesWithSnapshot.java:184)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots(TestOpenFilesWithSnapshot.java:162)
> {code}
> AND
> {code}
> 2014-10-11 12:38:24,385 ERROR datanode.DataNode (DataXceiver.java:run(243)) - 
> 127.0.0.1:55303:DataXceiver error processing WRITE_BLOCK operation  src: 
> /127.0.0.1:32949 dst: /127.0.0.1:55303
> java.io.IOException: Premature EOF from inputStream
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:196)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:468)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:772)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:720)
>   at 
> org.a

[jira] [Commented] (HDFS-6544) Broken Link for GFS in package.html

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170814#comment-14170814
 ] 

Hudson commented on HDFS-6544:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #711 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/711/])
HDFS-6544. Broken Link for GFS in package.html. Contributed by Suraj Nayak M. 
(wheat9: rev 53100318ea20c53c4d810dedfd50b88f9f32c1dc)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/package.html
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Broken Link for GFS in package.html
> ---
>
> Key: HDFS-6544
> URL: https://issues.apache.org/jira/browse/HDFS-6544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Suraj Nayak M
>Assignee: Suraj Nayak M
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6544.patch
>
>
> The link to GFS is currently pointing to 
> http://labs.google.com/papers/gfs.html, which is broken. Change it to 
> http://research.google.com/archive/gfs.html which has Abstract of the GFS 
> paper along with link to the PDF version of the GFS Paper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7090) Use unbuffered writes when persisting in-memory replicas

2014-10-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170810#comment-14170810
 ] 

Hudson commented on HDFS-7090:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #711 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/711/])
HDFS-7090. Use unbuffered writes when persisting in-memory replicas. 
Contributed by Xiaoyu Yao. (cnauroth: rev 
1770bb942f9ebea38b6811ba0bc3cc249ef3ccbb)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/nativeio/TestNativeIO.java
* 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/errno_enum.c
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/Errno.java
* 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Use unbuffered writes when persisting in-memory replicas
> 
>
> Key: HDFS-7090
> URL: https://issues.apache.org/jira/browse/HDFS-7090
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0
>
> Attachments: HDFS-7090.0.patch, HDFS-7090.1.patch, HDFS-7090.2.patch, 
> HDFS-7090.3.patch, HDFS-7090.4.patch
>
>
> The LazyWriter thread just uses {{FileUtils.copyFile}} to copy block files to 
> persistent storage. It would be better to use unbuffered writes to avoid 
> churning page cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7250) Store blocks in slabs rather than a Map inside BlocksMap

2014-10-14 Thread Amir Langer (JIRA)

Amir Langer created HDFS-7250:
-

 Summary: Store blocks in slabs rather than a Map inside BlocksMap
 Key: HDFS-7250
 URL: https://issues.apache.org/jira/browse/HDFS-7250
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


Key to every block is the replication factor + slab key (address).
When setting a different replication factor for a block we need to move its 
data from one slab to another.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7249) Define block slabs per replication factor and initialise them in advance (inc. size config)

2014-10-14 Thread Amir Langer (JIRA)

Amir Langer created HDFS-7249:
-

 Summary: Define block slabs per replication factor and initialise 
them in advance (inc. size config)
 Key: HDFS-7249
 URL: https://issues.apache.org/jira/browse/HDFS-7249
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


The plan is to create a slab per replication factor inside the BlocksMap.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7248) Use ids for blocks in InodeFile

2014-10-14 Thread Amir Langer (JIRA)

Amir Langer created HDFS-7248:
-

 Summary: Use ids for blocks in InodeFile
 Key: HDFS-7248
 URL: https://issues.apache.org/jira/browse/HDFS-7248
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


Getting access to a block will be via lookup in the BlocksMap by id.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7247) Use ids for Block collection in Block

2014-10-14 Thread Amir Langer (JIRA)

Amir Langer created HDFS-7247:
-

 Summary: Use ids for Block collection in Block
 Key: HDFS-7247
 URL: https://issues.apache.org/jira/browse/HDFS-7247
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


Getting the BlockCollection will be done via lookup by Id.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7246) Use ids for DatanodeStorageInfo in the BlockInfo triplets

2014-10-14 Thread Amir Langer (JIRA)

Amir Langer created HDFS-7246:
-

 Summary: Use ids for DatanodeStorageInfo in the BlockInfo triplets
 Key: HDFS-7246
 URL: https://issues.apache.org/jira/browse/HDFS-7246
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


Identical to HDFS-6660




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7245) Introduce Slab code in HDFS

2014-10-14 Thread Amir Langer (JIRA)

Amir Langer created HDFS-7245:
-

 Summary: Introduce Slab code in HDFS
 Key: HDFS-7245
 URL: https://issues.apache.org/jira/browse/HDFS-7245
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: performance
Reporter: Amir Langer


see [Slab|https://github.com/langera/slab]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7244) Reduce Namenode memory using Flyweight pattern

2014-10-14 Thread Amir Langer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170801#comment-14170801
 ] 

Amir Langer commented on HDFS-7244:
---

Although not focusing on off-heap specifically  this JIRA could be the 
mechanism to have off-heap data structures in the Namenode.
Ultimately, the goal of this JIRA and HDFS-6709 is the same.


> Reduce Namenode memory using Flyweight pattern
> --
>
> Key: HDFS-7244
> URL: https://issues.apache.org/jira/browse/HDFS-7244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Amir Langer
>
> Using the flyweight pattern can dramatically reduce memory usage in the 
> Namenode. The pattern also abstracts the actual storage type and allows the 
> decision of whether it is off-heap or not and what is the serialisation 
> mechanism to be configured per deployment. 
> The idea is to move all BlockInfo data (as a first step) to this storage 
> using the Flyweight pattern. The cost to doing it will be in higher latency 
> when accessing/modifying a block. The idea is that this will be offset with a 
> reduction in memory and in the case of off-heap, a dramatic reduction in 
> memory (effectively, memory used for BlockInfo would reduce to a very small 
> constant value).
> This reduction will also have an huge impact on the latency as GC pauses will 
> be reduced considerably and may even end up with better latency results than 
> the original code.
> I wrote a stand-alone project as a proof of concept, to show the pattern, the 
> data structure we can use and what will be the performance costs of this 
> approach.
> see [Slab|https://github.com/langera/slab]
> and [Slab performance 
> results|https://github.com/langera/slab/wiki/Performance-Results].
> Slab abstracts the storage, gives several storage implementations and 
> implements the flyweight pattern for the application (Namenode in our case).
> The stages to incorporate Slab into the Namenode is outlined in the sub-tasks 
> JIRAs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7244) Reduce Namenode memory using Flyweight pattern

2014-10-14 Thread Amir Langer (JIRA)

Amir Langer created HDFS-7244:
-

 Summary: Reduce Namenode memory using Flyweight pattern
 Key: HDFS-7244
 URL: https://issues.apache.org/jira/browse/HDFS-7244
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Amir Langer


Using the flyweight pattern can dramatically reduce memory usage in the 
Namenode. The pattern also abstracts the actual storage type and allows the 
decision of whether it is off-heap or not and what is the serialisation 
mechanism to be configured per deployment. 

The idea is to move all BlockInfo data (as a first step) to this storage using 
the Flyweight pattern. The cost to doing it will be in higher latency when 
accessing/modifying a block. The idea is that this will be offset with a 
reduction in memory and in the case of off-heap, a dramatic reduction in memory 
(effectively, memory used for BlockInfo would reduce to a very small constant 
value).
This reduction will also have an huge impact on the latency as GC pauses will 
be reduced considerably and may even end up with better latency results than 
the original code.

I wrote a stand-alone project as a proof of concept, to show the pattern, the 
data structure we can use and what will be the performance costs of this 
approach.

see [Slab|https://github.com/langera/slab]
and [Slab performance 
results|https://github.com/langera/slab/wiki/Performance-Results].

Slab abstracts the storage, gives several storage implementations and 
implements the flyweight pattern for the application (Namenode in our case).
The stages to incorporate Slab into the Namenode is outlined in the sub-tasks 
JIRAs.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7228) Add an SSD policy into the default BlockStoragePolicySuite

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170765#comment-14170765
 ] 

Hadoop QA commented on HDFS-7228:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674717/HDFS-7228.003.patch
  against trunk revision 5faaba0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8417//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8417//console

This message is automatically generated.

> Add an SSD policy into the default BlockStoragePolicySuite
> --
>
> Key: HDFS-7228
> URL: https://issues.apache.org/jira/browse/HDFS-7228
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-7228.000.patch, HDFS-7228.001.patch, 
> HDFS-7228.002.patch, HDFS-7228.003.patch, HDFS-7228.003.patch
>
>
> Currently in the default BlockStoragePolicySuite, we've defined 4 storage 
> policies: LAZY_PERSIST, HOT, WARM, and COLD. Since we have already defined 
> the SSD storage type, it will be useful to also include a SSD related storage 
> policy in the default suite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7242) Code improvement for FSN#checkUnreadableBySuperuser

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170766#comment-14170766
 ] 

Hadoop QA commented on HDFS-7242:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674720/HDFS-7242.001.patch
  against trunk revision 5faaba0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8418//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8418//console

This message is automatically generated.

> Code improvement for FSN#checkUnreadableBySuperuser
> ---
>
> Key: HDFS-7242
> URL: https://issues.apache.org/jira/browse/HDFS-7242
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-7242.001.patch
>
>
> _checkUnreadableBySuperuser_ is to check whether super user can access 
> specific path. The code logic is not efficient. It does iteration check for 
> all user, actually we just need to check _super user_ and can save few cpu 
> cycle.
> {code}
> private void checkUnreadableBySuperuser(FSPermissionChecker pc,
>   INode inode, int snapshotId)
>   throws IOException {
> for (XAttr xattr : dir.getXAttrs(inode, snapshotId)) {
>   if (XAttrHelper.getPrefixName(xattr).
>   equals(SECURITY_XATTR_UNREADABLE_BY_SUPERUSER)) {
> if (pc.isSuperUser()) {
>   throw new AccessControlException("Access is denied for " +
>   pc.getUser() + " since the superuser is not allowed to " +
>   "perform this operation.");
> }
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7243) HDFS concat operation should not be allowed in Encryption Zone

2014-10-14 Thread Yi Liu (JIRA)

Yi Liu created HDFS-7243:


 Summary: HDFS concat operation should not be allowed in Encryption 
Zone
 Key: HDFS-7243
 URL: https://issues.apache.org/jira/browse/HDFS-7243
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption, namenode
Affects Versions: 2.6.0
Reporter: Yi Liu
Assignee: Yi Liu


For HDFS encryption at rest, files in an encryption zone are using different 
data encryption keys, so concat should be disallowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7242) Code improvement for FSN#checkUnreadableBySuperuser

2014-10-14 Thread Yi Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7242:
-
Description: 
_checkUnreadableBySuperuser_ is to check whether super user can access specific 
path. The code logic is not efficient. It does iteration check for all user, 
actually we just need to check _super user_ and can save few cpu cycle.
{code}
private void checkUnreadableBySuperuser(FSPermissionChecker pc,
  INode inode, int snapshotId)
  throws IOException {
for (XAttr xattr : dir.getXAttrs(inode, snapshotId)) {
  if (XAttrHelper.getPrefixName(xattr).
  equals(SECURITY_XATTR_UNREADABLE_BY_SUPERUSER)) {
if (pc.isSuperUser()) {
  throw new AccessControlException("Access is denied for " +
  pc.getUser() + " since the superuser is not allowed to " +
  "perform this operation.");
}
  }
}
  }
{code}

  was:
_checkUnreadableBySuperuser_ is to check whether user can access specific path. 
The code logic is not efficient. It does iteration check for all user, actually 
we just need to check _super user_ and can save few cpu cycle.
{code}
private void checkUnreadableBySuperuser(FSPermissionChecker pc,
  INode inode, int snapshotId)
  throws IOException {
for (XAttr xattr : dir.getXAttrs(inode, snapshotId)) {
  if (XAttrHelper.getPrefixName(xattr).
  equals(SECURITY_XATTR_UNREADABLE_BY_SUPERUSER)) {
if (pc.isSuperUser()) {
  throw new AccessControlException("Access is denied for " +
  pc.getUser() + " since the superuser is not allowed to " +
  "perform this operation.");
}
  }
}
  }
{code}


> Code improvement for FSN#checkUnreadableBySuperuser
> ---
>
> Key: HDFS-7242
> URL: https://issues.apache.org/jira/browse/HDFS-7242
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-7242.001.patch
>
>
> _checkUnreadableBySuperuser_ is to check whether super user can access 
> specific path. The code logic is not efficient. It does iteration check for 
> all user, actually we just need to check _super user_ and can save few cpu 
> cycle.
> {code}
> private void checkUnreadableBySuperuser(FSPermissionChecker pc,
>   INode inode, int snapshotId)
>   throws IOException {
> for (XAttr xattr : dir.getXAttrs(inode, snapshotId)) {
>   if (XAttrHelper.getPrefixName(xattr).
>   equals(SECURITY_XATTR_UNREADABLE_BY_SUPERUSER)) {
> if (pc.isSuperUser()) {
>   throw new AccessControlException("Access is denied for " +
>   pc.getUser() + " since the superuser is not allowed to " +
>   "perform this operation.");
> }
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7242) Code improvement for FSN#checkUnreadableBySuperuser

2014-10-14 Thread Yi Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7242:
-
Status: Patch Available  (was: Open)

> Code improvement for FSN#checkUnreadableBySuperuser
> ---
>
> Key: HDFS-7242
> URL: https://issues.apache.org/jira/browse/HDFS-7242
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-7242.001.patch
>
>
> _checkUnreadableBySuperuser_ is to check whether user can access specific 
> path. The code logic is not efficient. It does iteration check for all user, 
> actually we just need to check _super user_ and can save few cpu cycle.
> {code}
> private void checkUnreadableBySuperuser(FSPermissionChecker pc,
>   INode inode, int snapshotId)
>   throws IOException {
> for (XAttr xattr : dir.getXAttrs(inode, snapshotId)) {
>   if (XAttrHelper.getPrefixName(xattr).
>   equals(SECURITY_XATTR_UNREADABLE_BY_SUPERUSER)) {
> if (pc.isSuperUser()) {
>   throw new AccessControlException("Access is denied for " +
>   pc.getUser() + " since the superuser is not allowed to " +
>   "perform this operation.");
> }
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7242) Code improvement for FSN#checkUnreadableBySuperuser

2014-10-14 Thread Yi Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7242:
-
Attachment: HDFS-7242.001.patch

> Code improvement for FSN#checkUnreadableBySuperuser
> ---
>
> Key: HDFS-7242
> URL: https://issues.apache.org/jira/browse/HDFS-7242
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-7242.001.patch
>
>
> _checkUnreadableBySuperuser_ is to check whether user can access specific 
> path. The code logic is not efficient. It does iteration check for all user, 
> actually we just need to check _super user_ and can save few cpu cycle.
> {code}
> private void checkUnreadableBySuperuser(FSPermissionChecker pc,
>   INode inode, int snapshotId)
>   throws IOException {
> for (XAttr xattr : dir.getXAttrs(inode, snapshotId)) {
>   if (XAttrHelper.getPrefixName(xattr).
>   equals(SECURITY_XATTR_UNREADABLE_BY_SUPERUSER)) {
> if (pc.isSuperUser()) {
>   throw new AccessControlException("Access is denied for " +
>   pc.getUser() + " since the superuser is not allowed to " +
>   "perform this operation.");
> }
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7242) Code improvement for FSN#checkUnreadableBySuperuser

2014-10-14 Thread Yi Liu (JIRA)

Yi Liu created HDFS-7242:


 Summary: Code improvement for FSN#checkUnreadableBySuperuser
 Key: HDFS-7242
 URL: https://issues.apache.org/jira/browse/HDFS-7242
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.6.0
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor


_checkUnreadableBySuperuser_ is to check whether user can access specific path. 
The code logic is not efficient. It does iteration check for all user, actually 
we just need to check _super user_ and can save few cpu cycle.
{code}
private void checkUnreadableBySuperuser(FSPermissionChecker pc,
  INode inode, int snapshotId)
  throws IOException {
for (XAttr xattr : dir.getXAttrs(inode, snapshotId)) {
  if (XAttrHelper.getPrefixName(xattr).
  equals(SECURITY_XATTR_UNREADABLE_BY_SUPERUSER)) {
if (pc.isSuperUser()) {
  throw new AccessControlException("Access is denied for " +
  pc.getUser() + " since the superuser is not allowed to " +
  "perform this operation.");
}
  }
}
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7228) Add an SSD policy into the default BlockStoragePolicySuite

2014-10-14 Thread Jing Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7228:

Attachment: HDFS-7228.003.patch

Update TestBlockStoragePolicy#testDefaultPolicies to make the policy names 
consistent.

> Add an SSD policy into the default BlockStoragePolicySuite
> --
>
> Key: HDFS-7228
> URL: https://issues.apache.org/jira/browse/HDFS-7228
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-7228.000.patch, HDFS-7228.001.patch, 
> HDFS-7228.002.patch, HDFS-7228.003.patch, HDFS-7228.003.patch
>
>
> Currently in the default BlockStoragePolicySuite, we've defined 4 storage 
> policies: LAZY_PERSIST, HOT, WARM, and COLD. Since we have already defined 
> the SSD storage type, it will be useful to also include a SSD related storage 
> policy in the default suite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7228) Add an SSD policy into the default BlockStoragePolicySuite

2014-10-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170618#comment-14170618
 ] 

Hadoop QA commented on HDFS-7228:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12674691/HDFS-7228.003.patch
  against trunk revision 5faaba0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestBlockStoragePolicy
  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8416//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8416//console

This message is automatically generated.

> Add an SSD policy into the default BlockStoragePolicySuite
> --
>
> Key: HDFS-7228
> URL: https://issues.apache.org/jira/browse/HDFS-7228
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-7228.000.patch, HDFS-7228.001.patch, 
> HDFS-7228.002.patch, HDFS-7228.003.patch
>
>
> Currently in the default BlockStoragePolicySuite, we've defined 4 storage 
> policies: LAZY_PERSIST, HOT, WARM, and COLD. Since we have already defined 
> the SSD storage type, it will be useful to also include a SSD related storage 
> policy in the default suite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

95 matches

Mail list logo