[jira] [Updated] (HDFS-7585) TestEnhancedByteBufferAccess hard code the block size

2015-01-06 Thread sam liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam liu updated HDFS-7585:
--
Attachment: HDFS-7585.001.patch

> TestEnhancedByteBufferAccess hard code the block size
> -
>
> Key: HDFS-7585
> URL: https://issues.apache.org/jira/browse/HDFS-7585
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: sam liu
>Assignee: sam liu
>Priority: Blocker
> Attachments: HDFS-7585.001.patch
>
>
> The test TestEnhancedByteBufferAccess hard code the block size, and it fails 
> with exceptions on power linux.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7585) TestEnhancedByteBufferAccess hard code the block size

2015-01-06 Thread sam liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam liu updated HDFS-7585:
--
Status: Patch Available  (was: Open)

The solution is to remove the hard-code of block size and use native OS page 
size instead. In this way, this test could pass on both x86 platform and power 
platform.

> TestEnhancedByteBufferAccess hard code the block size
> -
>
> Key: HDFS-7585
> URL: https://issues.apache.org/jira/browse/HDFS-7585
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: sam liu
>Assignee: sam liu
>Priority: Blocker
> Attachments: HDFS-7585.001.patch
>
>
> The test TestEnhancedByteBufferAccess hard code the block size, and it fails 
> with exceptions on power linux.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7585) TestEnhancedByteBufferAccess hard code the block size

2015-01-06 Thread sam liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam liu updated HDFS-7585:
--
Status: Patch Available  (was: Open)

> TestEnhancedByteBufferAccess hard code the block size
> -
>
> Key: HDFS-7585
> URL: https://issues.apache.org/jira/browse/HDFS-7585
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: sam liu
>Assignee: sam liu
>Priority: Blocker
> Attachments: HDFS-7585.001.patch
>
>
> The test TestEnhancedByteBufferAccess hard code the block size, and it fails 
> with exceptions on power linux.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7585) TestEnhancedByteBufferAccess hard code the block size

2015-01-06 Thread sam liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam liu updated HDFS-7585:
--
Status: Open  (was: Patch Available)

> TestEnhancedByteBufferAccess hard code the block size
> -
>
> Key: HDFS-7585
> URL: https://issues.apache.org/jira/browse/HDFS-7585
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: sam liu
>Assignee: sam liu
>Priority: Blocker
> Attachments: HDFS-7585.001.patch
>
>
> The test TestEnhancedByteBufferAccess hard code the block size, and it fails 
> with exceptions on power linux.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265858#comment-14265858
 ] 

Hadoop QA commented on HDFS-7564:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690277/HDFS-7564.003.patch
  against trunk revision 4cd66f7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9146//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9146//console

This message is automatically generated.

> NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
> 
>
> Key: HDFS-7564
> URL: https://issues.apache.org/jira/browse/HDFS-7564
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Yongjun Zhang
>Priority: Minor
> Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
> HDFS-7564.003.patch
>
>
> Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
> (default for static.id.mapping.file).
> It seems that the mappings file is currently only read upon restart of the 
> NFS gateway which would cause any active clients NFS mount points to hang or 
> fail.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7575) NameNode not handling heartbeats properly after HDFS-2832

2015-01-06 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265903#comment-14265903
 ] 

Lars Francke commented on HDFS-7575:


I don't object at all, quite the opposite. Thanks for taking care of this.

> NameNode not handling heartbeats properly after HDFS-2832
> -
>
> Key: HDFS-7575
> URL: https://issues.apache.org/jira/browse/HDFS-7575
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0, 2.5.0, 2.6.0
>Reporter: Lars Francke
>Assignee: Arpit Agarwal
>Priority: Critical
>
> Before HDFS-2832 each DataNode would have a unique storageId which included 
> its IP address. Since HDFS-2832 the DataNodes have a unique storageId per 
> storage directory which is just a random UUID.
> They send reports per storage directory in their heartbeats. This heartbeat 
> is processed on the NameNode in the 
> {{DatanodeDescriptor#updateHeartbeatState}} method. Pre HDFS-2832 this would 
> just store the information per Datanode. After the patch though each DataNode 
> can have multiple different storages so it's stored in a map keyed by the 
> storage Id.
> This works fine for all clusters that have been installed post HDFS-2832 as 
> they get a UUID for their storage Id. So a DN with 8 drives has a map with 8 
> different keys. On each Heartbeat the Map is searched and updated 
> ({{DatanodeStorageInfo storage = storageMap.get(s.getStorageID());}}):
> {code:title=DatanodeStorageInfo}
>   void updateState(StorageReport r) {
> capacity = r.getCapacity();
> dfsUsed = r.getDfsUsed();
> remaining = r.getRemaining();
> blockPoolUsed = r.getBlockPoolUsed();
>   }
> {code}
> On clusters that were upgraded from a pre HDFS-2832 version though the 
> storage Id has not been rewritten (at least not on the four clusters I 
> checked) so each directory will have the exact same storageId. That means 
> there'll be only a single entry in the {{storageMap}} and it'll be 
> overwritten by a random {{StorageReport}} from the DataNode. This can be seen 
> in the {{updateState}} method above. This just assigns the capacity from the 
> received report, instead it should probably sum it up per received heartbeat.
> The Balancer seems to be one of the only things that actually uses this 
> information so it now considers the utilization of a random drive per 
> DataNode for balancing purposes.
> Things get even worse when a drive has been added or replaced as this will 
> now get a new storage Id so there'll be two entries in the storageMap. As new 
> drives are usually empty it skewes the balancers decision in a way that this 
> node will never be considered over-utilized.
> Another problem is that old StorageReports are never removed from the 
> storageMap. So if I replace a drive and it gets a new storage Id the old one 
> will still be in place and used for all calculations by the Balancer until a 
> restart of the NameNode.
> I can try providing a patch that does the following:
> * Instead of using a Map I could just store the array we receive or instead 
> of storing an array sum up the values for reports with the same Id
> * On each heartbeat clear the map (so we know we have up to date information)
> Does that sound sensible?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7585) TestEnhancedByteBufferAccess hard code the block size

2015-01-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265963#comment-14265963
 ] 

Hadoop QA commented on HDFS-7585:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690280/HDFS-7585.001.patch
  against trunk revision 4cd66f7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9147//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9147//console

This message is automatically generated.

> TestEnhancedByteBufferAccess hard code the block size
> -
>
> Key: HDFS-7585
> URL: https://issues.apache.org/jira/browse/HDFS-7585
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: sam liu
>Assignee: sam liu
>Priority: Blocker
> Attachments: HDFS-7585.001.patch
>
>
> The test TestEnhancedByteBufferAccess hard code the block size, and it fails 
> with exceptions on power linux.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7583) Fix findbug in TransferFsImage.java

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265979#comment-14265979
 ] 

Hudson commented on HDFS-7583:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #799 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/799/])
HDFS-7583. Fix findbug in TransferFsImage.java (Contributed by Vinayakumar B) 
(vinayakumarb: rev 4cd66f7fb280e53e2d398a62e922a8d68d150679)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java


> Fix findbug in TransferFsImage.java
> ---
>
> Key: HDFS-7583
> URL: https://issues.apache.org/jira/browse/HDFS-7583
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7583-001.patch, HDFS-7583-002.patch
>
>
> Fix following findbug resulting in recent jenkins runs
> {noformat}Exceptional return value of java.io.File.delete() ignored in 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.TransferFsImage
> In method 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Called method java.io.File.delete()
> At TransferFsImage.java:[line 577]{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7572) TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265985#comment-14265985
 ] 

Hudson commented on HDFS-7572:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #799 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/799/])
HDFS-7572. TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on 
Windows. Contributed by Arpit Agarwal. (cnauroth: rev 
dfd2589bcb0e83f073eab30e32badcf2e9f75a62)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java


> TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows
> ---
>
> Key: HDFS-7572
> URL: https://issues.apache.org/jira/browse/HDFS-7572
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.7.0
>
> Attachments: HDFS-7572.001.patch
>
>
> *Error Message*
> Expected: is 
>  but: was 
> *Stacktrace*
> java.lang.AssertionError: 
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:865)
>   at org.junit.Assert.assertThat(Assert.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.LazyPersistTestCase.ensureFileReplicasOnStorageType(LazyPersistTestCase.java:129)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles.testDnRestartWithSavedReplicas(TestLazyPersistFiles.java:668)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7583) Fix findbug in TransferFsImage.java

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265994#comment-14265994
 ] 

Hudson commented on HDFS-7583:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #65 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/65/])
HDFS-7583. Fix findbug in TransferFsImage.java (Contributed by Vinayakumar B) 
(vinayakumarb: rev 4cd66f7fb280e53e2d398a62e922a8d68d150679)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java


> Fix findbug in TransferFsImage.java
> ---
>
> Key: HDFS-7583
> URL: https://issues.apache.org/jira/browse/HDFS-7583
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7583-001.patch, HDFS-7583-002.patch
>
>
> Fix following findbug resulting in recent jenkins runs
> {noformat}Exceptional return value of java.io.File.delete() ignored in 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.TransferFsImage
> In method 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Called method java.io.File.delete()
> At TransferFsImage.java:[line 577]{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7572) TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266000#comment-14266000
 ] 

Hudson commented on HDFS-7572:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #65 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/65/])
HDFS-7572. TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on 
Windows. Contributed by Arpit Agarwal. (cnauroth: rev 
dfd2589bcb0e83f073eab30e32badcf2e9f75a62)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows
> ---
>
> Key: HDFS-7572
> URL: https://issues.apache.org/jira/browse/HDFS-7572
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.7.0
>
> Attachments: HDFS-7572.001.patch
>
>
> *Error Message*
> Expected: is 
>  but: was 
> *Stacktrace*
> java.lang.AssertionError: 
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:865)
>   at org.junit.Assert.assertThat(Assert.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.LazyPersistTestCase.ensureFileReplicasOnStorageType(LazyPersistTestCase.java:129)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles.testDnRestartWithSavedReplicas(TestLazyPersistFiles.java:668)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7480) Namenodes loops on 'block does not belong to any file' after deleting many files

2015-01-06 Thread Frode Halvorsen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266043#comment-14266043
 ] 

Frode Halvorsen commented on HDFS-7480:
---

2.6.1 is not out yet, but one thought; This fix might resolve the issue when 
namenodes are started with a lot of incoming information about 'loose' 
data-blokcs, but it probably won't resolve the issue that causes the namenodes 
to be killed by zookeeper when I delete a lot of files.
Athe the delete-moment, I don't think that the logging is that problematic.
The logging-issue, I believe, is secondary. I believe that the active namenode 
gets busy calculating/distributing delete-orders to datanodes when I drop 
500.000 files at once, and that this is the causer fo the zookeeper-shutdown. 
When the namenode gets overloaded with caclulating/distributing those 
delete-orders, it doesn't keep up with responses to zoo-keeper, which the kills 
the namenode in order to failover to NN2.

> Namenodes loops on 'block does not belong to any file' after deleting many 
> files
> 
>
> Key: HDFS-7480
> URL: https://issues.apache.org/jira/browse/HDFS-7480
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
> Environment: CentOS - HDFS-HA (journal), zookeeper
>Reporter: Frode Halvorsen
>
> A small cluster has 8 servers with 32 G RAM.
> Two is namenodes (HA-configured), six is Datanodes (8x3 TB disks configured 
> with RAID as one 21 TB drive).
> The cluster recieves avg 400.000 small files each day. I started archiving 
> (HAR) each day as separate archives. After deleting the orinigal files for 
> one month, the namenodes stared acting up really bad.
> When restaring those, both active and passive nodes seems to work OK for some 
> time, but then starts to report a lot of blocks belonging to no files, and 
> the name-node just spins those messages in a massive loop. If the passive 
> node is first, it also influences the active node in susch a way that it's no 
> longer possible to archive new files. If the active node also starts in this 
> loop, it suddenly dies without any error-message.
> The only way I'm able to get rid of the problem, is to start decommission 
> nodes, watching the cluster closely to avoid downtime, and make sure every 
> datanode gets a 'clean' start. After all datanodes has been decommisioned (in 
> turns), and restarted with clean disks, the problem is gone. But if I then 
> delete a lot of files in a short time, the problem starts again...  
> The main problem (I think), is that the recieving and reporting of those 
> blocks takes so many resources, that the namenodes is too busy to tell the 
> datanodes to delete those blocks.. 
> If the active name-node starts on the loop, it does the 'right' thing by 
> telling the datanode to invalidate the block, But the amount of blocks is so 
> massive, that the namenode doesn't do anything else. Just now, I have about 
> 1200-1400 log-entries pr second in the passive node.
> update :
> Just got the active namenode in the loop - it logs 1000 lines pr second. 
> 500 'BlockStateChange: BLOCK* processReport: blk_1080796332_7056241 on 
> x.x.x.x:50010 size 1742 does not belong to any file'
> and 
> 500 ' BlockStateChange: BLOCK* InvalidateBlocks: add blk_1080796332_7056241 
> to x.x.x.x:50010'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7480) Namenodes loops on 'block does not belong to any file' after deleting many files

2015-01-06 Thread Frode Halvorsen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266046#comment-14266046
 ] 

Frode Halvorsen commented on HDFS-7480:
---

2.6.1 is not out yet, but one thought; This fix might resolve the issue when 
namenodes are started with a lot of incoming information about 'loose' 
data-blokcs, but it probably won't resolve the issue that causes the namenodes 
to be killed by zookeeper when I delete a lot of files.
Athe the delete-moment, I don't think that the logging is that problematic.
The logging-issue, I believe, is secondary. I believe that the active namenode 
gets busy calculating/distributing delete-orders to datanodes when I drop 
500.000 files at once, and that this is the causer fo the zookeeper-shutdown. 
When the namenode gets overloaded with caclulating/distributing those 
delete-orders, it doesn't keep up with responses to zoo-keeper, which the kills 
the namenode in order to failover to NN2.

> Namenodes loops on 'block does not belong to any file' after deleting many 
> files
> 
>
> Key: HDFS-7480
> URL: https://issues.apache.org/jira/browse/HDFS-7480
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
> Environment: CentOS - HDFS-HA (journal), zookeeper
>Reporter: Frode Halvorsen
>
> A small cluster has 8 servers with 32 G RAM.
> Two is namenodes (HA-configured), six is Datanodes (8x3 TB disks configured 
> with RAID as one 21 TB drive).
> The cluster recieves avg 400.000 small files each day. I started archiving 
> (HAR) each day as separate archives. After deleting the orinigal files for 
> one month, the namenodes stared acting up really bad.
> When restaring those, both active and passive nodes seems to work OK for some 
> time, but then starts to report a lot of blocks belonging to no files, and 
> the name-node just spins those messages in a massive loop. If the passive 
> node is first, it also influences the active node in susch a way that it's no 
> longer possible to archive new files. If the active node also starts in this 
> loop, it suddenly dies without any error-message.
> The only way I'm able to get rid of the problem, is to start decommission 
> nodes, watching the cluster closely to avoid downtime, and make sure every 
> datanode gets a 'clean' start. After all datanodes has been decommisioned (in 
> turns), and restarted with clean disks, the problem is gone. But if I then 
> delete a lot of files in a short time, the problem starts again...  
> The main problem (I think), is that the recieving and reporting of those 
> blocks takes so many resources, that the namenodes is too busy to tell the 
> datanodes to delete those blocks.. 
> If the active name-node starts on the loop, it does the 'right' thing by 
> telling the datanode to invalidate the block, But the amount of blocks is so 
> massive, that the namenode doesn't do anything else. Just now, I have about 
> 1200-1400 log-entries pr second in the passive node.
> update :
> Just got the active namenode in the loop - it logs 1000 lines pr second. 
> 500 'BlockStateChange: BLOCK* processReport: blk_1080796332_7056241 on 
> x.x.x.x:50010 size 1742 does not belong to any file'
> and 
> 500 ' BlockStateChange: BLOCK* InvalidateBlocks: add blk_1080796332_7056241 
> to x.x.x.x:50010'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7480) Namenodes loops on 'block does not belong to any file' after deleting many files

2015-01-06 Thread Frode Halvorsen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266044#comment-14266044
 ] 

Frode Halvorsen commented on HDFS-7480:
---

2.6.1 is not out yet, but one thought; This fix might resolve the issue when 
namenodes are started with a lot of incoming information about 'loose' 
data-blokcs, but it probably won't resolve the issue that causes the namenodes 
to be killed by zookeeper when I delete a lot of files.
Athe the delete-moment, I don't think that the logging is that problematic.
The logging-issue, I believe, is secondary. I believe that the active namenode 
gets busy calculating/distributing delete-orders to datanodes when I drop 
500.000 files at once, and that this is the causer fo the zookeeper-shutdown. 
When the namenode gets overloaded with caclulating/distributing those 
delete-orders, it doesn't keep up with responses to zoo-keeper, which the kills 
the namenode in order to failover to NN2.

> Namenodes loops on 'block does not belong to any file' after deleting many 
> files
> 
>
> Key: HDFS-7480
> URL: https://issues.apache.org/jira/browse/HDFS-7480
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
> Environment: CentOS - HDFS-HA (journal), zookeeper
>Reporter: Frode Halvorsen
>
> A small cluster has 8 servers with 32 G RAM.
> Two is namenodes (HA-configured), six is Datanodes (8x3 TB disks configured 
> with RAID as one 21 TB drive).
> The cluster recieves avg 400.000 small files each day. I started archiving 
> (HAR) each day as separate archives. After deleting the orinigal files for 
> one month, the namenodes stared acting up really bad.
> When restaring those, both active and passive nodes seems to work OK for some 
> time, but then starts to report a lot of blocks belonging to no files, and 
> the name-node just spins those messages in a massive loop. If the passive 
> node is first, it also influences the active node in susch a way that it's no 
> longer possible to archive new files. If the active node also starts in this 
> loop, it suddenly dies without any error-message.
> The only way I'm able to get rid of the problem, is to start decommission 
> nodes, watching the cluster closely to avoid downtime, and make sure every 
> datanode gets a 'clean' start. After all datanodes has been decommisioned (in 
> turns), and restarted with clean disks, the problem is gone. But if I then 
> delete a lot of files in a short time, the problem starts again...  
> The main problem (I think), is that the recieving and reporting of those 
> blocks takes so many resources, that the namenodes is too busy to tell the 
> datanodes to delete those blocks.. 
> If the active name-node starts on the loop, it does the 'right' thing by 
> telling the datanode to invalidate the block, But the amount of blocks is so 
> massive, that the namenode doesn't do anything else. Just now, I have about 
> 1200-1400 log-entries pr second in the passive node.
> update :
> Just got the active namenode in the loop - it logs 1000 lines pr second. 
> 500 'BlockStateChange: BLOCK* processReport: blk_1080796332_7056241 on 
> x.x.x.x:50010 size 1742 does not belong to any file'
> and 
> 500 ' BlockStateChange: BLOCK* InvalidateBlocks: add blk_1080796332_7056241 
> to x.x.x.x:50010'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7480) Namenodes loops on 'block does not belong to any file' after deleting many files

2015-01-06 Thread Frode Halvorsen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266045#comment-14266045
 ] 

Frode Halvorsen commented on HDFS-7480:
---

2.6.1 is not out yet, but one thought; This fix might resolve the issue when 
namenodes are started with a lot of incoming information about 'loose' 
data-blokcs, but it probably won't resolve the issue that causes the namenodes 
to be killed by zookeeper when I delete a lot of files.
Athe the delete-moment, I don't think that the logging is that problematic.
The logging-issue, I believe, is secondary. I believe that the active namenode 
gets busy calculating/distributing delete-orders to datanodes when I drop 
500.000 files at once, and that this is the causer fo the zookeeper-shutdown. 
When the namenode gets overloaded with caclulating/distributing those 
delete-orders, it doesn't keep up with responses to zoo-keeper, which the kills 
the namenode in order to failover to NN2.

> Namenodes loops on 'block does not belong to any file' after deleting many 
> files
> 
>
> Key: HDFS-7480
> URL: https://issues.apache.org/jira/browse/HDFS-7480
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
> Environment: CentOS - HDFS-HA (journal), zookeeper
>Reporter: Frode Halvorsen
>
> A small cluster has 8 servers with 32 G RAM.
> Two is namenodes (HA-configured), six is Datanodes (8x3 TB disks configured 
> with RAID as one 21 TB drive).
> The cluster recieves avg 400.000 small files each day. I started archiving 
> (HAR) each day as separate archives. After deleting the orinigal files for 
> one month, the namenodes stared acting up really bad.
> When restaring those, both active and passive nodes seems to work OK for some 
> time, but then starts to report a lot of blocks belonging to no files, and 
> the name-node just spins those messages in a massive loop. If the passive 
> node is first, it also influences the active node in susch a way that it's no 
> longer possible to archive new files. If the active node also starts in this 
> loop, it suddenly dies without any error-message.
> The only way I'm able to get rid of the problem, is to start decommission 
> nodes, watching the cluster closely to avoid downtime, and make sure every 
> datanode gets a 'clean' start. After all datanodes has been decommisioned (in 
> turns), and restarted with clean disks, the problem is gone. But if I then 
> delete a lot of files in a short time, the problem starts again...  
> The main problem (I think), is that the recieving and reporting of those 
> blocks takes so many resources, that the namenodes is too busy to tell the 
> datanodes to delete those blocks.. 
> If the active name-node starts on the loop, it does the 'right' thing by 
> telling the datanode to invalidate the block, But the amount of blocks is so 
> massive, that the namenode doesn't do anything else. Just now, I have about 
> 1200-1400 log-entries pr second in the passive node.
> update :
> Just got the active namenode in the loop - it logs 1000 lines pr second. 
> 500 'BlockStateChange: BLOCK* processReport: blk_1080796332_7056241 on 
> x.x.x.x:50010 size 1742 does not belong to any file'
> and 
> 500 ' BlockStateChange: BLOCK* InvalidateBlocks: add blk_1080796332_7056241 
> to x.x.x.x:50010'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7480) Namenodes loops on 'block does not belong to any file' after deleting many files

2015-01-06 Thread Frode Halvorsen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266042#comment-14266042
 ] 

Frode Halvorsen commented on HDFS-7480:
---

2.6.1 is not out yet, but one thought; This fix might resolve the issue when 
namenodes are started with a lot of incoming information about 'loose' 
data-blokcs, but it probably won't resolve the issue that causes the namenodes 
to be killed by zookeeper when I delete a lot of files.
Athe the delete-moment, I don't think that the logging is that problematic.
The logging-issue, I believe, is secondary. I believe that the active namenode 
gets busy calculating/distributing delete-orders to datanodes when I drop 
500.000 files at once, and that this is the causer fo the zookeeper-shutdown. 
When the namenode gets overloaded with caclulating/distributing those 
delete-orders, it doesn't keep up with responses to zoo-keeper, which the kills 
the namenode in order to failover to NN2.

> Namenodes loops on 'block does not belong to any file' after deleting many 
> files
> 
>
> Key: HDFS-7480
> URL: https://issues.apache.org/jira/browse/HDFS-7480
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
> Environment: CentOS - HDFS-HA (journal), zookeeper
>Reporter: Frode Halvorsen
>
> A small cluster has 8 servers with 32 G RAM.
> Two is namenodes (HA-configured), six is Datanodes (8x3 TB disks configured 
> with RAID as one 21 TB drive).
> The cluster recieves avg 400.000 small files each day. I started archiving 
> (HAR) each day as separate archives. After deleting the orinigal files for 
> one month, the namenodes stared acting up really bad.
> When restaring those, both active and passive nodes seems to work OK for some 
> time, but then starts to report a lot of blocks belonging to no files, and 
> the name-node just spins those messages in a massive loop. If the passive 
> node is first, it also influences the active node in susch a way that it's no 
> longer possible to archive new files. If the active node also starts in this 
> loop, it suddenly dies without any error-message.
> The only way I'm able to get rid of the problem, is to start decommission 
> nodes, watching the cluster closely to avoid downtime, and make sure every 
> datanode gets a 'clean' start. After all datanodes has been decommisioned (in 
> turns), and restarted with clean disks, the problem is gone. But if I then 
> delete a lot of files in a short time, the problem starts again...  
> The main problem (I think), is that the recieving and reporting of those 
> blocks takes so many resources, that the namenodes is too busy to tell the 
> datanodes to delete those blocks.. 
> If the active name-node starts on the loop, it does the 'right' thing by 
> telling the datanode to invalidate the block, But the amount of blocks is so 
> massive, that the namenode doesn't do anything else. Just now, I have about 
> 1200-1400 log-entries pr second in the passive node.
> update :
> Just got the active namenode in the loop - it logs 1000 lines pr second. 
> 500 'BlockStateChange: BLOCK* processReport: blk_1080796332_7056241 on 
> x.x.x.x:50010 size 1742 does not belong to any file'
> and 
> 500 ' BlockStateChange: BLOCK* InvalidateBlocks: add blk_1080796332_7056241 
> to x.x.x.x:50010'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7586) HFTP does not work when namenode bind on wildcard

2015-01-06 Thread Benoit Perroud (JIRA)
Benoit Perroud created HDFS-7586:


 Summary: HFTP does not work when namenode bind on wildcard
 Key: HDFS-7586
 URL: https://issues.apache.org/jira/browse/HDFS-7586
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0, 2.5.0, 2.4.0, 2.3.0, 2.2.0
Reporter: Benoit Perroud
Priority: Minor


When wildcard binding for NameNode RPC is turned on (i.e.  
dfs.namenode.rpc-address=0.0.0.0:8020), HFTP download is failing.

Call to http://namenode:50070/data/.. returns the header Location with 
parameter nnaddr=0.0.0.0:8020, which is unlikely to ever succeed :)

The idea would be, if wildcard binding is enabled, to get read the IP address 
the request is actually connected to from the HttpServletRequest and return 
this one.

WDYT?

How to reproduce:

1. Turn on wildcard binding
{code}dfs.namenode.rpc-address=0.0.0.0:8020{code}

2. Upload a file
{code}$ echo "123" | hdfs dfs -put - /tmp/randomFile.txt{code}

3. Validate it's failing
{code}
$ hdfs dfs -cat hftp://namenode1/tmp/randomFile.txt
{code}

4. Get more details via curl
{code}
$ curl -vv http://namenode1:50070/data/tmp/randomFile.txt?ugi=hdfs | grep 
"Location:"
 Location: 
http://datanode003:50075/streamFile/tmp/randomFile.txt?ugi=hdfs&nnaddr=0.0.0.0:8020
{code}

We can clearly see the 0.0.0.0 returned as the NN ip.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7586) HFTP does not work when namenode bind on wildcard

2015-01-06 Thread Benoit Perroud (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoit Perroud updated HDFS-7586:
-
Attachment: HDFS-7586-v0.1.txt

Draft patch. The idea is to read from HttpServletRequest when namenode url is 
0.0.0.0.

As the MiniDFSCluster is hardcoded to bind to 127.0.0.1, it's not completely 
trivial to test.

> HFTP does not work when namenode bind on wildcard
> -
>
> Key: HDFS-7586
> URL: https://issues.apache.org/jira/browse/HDFS-7586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0
>Reporter: Benoit Perroud
>Priority: Minor
> Attachments: HDFS-7586-v0.1.txt
>
>
> When wildcard binding for NameNode RPC is turned on (i.e.  
> dfs.namenode.rpc-address=0.0.0.0:8020), HFTP download is failing.
> Call to http://namenode:50070/data/.. returns the header Location with 
> parameter nnaddr=0.0.0.0:8020, which is unlikely to ever succeed :)
> The idea would be, if wildcard binding is enabled, to get read the IP address 
> the request is actually connected to from the HttpServletRequest and return 
> this one.
> WDYT?
> How to reproduce:
> 1. Turn on wildcard binding
> {code}dfs.namenode.rpc-address=0.0.0.0:8020{code}
> 2. Upload a file
> {code}$ echo "123" | hdfs dfs -put - /tmp/randomFile.txt{code}
> 3. Validate it's failing
> {code}
> $ hdfs dfs -cat hftp://namenode1/tmp/randomFile.txt
> {code}
> 4. Get more details via curl
> {code}
> $ curl -vv http://namenode1:50070/data/tmp/randomFile.txt?ugi=hdfs | grep 
> "Location:"
>  Location: 
> http://datanode003:50075/streamFile/tmp/randomFile.txt?ugi=hdfs&nnaddr=0.0.0.0:8020
> {code}
> We can clearly see the 0.0.0.0 returned as the NN ip.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7583) Fix findbug in TransferFsImage.java

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266145#comment-14266145
 ] 

Hudson commented on HDFS-7583:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1997 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1997/])
HDFS-7583. Fix findbug in TransferFsImage.java (Contributed by Vinayakumar B) 
(vinayakumarb: rev 4cd66f7fb280e53e2d398a62e922a8d68d150679)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix findbug in TransferFsImage.java
> ---
>
> Key: HDFS-7583
> URL: https://issues.apache.org/jira/browse/HDFS-7583
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7583-001.patch, HDFS-7583-002.patch
>
>
> Fix following findbug resulting in recent jenkins runs
> {noformat}Exceptional return value of java.io.File.delete() ignored in 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.TransferFsImage
> In method 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Called method java.io.File.delete()
> At TransferFsImage.java:[line 577]{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7572) TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266151#comment-14266151
 ] 

Hudson commented on HDFS-7572:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1997 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1997/])
HDFS-7572. TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on 
Windows. Contributed by Arpit Agarwal. (cnauroth: rev 
dfd2589bcb0e83f073eab30e32badcf2e9f75a62)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows
> ---
>
> Key: HDFS-7572
> URL: https://issues.apache.org/jira/browse/HDFS-7572
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.7.0
>
> Attachments: HDFS-7572.001.patch
>
>
> *Error Message*
> Expected: is 
>  but: was 
> *Stacktrace*
> java.lang.AssertionError: 
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:865)
>   at org.junit.Assert.assertThat(Assert.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.LazyPersistTestCase.ensureFileReplicasOnStorageType(LazyPersistTestCase.java:129)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles.testDnRestartWithSavedReplicas(TestLazyPersistFiles.java:668)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7501) TransactionsSinceLastCheckpoint can be negative on SBNs

2015-01-06 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266158#comment-14266158
 ] 

Daryn Sharp commented on HDFS-7501:
---

I don't agree with returning a hardcoded 0 on the standby.  I'd like to see the 
correct metric returned on both active and standby.

> TransactionsSinceLastCheckpoint can be negative on SBNs
> ---
>
> Key: HDFS-7501
> URL: https://issues.apache.org/jira/browse/HDFS-7501
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Harsh J
>Assignee: Gautam Gopalakrishnan
>Priority: Trivial
> Attachments: HDFS-7501-2.patch, HDFS-7501.patch
>
>
> The metric TransactionsSinceLastCheckpoint is derived as FSEditLog.txid minus 
> NNStorage.mostRecentCheckpointTxId.
> In Standby mode, the former does not increment beyond the loaded or 
> last-when-active value, but the latter does change due to checkpoints done 
> regularly in this mode. Thereby, the SBN will eventually end up showing 
> negative values for TransactionsSinceLastCheckpoint.
> This is not an issue as the metric only makes sense to be monitored on the 
> Active NameNode, but we should perhaps just show the value 0 by detecting if 
> the NN is in SBN form, as allowing a negative number is confusing to view 
> within a chart that tracks it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7583) Fix findbug in TransferFsImage.java

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266168#comment-14266168
 ] 

Hudson commented on HDFS-7583:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #62 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/62/])
HDFS-7583. Fix findbug in TransferFsImage.java (Contributed by Vinayakumar B) 
(vinayakumarb: rev 4cd66f7fb280e53e2d398a62e922a8d68d150679)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix findbug in TransferFsImage.java
> ---
>
> Key: HDFS-7583
> URL: https://issues.apache.org/jira/browse/HDFS-7583
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7583-001.patch, HDFS-7583-002.patch
>
>
> Fix following findbug resulting in recent jenkins runs
> {noformat}Exceptional return value of java.io.File.delete() ignored in 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.TransferFsImage
> In method 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Called method java.io.File.delete()
> At TransferFsImage.java:[line 577]{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7572) TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266174#comment-14266174
 ] 

Hudson commented on HDFS-7572:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #62 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/62/])
HDFS-7572. TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on 
Windows. Contributed by Arpit Agarwal. (cnauroth: rev 
dfd2589bcb0e83f073eab30e32badcf2e9f75a62)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java


> TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows
> ---
>
> Key: HDFS-7572
> URL: https://issues.apache.org/jira/browse/HDFS-7572
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.7.0
>
> Attachments: HDFS-7572.001.patch
>
>
> *Error Message*
> Expected: is 
>  but: was 
> *Stacktrace*
> java.lang.AssertionError: 
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:865)
>   at org.junit.Assert.assertThat(Assert.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.LazyPersistTestCase.ensureFileReplicasOnStorageType(LazyPersistTestCase.java:129)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles.testDnRestartWithSavedReplicas(TestLazyPersistFiles.java:668)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7572) TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266215#comment-14266215
 ] 

Hudson commented on HDFS-7572:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #66 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/66/])
HDFS-7572. TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on 
Windows. Contributed by Arpit Agarwal. (cnauroth: rev 
dfd2589bcb0e83f073eab30e32badcf2e9f75a62)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows
> ---
>
> Key: HDFS-7572
> URL: https://issues.apache.org/jira/browse/HDFS-7572
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.7.0
>
> Attachments: HDFS-7572.001.patch
>
>
> *Error Message*
> Expected: is 
>  but: was 
> *Stacktrace*
> java.lang.AssertionError: 
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:865)
>   at org.junit.Assert.assertThat(Assert.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.LazyPersistTestCase.ensureFileReplicasOnStorageType(LazyPersistTestCase.java:129)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles.testDnRestartWithSavedReplicas(TestLazyPersistFiles.java:668)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7583) Fix findbug in TransferFsImage.java

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266209#comment-14266209
 ] 

Hudson commented on HDFS-7583:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #66 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/66/])
HDFS-7583. Fix findbug in TransferFsImage.java (Contributed by Vinayakumar B) 
(vinayakumarb: rev 4cd66f7fb280e53e2d398a62e922a8d68d150679)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix findbug in TransferFsImage.java
> ---
>
> Key: HDFS-7583
> URL: https://issues.apache.org/jira/browse/HDFS-7583
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7583-001.patch, HDFS-7583-002.patch
>
>
> Fix following findbug resulting in recent jenkins runs
> {noformat}Exceptional return value of java.io.File.delete() ignored in 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.TransferFsImage
> In method 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Called method java.io.File.delete()
> At TransferFsImage.java:[line 577]{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7574) Make cmake work in Windows Visual Studio 2010

2015-01-06 Thread Thanh Do (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266247#comment-14266247
 ] 

Thanh Do commented on HDFS-7574:


Hi [~cmccabe]. In Windows, the existing test (in 
{{CMakeTestCompileStrerror.cpp}}) won't work because {{strerror}} has different 
signature. Specifically, Windows does not have {{strerror_r(errorno, buf, 
len)}}. The equivalence is {{strerror_s(buf, len, errorno)}}, with different 
parameter order. This make the test fails and {{STRERROR_R_RETURN_INT}} is 
always equal {{NO}}.

A cleaner fix may be put a few lines in {{CMakeTestCompileStrerror}}:
{code}
#ifdef _WIN32
#define strerror_r(errnum, buf, buflen) strerror_s((buf), (buflen), (errnum))
#endif
{code}

Thoughts?



> Make cmake work in Windows Visual Studio 2010
> -
>
> Key: HDFS-7574
> URL: https://issues.apache.org/jira/browse/HDFS-7574
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
> Environment: Windows Visual Studio 2010
>Reporter: Thanh Do
>Assignee: Thanh Do
> Attachments: HDFS-7574-branch-HDFS-6994-1.patch
>
>
> Cmake should be able to generate a solution file in Windows Visual Studio 
> 2010. This is the first step in a series of steps making libhdfs3 built 
> successfully in Windows. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7583) Fix findbug in TransferFsImage.java

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266252#comment-14266252
 ] 

Hudson commented on HDFS-7583:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2016 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2016/])
HDFS-7583. Fix findbug in TransferFsImage.java (Contributed by Vinayakumar B) 
(vinayakumarb: rev 4cd66f7fb280e53e2d398a62e922a8d68d150679)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java


> Fix findbug in TransferFsImage.java
> ---
>
> Key: HDFS-7583
> URL: https://issues.apache.org/jira/browse/HDFS-7583
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7583-001.patch, HDFS-7583-002.patch
>
>
> Fix following findbug resulting in recent jenkins runs
> {noformat}Exceptional return value of java.io.File.delete() ignored in 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.TransferFsImage
> In method 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.deleteTmpFiles(List)
> Called method java.io.File.delete()
> At TransferFsImage.java:[line 577]{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7572) TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266258#comment-14266258
 ] 

Hudson commented on HDFS-7572:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2016 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2016/])
HDFS-7572. TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on 
Windows. Contributed by Arpit Agarwal. (cnauroth: rev 
dfd2589bcb0e83f073eab30e32badcf2e9f75a62)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java


> TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows
> ---
>
> Key: HDFS-7572
> URL: https://issues.apache.org/jira/browse/HDFS-7572
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.7.0
>
> Attachments: HDFS-7572.001.patch
>
>
> *Error Message*
> Expected: is 
>  but: was 
> *Stacktrace*
> java.lang.AssertionError: 
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:865)
>   at org.junit.Assert.assertThat(Assert.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.LazyPersistTestCase.ensureFileReplicasOnStorageType(LazyPersistTestCase.java:129)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles.testDnRestartWithSavedReplicas(TestLazyPersistFiles.java:668)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7586) HFTP does not work when namenode bind on wildcard

2015-01-06 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266315#comment-14266315
 ] 

Daryn Sharp commented on HDFS-7586:
---

I think this is due to a misconfig.  The rpc-address key should be an actual 
ip/host:port.  There is a rpc-bind-host key that should be set to 0.0.0.0 for 
multihoming.  For more details, see:

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html



> HFTP does not work when namenode bind on wildcard
> -
>
> Key: HDFS-7586
> URL: https://issues.apache.org/jira/browse/HDFS-7586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0
>Reporter: Benoit Perroud
>Priority: Minor
> Attachments: HDFS-7586-v0.1.txt
>
>
> When wildcard binding for NameNode RPC is turned on (i.e.  
> dfs.namenode.rpc-address=0.0.0.0:8020), HFTP download is failing.
> Call to http://namenode:50070/data/.. returns the header Location with 
> parameter nnaddr=0.0.0.0:8020, which is unlikely to ever succeed :)
> The idea would be, if wildcard binding is enabled, to get read the IP address 
> the request is actually connected to from the HttpServletRequest and return 
> this one.
> WDYT?
> How to reproduce:
> 1. Turn on wildcard binding
> {code}dfs.namenode.rpc-address=0.0.0.0:8020{code}
> 2. Upload a file
> {code}$ echo "123" | hdfs dfs -put - /tmp/randomFile.txt{code}
> 3. Validate it's failing
> {code}
> $ hdfs dfs -cat hftp://namenode1/tmp/randomFile.txt
> {code}
> 4. Get more details via curl
> {code}
> $ curl -vv http://namenode1:50070/data/tmp/randomFile.txt?ugi=hdfs | grep 
> "Location:"
>  Location: 
> http://datanode003:50075/streamFile/tmp/randomFile.txt?ugi=hdfs&nnaddr=0.0.0.0:8020
> {code}
> We can clearly see the 0.0.0.0 returned as the NN ip.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7579) Improve log reporting during block report rpc failure

2015-01-06 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7579:
---
Attachment: HDFS-7579.000.patch

The attached patch modifies BPServiceActor so that even if one or more of the 
block report rpcs fails, a LOG.info message will still be displayed. This will 
help diagnose cases where the RPC throws an exception. This patch also adds a 
toString() to ServerCommand so that the LOG.info message displays something 
reasonable for the commands that it received back rather than just 
Object.toString().


> Improve log reporting during block report rpc failure
> -
>
> Key: HDFS-7579
> URL: https://issues.apache.org/jira/browse/HDFS-7579
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Attachments: HDFS-7579.000.patch
>
>
> During block reporting, if the block report RPC fails, for example because it 
> exceeded the max rpc len, we should still produce some sort of LOG.info 
> output to help with debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7579) Improve log reporting during block report rpc failure

2015-01-06 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266331#comment-14266331
 ] 

Charles Lamb commented on HDFS-7579:


Also, since this is just a remodularization of the LOG.info call, I did not add 
a unit test.


> Improve log reporting during block report rpc failure
> -
>
> Key: HDFS-7579
> URL: https://issues.apache.org/jira/browse/HDFS-7579
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Attachments: HDFS-7579.000.patch
>
>
> During block reporting, if the block report RPC fails, for example because it 
> exceeded the max rpc len, we should still produce some sort of LOG.info 
> output to help with debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7579) Improve log reporting during block report rpc failure

2015-01-06 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7579:
---
Status: Patch Available  (was: Open)

> Improve log reporting during block report rpc failure
> -
>
> Key: HDFS-7579
> URL: https://issues.apache.org/jira/browse/HDFS-7579
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-7579.000.patch
>
>
> During block reporting, if the block report RPC fails, for example because it 
> exceeded the max rpc len, we should still produce some sort of LOG.info 
> output to help with debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7579) Improve log reporting during block report rpc failure

2015-01-06 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7579:
---
Labels: supportability  (was: )

> Improve log reporting during block report rpc failure
> -
>
> Key: HDFS-7579
> URL: https://issues.apache.org/jira/browse/HDFS-7579
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-7579.000.patch
>
>
> During block reporting, if the block report RPC fails, for example because it 
> exceeded the max rpc len, we should still produce some sort of LOG.info 
> output to help with debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7586) HFTP does not work when namenode bind on wildcard

2015-01-06 Thread Benoit Perroud (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoit Perroud updated HDFS-7586:
-
Affects Version/s: (was: 2.6.0)
   (was: 2.5.0)

> HFTP does not work when namenode bind on wildcard
> -
>
> Key: HDFS-7586
> URL: https://issues.apache.org/jira/browse/HDFS-7586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.3.0, 2.4.0
>Reporter: Benoit Perroud
>Priority: Minor
> Attachments: HDFS-7586-v0.1.txt
>
>
> When wildcard binding for NameNode RPC is turned on (i.e.  
> dfs.namenode.rpc-address=0.0.0.0:8020), HFTP download is failing.
> Call to http://namenode:50070/data/.. returns the header Location with 
> parameter nnaddr=0.0.0.0:8020, which is unlikely to ever succeed :)
> The idea would be, if wildcard binding is enabled, to get read the IP address 
> the request is actually connected to from the HttpServletRequest and return 
> this one.
> WDYT?
> How to reproduce:
> 1. Turn on wildcard binding
> {code}dfs.namenode.rpc-address=0.0.0.0:8020{code}
> 2. Upload a file
> {code}$ echo "123" | hdfs dfs -put - /tmp/randomFile.txt{code}
> 3. Validate it's failing
> {code}
> $ hdfs dfs -cat hftp://namenode1/tmp/randomFile.txt
> {code}
> 4. Get more details via curl
> {code}
> $ curl -vv http://namenode1:50070/data/tmp/randomFile.txt?ugi=hdfs | grep 
> "Location:"
>  Location: 
> http://datanode003:50075/streamFile/tmp/randomFile.txt?ugi=hdfs&nnaddr=0.0.0.0:8020
> {code}
> We can clearly see the 0.0.0.0 returned as the NN ip.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7586) HFTP does not work when namenode bind on wildcard

2015-01-06 Thread Benoit Perroud (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266360#comment-14266360
 ] 

Benoit Perroud commented on HDFS-7586:
--

Thanks for the pointer. You're right in >=2.5.0. In <2.5, the way to do was 
setting 0.0.0.0:8020, which leads to the issue described here.
And has the multihoming been tested with HFTP too?


> HFTP does not work when namenode bind on wildcard
> -
>
> Key: HDFS-7586
> URL: https://issues.apache.org/jira/browse/HDFS-7586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.3.0, 2.4.0
>Reporter: Benoit Perroud
>Priority: Minor
> Attachments: HDFS-7586-v0.1.txt
>
>
> When wildcard binding for NameNode RPC is turned on (i.e.  
> dfs.namenode.rpc-address=0.0.0.0:8020), HFTP download is failing.
> Call to http://namenode:50070/data/.. returns the header Location with 
> parameter nnaddr=0.0.0.0:8020, which is unlikely to ever succeed :)
> The idea would be, if wildcard binding is enabled, to get read the IP address 
> the request is actually connected to from the HttpServletRequest and return 
> this one.
> WDYT?
> How to reproduce:
> 1. Turn on wildcard binding
> {code}dfs.namenode.rpc-address=0.0.0.0:8020{code}
> 2. Upload a file
> {code}$ echo "123" | hdfs dfs -put - /tmp/randomFile.txt{code}
> 3. Validate it's failing
> {code}
> $ hdfs dfs -cat hftp://namenode1/tmp/randomFile.txt
> {code}
> 4. Get more details via curl
> {code}
> $ curl -vv http://namenode1:50070/data/tmp/randomFile.txt?ugi=hdfs | grep 
> "Location:"
>  Location: 
> http://datanode003:50075/streamFile/tmp/randomFile.txt?ugi=hdfs&nnaddr=0.0.0.0:8020
> {code}
> We can clearly see the 0.0.0.0 returned as the NN ip.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7018) Implement C interface for libhdfs3

2015-01-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266434#comment-14266434
 ] 

Colin Patrick McCabe commented on HDFS-7018:


Sorry, this has been on my queue to review for a while, but stuff kept coming 
up.  I'll take a look today.

> Implement C interface for libhdfs3
> --
>
> Key: HDFS-7018
> URL: https://issues.apache.org/jira/browse/HDFS-7018
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-7018-pnative.002.patch, 
> HDFS-7018-pnative.003.patch, HDFS-7018.patch
>
>
> Implement C interface for libhdfs3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7586) HFTP does not work when namenode bind on wildcard

2015-01-06 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266446#comment-14266446
 ] 

Yongjun Zhang commented on HDFS-7586:
-

HI [~bperroud],

Thanks for reporting the issue. I once ran into the same issue myself, and 
found out the issue was incorrect setting. As [~daryn] pointed out, " The 
rpc-address key should be an actual ip/host:port. There is a rpc-bind-host key 
that should be set to 0.0.0.0 for multihoming". The rpc-bin-host key was 
introduced by HDFS-5128, which is at least in 2.3.0, if not earlier. IIRC, my 
testing with 2.3 was successful after fixing the setting. Thanks.




> HFTP does not work when namenode bind on wildcard
> -
>
> Key: HDFS-7586
> URL: https://issues.apache.org/jira/browse/HDFS-7586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.3.0, 2.4.0
>Reporter: Benoit Perroud
>Priority: Minor
> Attachments: HDFS-7586-v0.1.txt
>
>
> When wildcard binding for NameNode RPC is turned on (i.e.  
> dfs.namenode.rpc-address=0.0.0.0:8020), HFTP download is failing.
> Call to http://namenode:50070/data/.. returns the header Location with 
> parameter nnaddr=0.0.0.0:8020, which is unlikely to ever succeed :)
> The idea would be, if wildcard binding is enabled, to get read the IP address 
> the request is actually connected to from the HttpServletRequest and return 
> this one.
> WDYT?
> How to reproduce:
> 1. Turn on wildcard binding
> {code}dfs.namenode.rpc-address=0.0.0.0:8020{code}
> 2. Upload a file
> {code}$ echo "123" | hdfs dfs -put - /tmp/randomFile.txt{code}
> 3. Validate it's failing
> {code}
> $ hdfs dfs -cat hftp://namenode1/tmp/randomFile.txt
> {code}
> 4. Get more details via curl
> {code}
> $ curl -vv http://namenode1:50070/data/tmp/randomFile.txt?ugi=hdfs | grep 
> "Location:"
>  Location: 
> http://datanode003:50075/streamFile/tmp/randomFile.txt?ugi=hdfs&nnaddr=0.0.0.0:8020
> {code}
> We can clearly see the 0.0.0.0 returned as the NN ip.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7018) Implement C interface for libhdfs3

2015-01-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266484#comment-14266484
 ] 

Colin Patrick McCabe commented on HDFS-7018:


Thanks, this looks a lot better.

The CMake changes look good.

{code}
64  static inline char *Strdup(const char *str) {
65  if (str == NULL) {
66  return NULL;
67  }
68  
69  int len = strlen(str);
70  char *retval = new char[len + 1];
71  memcpy(retval, str, len + 1);
72  return retval;
73  }
{code}

Please, let's just use the regular {{strdup}} provided by the system.  Then we 
also don't have to worry about "array delete" versus regular delete either.

{{struct hdfsFile_internal}}: we can simplify this a lot.  Rather than having 
setters, just have a constructor that takes an {{InputStream}}, and another one 
that takes an {{OutputStream}}.  We shouldn't need to alter the streams after 
the {{hdfsFile_internal}} object has been created.  Using a "union" here is 
overly complicated, and not really saving any space.  On a 64-bit machine the 
boolean you need to select which type the union pads the structure out to 16 
bytes anyway.  Just have a pointer to an input stream, and a pointer to an 
output stream, and the invariant that one of them is always {{null}}.

{code}
166 class DefaultConfig {
167 public:
168 DefaultConfig() : reportError(false) {
169 const char *env = getenv("LIBHDFS3_CONF");
170 std::string confPath = env ? env : "";
{code}
We should be looking at CLASSPATH and searching all those directories for XML 
files, so that we can be compatible with the existing libhdfs code.  Also, 
Hadoop configurations include multiple files, not just a single file.  You can 
look at how I did it in the HADOOP-10388 branch, which has a working 
implementation of this.  Alternately we could punt this to a follow-on JIRA.

{code}
224 struct hdfsBuilder {
225 public:
226 hdfsBuilder(const Config &conf) : conf(conf), port(0) {
227 }
228 
229 ~hdfsBuilder() {
230 }
231 
232 public:
{code}
We don't need line 232.  It's a bit confusing because I expected the line to 
say "private"

{{PARAMETER_ASSERT}}: this isn't what people usually mean by an {{assert}}.  
Usually an {{assert}} is something that only takes effect in debug builds, and 
is used to guard against programmer mistakes.   In contrast, this is validating 
a parameter passed in by the library user.  I would prefer not to have this 
macro at all since I think we ought to actually provide a detailed error 
message explaining what is wrong.  This macro will just fill in something like 
"invalid parameter" for EINVAL, which is not very informative.  I also think 
it's confusing to have "return" statements in macros... maybe we can do this 
occasionally, but only for a VERY good reason or in a unit test.

{code}
519 } catch (const std::bad_alloc &e) {
520 SetErrorMessage("Out of memory");
521 errno = ENOMEM;
522 } catch (...) {
523 SetErrorMessage("Unknown Error");
524 errno = EINTERNAL;
525 }
{code}

I see this repeated a lot in the code.  Why can't we use 
{{CreateStatusFromException}} to figure out exactly what is wrong, and derive 
the errno and error message from the Status object?

Since we're adopting the Google C\+\+ style, we will eventually remove the 
throw statements from other parts of the code, and then these "outer catch 
block" in the C API will be the only catch blocks left, and the only users of 
{{CreateStatusFromException}}, right?

{{hdfs.h}}: it's problematic to add stuff to this file until the other 
implementations support it.  We could get away with returning ENOTSUP from 
these functions.  But I think we need to discuss what some of them do... I'm 
not familiar with the "get delegation token", "free delegation token" APIs and 
we need to discuss what they do and if we want them, etc.  I think it's best to 
file a follow-on for that and leave it out for now.

> Implement C interface for libhdfs3
> --
>
> Key: HDFS-7018
> URL: https://issues.apache.org/jira/browse/HDFS-7018
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-7018-pnative.002.patch, 
> HDFS-7018-pnative.003.patch, HDFS-7018.patch
>
>
> Implement C interface for libhdfs3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7339) NameNode support for erasure coding block groups

2015-01-06 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266566#comment-14266566
 ] 

Andrew Wang commented on HDFS-7339:
---

Thanks for working on this Zhe, here are some quick thoughts about the patch:

* Could make this into a INode Feature, like how we do ACL and XAttrs. I think 
we can get rid of isStriped then too.
* Need to wire up getAdditionalBlockGroups. previous handling also needs to 
account for block groups.
* LocatedBlockGroup is also missing a bunch of functionality from LocatedBlock, 
which I think we need. Check around a bit in the client for how it uses 
LocatedBlock too, we will want comparable functionality for erasure coded and 
not files.
* Would prefer to throw UnsupportedOperationException for stubbed methods, to 
be very clear
* Since BlockGroupManager#chooseNewGroupTargets is called without any locks 
held, need to make sure it is threadsafe. Worth adding a comment?
* What's the interaction between the two SequentialBlockIDGenerator classes? 
since they don't use the same count, there will be conflicts.
* Why do we have both BlockGroupInfo and BlockGroup? If we put BlockInfos 
rather than Blocks in BlockGroup, wouldn't it fill the need. Could move 
BlockGroup to blockmanagement package then too.

> NameNode support for erasure coding block groups
> 
>
> Key: HDFS-7339
> URL: https://issues.apache.org/jira/browse/HDFS-7339
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, 
> Meta-striping.jpg, NN-stripping.jpg
>
>
> All erasure codec operations center around the concept of _block group_; they 
> are formed in initial encoding and looked up in recoveries and conversions. A 
> lightweight class {{BlockGroup}} is created to record the original and parity 
> blocks in a coding group, as well as a pointer to the codec schema (pluggable 
> codec schemas will be supported in HDFS-7337). With the striping layout, the 
> HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. 
> Therefore we propose to extend a file’s inode to switch between _contiguous_ 
> and _striping_ modes, with the current mode recorded in a binary flag. An 
> array of BlockGroups (or BlockGroup IDs) is added, which remains empty for 
> “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new 
> {{ECManager}} component; the attached figure has an illustration of the 
> architecture. As a simple example, when a {_Striping+EC_} file is created and 
> written to, it will serve requests from the client to allocate new 
> {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, 
> {{BlockGroups}} are allocated both in initial online encoding and in the 
> conversion from replication to EC. {{ECManager}} also facilitates the lookup 
> of {{BlockGroup}} information for block recovery work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7579) Improve log reporting during block report rpc failure

2015-01-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266641#comment-14266641
 ] 

Hadoop QA commented on HDFS-7579:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690360/HDFS-7579.000.patch
  against trunk revision 4cd66f7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.qjournal.TestNNWithQJM

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9148//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9148//console

This message is automatically generated.

> Improve log reporting during block report rpc failure
> -
>
> Key: HDFS-7579
> URL: https://issues.apache.org/jira/browse/HDFS-7579
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-7579.000.patch
>
>
> During block reporting, if the block report RPC fails, for example because it 
> exceeded the max rpc len, we should still produce some sort of LOG.info 
> output to help with debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation

2015-01-06 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-7587:


 Summary: Edit log corruption can happen if append fails with a 
quota violation
 Key: HDFS-7587
 URL: https://issues.apache.org/jira/browse/HDFS-7587
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Priority: Blocker


We have seen a standby namenode crashing due to edit log corruption. It was 
complaining that {{OP_CLOSE}} cannot be applied because the file is not 
under-construction.

When a client was trying to append to the file, the remaining space quota was 
very small. This caused a failure in {{prepareFileForWrite()}}, but after the 
inode was already converted for writing and a lease added. Since these were not 
undone when the quota violation was detected, the file was left in 
under-construction with an active lease without edit logging {{OP_ADD}}.

A subsequent {{append()}} eventually caused a lease recovery after the soft 
limit period. This resulted in {{commitBlockSynchronization()}}, which closed 
the file with {{OP_CLOSE}} being logged.  Since there was no corresponding 
{{OP_ADD}}, edit replaying could not apply this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation

2015-01-06 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266694#comment-14266694
 ] 

Kihwal Lee commented on HDFS-7587:
--

This is a side-effect of HDFS-6423. [~daryn] has suggested that the quota check 
be done before converting inode/block. If something goes wrong, undoing the 
quota update is easier.

> Edit log corruption can happen if append fails with a quota violation
> -
>
> Key: HDFS-7587
> URL: https://issues.apache.org/jira/browse/HDFS-7587
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Blocker
>
> We have seen a standby namenode crashing due to edit log corruption. It was 
> complaining that {{OP_CLOSE}} cannot be applied because the file is not 
> under-construction.
> When a client was trying to append to the file, the remaining space quota was 
> very small. This caused a failure in {{prepareFileForWrite()}}, but after the 
> inode was already converted for writing and a lease added. Since these were 
> not undone when the quota violation was detected, the file was left in 
> under-construction with an active lease without edit logging {{OP_ADD}}.
> A subsequent {{append()}} eventually caused a lease recovery after the soft 
> limit period. This resulted in {{commitBlockSynchronization()}}, which closed 
> the file with {{OP_CLOSE}} being logged.  Since there was no corresponding 
> {{OP_ADD}}, edit replaying could not apply this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7579) Improve log reporting during block report rpc failure

2015-01-06 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266709#comment-14266709
 ] 

Charles Lamb commented on HDFS-7579:


The test failure is a timeout and appears to be unrelated. I reran it myself on 
my local machine with the patch applied and it passed.


> Improve log reporting during block report rpc failure
> -
>
> Key: HDFS-7579
> URL: https://issues.apache.org/jira/browse/HDFS-7579
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-7579.000.patch
>
>
> During block reporting, if the block report RPC fails, for example because it 
> exceeded the max rpc len, we should still produce some sort of LOG.info 
> output to help with debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7572) TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows

2015-01-06 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266772#comment-14266772
 ] 

Arpit Agarwal commented on HDFS-7572:
-

Thanks Chris for committing this!

> TestLazyPersistFiles#testDnRestartWithSavedReplicas is flaky on Windows
> ---
>
> Key: HDFS-7572
> URL: https://issues.apache.org/jira/browse/HDFS-7572
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.7.0
>
> Attachments: HDFS-7572.001.patch
>
>
> *Error Message*
> Expected: is 
>  but: was 
> *Stacktrace*
> java.lang.AssertionError: 
> Expected: is 
>  but: was 
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:865)
>   at org.junit.Assert.assertThat(Assert.java:832)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.LazyPersistTestCase.ensureFileReplicasOnStorageType(LazyPersistTestCase.java:129)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles.testDnRestartWithSavedReplicas(TestLazyPersistFiles.java:668)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7466) Allow different values for dfs.datanode.balance.max.concurrent.moves per datanode

2015-01-06 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266805#comment-14266805
 ] 

Benoy Antony commented on HDFS-7466:


[~szetszwo], My usecase is as follows:

We have DISK tier and ARCHIVE tier. The ARCHIVAL nodes doesn't have yarn 
containers running on them. The read of the ARCHIVAL data is very less. The 
major activity that happens on ARCHIVAL nodes is when someone moves the the 
data between tiers.  The DISK nodes have 12 drives whereas ARCHIVAL nodes have 
60 drives. 

We like to keep the dfs.datanode.balance.max.concurrent.moves on ARCHIVAL nodes 
to around 60. The DISK nodes use the default value of 5.



> Allow different values for dfs.datanode.balance.max.concurrent.moves per 
> datanode
> -
>
> Key: HDFS-7466
> URL: https://issues.apache.org/jira/browse/HDFS-7466
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
>
> It is possible to configure different values for  
> _dfs.datanode.balance.max.concurrent.moves_ per datanode.  But the value will 
> be used by balancer/mover which obtains the value from its own configuration. 
> The correct approach will be to obtain the value from the datanode itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6292) Display HDFS per user and per group usage on the webUI

2015-01-06 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-6292:
---
Attachment: HDFS-6292.01.patch

Ok! Here's the skeleton code that has come out of my attempt to add this 
functionality to the NameNode. DISCLAIMER: This patch is not ready and I'm 
uploading it only so that you folks can see what I'm thinking so far.

I would request feedback on the following (and whatever else you think of):
1. Should HdfsUsageMetricsSource be thread safe? Should I just assume the FSN 
write lock is always held when calling into here?
2. I understand that we need to plug into a LOT of places to correctly update 
the stats. I have only plugged into 2-3 places (so obviously the usage will be 
incorrect if you venture out of those ops: create / delete / chown files+dirs 
and even these have wrinkles I need to smooth) . I propose we do this all as 
another sub-task after the framework gets committed.
3. I still need to figure out how best to let this be configurable for any of 
the HDFS daemons: NameNode/Standby/SecondaryNamenode
4. Enable and disable this feature dynamically.


> Display HDFS per user and per group usage on the webUI
> --
>
> Key: HDFS-6292
> URL: https://issues.apache.org/jira/browse/HDFS-6292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.4.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HDFS-6292.01.patch, HDFS-6292.patch, HDFS-6292.png
>
>
> It would be nice to show HDFS usage per user and per group on a web ui.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7337) Configurable and pluggable Erasure Codec and schema

2015-01-06 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266816#comment-14266816
 ] 

Andrew Wang commented on HDFS-7337:
---

Hey Kai, thanks for getting us started here. I gave this a quick look, had a 
few comments:

* Could you generate normal plaintext diffs rather than a zip? We might also 
want to reorganize things into existing packages. The rawcoder stuff could go 
somewhere in hadoop-common for instance. We could move the block grouper 
classes into blockmanagement. etc.
* I see mixed tabs and spaces, we do spaces only in Hadoop.
* Since the LRC stuff is still up in the air, could we defer everything related 
to that to a later JIRA?
* In RSBlockGrouper, using ExtendedBlockId is overkill, since the bpid is the 
same for everything

Configuration
* The XML file approach seems potentially error-prone. IIUC after a set of 
parameters are assigned to a schema name, the parameters should never be 
changed. We thus also need to keep the xml file in sync between the NN, DN, and 
client. The client part is especially troublesome. Are we planning to put into 
the editlog/image down the road, like how we do storage policies?
* Also, I think we want to separate out the the type of erasure coding from the 
implementation. The schema definition from the PDF encodes both together, e.g. 
JerasureRS. While it's not possible to change the RS part, the user might want 
to swap out Jerasure for ISAL which should be allowed. This is sort of like how 
we did things for encryption; we define a CipherSuite (i.e. AES-CTR) and then 
the user can choose among the multiple pluggable implementations for that 
cipher.

BlockGroup:
* Zhe told me this is a placeholder class, but a few comments nonetheless.
* Can we just set the two fields in the constructor? They should also be final.
* Since the schema encodes the layout, does SubBlockGroup need to encode both 
data and parity? Do we even need SubBlockGroup? Seems like a single array and a 
schema (a concrete object, which also encodes the RS or LRC parameters) tells 
you the layout, which is sufficient. This will save some memory.

> Configurable and pluggable Erasure Codec and schema
> ---
>
> Key: HDFS-7337
> URL: https://issues.apache.org/jira/browse/HDFS-7337
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Kai Zheng
> Attachments: HDFS-7337-prototype-v1.patch, 
> HDFS-7337-prototype-v2.zip, HDFS-7337-prototype-v3.zip, 
> PluggableErasureCodec.pdf
>
>
> According to HDFS-7285 and the design, this considers to support multiple 
> Erasure Codecs via pluggable approach. It allows to define and configure 
> multiple codec schemas with different coding algorithms and parameters. The 
> resultant codec schemas can be utilized and specified via command tool for 
> different file folders. While design and implement such pluggable framework, 
> it’s also to implement a concrete codec by default (Reed Solomon) to prove 
> the framework is useful and workable. Separate JIRA could be opened for the 
> RS codec implementation.
> Note HDFS-7353 will focus on the very low level codec API and implementation 
> to make concrete vendor libraries transparent to the upper layer. This JIRA 
> focuses on high level stuffs that interact with configuration, schema and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7467) Provide storage tier information for a directory via fsck

2015-01-06 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266871#comment-14266871
 ] 

Benoy Antony commented on HDFS-7467:


1. 
{quote}
Are all storage policies in fallback storage equivalent to other storage 
policies that this output can always be fully described by the percentages that 
Tsz has suggested?
{quote}

There is a possibility that some storage tier combination may not belong to a 
storage policy. 
My recommendation is to display the policy along with the combination if 
possible. If not, display the combination. Lowercase for policy name is 
intentional.
{code}
Storage Policy  # of blocks   % of blocks
cold (DISK:1,ARCHIVE:2)  340730   97.7393%
frozen (ARCHIVE:3)   39281.1268%
DISK:2,ARCHIVE:231220.8956%
warm (DISK:2,ARCHIVE:1)   7480.2146%
DISK:1,ARCHIVE:3 440.0126%
DISK:3,ARCHIVE:2 300.0086%
DISK:3,ARCHIVE:1   90.0026% 

{code}

2.
{quote}
There should also be some warning messages as well in fsck for all files that 
are unable to meet the requested ideal for their storage policy and are using 
fallback storage, perhaps with a switch since that could become overly volumous 
output.
{quote}

This is a nice feature. Will look into that .

> Provide storage tier information for a directory via fsck
> -
>
> Key: HDFS-7467
> URL: https://issues.apache.org/jira/browse/HDFS-7467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HDFS-7467.patch
>
>
> Currently _fsck_  provides information regarding blocks for a directory.
> It should be augmented to provide storage tier information (optionally). 
> The sample report could be as follows :
> {code}
> Storage Tier Combination# of blocks   % of blocks
> DISK:1,ARCHIVE:2  340730   97.7393%
>  
> ARCHIVE:3   39281.1268%
>  
> DISK:2,ARCHIVE:231220.8956%
>  
> DISK:2,ARCHIVE:1 7480.2146%
>  
> DISK:1,ARCHIVE:3  440.0126%
>  
> DISK:3,ARCHIVE:2  300.0086%
>  
> DISK:3,ARCHIVE:1   90.0026%
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7467) Provide storage tier information for a directory via fsck

2015-01-06 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266871#comment-14266871
 ] 

Benoy Antony edited comment on HDFS-7467 at 1/6/15 10:16 PM:
-

1. 
{quote}
Are all storage policies in fallback storage equivalent to other storage 
policies that this output can always be fully described by the percentages that 
Tsz has suggested?
{quote}

There is a possibility that some storage tier combination may not belong to a 
storage policy. 
My recommendation is to display the policy along with the combination if 
possible. If not, display the combination. Lowercase for policy name is 
intentional.
{code}
Storage Policy  # of blocks   % of blocks
cold (DISK:1,ARCHIVE:2)340730   97.7393%
frozen (ARCHIVE:3) 39281.1268%
DISK:2,ARCHIVE:2  31220.8956%
warm (DISK:2,ARCHIVE:1)7480.2146%
DISK:1,ARCHIVE:3  440.0126%
DISK:3,ARCHIVE:2  300.0086%
DISK:3,ARCHIVE:190.0026%
 
{code}

2.
{quote}
There should also be some warning messages as well in fsck for all files that 
are unable to meet the requested ideal for their storage policy and are using 
fallback storage, perhaps with a switch since that could become overly volumous 
output.
{quote}

This is a nice feature. Will look into that .


was (Author: benoyantony):
1. 
{quote}
Are all storage policies in fallback storage equivalent to other storage 
policies that this output can always be fully described by the percentages that 
Tsz has suggested?
{quote}

There is a possibility that some storage tier combination may not belong to a 
storage policy. 
My recommendation is to display the policy along with the combination if 
possible. If not, display the combination. Lowercase for policy name is 
intentional.
{code}
Storage Policy  # of blocks   % of blocks
cold (DISK:1,ARCHIVE:2)  340730   97.7393%
frozen (ARCHIVE:3)   39281.1268%
DISK:2,ARCHIVE:231220.8956%
warm (DISK:2,ARCHIVE:1)   7480.2146%
DISK:1,ARCHIVE:3 440.0126%
DISK:3,ARCHIVE:2 300.0086%
DISK:3,ARCHIVE:1   90.0026% 

{code}

2.
{quote}
There should also be some warning messages as well in fsck for all files that 
are unable to meet the requested ideal for their storage policy and are using 
fallback storage, perhaps with a switch since that could become overly volumous 
output.
{quote}

This is a nice feature. Will look into that .

> Provide storage tier information for a directory via fsck
> -
>
> Key: HDFS-7467
> URL: https://issues.apache.org/jira/browse/HDFS-7467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HDFS-7467.patch
>
>
> Currently _fsck_  provides information regarding blocks for a directory.
> It should be augmented to provide storage tier information (optionally). 
> The sample report could be as follows :
> {code}
> Storage Tier Combination# of blocks   % of blocks
> DISK:1,ARCHIVE:2  340730   97.7393%
>  
> ARCHIVE:3   39281.1268%
>  
> DISK:2,ARCHIVE:231220.8956%
>  
> DISK:2,ARCHIVE:1 7480.2146%
>  
> DISK:1,ARCHIVE:3  440.0126%
>  
> DISK:3,ARCHIVE:2  300.0086%
>  
> DISK:3,ARCHIVE:1   90.0026%
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7467) Provide storage tier information for a directory via fsck

2015-01-06 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266871#comment-14266871
 ] 

Benoy Antony edited comment on HDFS-7467 at 1/6/15 10:17 PM:
-

1. 
{quote}
Are all storage policies in fallback storage equivalent to other storage 
policies that this output can always be fully described by the percentages that 
Tsz has suggested?
{quote}

There is a possibility that some storage tier combination may not belong to a 
storage policy. 
My recommendation is to display the policy along with the combination if 
possible. If not, display the combination. Lowercase for policy name is 
intentional.
{code}
Storage Policy  # of blocks   % of blocks
cold(DISK:1,ARCHIVE:2) 340730   97.7393%
frozen(ARCHIVE:3)  39281.1268%
DISK:2,ARCHIVE:2  31220.8956%
warm(DISK:2,ARCHIVE:1) 7480.2146%
DISK:1,ARCHIVE:3  440.0126%
DISK:3,ARCHIVE:2  300.0086%
DISK:3,ARCHIVE:190.0026%
 
{code}

2.
{quote}
There should also be some warning messages as well in fsck for all files that 
are unable to meet the requested ideal for their storage policy and are using 
fallback storage, perhaps with a switch since that could become overly volumous 
output.
{quote}

This is a nice feature. Will look into that .


was (Author: benoyantony):
1. 
{quote}
Are all storage policies in fallback storage equivalent to other storage 
policies that this output can always be fully described by the percentages that 
Tsz has suggested?
{quote}

There is a possibility that some storage tier combination may not belong to a 
storage policy. 
My recommendation is to display the policy along with the combination if 
possible. If not, display the combination. Lowercase for policy name is 
intentional.
{code}
Storage Policy  # of blocks   % of blocks
cold (DISK:1,ARCHIVE:2)340730   97.7393%
frozen (ARCHIVE:3) 39281.1268%
DISK:2,ARCHIVE:2  31220.8956%
warm (DISK:2,ARCHIVE:1)7480.2146%
DISK:1,ARCHIVE:3  440.0126%
DISK:3,ARCHIVE:2  300.0086%
DISK:3,ARCHIVE:190.0026%
 
{code}

2.
{quote}
There should also be some warning messages as well in fsck for all files that 
are unable to meet the requested ideal for their storage policy and are using 
fallback storage, perhaps with a switch since that could become overly volumous 
output.
{quote}

This is a nice feature. Will look into that .

> Provide storage tier information for a directory via fsck
> -
>
> Key: HDFS-7467
> URL: https://issues.apache.org/jira/browse/HDFS-7467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HDFS-7467.patch
>
>
> Currently _fsck_  provides information regarding blocks for a directory.
> It should be augmented to provide storage tier information (optionally). 
> The sample report could be as follows :
> {code}
> Storage Tier Combination# of blocks   % of blocks
> DISK:1,ARCHIVE:2  340730   97.7393%
>  
> ARCHIVE:3   39281.1268%
>  
> DISK:2,ARCHIVE:231220.8956%
>  
> DISK:2,ARCHIVE:1 7480.2146%
>  
> DISK:1,ARCHIVE:3  440.0126%
>  
> DISK:3,ARCHIVE:2  300.0086%
>  
> DISK:3,ARCHIVE:1   90.0026%
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7467) Provide storage tier information for a directory via fsck

2015-01-06 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266871#comment-14266871
 ] 

Benoy Antony edited comment on HDFS-7467 at 1/6/15 10:18 PM:
-

1. 
{quote}
Are all storage policies in fallback storage equivalent to other storage 
policies that this output can always be fully described by the percentages that 
Tsz has suggested?
{quote}

There is a possibility that some storage tier combination may not belong to a 
storage policy. 
My recommendation is to display the policy along with the combination if 
possible. If not, display the combination. Lowercase for policy name is 
intentional.
{code}
Storage Policy  # of blocks   % of blocks
cold(DISK:1,ARCHIVE:2) 340730  97.7393%
frozen(ARCHIVE:3)  39281.1268%
DISK:2,ARCHIVE:2  31220.8956%
warm(DISK:2,ARCHIVE:1) 7480.2146%
DISK:1,ARCHIVE:3  440.0126%
DISK:3,ARCHIVE:2  300.0086%
DISK:3,ARCHIVE:190.0026%
 
{code}

2.
{quote}
There should also be some warning messages as well in fsck for all files that 
are unable to meet the requested ideal for their storage policy and are using 
fallback storage, perhaps with a switch since that could become overly volumous 
output.
{quote}

This is a nice feature. Will look into that .


was (Author: benoyantony):
1. 
{quote}
Are all storage policies in fallback storage equivalent to other storage 
policies that this output can always be fully described by the percentages that 
Tsz has suggested?
{quote}

There is a possibility that some storage tier combination may not belong to a 
storage policy. 
My recommendation is to display the policy along with the combination if 
possible. If not, display the combination. Lowercase for policy name is 
intentional.
{code}
Storage Policy  # of blocks   % of blocks
cold(DISK:1,ARCHIVE:2) 340730   97.7393%
frozen(ARCHIVE:3)  39281.1268%
DISK:2,ARCHIVE:2  31220.8956%
warm(DISK:2,ARCHIVE:1) 7480.2146%
DISK:1,ARCHIVE:3  440.0126%
DISK:3,ARCHIVE:2  300.0086%
DISK:3,ARCHIVE:190.0026%
 
{code}

2.
{quote}
There should also be some warning messages as well in fsck for all files that 
are unable to meet the requested ideal for their storage policy and are using 
fallback storage, perhaps with a switch since that could become overly volumous 
output.
{quote}

This is a nice feature. Will look into that .

> Provide storage tier information for a directory via fsck
> -
>
> Key: HDFS-7467
> URL: https://issues.apache.org/jira/browse/HDFS-7467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HDFS-7467.patch
>
>
> Currently _fsck_  provides information regarding blocks for a directory.
> It should be augmented to provide storage tier information (optionally). 
> The sample report could be as follows :
> {code}
> Storage Tier Combination# of blocks   % of blocks
> DISK:1,ARCHIVE:2  340730   97.7393%
>  
> ARCHIVE:3   39281.1268%
>  
> DISK:2,ARCHIVE:231220.8956%
>  
> DISK:2,ARCHIVE:1 7480.2146%
>  
> DISK:1,ARCHIVE:3  440.0126%
>  
> DISK:3,ARCHIVE:2  300.0086%
>  
> DISK:3,ARCHIVE:1   90.0026%
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7467) Provide storage tier information for a directory via fsck

2015-01-06 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266871#comment-14266871
 ] 

Benoy Antony edited comment on HDFS-7467 at 1/6/15 10:22 PM:
-

1. 
{quote}
Are all storage policies in fallback storage equivalent to other storage 
policies that this output can always be fully described by the percentages that 
Tsz has suggested?
{quote}

There is a possibility that some storage tier combination may not belong to a 
storage policy. 
My recommendation is to display the policy along with the combination if 
possible. If not, display the combination. Lowercase for policy name is 
intentional.
{code}
Storage Policy   # of blocks   % of blocks
cold(DISK:1,ARCHIVE:2)  340730  97.7393%
frozen(ARCHIVE:3)   39281.1268%
DISK:2,ARCHIVE:2   31220.8956%
warm(DISK:2,ARCHIVE:1)   7480.2146%
DISK:1,ARCHIVE:3440.0126%
DISK:3,ARCHIVE:2300.0086%
DISK:3,ARCHIVE:1  90.0026%
{code}

2.
{quote}
There should also be some warning messages as well in fsck for all files that 
are unable to meet the requested ideal for their storage policy and are using 
fallback storage, perhaps with a switch since that could become overly volumous 
output.
{quote}

This is a nice feature. Will look into that .


was (Author: benoyantony):
1. 
{quote}
Are all storage policies in fallback storage equivalent to other storage 
policies that this output can always be fully described by the percentages that 
Tsz has suggested?
{quote}

There is a possibility that some storage tier combination may not belong to a 
storage policy. 
My recommendation is to display the policy along with the combination if 
possible. If not, display the combination. Lowercase for policy name is 
intentional.
{code}
Storage Policy  # of blocks   % of blocks
cold(DISK:1,ARCHIVE:2) 340730  97.7393%
frozen(ARCHIVE:3)  39281.1268%
DISK:2,ARCHIVE:2  31220.8956%
warm(DISK:2,ARCHIVE:1) 7480.2146%
DISK:1,ARCHIVE:3  440.0126%
DISK:3,ARCHIVE:2  300.0086%
DISK:3,ARCHIVE:190.0026%
 
{code}

2.
{quote}
There should also be some warning messages as well in fsck for all files that 
are unable to meet the requested ideal for their storage policy and are using 
fallback storage, perhaps with a switch since that could become overly volumous 
output.
{quote}

This is a nice feature. Will look into that .

> Provide storage tier information for a directory via fsck
> -
>
> Key: HDFS-7467
> URL: https://issues.apache.org/jira/browse/HDFS-7467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HDFS-7467.patch
>
>
> Currently _fsck_  provides information regarding blocks for a directory.
> It should be augmented to provide storage tier information (optionally). 
> The sample report could be as follows :
> {code}
> Storage Tier Combination# of blocks   % of blocks
> DISK:1,ARCHIVE:2  340730   97.7393%
>  
> ARCHIVE:3   39281.1268%
>  
> DISK:2,ARCHIVE:231220.8956%
>  
> DISK:2,ARCHIVE:1 7480.2146%
>  
> DISK:1,ARCHIVE:3  440.0126%
>  
> DISK:3,ARCHIVE:2  300.0086%
>  
> DISK:3,ARCHIVE:1   90.0026%
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation

2015-01-06 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7587:
-
Assignee: Daryn Sharp

> Edit log corruption can happen if append fails with a quota violation
> -
>
> Key: HDFS-7587
> URL: https://issues.apache.org/jira/browse/HDFS-7587
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Daryn Sharp
>Priority: Blocker
>
> We have seen a standby namenode crashing due to edit log corruption. It was 
> complaining that {{OP_CLOSE}} cannot be applied because the file is not 
> under-construction.
> When a client was trying to append to the file, the remaining space quota was 
> very small. This caused a failure in {{prepareFileForWrite()}}, but after the 
> inode was already converted for writing and a lease added. Since these were 
> not undone when the quota violation was detected, the file was left in 
> under-construction with an active lease without edit logging {{OP_ADD}}.
> A subsequent {{append()}} eventually caused a lease recovery after the soft 
> limit period. This resulted in {{commitBlockSynchronization()}}, which closed 
> the file with {{OP_CLOSE}} being logged.  Since there was no corresponding 
> {{OP_ADD}}, edit replaying could not apply this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7588) Improve the HDFS Web UI browser to allow chowning / chmoding, creating dirs and uploading files

2015-01-06 Thread Ravi Prakash (JIRA)
Ravi Prakash created HDFS-7588:
--

 Summary: Improve the HDFS Web UI browser to allow chowning / 
chmoding, creating dirs and uploading files
 Key: HDFS-7588
 URL: https://issues.apache.org/jira/browse/HDFS-7588
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ravi Prakash


The new HTML5 web browser is neat, however it lacks a few features that might 
make it more useful:
1. chown
2. chmod
3. Uploading files
4. mkdir



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-06 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267042#comment-14267042
 ] 

Brandon Li commented on HDFS-7564:
--

+1. I will commit the patch soon.

> NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
> 
>
> Key: HDFS-7564
> URL: https://issues.apache.org/jira/browse/HDFS-7564
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Yongjun Zhang
>Priority: Minor
> Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
> HDFS-7564.003.patch
>
>
> Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
> (default for static.id.mapping.file).
> It seems that the mappings file is currently only read upon restart of the 
> NFS gateway which would cause any active clients NFS mount points to hang or 
> fail.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-06 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-7564:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
> 
>
> Key: HDFS-7564
> URL: https://issues.apache.org/jira/browse/HDFS-7564
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Yongjun Zhang
>Priority: Minor
> Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
> HDFS-7564.003.patch
>
>
> Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
> (default for static.id.mapping.file).
> It seems that the mappings file is currently only read upon restart of the 
> NFS gateway which would cause any active clients NFS mount points to hang or 
> fail.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-06 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267052#comment-14267052
 ] 

Brandon Li commented on HDFS-7564:
--

I'll committed the patch. Thank you, [~yzhangal], for the contribution!


> NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
> 
>
> Key: HDFS-7564
> URL: https://issues.apache.org/jira/browse/HDFS-7564
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Yongjun Zhang
>Priority: Minor
> Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
> HDFS-7564.003.patch
>
>
> Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
> (default for static.id.mapping.file).
> It seems that the mappings file is currently only read upon restart of the 
> NFS gateway which would cause any active clients NFS mount points to hang or 
> fail.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-06 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-7564:
-
Fix Version/s: 2.7.0

> NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
> 
>
> Key: HDFS-7564
> URL: https://issues.apache.org/jira/browse/HDFS-7564
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Yongjun Zhang
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
> HDFS-7564.003.patch
>
>
> Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
> (default for static.id.mapping.file).
> It seems that the mappings file is currently only read upon restart of the 
> NFS gateway which would cause any active clients NFS mount points to hang or 
> fail.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267058#comment-14267058
 ] 

Hudson commented on HDFS-7564:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6820 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6820/])
HDFS-7564. NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map. 
Contributed by Yongjun Zhang (brandonli: rev 
788ee35e2bf0f3d445e03e6ea9bd02c40c8fdfe3)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
> 
>
> Key: HDFS-7564
> URL: https://issues.apache.org/jira/browse/HDFS-7564
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Yongjun Zhang
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
> HDFS-7564.003.patch
>
>
> Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
> (default for static.id.mapping.file).
> It seems that the mappings file is currently only read upon restart of the 
> NFS gateway which would cause any active clients NFS mount points to hang or 
> fail.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7585) TestEnhancedByteBufferAccess hard code the block size

2015-01-06 Thread sam liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267151#comment-14267151
 ] 

sam liu commented on HDFS-7585:
---

Could please help review this patch? 

Thanks!

> TestEnhancedByteBufferAccess hard code the block size
> -
>
> Key: HDFS-7585
> URL: https://issues.apache.org/jira/browse/HDFS-7585
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: sam liu
>Assignee: sam liu
>Priority: Blocker
> Attachments: HDFS-7585.001.patch
>
>
> The test TestEnhancedByteBufferAccess hard code the block size, and it fails 
> with exceptions on power linux.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-06 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267198#comment-14267198
 ] 

Yongjun Zhang commented on HDFS-7564:
-

Many thanks Brandon!


> NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
> 
>
> Key: HDFS-7564
> URL: https://issues.apache.org/jira/browse/HDFS-7564
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Yongjun Zhang
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
> HDFS-7564.003.patch
>
>
> Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
> (default for static.id.mapping.file).
> It seems that the mappings file is currently only read upon restart of the 
> NFS gateway which would cause any active clients NFS mount points to hang or 
> fail.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7467) Provide storage tier information for a directory via fsck

2015-01-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267222#comment-14267222
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7467:
---

> My recommendation is to display the policy along with the combination if 
> possible. ...

It is a good idea.  We should also consider a file's specified storage policy 
and actually storage media.  If a file does not satisfies the specified policy, 
fsck should show such information.  E.g. the specified storage policy of file 
foo is hot but all the replicas are stored in ARCHIVE, then it should not be 
counted as "frozen".  It should be counted as "ARCHIVE:3" in order to indicate 
that it does not satisfies the specified policy.

> Provide storage tier information for a directory via fsck
> -
>
> Key: HDFS-7467
> URL: https://issues.apache.org/jira/browse/HDFS-7467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HDFS-7467.patch
>
>
> Currently _fsck_  provides information regarding blocks for a directory.
> It should be augmented to provide storage tier information (optionally). 
> The sample report could be as follows :
> {code}
> Storage Tier Combination# of blocks   % of blocks
> DISK:1,ARCHIVE:2  340730   97.7393%
>  
> ARCHIVE:3   39281.1268%
>  
> DISK:2,ARCHIVE:231220.8956%
>  
> DISK:2,ARCHIVE:1 7480.2146%
>  
> DISK:1,ARCHIVE:3  440.0126%
>  
> DISK:3,ARCHIVE:2  300.0086%
>  
> DISK:3,ARCHIVE:1   90.0026%
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation

2015-01-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267227#comment-14267227
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7587:
---

> ... . Daryn Sharp has suggested that the quota check be done before 
> converting inode/block. ...

Sounds good.  All the checks (quota, permission, etc.) should the performed 
before any change to the namespace.

> Edit log corruption can happen if append fails with a quota violation
> -
>
> Key: HDFS-7587
> URL: https://issues.apache.org/jira/browse/HDFS-7587
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Kihwal Lee
>Assignee: Daryn Sharp
>Priority: Blocker
>
> We have seen a standby namenode crashing due to edit log corruption. It was 
> complaining that {{OP_CLOSE}} cannot be applied because the file is not 
> under-construction.
> When a client was trying to append to the file, the remaining space quota was 
> very small. This caused a failure in {{prepareFileForWrite()}}, but after the 
> inode was already converted for writing and a lease added. Since these were 
> not undone when the quota violation was detected, the file was left in 
> under-construction with an active lease without edit logging {{OP_ADD}}.
> A subsequent {{append()}} eventually caused a lease recovery after the soft 
> limit period. This resulted in {{commitBlockSynchronization()}}, which closed 
> the file with {{OP_CLOSE}} being logged.  Since there was no corresponding 
> {{OP_ADD}}, edit replaying could not apply this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation

2015-01-06 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7587:
--
Component/s: namenode

> Edit log corruption can happen if append fails with a quota violation
> -
>
> Key: HDFS-7587
> URL: https://issues.apache.org/jira/browse/HDFS-7587
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Kihwal Lee
>Assignee: Daryn Sharp
>Priority: Blocker
>
> We have seen a standby namenode crashing due to edit log corruption. It was 
> complaining that {{OP_CLOSE}} cannot be applied because the file is not 
> under-construction.
> When a client was trying to append to the file, the remaining space quota was 
> very small. This caused a failure in {{prepareFileForWrite()}}, but after the 
> inode was already converted for writing and a lease added. Since these were 
> not undone when the quota violation was detected, the file was left in 
> under-construction with an active lease without edit logging {{OP_ADD}}.
> A subsequent {{append()}} eventually caused a lease recovery after the soft 
> limit period. This resulted in {{commitBlockSynchronization()}}, which closed 
> the file with {{OP_CLOSE}} being logged.  Since there was no corresponding 
> {{OP_ADD}}, edit replaying could not apply this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7466) Allow different values for dfs.datanode.balance.max.concurrent.moves per datanode

2015-01-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267239#comment-14267239
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7466:
---

Benoy, thanks for showing the use case.

Approach 1 sounds good.  When should mover/balancer contacts the datanodes?  
How about contacting a datanode when dispatching a move, i.e. 
PendingMove.dispatch()?  In that way, the datanode queries are run in parallel 
and are executed on-demand.

> Allow different values for dfs.datanode.balance.max.concurrent.moves per 
> datanode
> -
>
> Key: HDFS-7466
> URL: https://issues.apache.org/jira/browse/HDFS-7466
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
>
> It is possible to configure different values for  
> _dfs.datanode.balance.max.concurrent.moves_ per datanode.  But the value will 
> be used by balancer/mover which obtains the value from its own configuration. 
> The correct approach will be to obtain the value from the datanode itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5631) Expose interfaces required by FsDatasetSpi implementations

2015-01-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267241#comment-14267241
 ] 

Tsz Wo Nicholas Sze commented on HDFS-5631:
---

Changing ChunkChecksum and BlockMetadataHeader's readHeader() and writeHeader() 
to public sound good.

For the test, how would someone add new methods to FsDatasetSpi or change the 
existing methods?  Are they supposed to update the test at the same time?

> Expose interfaces required by FsDatasetSpi implementations
> --
>
> Key: HDFS-5631
> URL: https://issues.apache.org/jira/browse/HDFS-5631
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: David Powell
>Assignee: David Powell
>Priority: Minor
> Attachments: HDFS-5631.patch, HDFS-5631.patch
>
>
> This sub-task addresses section 4.1 of the document attached to HDFS-5194,
> the exposure of interfaces needed by a FsDatasetSpi implementation.
> Specifically it makes ChunkChecksum public and BlockMetadataHeader's
> readHeader() and writeHeader() methods public.
> The changes to BlockReaderUtil (and related classes) discussed by section
> 4.1 are only needed if supporting short-circuit, and should be addressed
> as part of an effort to provide such support rather than this JIRA.
> To help ensure these changes are complete and are not regressed in the
> future, tests that gauge the accessibility (though *not* behavior)
> of interfaces needed by a FsDatasetSpi subclass are also included.
> These take the form of a dummy FsDatasetSpi subclass -- a successful
> compilation is effectively a pass.  Trivial unit tests are included so
> that there is something tangible to track.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5631) Expose interfaces required by FsDatasetSpi implementations

2015-01-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267244#comment-14267244
 ] 

Hadoop QA commented on HDFS-5631:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625158/HDFS-5631.patch
  against trunk revision 788ee35.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9149//console

This message is automatically generated.

> Expose interfaces required by FsDatasetSpi implementations
> --
>
> Key: HDFS-5631
> URL: https://issues.apache.org/jira/browse/HDFS-5631
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 3.0.0
>Reporter: David Powell
>Assignee: David Powell
>Priority: Minor
> Attachments: HDFS-5631.patch, HDFS-5631.patch
>
>
> This sub-task addresses section 4.1 of the document attached to HDFS-5194,
> the exposure of interfaces needed by a FsDatasetSpi implementation.
> Specifically it makes ChunkChecksum public and BlockMetadataHeader's
> readHeader() and writeHeader() methods public.
> The changes to BlockReaderUtil (and related classes) discussed by section
> 4.1 are only needed if supporting short-circuit, and should be addressed
> as part of an effort to provide such support rather than this JIRA.
> To help ensure these changes are complete and are not regressed in the
> future, tests that gauge the accessibility (though *not* behavior)
> of interfaces needed by a FsDatasetSpi subclass are also included.
> These take the form of a dummy FsDatasetSpi subclass -- a successful
> compilation is effectively a pass.  Trivial unit tests are included so
> that there is something tangible to track.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7565) NFS gateway UID overflow

2015-01-06 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267248#comment-14267248
 ] 

Yongjun Zhang commented on HDFS-7565:
-

HI [~harisekhon],

I assume you applied your own fix of HDFS-7563, and then see this problem, 
right? What does your static map file exactly look like? (would you please cat 
the file and paste it here?)

Would you please try "getent passwd hari", "getent passwd 10002", "getent 
passwd <4B>" (where <4B> is the 4 biilion number you are using) on the node 
that runs nfs gateway, and share the results here?

Thanks.






> NFS gateway UID overflow
> 
>
> Key: HDFS-7565
> URL: https://issues.apache.org/jira/browse/HDFS-7565
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.6.0
> Environment: HDP 2.2 (Apache Hadoop 2.6.0)
>Reporter: Hari Sekhon
>Assignee: Yongjun Zhang
>
> It appears that my Windows 7 workstation is passing a UID around 4 billion to 
> the NFS gateway and the getUserName() method is being passed "-2", so it 
> looks like the UID is an int and is overflowing:
> {code}security.ShellBasedIdMapping 
> (ShellBasedIdMapping.java:getUserName(358)) - Can't find user name for uid 
> -2. Use default user name nobody{code}
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7360) Test libhdfs3 against MiniDFSCluster

2015-01-06 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang reassigned HDFS-7360:
--

Assignee: Zhanwei Wang

> Test libhdfs3 against MiniDFSCluster
> 
>
> Key: HDFS-7360
> URL: https://issues.apache.org/jira/browse/HDFS-7360
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Zhanwei Wang
>Priority: Critical
>
> Currently the branch has enough code to interact with HDFS servers. We should 
> test the code against MiniDFSCluster to ensure the correctness of the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-7360) Test libhdfs3 against MiniDFSCluster

2015-01-06 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-7360 started by Zhanwei Wang.
--
> Test libhdfs3 against MiniDFSCluster
> 
>
> Key: HDFS-7360
> URL: https://issues.apache.org/jira/browse/HDFS-7360
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Zhanwei Wang
>Priority: Critical
>
> Currently the branch has enough code to interact with HDFS servers. We should 
> test the code against MiniDFSCluster to ensure the correctness of the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-06 Thread Zhanwei Wang (JIRA)
Zhanwei Wang created HDFS-7589:
--

 Summary: Break the dependency between libnative_mini_dfs and 
libhdfs
 Key: HDFS-7589
 URL: https://issues.apache.org/jira/browse/HDFS-7589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Zhanwei Wang


Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
Other applications which want to use libnative_mini_dfs have to link to libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-06 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang updated HDFS-7589:
---
Attachment: HDFS-7589.patch

> Break the dependency between libnative_mini_dfs and libhdfs
> ---
>
> Key: HDFS-7589
> URL: https://issues.apache.org/jira/browse/HDFS-7589
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
> Attachments: HDFS-7589.patch
>
>
> Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
> Other applications which want to use libnative_mini_dfs have to link to 
> libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-06 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-7589 started by Zhanwei Wang.
--
> Break the dependency between libnative_mini_dfs and libhdfs
> ---
>
> Key: HDFS-7589
> URL: https://issues.apache.org/jira/browse/HDFS-7589
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-7589.patch
>
>
> Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
> Other applications which want to use libnative_mini_dfs have to link to 
> libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-06 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang reassigned HDFS-7589:
--

Assignee: Zhanwei Wang

> Break the dependency between libnative_mini_dfs and libhdfs
> ---
>
> Key: HDFS-7589
> URL: https://issues.apache.org/jira/browse/HDFS-7589
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-7589.patch
>
>
> Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
> Other applications which want to use libnative_mini_dfs have to link to 
> libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7501) TransactionsSinceLastCheckpoint can be negative on SBNs

2015-01-06 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267346#comment-14267346
 ] 

Harsh J commented on HDFS-7501:
---

[~daryn] - Won't the metric lag at the StandBy even if we were to correct 
things up (for that metric) during checkpoints? Is a laggy metric OK to display 
(better than negatives, but still)?

> TransactionsSinceLastCheckpoint can be negative on SBNs
> ---
>
> Key: HDFS-7501
> URL: https://issues.apache.org/jira/browse/HDFS-7501
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Harsh J
>Assignee: Gautam Gopalakrishnan
>Priority: Trivial
> Attachments: HDFS-7501-2.patch, HDFS-7501.patch
>
>
> The metric TransactionsSinceLastCheckpoint is derived as FSEditLog.txid minus 
> NNStorage.mostRecentCheckpointTxId.
> In Standby mode, the former does not increment beyond the loaded or 
> last-when-active value, but the latter does change due to checkpoints done 
> regularly in this mode. Thereby, the SBN will eventually end up showing 
> negative values for TransactionsSinceLastCheckpoint.
> This is not an issue as the metric only makes sense to be monitored on the 
> Active NameNode, but we should perhaps just show the value 0 by detecting if 
> the NN is in SBN form, as allowing a negative number is confusing to view 
> within a chart that tracks it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-06 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang updated HDFS-7589:
---
Attachment: (was: HDFS-7589.patch)

> Break the dependency between libnative_mini_dfs and libhdfs
> ---
>
> Key: HDFS-7589
> URL: https://issues.apache.org/jira/browse/HDFS-7589
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-7589.patch
>
>
> Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
> Other applications which want to use libnative_mini_dfs have to link to 
> libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-06 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang updated HDFS-7589:
---
Attachment: HDFS-7589.patch

> Break the dependency between libnative_mini_dfs and libhdfs
> ---
>
> Key: HDFS-7589
> URL: https://issues.apache.org/jira/browse/HDFS-7589
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-7589.patch
>
>
> Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
> Other applications which want to use libnative_mini_dfs have to link to 
> libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7360) Test libhdfs3 against MiniDFSCluster

2015-01-06 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang updated HDFS-7360:
---
Attachment: HDFS-7360.patch

> Test libhdfs3 against MiniDFSCluster
> 
>
> Key: HDFS-7360
> URL: https://issues.apache.org/jira/browse/HDFS-7360
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Zhanwei Wang
>Priority: Critical
> Attachments: HDFS-7360.patch
>
>
> Currently the branch has enough code to interact with HDFS servers. We should 
> test the code against MiniDFSCluster to ensure the correctness of the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)