[jira] [Updated] (HDFS-8836) Skip newline on empty files with getMerge -nl

2015-12-14 Thread Kanaka Kumar Avvaru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kanaka Kumar Avvaru updated HDFS-8836:
--
Attachment: HDFS-8836-07.patch

> Skip newline on empty files with getMerge -nl
> -
>
> Key: HDFS-8836
> URL: https://issues.apache.org/jira/browse/HDFS-8836
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.6.0, 2.7.1
>Reporter: Jan Filipiak
>Assignee: Kanaka Kumar Avvaru
>Priority: Trivial
> Attachments: HDFS-8836-01.patch, HDFS-8836-02.patch, 
> HDFS-8836-03.patch, HDFS-8836-04.patch, HDFS-8836-05.patch, 
> HDFS-8836-06.patch, HDFS-8836-07.patch
>
>
> Hello everyone,
> I recently was in the need of using the new line option -nl with getMerge 
> because the files I needed to merge simply didn't had one. I was merging all 
> the files from one directory and unfortunately this directory also included 
> empty files, which effectively led to multiple newlines append after some 
> files. I needed to remove them manually afterwards.
> In this situation it is maybe good to have another argument that allows 
> skipping empty files.
> Thing one could try to implement this feature:
> The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't
> return the number of bytes copied which would be convenient as one could
> skip append the new line when 0 bytes where copied or one would check the 
> file size before.
> I posted this Idea on the mailing list 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201507.mbox/%3C55B25140.3060005%40trivago.com%3E
>  but I didn't really get many responses, so I thought I my try this way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9493) Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk

2015-12-14 Thread Tony Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057224#comment-15057224
 ] 

Tony Wu commented on HDFS-9493:
---

The failed tests are not related to the patch. As the patch only updated 
TestMetaSave.java with a new helper function no other tests use.

> Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk
> ---
>
> Key: HDFS-9493
> URL: https://issues.apache.org/jira/browse/HDFS-9493
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Mingliang Liu
>Assignee: Tony Wu
> Attachments: HDFS-9493.001.patch, HDFS-9493.002.patch
>
>
> Tested in both Gentoo Linux and Mac.
> {quote}
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 34.159 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> testMetasaveAfterDelete(org.apache.hadoop.hdfs.server.namenode.TestMetaSave)  
> Time elapsed: 15.318 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestMetaSave.testMetasaveAfterDelete(TestMetaSave.java:154)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057250#comment-15057250
 ] 

Kai Zheng commented on HDFS-8562:
-

The tricky thing to call the internal method *FileChannelImpl.open()* is, the 
signature may change for different JDK versions. 
In JDK 7 ref.
https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/sun/nio/ch/FileChannelImpl.java
 
In JDK 8 ref. 
http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/4a1e42601d61/src/share/classes/sun/nio/ch/FileChannelImpl.java

> HDFS Performance is impacted by FileInputStream Finalizer
> -
>
> Key: HDFS-8562
> URL: https://issues.apache.org/jira/browse/HDFS-8562
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.5.0
> Environment: Impact any application that uses HDFS
>Reporter: Yanping Wang
> Attachments: HDFS-8562.002b.patch, HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC 
> pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
> datanode GC pauses spiked toward 160 milliseconds while they should be around 
> 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to 
> process high number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 
> 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 
> milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
> milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those 
> FinalReference were all from FileInputStream.  We checked HDFS code and saw 
> the use of the FileInputStream in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.public static MappableBlock load(long length,
> 2.FileInputStream blockIn, FileInputStream metaIn,
> 3.String blockFileName) throws IOException {
> 4.MappableBlock mappableBlock = null;
> 5.MappedByteBuffer mmap = null;
> 6.FileChannel blockChannel = null;
> 7.try {
> 8.blockChannel = blockIn.getChannel();
> 9.if (blockChannel == null) {
> 10.   throw new IOException("Block InputStream has no FileChannel.");
> 11.   }
> 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.   mappableBlock = new MappableBlock(mmap, length);
> 16.   } finally {
> 17.   IOUtils.closeQuietly(blockChannel);
> 18.   if (mappableBlock == null) {
> 19.   if (mmap != null) {
> 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.   }
> 22.   }
> 23.   }
> 24.   return mappableBlock;
> 25.   }
> {code}
> We looked up 
> https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
>  and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance 
> is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
> to be executed.   
> The current issue is: even with programmers do call close() after using 
> FileInputStream, its finalize() method will still be called. In other words, 
> still get the side effect of the FinalReference being registered at 
> FileInputStream allocation time, and also reference processing to reclaim the 
> FinalReference during GC (any GC solution has to deal with this). 
> We can imagine When running industry deployment HDFS, millions of files could 
> be opened and closed which resulted in a very large number of finalizers 
> being registered and subsequently being executed.  That could cause very long 
> GC pause times.
> We tried to use Files.newInputStream() to replace FileInputStream, but it was 
> clear we could not replace FileInputStream in 
> hdfs/server/datanode/fsdataset/impl/MappableBlock.java 
> We notified Oracle JVM team of this performance issue that impacting all Big 

[jira] [Commented] (HDFS-7661) Support read when a EC file is being written

2015-12-14 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057063#comment-15057063
 ] 

Zhe Zhang commented on HDFS-7661:
-

Thanks for the design Rui. I haven't finished reading it. But some quick 
comments below:
# Sorry I didn't get back to your above question in time. Yes I think we are 
already reading the correct visible length for EC files, assuming no hflush.
# I guess we should rename this JIRA to "Support hflush/hsync"? Should also 
consider merging it with HDFS-7691 ([~vinayrpet] are you still working on it?)

> Support read when a EC file is being written
> 
>
> Key: HDFS-7661
> URL: https://issues.apache.org/jira/browse/HDFS-7661
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo Nicholas Sze
>Assignee: GAO Rui
> Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, 
> HDFS-7661-unitTest-wip-trunk.patch, 
> HDFS-EC-file-flush-sync-design-version1.1.pdf
>
>
> We also need to support hflush/hsync and visible length. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9371) Code cleanup for DatanodeManager

2015-12-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057076#comment-15057076
 ] 

Jing Zhao commented on HDFS-9371:
-

The javac warning is about {{BlockManager#setBlockToken}} which has been 
touched by the patch. The failed unit test is also unrelated.

> Code cleanup for DatanodeManager
> 
>
> Key: HDFS-9371
> URL: https://issues.apache.org/jira/browse/HDFS-9371
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9371.000.patch, HDFS-9371.001.patch, 
> HDFS-9371.002.patch, HDFS-9371.003.patch, HDFS-9371.004.patch
>
>
> Some code cleanup for DatanodeManager. The main changes include:
> # make the synchronization of {{datanodeMap}} and 
> {{datanodesSoftwareVersions}} consistent
> # remove unnecessary lock in {{handleHeartbeat}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-8562:

Assignee: (was: Wei Zhou)

> HDFS Performance is impacted by FileInputStream Finalizer
> -
>
> Key: HDFS-8562
> URL: https://issues.apache.org/jira/browse/HDFS-8562
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.5.0
> Environment: Impact any application that uses HDFS
>Reporter: Yanping Wang
> Attachments: HDFS-8562.002b.patch, HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC 
> pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
> datanode GC pauses spiked toward 160 milliseconds while they should be around 
> 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to 
> process high number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 
> 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 
> milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
> milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those 
> FinalReference were all from FileInputStream.  We checked HDFS code and saw 
> the use of the FileInputStream in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.public static MappableBlock load(long length,
> 2.FileInputStream blockIn, FileInputStream metaIn,
> 3.String blockFileName) throws IOException {
> 4.MappableBlock mappableBlock = null;
> 5.MappedByteBuffer mmap = null;
> 6.FileChannel blockChannel = null;
> 7.try {
> 8.blockChannel = blockIn.getChannel();
> 9.if (blockChannel == null) {
> 10.   throw new IOException("Block InputStream has no FileChannel.");
> 11.   }
> 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.   mappableBlock = new MappableBlock(mmap, length);
> 16.   } finally {
> 17.   IOUtils.closeQuietly(blockChannel);
> 18.   if (mappableBlock == null) {
> 19.   if (mmap != null) {
> 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.   }
> 22.   }
> 23.   }
> 24.   return mappableBlock;
> 25.   }
> {code}
> We looked up 
> https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
>  and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance 
> is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
> to be executed.   
> The current issue is: even with programmers do call close() after using 
> FileInputStream, its finalize() method will still be called. In other words, 
> still get the side effect of the FinalReference being registered at 
> FileInputStream allocation time, and also reference processing to reclaim the 
> FinalReference during GC (any GC solution has to deal with this). 
> We can imagine When running industry deployment HDFS, millions of files could 
> be opened and closed which resulted in a very large number of finalizers 
> being registered and subsequently being executed.  That could cause very long 
> GC pause times.
> We tried to use Files.newInputStream() to replace FileInputStream, but it was 
> clear we could not replace FileInputStream in 
> hdfs/server/datanode/fsdataset/impl/MappableBlock.java 
> We notified Oracle JVM team of this performance issue that impacting all Big 
> Data applications using HDFS. We recommended the proper fix in Java SE 
> FileInputStream. Because (1) it is really nothing wrong to use 
> FileInputStream in above datanode code, (2) as the object with a finalizer is 
> registered with finalizer list within the JVM at object allocation time, if 
> someone makes an explicit call to close or free the resources that are to be 
> done 

[jira] [Commented] (HDFS-8836) Skip newline on empty files with getMerge -nl

2015-12-14 Thread Kanaka Kumar Avvaru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057225#comment-15057225
 ] 

Kanaka Kumar Avvaru commented on HDFS-8836:
---

Ok fine. Thanks for the comment [~ajisakaa]. Please review the updated patch. 
The only change is to make the field private

> Skip newline on empty files with getMerge -nl
> -
>
> Key: HDFS-8836
> URL: https://issues.apache.org/jira/browse/HDFS-8836
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.6.0, 2.7.1
>Reporter: Jan Filipiak
>Assignee: Kanaka Kumar Avvaru
>Priority: Trivial
> Attachments: HDFS-8836-01.patch, HDFS-8836-02.patch, 
> HDFS-8836-03.patch, HDFS-8836-04.patch, HDFS-8836-05.patch, 
> HDFS-8836-06.patch, HDFS-8836-07.patch
>
>
> Hello everyone,
> I recently was in the need of using the new line option -nl with getMerge 
> because the files I needed to merge simply didn't had one. I was merging all 
> the files from one directory and unfortunately this directory also included 
> empty files, which effectively led to multiple newlines append after some 
> files. I needed to remove them manually afterwards.
> In this situation it is maybe good to have another argument that allows 
> skipping empty files.
> Thing one could try to implement this feature:
> The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't
> return the number of bytes copied which would be convenient as one could
> skip append the new line when 0 bytes where copied or one would check the 
> file size before.
> I posted this Idea on the mailing list 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201507.mbox/%3C55B25140.3060005%40trivago.com%3E
>  but I didn't really get many responses, so I thought I my try this way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8020) Erasure Coding: restore BlockGroup and schema info from stripping coding command

2015-12-14 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated HDFS-8020:
-
Description: As a task of HDFS-7348, to process *stripping* coding commands 
from NameNode or other scheduler services/tools, we need to first be able to 
restore BlockGroup and schema information in DataNode, which will be used to 
construct and perform coding work using {{ErasureCoder}} API.  (was: As a task 
of HDFS-7344, to process *stripping* coding commands from NameNode or other 
scheduler services/tools, we need to first be able to restore BlockGroup and 
schema information in DataNode, which will be used to construct and perform 
coding work using {{ErasureCoder}} API.)

> Erasure Coding: restore BlockGroup and schema info from stripping coding 
> command
> 
>
> Key: HDFS-8020
> URL: https://issues.apache.org/jira/browse/HDFS-8020
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Sasaki
>
> As a task of HDFS-7348, to process *stripping* coding commands from NameNode 
> or other scheduler services/tools, we need to first be able to restore 
> BlockGroup and schema information in DataNode, which will be used to 
> construct and perform coding work using {{ErasureCoder}} API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9535) Newly completed blocks in IBR should not be considered under-replicated too quickly

2015-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057036#comment-15057036
 ] 

Hudson commented on HDFS-9535:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #692 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/692/])
HDFS-9535. Newly completed blocks in IBR should not be considered (jing9: rev 
e53456981474d6e16e3c134e3777b3588dc6fedf)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Newly completed blocks in IBR should not be considered under-replicated too 
> quickly
> ---
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9535.000.patch, HDFS-9535.001.patch, 
> HDFS-9535.002.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.
> This test failure reveals a scenario that HDFS-1172 missed: if a block is 
> first committed by the client, and then the first IBR comes to the NN, as 
> proposed by HDFS-1172, we should still put the remaining expected replicas 
> into the pending queue, instead of the under-replicated queue. Please see 
> [~liuml07]'s comment 
> [here|https://issues.apache.org/jira/browse/HDFS-9535?focusedCommentId=15052397=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15052397]
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8860) Remove unused Replica copyOnWrite code

2015-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057038#comment-15057038
 ] 

Hudson commented on HDFS-8860:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #692 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/692/])
Revert "Revert "HDFS-8860. Remove unused Replica copyOnWrite code (Lei (lei: 
rev de522d2cd46be13806d13aa5f373b310e0ad6693)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestDatanodeRestart.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaWaitingToBeRecovered.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaUnderRecovery.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FinalizedReplica.java


> Remove unused Replica copyOnWrite code
> --
>
> Key: HDFS-8860
> URL: https://issues.apache.org/jira/browse/HDFS-8860
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Fix For: 2.8.0
>
> Attachments: HDFS-8860.0.patch
>
>
> {{ReplicaInfo#unlinkBlock()}} is effectively disabled by the following code, 
> because {{isUnlinked()}} always returns true.
> {code}
> if (isUnlinked()) {
>   return false;
> }
> {code}
> Several test cases, e.g., {{TestFileAppend#testCopyOnWrite}} and 
> {{TestDatanodeRestart#testRecoverReplicas}} are testing against the unlink 
> Lets remove the relevant code to eliminate the confusions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9281) Change TestDeleteBlockPool to not explicitly use File to check block pool existence.

2015-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057035#comment-15057035
 ] 

Hudson commented on HDFS-9281:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #692 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/692/])
HDFS-9281. Change TestDeleteBlockPool to not explicitly use File to (lei: rev 
f229772f99d1751e6b2152b6e3ac9c9f7844c15d)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java


> Change TestDeleteBlockPool to not explicitly use File to check block pool 
> existence.
> 
>
> Key: HDFS-9281
> URL: https://issues.apache.org/jira/browse/HDFS-9281
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Fix For: 3.0.0, 2.9.0
>
> Attachments: HDFS-9281.00.patch, HDFS-9281.02.patch, 
> HDFS-9281.03.patch, HDFS-9281.combo.00.patch
>
>
> {{TestDeleteBlockPool}} checks the existence of a block pool by checking the 
> directories in the file-based block pool exists. However, it does not apply 
> to non file based fsdataset. 
> We can fix it by abstracting the checking logic behind {{FsDatasetTestUtils}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9038) DFS reserved space is erroneously counted towards non-DFS used.

2015-12-14 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057077#comment-15057077
 ] 

Arpit Agarwal commented on HDFS-9038:
-

[~brahma], thanks for your patience as we work through the math. It may be 
useful to describe your proposed derivation as a Jira comment before you post 
another patch.

[~cnauroth] I think you are right in conclusion but there is a misstep in the 
equations in [this 
comment|https://issues.apache.org/jira/browse/HDFS-9038?focusedCommentId=15051894=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15051894].
 {{reserved}} _should_ cancel out and it should not factor in the final 
computation.

The problem with the v005 patch we missed earlier is that getCapacity() 
subtracts reserved space. We should use the raw capacity.

One way to fix it is:
{code}
public long getNonDfsUsed() throws IOException {
  long totalFreeSpace = currentDir.getUsableSpace();
  long nonDfsUsed = getCapacity() + reserved - getDfsUsed() - totalFreeSpace;
  return (nonDfsUsed >= 0) ? nonDfsUsed : 0;
}
{code}

Then the derivation becomes:
{code}
1: non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
2:  = usage.getCapacity() - reserved + reserved - getDfsUsed() - 
totalFreeSpace
3:  = usage.getCapacity() - getDfsUsed() - totalFreeSpace
4:  = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
{code}

Hope that makes sense.

> DFS reserved space is erroneously counted towards non-DFS used.
> ---
>
> Key: HDFS-9038
> URL: https://issues.apache.org/jira/browse/HDFS-9038
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-9038-002.patch, HDFS-9038-003.patch, 
> HDFS-9038-004.patch, HDFS-9038-005.patch, HDFS-9038-006.patch, 
> HDFS-9038-007.patch, HDFS-9038.patch
>
>
> HDFS-5215 changed the DataNode volume available space calculation to consider 
> the reserved space held by the {{dfs.datanode.du.reserved}} configuration 
> property.  As a side effect, reserved space is now counted towards non-DFS 
> used.  I don't believe it was intentional to change the definition of non-DFS 
> used.  This issue proposes restoring the prior behavior: do not count 
> reserved space towards non-DFS used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9038) DFS reserved space is erroneously counted towards non-DFS used.

2015-12-14 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057077#comment-15057077
 ] 

Arpit Agarwal edited comment on HDFS-9038 at 12/15/15 12:46 AM:


[~brahma], thanks for your patience as we work through the math. It may be 
useful to describe your proposed derivation as a Jira comment before you post 
another patch.

[~cnauroth] I think you are right in conclusion but there is a misstep in the 
equations in [this 
comment|https://issues.apache.org/jira/browse/HDFS-9038?focusedCommentId=15051894=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15051894].
 {{reserved}} _should_ cancel out and it should not factor in the final 
computation.

The problem with the v005 patch we missed earlier is that getCapacity() 
subtracts reserved space. We should use the raw capacity.

One way to fix it is:
{code}
public long getNonDfsUsed() throws IOException {
  long totalFreeSpace = currentDir.getFreeSpace();
  long nonDfsUsed = getCapacity() + reserved - getDfsUsed() - totalFreeSpace;
  return (nonDfsUsed >= 0) ? nonDfsUsed : 0;
}
{code}

Then the derivation becomes:
{code}
1: non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
2:  = usage.getCapacity() - reserved + reserved - getDfsUsed() - 
totalFreeSpace
3:  = usage.getCapacity() - getDfsUsed() - totalFreeSpace
4:  = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
{code}

Hope that makes sense.


was (Author: arpitagarwal):
[~brahma], thanks for your patience as we work through the math. It may be 
useful to describe your proposed derivation as a Jira comment before you post 
another patch.

[~cnauroth] I think you are right in conclusion but there is a misstep in the 
equations in [this 
comment|https://issues.apache.org/jira/browse/HDFS-9038?focusedCommentId=15051894=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15051894].
 {{reserved}} _should_ cancel out and it should not factor in the final 
computation.

The problem with the v005 patch we missed earlier is that getCapacity() 
subtracts reserved space. We should use the raw capacity.

One way to fix it is:
{code}
public long getNonDfsUsed() throws IOException {
  long totalFreeSpace = currentDir.getUsableSpace();
  long nonDfsUsed = getCapacity() + reserved - getDfsUsed() - totalFreeSpace;
  return (nonDfsUsed >= 0) ? nonDfsUsed : 0;
}
{code}

Then the derivation becomes:
{code}
1: non-DFS used = getCapacity() + reserved - getDfsUsed() - totalFreeSpace
2:  = usage.getCapacity() - reserved + reserved - getDfsUsed() - 
totalFreeSpace
3:  = usage.getCapacity() - getDfsUsed() - totalFreeSpace
4:  = File#getTotalSpace - getDfsUsed() - File#getFreeSpace
{code}

Hope that makes sense.

> DFS reserved space is erroneously counted towards non-DFS used.
> ---
>
> Key: HDFS-9038
> URL: https://issues.apache.org/jira/browse/HDFS-9038
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-9038-002.patch, HDFS-9038-003.patch, 
> HDFS-9038-004.patch, HDFS-9038-005.patch, HDFS-9038-006.patch, 
> HDFS-9038-007.patch, HDFS-9038.patch
>
>
> HDFS-5215 changed the DataNode volume available space calculation to consider 
> the reserved space held by the {{dfs.datanode.du.reserved}} configuration 
> property.  As a side effect, reserved space is now counted towards non-DFS 
> used.  I don't believe it was intentional to change the definition of non-DFS 
> used.  This issue proposes restoring the prior behavior: do not count 
> reserved space towards non-DFS used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )

2015-12-14 Thread GAO Rui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GAO Rui updated HDFS-9494:
--
Status: In Progress  (was: Patch Available)

> Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
> 
>
> Key: HDFS-9494
> URL: https://issues.apache.org/jira/browse/HDFS-9494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: GAO Rui
>Assignee: GAO Rui
>Priority: Minor
> Attachments: HDFS-9494-origin-trunk.00.patch, 
> HDFS-9494-origin-trunk.01.patch
>
>
> Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and 
> wait for flushInternal( ) in sequence. So the runtime flow is like:
> {code}
> Streamer0#flushInternal( )
> Streamer0#waitForAckedSeqno( )
> Streamer1#flushInternal( )
> Streamer1#waitForAckedSeqno( )
> …
> Streamer8#flushInternal( )
> Streamer8#waitForAckedSeqno( )
> {code}
> It could be better to trigger all the streamers to flushInternal( ) and
> wait for all of them to return from waitForAckedSeqno( ),  and then 
> flushAllInternals( ) returns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8860) Remove unused Replica copyOnWrite code

2015-12-14 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu resolved HDFS-8860.
-
Resolution: Fixed

I had a discussion with [~cmccabe] offline and learned that 
{{ReplicaInfo#unlinkBlock}} was designed to append workload before HDFS-1700. 
It was not designed to remove the hardlinks created by {{DN}} upgrade.

Since the code of creating hardlinks when appending a file is gone, the patch 
is still valid to remove dead code.



> Remove unused Replica copyOnWrite code
> --
>
> Key: HDFS-8860
> URL: https://issues.apache.org/jira/browse/HDFS-8860
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Fix For: 2.8.0
>
> Attachments: HDFS-8860.0.patch
>
>
> {{ReplicaInfo#unlinkBlock()}} is effectively disabled by the following code, 
> because {{isUnlinked()}} always returns true.
> {code}
> if (isUnlinked()) {
>   return false;
> }
> {code}
> Several test cases, e.g., {{TestFileAppend#testCopyOnWrite}} and 
> {{TestDatanodeRestart#testRecoverReplicas}} are testing against the unlink 
> Lets remove the relevant code to eliminate the confusions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8562:
---
Attachment: HDFS-8562.002.patch

> HDFS Performance is impacted by FileInputStream Finalizer
> -
>
> Key: HDFS-8562
> URL: https://issues.apache.org/jira/browse/HDFS-8562
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.5.0
> Environment: Impact any application that uses HDFS
>Reporter: Yanping Wang
>Assignee: Wei Zhou
> Attachments: HDFS-8562.002.patch, HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC 
> pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
> datanode GC pauses spiked toward 160 milliseconds while they should be around 
> 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to 
> process high number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 
> 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 
> milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
> milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those 
> FinalReference were all from FileInputStream.  We checked HDFS code and saw 
> the use of the FileInputStream in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.public static MappableBlock load(long length,
> 2.FileInputStream blockIn, FileInputStream metaIn,
> 3.String blockFileName) throws IOException {
> 4.MappableBlock mappableBlock = null;
> 5.MappedByteBuffer mmap = null;
> 6.FileChannel blockChannel = null;
> 7.try {
> 8.blockChannel = blockIn.getChannel();
> 9.if (blockChannel == null) {
> 10.   throw new IOException("Block InputStream has no FileChannel.");
> 11.   }
> 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.   mappableBlock = new MappableBlock(mmap, length);
> 16.   } finally {
> 17.   IOUtils.closeQuietly(blockChannel);
> 18.   if (mappableBlock == null) {
> 19.   if (mmap != null) {
> 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.   }
> 22.   }
> 23.   }
> 24.   return mappableBlock;
> 25.   }
> {code}
> We looked up 
> https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
>  and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance 
> is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
> to be executed.   
> The current issue is: even with programmers do call close() after using 
> FileInputStream, its finalize() method will still be called. In other words, 
> still get the side effect of the FinalReference being registered at 
> FileInputStream allocation time, and also reference processing to reclaim the 
> FinalReference during GC (any GC solution has to deal with this). 
> We can imagine When running industry deployment HDFS, millions of files could 
> be opened and closed which resulted in a very large number of finalizers 
> being registered and subsequently being executed.  That could cause very long 
> GC pause times.
> We tried to use Files.newInputStream() to replace FileInputStream, but it was 
> clear we could not replace FileInputStream in 
> hdfs/server/datanode/fsdataset/impl/MappableBlock.java 
> We notified Oracle JVM team of this performance issue that impacting all Big 
> Data applications using HDFS. We recommended the proper fix in Java SE 
> FileInputStream. Because (1) it is really nothing wrong to use 
> FileInputStream in above datanode code, (2) as the object with a finalizer is 
> registered with finalizer list within the JVM at object allocation time, if 
> someone makes an explicit call 

[jira] [Commented] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057187#comment-15057187
 ] 

Kai Zheng commented on HDFS-8562:
-

Note the patch doesn't compile for Java 7 in the places using 
FileChannelImpl.open().

> HDFS Performance is impacted by FileInputStream Finalizer
> -
>
> Key: HDFS-8562
> URL: https://issues.apache.org/jira/browse/HDFS-8562
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.5.0
> Environment: Impact any application that uses HDFS
>Reporter: Yanping Wang
> Attachments: HDFS-8562.002b.patch, HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC 
> pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
> datanode GC pauses spiked toward 160 milliseconds while they should be around 
> 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to 
> process high number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 
> 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 
> milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
> milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those 
> FinalReference were all from FileInputStream.  We checked HDFS code and saw 
> the use of the FileInputStream in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.public static MappableBlock load(long length,
> 2.FileInputStream blockIn, FileInputStream metaIn,
> 3.String blockFileName) throws IOException {
> 4.MappableBlock mappableBlock = null;
> 5.MappedByteBuffer mmap = null;
> 6.FileChannel blockChannel = null;
> 7.try {
> 8.blockChannel = blockIn.getChannel();
> 9.if (blockChannel == null) {
> 10.   throw new IOException("Block InputStream has no FileChannel.");
> 11.   }
> 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.   mappableBlock = new MappableBlock(mmap, length);
> 16.   } finally {
> 17.   IOUtils.closeQuietly(blockChannel);
> 18.   if (mappableBlock == null) {
> 19.   if (mmap != null) {
> 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.   }
> 22.   }
> 23.   }
> 24.   return mappableBlock;
> 25.   }
> {code}
> We looked up 
> https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
>  and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance 
> is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
> to be executed.   
> The current issue is: even with programmers do call close() after using 
> FileInputStream, its finalize() method will still be called. In other words, 
> still get the side effect of the FinalReference being registered at 
> FileInputStream allocation time, and also reference processing to reclaim the 
> FinalReference during GC (any GC solution has to deal with this). 
> We can imagine When running industry deployment HDFS, millions of files could 
> be opened and closed which resulted in a very large number of finalizers 
> being registered and subsequently being executed.  That could cause very long 
> GC pause times.
> We tried to use Files.newInputStream() to replace FileInputStream, but it was 
> clear we could not replace FileInputStream in 
> hdfs/server/datanode/fsdataset/impl/MappableBlock.java 
> We notified Oracle JVM team of this performance issue that impacting all Big 
> Data applications using HDFS. We recommended the proper fix in Java SE 
> FileInputStream. Because (1) it is really nothing wrong to use 
> FileInputStream in above datanode code, (2) as the object with a finalizer is 
> registered with finalizer list within the JVM at object allocation 

[jira] [Commented] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057031#comment-15057031
 ] 

Kai Zheng commented on HDFS-8562:
-

Thanks Colin for providing the formal patch. It looks a good rational to just 
focus on the key path and limit the scope. Please take it accordingly if you'd 
like to. Thanks.

> HDFS Performance is impacted by FileInputStream Finalizer
> -
>
> Key: HDFS-8562
> URL: https://issues.apache.org/jira/browse/HDFS-8562
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.5.0
> Environment: Impact any application that uses HDFS
>Reporter: Yanping Wang
> Attachments: HDFS-8562.002b.patch, HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC 
> pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
> datanode GC pauses spiked toward 160 milliseconds while they should be around 
> 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to 
> process high number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 
> 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 
> milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
> milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those 
> FinalReference were all from FileInputStream.  We checked HDFS code and saw 
> the use of the FileInputStream in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.public static MappableBlock load(long length,
> 2.FileInputStream blockIn, FileInputStream metaIn,
> 3.String blockFileName) throws IOException {
> 4.MappableBlock mappableBlock = null;
> 5.MappedByteBuffer mmap = null;
> 6.FileChannel blockChannel = null;
> 7.try {
> 8.blockChannel = blockIn.getChannel();
> 9.if (blockChannel == null) {
> 10.   throw new IOException("Block InputStream has no FileChannel.");
> 11.   }
> 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.   mappableBlock = new MappableBlock(mmap, length);
> 16.   } finally {
> 17.   IOUtils.closeQuietly(blockChannel);
> 18.   if (mappableBlock == null) {
> 19.   if (mmap != null) {
> 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.   }
> 22.   }
> 23.   }
> 24.   return mappableBlock;
> 25.   }
> {code}
> We looked up 
> https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
>  and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance 
> is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
> to be executed.   
> The current issue is: even with programmers do call close() after using 
> FileInputStream, its finalize() method will still be called. In other words, 
> still get the side effect of the FinalReference being registered at 
> FileInputStream allocation time, and also reference processing to reclaim the 
> FinalReference during GC (any GC solution has to deal with this). 
> We can imagine When running industry deployment HDFS, millions of files could 
> be opened and closed which resulted in a very large number of finalizers 
> being registered and subsequently being executed.  That could cause very long 
> GC pause times.
> We tried to use Files.newInputStream() to replace FileInputStream, but it was 
> clear we could not replace FileInputStream in 
> hdfs/server/datanode/fsdataset/impl/MappableBlock.java 
> We notified Oracle JVM team of this performance issue that impacting all Big 
> Data applications using HDFS. We recommended the proper fix in Java SE 
> FileInputStream. Because (1) it is really nothing wrong to use 
> FileInputStream in above datanode code, (2) as the object with 

[jira] [Commented] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057094#comment-15057094
 ] 

Hadoop QA commented on HDFS-8562:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 59s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
16s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 40s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
59s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 28s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 28s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 24s 
{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 24s 
{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 30s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 37s 
{color} | {color:red} root in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 37s {color} 
| {color:red} root in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 39s 
{color} | {color:red} root in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 39s {color} 
| {color:red} root in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 2s 
{color} | {color:red} Patch generated 18 new checkstyle issues in root (total 
was 251, now 266). {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 26s 
{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 26s 
{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 31s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 23s 
{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 24s 
{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 29s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 31s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 7m 52s 
{color} | {color:red} hadoop-common-project_hadoop-common-jdk1.7.0_91 with JDK 
v1.7.0_91 generated 1 new issues (was 13, now 14). {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 7m 52s 
{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.7.0_91 with 
JDK v1.7.0_91 generated 1 new issues (was 1, now 2). {color} |
| 

[jira] [Commented] (HDFS-9038) DFS reserved space is erroneously counted towards non-DFS used.

2015-12-14 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056966#comment-15056966
 ] 

Chris Nauroth commented on HDFS-9038:
-

bq. So if actual non-dfs usage is within this reserved limit, metric should 
show 0. right?

Yes, that's my understanding, and that's the behavior pre-HDFS-5215.

bq. In that case, wouldn't it be better to rename the 'NonDfsUsage' metric to 
'UnexpectedNonDfsUsage' to make it more clear? Or mention somewhere to clear 
confusion in people like me.

I don' t think it could be renamed easily due to backwards-compatibility, but I 
do think we could update Metrics.md.

> DFS reserved space is erroneously counted towards non-DFS used.
> ---
>
> Key: HDFS-9038
> URL: https://issues.apache.org/jira/browse/HDFS-9038
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-9038-002.patch, HDFS-9038-003.patch, 
> HDFS-9038-004.patch, HDFS-9038-005.patch, HDFS-9038-006.patch, 
> HDFS-9038-007.patch, HDFS-9038.patch
>
>
> HDFS-5215 changed the DataNode volume available space calculation to consider 
> the reserved space held by the {{dfs.datanode.du.reserved}} configuration 
> property.  As a side effect, reserved space is now counted towards non-DFS 
> used.  I don't believe it was intentional to change the definition of non-DFS 
> used.  This issue proposes restoring the prior behavior: do not count 
> reserved space towards non-DFS used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8562:
---
Status: Patch Available  (was: Open)

> HDFS Performance is impacted by FileInputStream Finalizer
> -
>
> Key: HDFS-8562
> URL: https://issues.apache.org/jira/browse/HDFS-8562
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.5.0
> Environment: Impact any application that uses HDFS
>Reporter: Yanping Wang
>Assignee: Wei Zhou
> Attachments: HDFS-8562.002b.patch, HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC 
> pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
> datanode GC pauses spiked toward 160 milliseconds while they should be around 
> 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to 
> process high number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 
> 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 
> milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
> milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those 
> FinalReference were all from FileInputStream.  We checked HDFS code and saw 
> the use of the FileInputStream in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.public static MappableBlock load(long length,
> 2.FileInputStream blockIn, FileInputStream metaIn,
> 3.String blockFileName) throws IOException {
> 4.MappableBlock mappableBlock = null;
> 5.MappedByteBuffer mmap = null;
> 6.FileChannel blockChannel = null;
> 7.try {
> 8.blockChannel = blockIn.getChannel();
> 9.if (blockChannel == null) {
> 10.   throw new IOException("Block InputStream has no FileChannel.");
> 11.   }
> 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.   mappableBlock = new MappableBlock(mmap, length);
> 16.   } finally {
> 17.   IOUtils.closeQuietly(blockChannel);
> 18.   if (mappableBlock == null) {
> 19.   if (mmap != null) {
> 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.   }
> 22.   }
> 23.   }
> 24.   return mappableBlock;
> 25.   }
> {code}
> We looked up 
> https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
>  and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance 
> is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
> to be executed.   
> The current issue is: even with programmers do call close() after using 
> FileInputStream, its finalize() method will still be called. In other words, 
> still get the side effect of the FinalReference being registered at 
> FileInputStream allocation time, and also reference processing to reclaim the 
> FinalReference during GC (any GC solution has to deal with this). 
> We can imagine When running industry deployment HDFS, millions of files could 
> be opened and closed which resulted in a very large number of finalizers 
> being registered and subsequently being executed.  That could cause very long 
> GC pause times.
> We tried to use Files.newInputStream() to replace FileInputStream, but it was 
> clear we could not replace FileInputStream in 
> hdfs/server/datanode/fsdataset/impl/MappableBlock.java 
> We notified Oracle JVM team of this performance issue that impacting all Big 
> Data applications using HDFS. We recommended the proper fix in Java SE 
> FileInputStream. Because (1) it is really nothing wrong to use 
> FileInputStream in above datanode code, (2) as the object with a finalizer is 
> registered with finalizer list within the JVM at object allocation time, if 
> someone makes an 

[jira] [Updated] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8562:
---
Attachment: HDFS-8562.002b.patch

> HDFS Performance is impacted by FileInputStream Finalizer
> -
>
> Key: HDFS-8562
> URL: https://issues.apache.org/jira/browse/HDFS-8562
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.5.0
> Environment: Impact any application that uses HDFS
>Reporter: Yanping Wang
>Assignee: Wei Zhou
> Attachments: HDFS-8562.002b.patch, HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC 
> pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
> datanode GC pauses spiked toward 160 milliseconds while they should be around 
> 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to 
> process high number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 
> 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 
> milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
> milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those 
> FinalReference were all from FileInputStream.  We checked HDFS code and saw 
> the use of the FileInputStream in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.public static MappableBlock load(long length,
> 2.FileInputStream blockIn, FileInputStream metaIn,
> 3.String blockFileName) throws IOException {
> 4.MappableBlock mappableBlock = null;
> 5.MappedByteBuffer mmap = null;
> 6.FileChannel blockChannel = null;
> 7.try {
> 8.blockChannel = blockIn.getChannel();
> 9.if (blockChannel == null) {
> 10.   throw new IOException("Block InputStream has no FileChannel.");
> 11.   }
> 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.   mappableBlock = new MappableBlock(mmap, length);
> 16.   } finally {
> 17.   IOUtils.closeQuietly(blockChannel);
> 18.   if (mappableBlock == null) {
> 19.   if (mmap != null) {
> 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.   }
> 22.   }
> 23.   }
> 24.   return mappableBlock;
> 25.   }
> {code}
> We looked up 
> https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
>  and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance 
> is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
> to be executed.   
> The current issue is: even with programmers do call close() after using 
> FileInputStream, its finalize() method will still be called. In other words, 
> still get the side effect of the FinalReference being registered at 
> FileInputStream allocation time, and also reference processing to reclaim the 
> FinalReference during GC (any GC solution has to deal with this). 
> We can imagine When running industry deployment HDFS, millions of files could 
> be opened and closed which resulted in a very large number of finalizers 
> being registered and subsequently being executed.  That could cause very long 
> GC pause times.
> We tried to use Files.newInputStream() to replace FileInputStream, but it was 
> clear we could not replace FileInputStream in 
> hdfs/server/datanode/fsdataset/impl/MappableBlock.java 
> We notified Oracle JVM team of this performance issue that impacting all Big 
> Data applications using HDFS. We recommended the proper fix in Java SE 
> FileInputStream. Because (1) it is really nothing wrong to use 
> FileInputStream in above datanode code, (2) as the object with a finalizer is 
> registered with finalizer list within the JVM at object allocation time, if 
> someone makes an explicit 

[jira] [Updated] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8562:
---
Attachment: (was: HDFS-8562.002.patch)

> HDFS Performance is impacted by FileInputStream Finalizer
> -
>
> Key: HDFS-8562
> URL: https://issues.apache.org/jira/browse/HDFS-8562
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.5.0
> Environment: Impact any application that uses HDFS
>Reporter: Yanping Wang
>Assignee: Wei Zhou
> Attachments: HDFS-8562.002b.patch, HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC 
> pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
> datanode GC pauses spiked toward 160 milliseconds while they should be around 
> 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to 
> process high number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 
> 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 
> milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
> milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those 
> FinalReference were all from FileInputStream.  We checked HDFS code and saw 
> the use of the FileInputStream in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.public static MappableBlock load(long length,
> 2.FileInputStream blockIn, FileInputStream metaIn,
> 3.String blockFileName) throws IOException {
> 4.MappableBlock mappableBlock = null;
> 5.MappedByteBuffer mmap = null;
> 6.FileChannel blockChannel = null;
> 7.try {
> 8.blockChannel = blockIn.getChannel();
> 9.if (blockChannel == null) {
> 10.   throw new IOException("Block InputStream has no FileChannel.");
> 11.   }
> 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.   mappableBlock = new MappableBlock(mmap, length);
> 16.   } finally {
> 17.   IOUtils.closeQuietly(blockChannel);
> 18.   if (mappableBlock == null) {
> 19.   if (mmap != null) {
> 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.   }
> 22.   }
> 23.   }
> 24.   return mappableBlock;
> 25.   }
> {code}
> We looked up 
> https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
>  and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance 
> is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
> to be executed.   
> The current issue is: even with programmers do call close() after using 
> FileInputStream, its finalize() method will still be called. In other words, 
> still get the side effect of the FinalReference being registered at 
> FileInputStream allocation time, and also reference processing to reclaim the 
> FinalReference during GC (any GC solution has to deal with this). 
> We can imagine When running industry deployment HDFS, millions of files could 
> be opened and closed which resulted in a very large number of finalizers 
> being registered and subsequently being executed.  That could cause very long 
> GC pause times.
> We tried to use Files.newInputStream() to replace FileInputStream, but it was 
> clear we could not replace FileInputStream in 
> hdfs/server/datanode/fsdataset/impl/MappableBlock.java 
> We notified Oracle JVM team of this performance issue that impacting all Big 
> Data applications using HDFS. We recommended the proper fix in Java SE 
> FileInputStream. Because (1) it is really nothing wrong to use 
> FileInputStream in above datanode code, (2) as the object with a finalizer is 
> registered with finalizer list within the JVM at object allocation time, if 
> someone makes an 

[jira] [Commented] (HDFS-9493) Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057017#comment-15057017
 ] 

Hadoop QA commented on HDFS-9493:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 49s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 59s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 129m 37s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.server.datanode.TestBpServiceActorScheduler |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.server.datanode.TestBlockReplacement |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12777591/HDFS-9493.002.patch |
| JIRA Issue | HDFS-9493 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 814118cbff0b 3.13.0-36-lowlatency #63-Ubuntu SMP 

[jira] [Commented] (HDFS-9173) Erasure Coding: Lease recovery for striped file

2015-12-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057058#comment-15057058
 ] 

Jing Zhao commented on HDFS-9173:
-

bq. newLocs and newStorages should be constructed in the same loop as rurList. 
Something like below.

Take another look at the patch. Actually here in the first loop we still need 
to connect to all the datanodes for the recovery. We cannot assume all the 
recovery will work fine and prepare the new locations right away. Instead, we 
may also need to update the current patch to handle the possible failures from 
{{callInitReplicaRecovery}}.

> Erasure Coding: Lease recovery for striped file
> ---
>
> Key: HDFS-9173
> URL: https://issues.apache.org/jira/browse/HDFS-9173
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-9173.00.wip.patch, HDFS-9173.01.patch, 
> HDFS-9173.02.step125.patch, HDFS-9173.03.patch, HDFS-9173.04.patch, 
> HDFS-9173.05.patch, HDFS-9173.06.patch, HDFS-9173.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-9521) TransferFsImage.receiveFile should account and log separate times for image download and fsync to disk

2015-12-14 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-9521 started by Wellington Chevreuil.
--
> TransferFsImage.receiveFile should account and log separate times for image 
> download and fsync to disk 
> ---
>
> Key: HDFS-9521
> URL: https://issues.apache.org/jira/browse/HDFS-9521
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HDFS-9521.patch
>
>
> Currently, TransferFsImage.receiveFile is logging total transfer time as 
> below:
> {noformat}
> double xferSec = Math.max(
>((float)(Time.monotonicNow() - startTime)) / 1000.0, 0.001);
> long xferKb = received / 1024;
> LOG.info(String.format("Transfer took %.2fs at %.2f KB/s",xferSec, xferKb / 
> xferSec))
> {noformat}
> This is really useful, but it just measures the total method execution time, 
> which includes time taken to download the image and do an fsync to all the 
> namenode metadata directories.
> Sometime when troubleshooting these imager transfer problems, it's 
> interesting to know which part of the process is being the bottleneck 
> (whether network or disk write).
> This patch accounts time for image download and fsync to each disk 
> separately, logging how much time did it take on each operation.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9521) TransferFsImage.receiveFile should account and log separate times for image download and fsync to disk

2015-12-14 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HDFS-9521:
---
Status: Patch Available  (was: In Progress)

Initial patch version has been submitted.

> TransferFsImage.receiveFile should account and log separate times for image 
> download and fsync to disk 
> ---
>
> Key: HDFS-9521
> URL: https://issues.apache.org/jira/browse/HDFS-9521
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HDFS-9521.patch
>
>
> Currently, TransferFsImage.receiveFile is logging total transfer time as 
> below:
> {noformat}
> double xferSec = Math.max(
>((float)(Time.monotonicNow() - startTime)) / 1000.0, 0.001);
> long xferKb = received / 1024;
> LOG.info(String.format("Transfer took %.2fs at %.2f KB/s",xferSec, xferKb / 
> xferSec))
> {noformat}
> This is really useful, but it just measures the total method execution time, 
> which includes time taken to download the image and do an fsync to all the 
> namenode metadata directories.
> Sometime when troubleshooting these imager transfer problems, it's 
> interesting to know which part of the process is being the bottleneck 
> (whether network or disk write).
> This patch accounts time for image download and fsync to each disk 
> separately, logging how much time did it take on each operation.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9555) LazyPersistFileScrubber should still sleep if there are errors in the clear progress

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055859#comment-15055859
 ] 

Hadoop QA commented on HDFS-9555:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
3s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 193, now 193). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 50s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 39s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 144m 26s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicy |
|   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork |
|   | 

[jira] [Updated] (HDFS-9537) libhdfs++: implement HDFSConfiguration class

2015-12-14 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9537:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> libhdfs++: implement HDFSConfiguration class
> 
>
> Key: HDFS-9537
> URL: https://issues.apache.org/jira/browse/HDFS-9537
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9537.HDFS-8707.000.patch, 
> HDFS-9537.HDFS-8707.001.patch, HDFS-9537.HDFS-8707.002.patch, 
> HDFS-9537.HDFS-8707.003.patch, HDFS-9537.HDFS-8707.004.patch
>
>
> Create a class to encode the rules for interpreting a Configuration class to 
> create a libhdfs++ Options object



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9537) libhdfs++: implement HDFSConfiguration class

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056079#comment-15056079
 ] 

Hadoop QA commented on HDFS-9537:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
1s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 47s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 49s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 56s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 3m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 56s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 3m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 15s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 13s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 59s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12777468/HDFS-9537.HDFS-8707.004.patch
 |
| JIRA Issue | HDFS-9537 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux ccacb29bf242 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / 7ad9b77 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13859/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_66.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13859/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_91.txt
 |
| JDK v1.7.0_91  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13859/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Max memory used | 78MB |
| Powered by | Apache Yetus 0.1.0   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13859/console |


This message was automatically generated.



> libhdfs++: implement HDFSConfiguration class
> 

[jira] [Updated] (HDFS-9533) seen_txid in the shared edits directory is modified during bootstrapping

2015-12-14 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9533:
-
Status: Patch Available  (was: Open)

> seen_txid in the shared edits directory is modified during bootstrapping
> 
>
> Key: HDFS-9533
> URL: https://issues.apache.org/jira/browse/HDFS-9533
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Affects Versions: 2.6.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9533.patch
>
>
> The last known transaction id is stored in the seen_txid file of all known 
> directories of a NNStorage when starting a new edit segment. However, we have 
> seen a case where it contains an id that falls in the middle of an edit 
> segment. This was the seen_txid file in the sahred edits directory.  The 
> active namenode's local storage was containing valid looking seen_txid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9537) libhdfs++: implement HDFSConfiguration class

2015-12-14 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056217#comment-15056217
 ] 

James Clampffer commented on HDFS-9537:
---

+1, thanks for fixing the whitespace issues.  

Since the only thing that's changed between HDFS-9537.HDFS-8707.002.patch and  
HDFS-9537.HDFS-8707.004.patch was whitespace and HDFS-9537.HDFS-8707.002.patch 
has been up for a couple days I'm going to commit now.  Thanks for the patch 
[~bobthansen]!

> libhdfs++: implement HDFSConfiguration class
> 
>
> Key: HDFS-9537
> URL: https://issues.apache.org/jira/browse/HDFS-9537
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9537.HDFS-8707.000.patch, 
> HDFS-9537.HDFS-8707.001.patch, HDFS-9537.HDFS-8707.002.patch, 
> HDFS-9537.HDFS-8707.003.patch, HDFS-9537.HDFS-8707.004.patch
>
>
> Create a class to encode the rules for interpreting a Configuration class to 
> create a libhdfs++ Options object



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8836) Skip newline on empty files with getMerge -nl

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057304#comment-15057304
 ] 

Hadoop QA commented on HDFS-8836:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 45s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
10s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 47s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 41s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 50s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m 42s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | hadoop.metrics2.impl.TestGangliaMetrics |
| JDK v1.7.0_91 Failed junit tests | hadoop.metrics2.impl.TestGangliaMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12777652/HDFS-8836-07.patch |
| JIRA Issue | HDFS-8836 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 111a1f3ee02e 3.13.0-36-lowlatency 

[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )

2015-12-14 Thread GAO Rui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GAO Rui updated HDFS-9494:
--
Status: Patch Available  (was: In Progress)

> Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
> 
>
> Key: HDFS-9494
> URL: https://issues.apache.org/jira/browse/HDFS-9494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: GAO Rui
>Assignee: GAO Rui
>Priority: Minor
> Attachments: HDFS-9494-origin-trunk.00.patch, 
> HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch
>
>
> Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and 
> wait for flushInternal( ) in sequence. So the runtime flow is like:
> {code}
> Streamer0#flushInternal( )
> Streamer0#waitForAckedSeqno( )
> Streamer1#flushInternal( )
> Streamer1#waitForAckedSeqno( )
> …
> Streamer8#flushInternal( )
> Streamer8#waitForAckedSeqno( )
> {code}
> It could be better to trigger all the streamers to flushInternal( ) and
> wait for all of them to return from waitForAckedSeqno( ),  and then 
> flushAllInternals( ) returns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )

2015-12-14 Thread GAO Rui (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057377#comment-15057377
 ] 

GAO Rui commented on HDFS-9494:
---

[~rakeshr], thank you very much for your detailed comments!  I have addressed 
1,3,4 in previous comment. For the second one, I drafted the following code:
{code}
for (int i = 0; i < healthyStreamerCount; i++) {
  try {
executorCompletionService.take().get();
  } catch (InterruptedException ie) {
throw DFSUtilClient.toInterruptedIOException(
"Interrupted during waiting all streamer flush. ", ie);
  } catch (ExecutionException ee) {
LOG.warn("Caught ExecutionException during waiting all streamer " +
"flush", ee);
  }
}
{code}
I think it should be enough for handling {{ExecutionException}}. What do you 
think?

> Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
> 
>
> Key: HDFS-9494
> URL: https://issues.apache.org/jira/browse/HDFS-9494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: GAO Rui
>Assignee: GAO Rui
>Priority: Minor
> Attachments: HDFS-9494-origin-trunk.00.patch, 
> HDFS-9494-origin-trunk.01.patch
>
>
> Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and 
> wait for flushInternal( ) in sequence. So the runtime flow is like:
> {code}
> Streamer0#flushInternal( )
> Streamer0#waitForAckedSeqno( )
> Streamer1#flushInternal( )
> Streamer1#waitForAckedSeqno( )
> …
> Streamer8#flushInternal( )
> Streamer8#waitForAckedSeqno( )
> {code}
> It could be better to trigger all the streamers to flushInternal( ) and
> wait for all of them to return from waitForAckedSeqno( ),  and then 
> flushAllInternals( ) returns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9038) DFS reserved space is erroneously counted towards non-DFS used.

2015-12-14 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057367#comment-15057367
 ] 

Vinayakumar B commented on HDFS-9038:
-

[~arpitagarwal], thanks for jumping in for calculation.

I think considering We use {{File#getUsableSpace}} instead of 
{{File#getFreeSpace}}, current code after HDFS-5215 also comes to same equation 
as mentioned by [~arpitagarwal], brahma's patch doesn't change anything in 
this, except subtracting 'reservedForReplicas' also.

But, I think what [~cnauroth] expects is, 
*Since {{reserved}} is hidden from the HDFS in {{getCapacity()}} itself, we can 
think that's already used for {{nonDfs}} and subtract it from actual ondisk 
{{nonDfsUsage}}, showing only HDFS visible {{nonDfsUsage}}*. {{nonDfsUsage}} 
metric will be positive, only if the ondisk nonDfsUsage crosses beyond 
{{reserved}}, and it shows only excess usage beyond {{reserved}}.
i.e. nonDfsUsage will be {{2GB}} instead of  {{3GB}}  in my earlier example, 
where {{1GB}} was reserved.

So final equation would be {code}nonDfsUsage=File#getTotalSpace - dfsUsed - 
File#getUsableSpace - reserved{code}

Code would look like this.
{code}
  public long getNonDfsUsed() throws IOException {
long nonDfsUsed = getCapacity() - getDfsUsed() - getAvailable() - reserved 
- getReservedForReplicas();
return (nonDfsUsed > 0) ? nonDfsUsed : 0;
  }
{code}

Am I right [~cnauroth]?

> DFS reserved space is erroneously counted towards non-DFS used.
> ---
>
> Key: HDFS-9038
> URL: https://issues.apache.org/jira/browse/HDFS-9038
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-9038-002.patch, HDFS-9038-003.patch, 
> HDFS-9038-004.patch, HDFS-9038-005.patch, HDFS-9038-006.patch, 
> HDFS-9038-007.patch, HDFS-9038.patch
>
>
> HDFS-5215 changed the DataNode volume available space calculation to consider 
> the reserved space held by the {{dfs.datanode.du.reserved}} configuration 
> property.  As a side effect, reserved space is now counted towards non-DFS 
> used.  I don't believe it was intentional to change the definition of non-DFS 
> used.  This issue proposes restoring the prior behavior: do not count 
> reserved space towards non-DFS used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7344) [umbrella] Erasure Coding worker and support in DataNode

2015-12-14 Thread GAO Rui (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057515#comment-15057515
 ] 

GAO Rui commented on HDFS-7344:
---

[~zhz] Thank you very much for your information. I am trying to draft a design 
and then implement {{Converter}} which is mentioned in HDFS-7717. I think maybe 
we should design as {{Converter}} distribute conversation tasks to several 
{{ErasureCodingWorker}}, then actually these {{ErasureCodingWorker}} implement 
conversation tasks. Could you share your opinions?  Currently, if put Erasure 
Coding to product clusters, {{Converter}} should be among the most used 
functions. Lots of replication files would be converted to EC files. Without 
{{Converter}}, we can only use distcp. {{Converter}} should could be much more 
efficient, right? 

> [umbrella] Erasure Coding worker and support in DataNode
> 
>
> Key: HDFS-7344
> URL: https://issues.apache.org/jira/browse/HDFS-7344
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Kai Zheng
>Assignee: Li Bo
> Attachments: ECWorker-design-v2.pdf, HDFS ECWorker Design.pdf, 
> hdfs-ec-datanode.0108.zip, hdfs-ec-datanode.0108.zip
>
>
> According to HDFS-7285 and the design, this handles DataNode side extension 
> and related support for Erasure Coding. More specifically, it implements 
> {{ECWorker}}, which reconstructs lost blocks (in striping layout).
> It generally needs to restore BlockGroup and schema information from coding 
> commands from NameNode or other entities, and construct specific coding work 
> to execute. The required block reader, writer, either local or remote, 
> encoder and decoder, will be implemented separately as sub-tasks. 
> This JIRA will track all the linked sub-tasks, and is responsible for general 
> discussions and integration for ECWorker. It won't resolve until all the 
> related tasks are done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7661) Support read when a EC file is being written

2015-12-14 Thread GAO Rui (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057439#comment-15057439
 ] 

GAO Rui commented on HDFS-7661:
---

Hi [~zhz], I am agree with you about renaming this JIRA to "Support 
hflush/hsync". Let's wait for the reply of [~vinayrpet], then merge this two 
JIRAs and work together to put hflush/hsync down :)

> Support read when a EC file is being written
> 
>
> Key: HDFS-7661
> URL: https://issues.apache.org/jira/browse/HDFS-7661
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo Nicholas Sze
>Assignee: GAO Rui
> Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, 
> HDFS-7661-unitTest-wip-trunk.patch, 
> HDFS-EC-file-flush-sync-design-version1.1.pdf
>
>
> We also need to support hflush/hsync and visible length. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )

2015-12-14 Thread GAO Rui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GAO Rui updated HDFS-9494:
--
Attachment: HDFS-9494-origin-trunk.02.patch

New patch attached. Please feel free to give any comments, thanks.

> Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
> 
>
> Key: HDFS-9494
> URL: https://issues.apache.org/jira/browse/HDFS-9494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: GAO Rui
>Assignee: GAO Rui
>Priority: Minor
> Attachments: HDFS-9494-origin-trunk.00.patch, 
> HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch
>
>
> Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and 
> wait for flushInternal( ) in sequence. So the runtime flow is like:
> {code}
> Streamer0#flushInternal( )
> Streamer0#waitForAckedSeqno( )
> Streamer1#flushInternal( )
> Streamer1#waitForAckedSeqno( )
> …
> Streamer8#flushInternal( )
> Streamer8#waitForAckedSeqno( )
> {code}
> It could be better to trigger all the streamers to flushInternal( ) and
> wait for all of them to return from waitForAckedSeqno( ),  and then 
> flushAllInternals( ) returns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9393) After choosing favored nodes, choosing nodes for remaining replicas should go through BlockPlacementPolicy

2015-12-14 Thread J.Andreina (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

J.Andreina updated HDFS-9393:
-
Attachment: HDFS-9393.2.patch

Updated the patch with latest code.
Please review.

> After choosing favored nodes, choosing nodes for remaining replicas should go 
> through BlockPlacementPolicy
> --
>
> Key: HDFS-9393
> URL: https://issues.apache.org/jira/browse/HDFS-9393
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: J.Andreina
>Assignee: J.Andreina
> Attachments: HDFS-9393.1.patch, HDFS-9393.2.patch
>
>
> Current Behavior is :
> After choosing replicas from passed favored nodes , choosing nodes for 
> remaining replica does not go through BlockPlacementPolicy.
> Hence eventhough there is a local client datanode is available and not passed 
> as part of favored nodes , probability for choosing local datanode is less.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9516) truncate file fails with data dirs on multiple disks

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057418#comment-15057418
 ] 

Hadoop QA commented on HDFS-9516:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 49s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 57s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 25s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 3s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 2m 16s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 52s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 52s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 123, now 123). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 181m 43s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 162m 3s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 51s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 415m 40s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.TestDFSUpgrade |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.TestDataTransferKeepalive |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.server.namenode.TestFsck |
|   | 

[jira] [Commented] (HDFS-9521) TransferFsImage.receiveFile should account and log separate times for image download and fsync to disk

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055975#comment-15055975
 ] 

Hadoop QA commented on HDFS-9521:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
59s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s 
{color} | {color:red} Patch generated 3 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 26, now 29). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 28s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 32s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 139m 23s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestSnapshot |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.datanode.TestIncrementalBrVariations |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion |
|   | 

[jira] [Updated] (HDFS-9538) libhdfs++: load configuration from files

2015-12-14 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9538:
-
Attachment: HDFS-9538.HDFS-9537.001.patch

New patch: rebased on latest HDFS-9537

> libhdfs++: load configuration from files
> 
>
> Key: HDFS-9538
> URL: https://issues.apache.org/jira/browse/HDFS-9538
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9538.HDFS-9537.000.patch, 
> HDFS-9538.HDFS-9537.001.patch
>
>
> One goal of the Configuration classes are to allow the consumers of the 
> libhdfs++ library to deploy client applications into hadoop edge nodes and 
> have them pick up the Hadoop configuration that has been deployed there.
> Note that we also need to support the use case where the consumer application 
> will manage Hadoop configuration files itself, or will handle all 
> configuration out-of-band.
> libhdfs++ should be able to read files that are found in the field and easily 
> construct an instance that will communicate with the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9538) libhdfs++: load configuration from files

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056027#comment-15056027
 ] 

Hadoop QA commented on HDFS-9538:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} 
| {color:red} HDFS-9538 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12777470/HDFS-9538.HDFS-9537.001.patch
 |
| JIRA Issue | HDFS-9538 |
| Powered by | Apache Yetus 0.1.0   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13860/console |


This message was automatically generated.



> libhdfs++: load configuration from files
> 
>
> Key: HDFS-9538
> URL: https://issues.apache.org/jira/browse/HDFS-9538
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9538.HDFS-9537.000.patch, 
> HDFS-9538.HDFS-9537.001.patch
>
>
> One goal of the Configuration classes are to allow the consumers of the 
> libhdfs++ library to deploy client applications into hadoop edge nodes and 
> have them pick up the Hadoop configuration that has been deployed there.
> Note that we also need to support the use case where the consumer application 
> will manage Hadoop configuration files itself, or will handle all 
> configuration out-of-band.
> libhdfs++ should be able to read files that are found in the field and easily 
> construct an instance that will communicate with the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9514) TestDistributedFileSystem.testDFSClientPeerWriteTimeout failing; exception being swallowed

2015-12-14 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056034#comment-15056034
 ] 

Wei-Chiu Chuang commented on HDFS-9514:
---

Thanks [~yzhangal] for the review and commit, and [~steve_l] for reporting the 
issue!

> TestDistributedFileSystem.testDFSClientPeerWriteTimeout failing; exception 
> being swallowed
> --
>
> Key: HDFS-9514
> URL: https://issues.apache.org/jira/browse/HDFS-9514
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, test
>Affects Versions: 3.0.0
> Environment: jenkins
>Reporter: Steve Loughran
>Assignee: Wei-Chiu Chuang
> Fix For: 2.8.0
>
> Attachments: HDFS-9514.001.patch, HDFS-9514.002.patch, 
> HDFS-9514.003.patch
>
>
> {{TestDistributedFileSystem.testDFSClientPeerWriteTimeout}} is failing with 
> the wrong exception being raised...reporter isn't using the 
> {{GenericTestUtils}} code and so losing the details



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9537) libhdfs++: implement HDFSConfiguration class

2015-12-14 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9537:
-
Attachment: HDFS-9537.HDFS-8707.004.patch

New patch: really fixed up whitespace errors this time

> libhdfs++: implement HDFSConfiguration class
> 
>
> Key: HDFS-9537
> URL: https://issues.apache.org/jira/browse/HDFS-9537
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9537.HDFS-8707.000.patch, 
> HDFS-9537.HDFS-8707.001.patch, HDFS-9537.HDFS-8707.002.patch, 
> HDFS-9537.HDFS-8707.003.patch, HDFS-9537.HDFS-8707.004.patch
>
>
> Create a class to encode the rules for interpreting a Configuration class to 
> create a libhdfs++ Options object



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7764) DirectoryScanner shouldn't abort the scan if one directory had an error

2015-12-14 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7764:
---
Component/s: (was: datanode)

> DirectoryScanner shouldn't abort the scan if one directory had an error
> ---
>
> Key: HDFS-7764
> URL: https://issues.apache.org/jira/browse/HDFS-7764
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-7764-01.patch, HDFS-7764-02.patch, 
> HDFS-7764-03.patch, HDFS-7764.patch
>
>
> If there is an exception while preparing the ScanInfo for the blocks in the 
> directory, DirectoryScanner is immediately throwing exception and coming out 
> of the current scan cycle. The idea of this jira is to discuss & improve the 
> exception handling mechanism.
> DirectoryScanner.java
> {code}
> for (Entry report :
> compilersInProgress.entrySet()) {
>   try {
> dirReports[report.getKey()] = report.getValue().get();
>   } catch (Exception ex) {
> LOG.error("Error compiling report", ex);
> // Propagate ex to DataBlockScanner to deal with
> throw new RuntimeException(ex);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7344) [umbrella] Erasure Coding worker and support in DataNode

2015-12-14 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056353#comment-15056353
 ] 

Zhe Zhang commented on HDFS-7344:
-

{{ErasureCodingWorker}} was created under HDFS-7348. Maybe 1~2 patches have 
been committed to update it. Please check git history for that file.

> [umbrella] Erasure Coding worker and support in DataNode
> 
>
> Key: HDFS-7344
> URL: https://issues.apache.org/jira/browse/HDFS-7344
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Kai Zheng
>Assignee: Li Bo
> Attachments: ECWorker-design-v2.pdf, HDFS ECWorker Design.pdf, 
> hdfs-ec-datanode.0108.zip, hdfs-ec-datanode.0108.zip
>
>
> According to HDFS-7285 and the design, this handles DataNode side extension 
> and related support for Erasure Coding. More specifically, it implements 
> {{ECWorker}}, which reconstructs lost blocks (in striping layout).
> It generally needs to restore BlockGroup and schema information from coding 
> commands from NameNode or other entities, and construct specific coding work 
> to execute. The required block reader, writer, either local or remote, 
> encoder and decoder, will be implemented separately as sub-tasks. 
> This JIRA will track all the linked sub-tasks, and is responsible for general 
> discussions and integration for ECWorker. It won't resolve until all the 
> related tasks are done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8785) TestDistributedFileSystem is failing in trunk

2015-12-14 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056310#comment-15056310
 ] 

Yongjun Zhang commented on HDFS-8785:
-

Welcome Xiaoyu!


> TestDistributedFileSystem is failing in trunk
> -
>
> Key: HDFS-8785
> URL: https://issues.apache.org/jira/browse/HDFS-8785
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Arpit Agarwal
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-8785.00.patch, HDFS-8785.01.patch, 
> HDFS-8785.02.patch
>
>
> A newly added test case 
> {{TestDistributedFileSystem#testDFSClientPeerWriteTimeout}} is failing in 
> trunk.
> e.g. run
> https://builds.apache.org/job/PreCommit-HDFS-Build/11716/testReport/org.apache.hadoop.hdfs/TestDistributedFileSystem/testDFSClientPeerWriteTimeout/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )

2015-12-14 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056345#comment-15056345
 ] 

Rakesh R commented on HDFS-9494:


Thanks [~demongaorui], nice improvement. I've few comments, please see:

# To be on the safe side, please do incr {{healthyStreamerCount++;}} after 
successfully submitting {{executorCompletionService.submit()}}.
# The new logic is masking the execution exception of 
{{s.waitForAckedSeqno(toWaitFor);}}. Please catch ExecutionException. Probably 
you can use like,
{code}
try {
Future  f = executorCompletionService.take();
f.get();
}  catch (InterruptedException ie) {
  //
}  catch (ExecutionException e) {
  //
} 
{code}
# For javac warning, please add {{}} as follows:
{code}
CompletionService  executorCompletionService = new
ExecutorCompletionService<>(executor);
{code}
# Nit: Please correct formatting. Give one space in between the operators like,
{code}
int healthyStreamerCount = 0;
final long toWaitFor = flushInternalWithoutWaitingAck();
//
//
for (int i = 0; i < healthyStreamerCount; i++)
{code}


> Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
> 
>
> Key: HDFS-9494
> URL: https://issues.apache.org/jira/browse/HDFS-9494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: GAO Rui
>Assignee: GAO Rui
>Priority: Minor
> Attachments: HDFS-9494-origin-trunk.00.patch, 
> HDFS-9494-origin-trunk.01.patch
>
>
> Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and 
> wait for flushInternal( ) in sequence. So the runtime flow is like:
> {code}
> Streamer0#flushInternal( )
> Streamer0#waitForAckedSeqno( )
> Streamer1#flushInternal( )
> Streamer1#waitForAckedSeqno( )
> …
> Streamer8#flushInternal( )
> Streamer8#waitForAckedSeqno( )
> {code}
> It could be better to trigger all the streamers to flushInternal( ) and
> wait for all of them to return from waitForAckedSeqno( ),  and then 
> flushAllInternals( ) returns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8785) TestDistributedFileSystem is failing in trunk

2015-12-14 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056249#comment-15056249
 ] 

Xiaoyu Yao commented on HDFS-8785:
--

[~yzhangal], Thanks for committing this to branch-2/branch-2.8!

> TestDistributedFileSystem is failing in trunk
> -
>
> Key: HDFS-8785
> URL: https://issues.apache.org/jira/browse/HDFS-8785
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Arpit Agarwal
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-8785.00.patch, HDFS-8785.01.patch, 
> HDFS-8785.02.patch
>
>
> A newly added test case 
> {{TestDistributedFileSystem#testDFSClientPeerWriteTimeout}} is failing in 
> trunk.
> e.g. run
> https://builds.apache.org/job/PreCommit-HDFS-Build/11716/testReport/org.apache.hadoop.hdfs/TestDistributedFileSystem/testDFSClientPeerWriteTimeout/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8791) block ID-based DN storage layout can be very slow for datanode on ext4

2015-12-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056342#comment-15056342
 ] 

Kihwal Lee commented on HDFS-8791:
--

For those who are interested, we have upgraded a 2000+ node busy cluster with 
this patch.  We had to do something extra to speed up the rolling upgrade 
process.
- Tune the kernel to be less aggressive on evicting vfs-related slab entries.
{noformat}
echo 2 > /proc/sys/vm/vfs_cache_pressure
Wait 6 hours for the DirectoryScanner to run and warm up the cache.
{noformat}
- Use a custom tool to upgrade the volumes offline in parallel without 
scanning.  This tool utilizes the replica cache file that is created during 
upgrade-shutdown.

If a node was going through the slow (regular) upgrade path, it could have 
taken over an hour (9-11 minutes * n drives). Via the "fast" path, the layout 
upgrade finished in 2-3 minutes, depending on the size of drives. The offline 
layout upgrade was done in 3-4 seconds on a non-busy cluster.  Scanning blocks 
in the new layout was taking about 2 seconds (this is done in parallel), so 
datanodes were registering with the NNs in 6 seconds after startup.


> block ID-based DN storage layout can be very slow for datanode on ext4
> --
>
> Key: HDFS-8791
> URL: https://issues.apache.org/jira/browse/HDFS-8791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0, 2.8.0, 2.7.1
>Reporter: Nathan Roberts
>Assignee: Chris Trezzo
>Priority: Blocker
> Attachments: 32x32DatanodeLayoutTesting-v1.pdf, 
> 32x32DatanodeLayoutTesting-v2.pdf, HDFS-8791-trunk-v1.patch, 
> HDFS-8791-trunk-v2-bin.patch, HDFS-8791-trunk-v2.patch, 
> HDFS-8791-trunk-v2.patch, hadoop-56-layout-datanode-dir.tgz, 
> test-node-upgrade.txt
>
>
> We are seeing cases where the new directory layout causes the datanode to 
> basically cause the disks to seek for 10s of minutes. This can be when the 
> datanode is running du, and it can also be when it is performing a 
> checkDirs(). Both of these operations currently scan all directories in the 
> block pool and that's very expensive in the new layout.
> The new layout creates 256 subdirs, each with 256 subdirs. Essentially 64K 
> leaf directories where block files are placed.
> So, what we have on disk is:
> - 256 inodes for the first level directories
> - 256 directory blocks for the first level directories
> - 256*256 inodes for the second level directories
> - 256*256 directory blocks for the second level directories
> - Then the inodes and blocks to store the the HDFS blocks themselves.
> The main problem is the 256*256 directory blocks. 
> inodes and dentries will be cached by linux and one can configure how likely 
> the system is to prune those entries (vfs_cache_pressure). However, ext4 
> relies on the buffer cache to cache the directory blocks and I'm not aware of 
> any way to tell linux to favor buffer cache pages (even if it did I'm not 
> sure I would want it to in general).
> Also, ext4 tries hard to spread directories evenly across the entire volume, 
> this basically means the 64K directory blocks are probably randomly spread 
> across the entire disk. A du type scan will look at directories one at a 
> time, so the ioscheduler can't optimize the corresponding seeks, meaning the 
> seeks will be random and far. 
> In a system I was using to diagnose this, I had 60K blocks. A DU when things 
> are hot is less than 1 second. When things are cold, about 20 minutes.
> How do things get cold?
> - A large set of tasks run on the node. This pushes almost all of the buffer 
> cache out, causing the next DU to hit this situation. We are seeing cases 
> where a large job can cause a seek storm across the entire cluster.
> Why didn't the previous layout see this?
> - It might have but it wasn't nearly as pronounced. The previous layout would 
> be a few hundred directory blocks. Even when completely cold, these would 
> only take a few a hundred seeks which would mean single digit seconds.  
> - With only a few hundred directories, the odds of the directory blocks 
> getting modified is quite high, this keeps those blocks hot and much less 
> likely to be evicted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-12-14 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056755#comment-15056755
 ] 

Colin Patrick McCabe commented on HDFS-8562:


HBase keeps open many HDFS files.  Because of the way short-circuit reads work, 
this involves keeping a large cache of {{FileInputStream}} pairs-- sometimes as 
many as 128,000.  I think it's those finalizers that are the problem here.  
Those finalizers are located in the HDFS client, not in the DataNode.  In 
contrast to the HDFS client, the DataNode does not keep files open for very 
long (for better or for worse)... when it's done reading a block file, it 
immediately closes it.  So the DN is not likely to have more than a dozen or 
two {{FileInputStream}} objects around at once.  I do not think that that will 
have a major impact on GC.  DN heaps are typically small anyway, and almost 
never have GC problems.

I think the best thing to do would be to move towards using {{FileChannel}} in 
the HDFS client.  I have a patch to do this which I'll post shortly.

> HDFS Performance is impacted by FileInputStream Finalizer
> -
>
> Key: HDFS-8562
> URL: https://issues.apache.org/jira/browse/HDFS-8562
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, performance
>Affects Versions: 2.5.0
> Environment: Impact any application that uses HDFS
>Reporter: Yanping Wang
>Assignee: Wei Zhou
> Attachments: HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC 
> pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
> datanode GC pauses spiked toward 160 milliseconds while they should be around 
> 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to 
> process high number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 
> 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 
> milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
> milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those 
> FinalReference were all from FileInputStream.  We checked HDFS code and saw 
> the use of the FileInputStream in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.public static MappableBlock load(long length,
> 2.FileInputStream blockIn, FileInputStream metaIn,
> 3.String blockFileName) throws IOException {
> 4.MappableBlock mappableBlock = null;
> 5.MappedByteBuffer mmap = null;
> 6.FileChannel blockChannel = null;
> 7.try {
> 8.blockChannel = blockIn.getChannel();
> 9.if (blockChannel == null) {
> 10.   throw new IOException("Block InputStream has no FileChannel.");
> 11.   }
> 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.   mappableBlock = new MappableBlock(mmap, length);
> 16.   } finally {
> 17.   IOUtils.closeQuietly(blockChannel);
> 18.   if (mappableBlock == null) {
> 19.   if (mmap != null) {
> 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.   }
> 22.   }
> 23.   }
> 24.   return mappableBlock;
> 25.   }
> {code}
> We looked up 
> https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
>  and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance 
> is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
> to be executed.   
> The current issue is: even with programmers do call close() after using 
> FileInputStream, its finalize() method will still be called. In other words, 
> still get the side effect of the FinalReference being registered at 
> FileInputStream allocation time, and also reference processing to reclaim the 
> FinalReference during GC (any GC solution has to deal with 

[jira] [Commented] (HDFS-9538) libhdfs++: load configuration from files

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056757#comment-15056757
 ] 

Hadoop QA commented on HDFS-9538:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
31s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 52s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 53s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 58s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 3m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 16s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 24s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 12s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12777575/HDFS-9538.HDFS-8707.003.patch
 |
| JIRA Issue | HDFS-9538 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 44d0a850080f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / 4a0eea7 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_66.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_91.txt
 |
| JDK v1.7.0_91  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13868/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Max memory used | 76MB |
| Powered by | Apache Yetus 0.1.0   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13868/console |


This message was automatically generated.



> libhdfs++: load configuration from files
> 

[jira] [Commented] (HDFS-9493) Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk

2015-12-14 Thread Tony Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056768#comment-15056768
 ] 

Tony Wu commented on HDFS-9493:
---

Hi [~liuml07],

Thanks for your detailed comments. I have confirmed problem 2 & 3 after 
revisiting the test code.

For problem 3, I mis-read the {{@BeforeClass}} and {{@AfterClass}} tags, 
thinking {{MiniDFSCluster}} will be torn down and rebuilt for every test. That 
is not the case here. Instead as you said by the time {{testMetaSave()}} runs 
{{testMetaSaveAfterDelete()}} would have already removed the DN and the 
{{cluster.stopDataNode()}} call in the second test is essentially a no-op. 
Looks like the tests just happened to be working. The {{setup}} and 
{{tearDown}} functions should have been executed before and after each test 
case.

For problem 1 and 2, to further reduce the wait time for this unit test, 
{{BlockManagerTestUtil#noticeDeadDatanode()}} will be useful. This helper 
function will set the DN to be dead right away instead of waiting for HB 
timeout. 

I will rework the current patch and incorporate the findings above.

> Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk
> ---
>
> Key: HDFS-9493
> URL: https://issues.apache.org/jira/browse/HDFS-9493
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Mingliang Liu
>Assignee: Tony Wu
> Attachments: HDFS-9493.001.patch
>
>
> Tested in both Gentoo Linux and Mac.
> {quote}
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 34.159 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> testMetasaveAfterDelete(org.apache.hadoop.hdfs.server.namenode.TestMetaSave)  
> Time elapsed: 15.318 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestMetaSave.testMetasaveAfterDelete(TestMetaSave.java:154)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9493) Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk

2015-12-14 Thread Tony Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Wu updated HDFS-9493:
--
Attachment: HDFS-9493.002.patch

In v2 patch:
* {{Use BlockManagerTestUtil#noticeDeadDatanode()}} to reduce the test run 
time. After this call the NN will declare the DN dead right away, rather than 
waiting for HB timeout.
* Create a helper function {{stopDnAndWaitForNnToRemoveIt()}} for test cases to 
stop a DN and wait for it to be removed by DN.
* Create a new {{MiniDFSCluster}} for every test case.

Verified the {{testMetaSave}} runs fine on OSX & Linux (CentOS). Verified the 
run time reduced from 30+ seconds to 17 seconds.

> Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk
> ---
>
> Key: HDFS-9493
> URL: https://issues.apache.org/jira/browse/HDFS-9493
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Mingliang Liu
>Assignee: Tony Wu
> Attachments: HDFS-9493.001.patch, HDFS-9493.002.patch
>
>
> Tested in both Gentoo Linux and Mac.
> {quote}
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 34.159 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> testMetasaveAfterDelete(org.apache.hadoop.hdfs.server.namenode.TestMetaSave)  
> Time elapsed: 15.318 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestMetaSave.testMetasaveAfterDelete(TestMetaSave.java:154)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9173) Erasure Coding: Lease recovery for striped file

2015-12-14 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056780#comment-15056780
 ] 

Zhe Zhang commented on HDFS-9173:
-

Thanks Jing for the rebase. The patch LGTM overall. I have a few comments and 
some suggestions for future work:

# The below should be removed:
{code}
  public Daemon recoverBlocks(String who, Collection blocks) {
return blockRecoveryWorker.recoverBlocks(who, blocks);
  }
{code}
# Since we are trimming null elements in 
{{DatanodeManager#getDatanodeStorageInfos}}, should change the below in 
{{BlockUnderConstructionFeature#setExpectedLocations}} to an assertion.
{code}
for (DatanodeStorageInfo target : targets) {
  if (target != null) {
numLocations++;
  }
}
{code}
# {{FSNamesystem#commitBlockSynchronization}} also uses 
{{getDatanodeStorageInfos}}. The behavior is a little tricky. If some elements 
in {{newtargets}} is {{EMPTY_DATANODE_ID}}, or not found in DNManager, then 
{{dsInfos}} will be shorter than a full stripe. But we are always assigning 
block IDs based on an element's offset in {{dsInfos}}.
{code}
if (storedBlock.isStriped()) {
  bi.setBlockId(bi.getBlockId() + i);
}
{code}
# Should update to "If some internal blocks reach the safe length"
{code}
  // If some internal blocks are longer than safe length, convert them to
  // to RUR replicas.
{code}
# {{newLocs}} and {{newStorages}} should be constructed in the same loop as 
{{rurList}}. Something like below. To avoid the double loop.
{code}
final DatanodeID[] newLocs = new DatanodeID[totalBlkNum];
Arrays.fill(newLocs, EMPTY_DATANODE_ID);
for (BlockRecord r : syncList) {
...
if (r.rInfo.getNumBytes() >= newSize) {
rurList.add(r);
newLocs[blockIndex] = r.id;
newStorages[blockIndex] = r.storageID;
}
}
{code}

> Erasure Coding: Lease recovery for striped file
> ---
>
> Key: HDFS-9173
> URL: https://issues.apache.org/jira/browse/HDFS-9173
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-9173.00.wip.patch, HDFS-9173.01.patch, 
> HDFS-9173.02.step125.patch, HDFS-9173.03.patch, HDFS-9173.04.patch, 
> HDFS-9173.05.patch, HDFS-9173.06.patch, HDFS-9173.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9493) Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk

2015-12-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056792#comment-15056792
 ] 

Mingliang Liu commented on HDFS-9493:
-

Thanks for the prompt update.

+1 (non-binding) pending on Jenkins.


> Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk
> ---
>
> Key: HDFS-9493
> URL: https://issues.apache.org/jira/browse/HDFS-9493
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Mingliang Liu
>Assignee: Tony Wu
> Attachments: HDFS-9493.001.patch, HDFS-9493.002.patch
>
>
> Tested in both Gentoo Linux and Mac.
> {quote}
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 34.159 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> testMetasaveAfterDelete(org.apache.hadoop.hdfs.server.namenode.TestMetaSave)  
> Time elapsed: 15.318 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestMetaSave.testMetasaveAfterDelete(TestMetaSave.java:154)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9371) Code cleanup for DatanodeManager

2015-12-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9371:

Attachment: HDFS-9371.004.patch

Thanks for the review, [~szetszwo]! Update the patch to address your comments.

> Code cleanup for DatanodeManager
> 
>
> Key: HDFS-9371
> URL: https://issues.apache.org/jira/browse/HDFS-9371
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9371.000.patch, HDFS-9371.001.patch, 
> HDFS-9371.002.patch, HDFS-9371.003.patch, HDFS-9371.004.patch
>
>
> Some code cleanup for DatanodeManager. The main changes include:
> # make the synchronization of {{datanodeMap}} and 
> {{datanodesSoftwareVersions}} consistent
> # remove unnecessary lock in {{handleHeartbeat}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9538) libhdfs++: load configuration from files

2015-12-14 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9538:
-
Attachment: HDFS-9538.HDFS-9537.002.patch

> libhdfs++: load configuration from files
> 
>
> Key: HDFS-9538
> URL: https://issues.apache.org/jira/browse/HDFS-9538
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9538.HDFS-9537.000.patch, 
> HDFS-9538.HDFS-9537.001.patch, HDFS-9538.HDFS-9537.002.patch
>
>
> One goal of the Configuration classes are to allow the consumers of the 
> libhdfs++ library to deploy client applications into hadoop edge nodes and 
> have them pick up the Hadoop configuration that has been deployed there.
> Note that we also need to support the use case where the consumer application 
> will manage Hadoop configuration files itself, or will handle all 
> configuration out-of-band.
> libhdfs++ should be able to read files that are found in the field and easily 
> construct an instance that will communicate with the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9538) libhdfs++: load configuration from files

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056661#comment-15056661
 ] 

Hadoop QA commented on HDFS-9538:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} 
| {color:red} HDFS-9538 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12777567/HDFS-9538.HDFS-9537.002.patch
 |
| JIRA Issue | HDFS-9538 |
| Powered by | Apache Yetus 0.1.0   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13866/console |


This message was automatically generated.



> libhdfs++: load configuration from files
> 
>
> Key: HDFS-9538
> URL: https://issues.apache.org/jira/browse/HDFS-9538
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9538.HDFS-9537.000.patch, 
> HDFS-9538.HDFS-9537.001.patch, HDFS-9538.HDFS-9537.002.patch
>
>
> One goal of the Configuration classes are to allow the consumers of the 
> libhdfs++ library to deploy client applications into hadoop edge nodes and 
> have them pick up the Hadoop configuration that has been deployed there.
> Note that we also need to support the use case where the consumer application 
> will manage Hadoop configuration files itself, or will handle all 
> configuration out-of-band.
> libhdfs++ should be able to read files that are found in the field and easily 
> construct an instance that will communicate with the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9524) libhdfs++ deadlocks in Filesystem::New if NN conneciton fails

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056663#comment-15056663
 ] 

Hadoop QA commented on HDFS-9524:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 1s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
59s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 59s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 59s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 42m 41s {color} | 
{color:red} hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_66 with JDK 
v1.8.0_66 generated 1 new issues (was 3, now 3). {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 23s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 7s {color} | 
{color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 
8s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 31s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12777532/HDFS-9524.HDFS-8707.001.patch
 |
| JIRA Issue | HDFS-9524 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux b30fca8716fe 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / 4a0eea7 |
| cc | hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_66: 
https://builds.apache.org/job/PreCommit-HDFS-Build/13864/artifact/patchprocess/diff-compile-cc-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_66.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13864/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_66.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13864/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_91.txt
 |
| JDK v1.7.0_91  Test Results | 

[jira] [Commented] (HDFS-9538) libhdfs++: load configuration from files

2015-12-14 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056658#comment-15056658
 ] 

Bob Hansen commented on HDFS-9538:
--

New patch to address [~James Clampffer]'s comments.

bq. -It looks like ConfigurationLoader::SetSearchPath is breaking up the path 
into subpaths of the full path. What's the intended use case for this?
To be able to re-use the calculations for Java classpaths as a search path for 
the config files.

{quote}
-Should file_exists return false if the process doesn't have read permissions 
for the file? I'd vote yes
-Regarding the "Can we tell if it's a (transitive) symlink to a regular file?" 
comment in file_exists. I think a symlink that resolves to a file that would 
return true if used directly with file_exists should also return true. It looks 
like it could be done by checking S_ISLNK(my_stat_struct.st_mode) + realpath 
recursively with some reasonable depth limit to cover a few levels of 
indirection.
{quote}
I took out the "file_exists" call, and just try to read from the file.  If we 
can't then we go on as if it doesn't exist.

{quote}
-nftw_remove calls perror to handle error conditions. Should this have an 
#ifdef in case the user doesn't want things being printed to stderr?
-The return code for the mkdir call in TempDir::Tempdir() is never checked. 
Should this have a check that clears out path, or bails if path is already 
empty, in case of permissions issues?
{quote}
We now check the results and just fail the test if something goes wrong.

> libhdfs++: load configuration from files
> 
>
> Key: HDFS-9538
> URL: https://issues.apache.org/jira/browse/HDFS-9538
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9538.HDFS-9537.000.patch, 
> HDFS-9538.HDFS-9537.001.patch, HDFS-9538.HDFS-9537.002.patch
>
>
> One goal of the Configuration classes are to allow the consumers of the 
> libhdfs++ library to deploy client applications into hadoop edge nodes and 
> have them pick up the Hadoop configuration that has been deployed there.
> Note that we also need to support the use case where the consumer application 
> will manage Hadoop configuration files itself, or will handle all 
> configuration out-of-band.
> libhdfs++ should be able to read files that are found in the field and easily 
> construct an instance that will communicate with the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9521) TransferFsImage.receiveFile should account and log separate times for image download and fsync to disk

2015-12-14 Thread Wellington Chevreuil (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055829#comment-15055829
 ] 

Wellington Chevreuil commented on HDFS-9521:


Thanks [~liuml07]. The original idea on measuring download time and individual 
disk fsync time separately and log it as INFO is to help troubleshooting issues 
once it occurs. Leaving it as DEBUG would require a restart of Namenodes and 
hope for the incident to re occur. Would you think leave this as INFO would add 
too much overhead on this feature?

> TransferFsImage.receiveFile should account and log separate times for image 
> download and fsync to disk 
> ---
>
> Key: HDFS-9521
> URL: https://issues.apache.org/jira/browse/HDFS-9521
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HDFS-9521.patch
>
>
> Currently, TransferFsImage.receiveFile is logging total transfer time as 
> below:
> {noformat}
> double xferSec = Math.max(
>((float)(Time.monotonicNow() - startTime)) / 1000.0, 0.001);
> long xferKb = received / 1024;
> LOG.info(String.format("Transfer took %.2fs at %.2f KB/s",xferSec, xferKb / 
> xferSec))
> {noformat}
> This is really useful, but it just measures the total method execution time, 
> which includes time taken to download the image and do an fsync to all the 
> namenode metadata directories.
> Sometime when troubleshooting these imager transfer problems, it's 
> interesting to know which part of the process is being the bottleneck 
> (whether network or disk write).
> This patch accounts time for image download and fsync to each disk 
> separately, logging how much time did it take on each operation.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-12-14 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056403#comment-15056403
 ] 

Colin Patrick McCabe commented on HDFS-7923:


We have tested this on large (300 node) clusters.  If you want to disable this, 
then you can simply set dfs.namenode.max.full.block.report.leases to a very 
large value.  Then you can have the old behavior where there is no 
rate-limiting on the block reports that come into the NameNode.  I'm not sure 
why you would want that behavior, though.

> The DataNodes should rate-limit their full block reports by asking the NN on 
> heartbeat messages
> ---
>
> Key: HDFS-7923
> URL: https://issues.apache.org/jira/browse/HDFS-7923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.8.0
>
> Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, 
> HDFS-7923.002.patch, HDFS-7923.003.patch, HDFS-7923.004.patch, 
> HDFS-7923.006.patch, HDFS-7923.007.patch
>
>
> The DataNodes should rate-limit their full block reports.  They can do this 
> by first sending a heartbeat message to the NN with an optional boolean set 
> which requests permission to send a full block report.  If the NN responds 
> with another optional boolean set, the DN will send an FBR... if not, it will 
> wait until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9535) Newly completed blocks in IBR should not be considered under-replicated too quickly

2015-12-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056469#comment-15056469
 ] 

Jing Zhao commented on HDFS-9535:
-

The fix looks good to me. The test failures should be unrelated. +1. I will 
commit the patch shortly.

> Newly completed blocks in IBR should not be considered under-replicated too 
> quickly
> ---
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9535.000.patch, HDFS-9535.001.patch, 
> HDFS-9535.002.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9535) Newly completed blocks in IBR should not be considered under-replicated too quickly

2015-12-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9535:

Description: 
TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in several 
Jenkins run (e.g., 
https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
failure is on the last {{assertNoReplicationWasPerformed}} check.

This test failure reveals a scenario that HDFS-1172 missed. Please see 
[~liuml07]'s comment [here|


  was:
TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in several 
Jenkins run (e.g., 
https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
failure is on the last {{assertNoReplicationWasPerformed}} check.




> Newly completed blocks in IBR should not be considered under-replicated too 
> quickly
> ---
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9535.000.patch, HDFS-9535.001.patch, 
> HDFS-9535.002.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.
> This test failure reveals a scenario that HDFS-1172 missed. Please see 
> [~liuml07]'s comment [here|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9281) Change TestDeleteBlockPool to not explicitly use File to check block pool existence.

2015-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056522#comment-15056522
 ] 

Hudson commented on HDFS-9281:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8962 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8962/])
HDFS-9281. Change TestDeleteBlockPool to not explicitly use File to (lei: rev 
f229772f99d1751e6b2152b6e3ac9c9f7844c15d)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java


> Change TestDeleteBlockPool to not explicitly use File to check block pool 
> existence.
> 
>
> Key: HDFS-9281
> URL: https://issues.apache.org/jira/browse/HDFS-9281
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Fix For: 3.0.0, 2.9.0
>
> Attachments: HDFS-9281.00.patch, HDFS-9281.02.patch, 
> HDFS-9281.03.patch, HDFS-9281.combo.00.patch
>
>
> {{TestDeleteBlockPool}} checks the existence of a block pool by checking the 
> directories in the file-based block pool exists. However, it does not apply 
> to non file based fsdataset. 
> We can fix it by abstracting the checking logic behind {{FsDatasetTestUtils}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9535) Newly completed blocks in IBR should not be considered under-replicated too quickly

2015-12-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9535:

Description: 
TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in several 
Jenkins run (e.g., 
https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
failure is on the last {{assertNoReplicationWasPerformed}} check.

This test failure reveals a scenario that HDFS-1172 missed: if a block is first 
committed by the client, and then the first IBR comes to the NN, as proposed by 
HDFS-1172, we should still put the remaining expected replicas into the pending 
queue, instead of the under-replicated queue. Please see [~liuml07]'s comment 
[here|https://issues.apache.org/jira/browse/HDFS-9535?focusedCommentId=15052397=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15052397]
 for more details.


  was:
TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in several 
Jenkins run (e.g., 
https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
failure is on the last {{assertNoReplicationWasPerformed}} check.

This test failure reveals a scenario that HDFS-1172 missed. Please see 
[~liuml07]'s comment [here|



> Newly completed blocks in IBR should not be considered under-replicated too 
> quickly
> ---
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9535.000.patch, HDFS-9535.001.patch, 
> HDFS-9535.002.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.
> This test failure reveals a scenario that HDFS-1172 missed: if a block is 
> first committed by the client, and then the first IBR comes to the NN, as 
> proposed by HDFS-1172, we should still put the remaining expected replicas 
> into the pending queue, instead of the under-replicated queue. Please see 
> [~liuml07]'s comment 
> [here|https://issues.apache.org/jira/browse/HDFS-9535?focusedCommentId=15052397=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15052397]
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8860) Remove unused Replica copyOnWrite code

2015-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056569#comment-15056569
 ] 

Hudson commented on HDFS-8860:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8963 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8963/])
Revert "Revert "HDFS-8860. Remove unused Replica copyOnWrite code (Lei (lei: 
rev de522d2cd46be13806d13aa5f373b310e0ad6693)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaWaitingToBeRecovered.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestDatanodeRestart.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaUnderRecovery.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FinalizedReplica.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java


> Remove unused Replica copyOnWrite code
> --
>
> Key: HDFS-8860
> URL: https://issues.apache.org/jira/browse/HDFS-8860
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Fix For: 2.8.0
>
> Attachments: HDFS-8860.0.patch
>
>
> {{ReplicaInfo#unlinkBlock()}} is effectively disabled by the following code, 
> because {{isUnlinked()}} always returns true.
> {code}
> if (isUnlinked()) {
>   return false;
> }
> {code}
> Several test cases, e.g., {{TestFileAppend#testCopyOnWrite}} and 
> {{TestDatanodeRestart#testRecoverReplicas}} are testing against the unlink 
> Lets remove the relevant code to eliminate the confusions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9524) libhdfs++ deadlocks in Filesystem::New if NN conneciton fails

2015-12-14 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9524:
-
Attachment: HDFS-9524.HDFS-8707.001.patch

> libhdfs++ deadlocks in Filesystem::New if NN conneciton fails
> -
>
> Key: HDFS-9524
> URL: https://issues.apache.org/jira/browse/HDFS-9524
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9524.HDFS-8707.000.patch, 
> HDFS-9524.HDFS-8707.001.patch
>
>
> FileSystem::New attempts to free the new FileSystem if the connection fails.  
> Unfortunately, it's in the middle of a callback from the filesystem's 
> threadpool, and attempts to join the worker thread while running the worker 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9524) libhdfs++ deadlocks in Filesystem::New if NN conneciton fails

2015-12-14 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056466#comment-15056466
 ] 

Bob Hansen commented on HDFS-9524:
--

Oh!  You are correct; that failure was masked by HDFS-9523.  New patch attached.

> libhdfs++ deadlocks in Filesystem::New if NN conneciton fails
> -
>
> Key: HDFS-9524
> URL: https://issues.apache.org/jira/browse/HDFS-9524
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9524.HDFS-8707.000.patch
>
>
> FileSystem::New attempts to free the new FileSystem if the connection fails.  
> Unfortunately, it's in the middle of a callback from the filesystem's 
> threadpool, and attempts to join the worker thread while running the worker 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9535) Newly completed blocks in IBR should not be considered under-replicated too quickly

2015-12-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9535:

Description: 
TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in several 
Jenkins run (e.g., 
https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
failure is on the last {{assertNoReplicationWasPerformed}} check.



  was:TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
several Jenkins run (e.g., 
https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
failure is on the last {{assertNoReplicationWasPerformed}} check.


> Newly completed blocks in IBR should not be considered under-replicated too 
> quickly
> ---
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9535.000.patch, HDFS-9535.001.patch, 
> HDFS-9535.002.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9535) Newly completed blocks in IBR should not be considered under-replicated too quickly

2015-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056523#comment-15056523
 ] 

Hudson commented on HDFS-9535:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8962 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8962/])
HDFS-9535. Newly completed blocks in IBR should not be considered (jing9: rev 
e53456981474d6e16e3c134e3777b3588dc6fedf)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Newly completed blocks in IBR should not be considered under-replicated too 
> quickly
> ---
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9535.000.patch, HDFS-9535.001.patch, 
> HDFS-9535.002.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.
> This test failure reveals a scenario that HDFS-1172 missed. Please see 
> [~liuml07]'s comment [here|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9535) Newly completed blocks in IBR should not be considered under-replicated too quickly

2015-12-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9535:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch into trunk, branch-2, and branch-2.8.

> Newly completed blocks in IBR should not be considered under-replicated too 
> quickly
> ---
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9535.000.patch, HDFS-9535.001.patch, 
> HDFS-9535.002.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.
> This test failure reveals a scenario that HDFS-1172 missed: if a block is 
> first committed by the client, and then the first IBR comes to the NN, as 
> proposed by HDFS-1172, we should still put the remaining expected replicas 
> into the pending queue, instead of the under-replicated queue. Please see 
> [~liuml07]'s comment 
> [here|https://issues.apache.org/jira/browse/HDFS-9535?focusedCommentId=15052397=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15052397]
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9521) TransferFsImage.receiveFile should account and log separate times for image download and fsync to disk

2015-12-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056559#comment-15056559
 ] 

Mingliang Liu commented on HDFS-9521:
-

The overhead should be fine. Overall information may be useful so that the 
operator does not have to calculate from verbose information.

> TransferFsImage.receiveFile should account and log separate times for image 
> download and fsync to disk 
> ---
>
> Key: HDFS-9521
> URL: https://issues.apache.org/jira/browse/HDFS-9521
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HDFS-9521.patch
>
>
> Currently, TransferFsImage.receiveFile is logging total transfer time as 
> below:
> {noformat}
> double xferSec = Math.max(
>((float)(Time.monotonicNow() - startTime)) / 1000.0, 0.001);
> long xferKb = received / 1024;
> LOG.info(String.format("Transfer took %.2fs at %.2f KB/s",xferSec, xferKb / 
> xferSec))
> {noformat}
> This is really useful, but it just measures the total method execution time, 
> which includes time taken to download the image and do an fsync to all the 
> namenode metadata directories.
> Sometime when troubleshooting these imager transfer problems, it's 
> interesting to know which part of the process is being the bottleneck 
> (whether network or disk write).
> This patch accounts time for image download and fsync to each disk 
> separately, logging how much time did it take on each operation.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9555) LazyPersistFileScrubber should still sleep if there are errors in the clear progress

2015-12-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056457#comment-15056457
 ] 

Mingliang Liu commented on HDFS-9555:
-

Thanks for reporting this.

I'm not sure swallowing {{Exception}} instead of {{IOException}} is a good idea 
after {{clearCorruptLazyPersistFiles}}, but skipping the sleep logic makes 
little sense in the current code.

+1 (non-binding).

> LazyPersistFileScrubber should still sleep if there are errors in the clear 
> progress
> 
>
> Key: HDFS-9555
> URL: https://issues.apache.org/jira/browse/HDFS-9555
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9555-v1.patch
>
>
> If LazyPersistFileScrubber.clearCorruptLazyPersistFiles throw an exception in 
> run(), there will be no sleep logic so it will restart immediately. However 
> it may be still fail so there are too many ERROR logs in namenode said 
> "Ignoring exception in LazyPersistFileScrubber".
> We need sleep if we catch the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9281) Change TestDeleteBlockPool to not explicitly use File to check block pool existence.

2015-12-14 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9281:

   Resolution: Fixed
Fix Version/s: 2.9.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Thanks a lot for the reviews and suggestions, [~cmccabe]!

I committed this to  {{trunk}} and {{branch-2}}.

> Change TestDeleteBlockPool to not explicitly use File to check block pool 
> existence.
> 
>
> Key: HDFS-9281
> URL: https://issues.apache.org/jira/browse/HDFS-9281
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Fix For: 3.0.0, 2.9.0
>
> Attachments: HDFS-9281.00.patch, HDFS-9281.02.patch, 
> HDFS-9281.03.patch, HDFS-9281.combo.00.patch
>
>
> {{TestDeleteBlockPool}} checks the existence of a block pool by checking the 
> directories in the file-based block pool exists. However, it does not apply 
> to non file based fsdataset. 
> We can fix it by abstracting the checking logic behind {{FsDatasetTestUtils}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9538) libhdfs++: load configuration from files

2015-12-14 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056421#comment-15056421
 ] 

James Clampffer commented on HDFS-9538:
---

A couple more comments:

-nftw_remove calls perror to handle error conditions.  Should this have an 
#ifdef in case the user doesn't want things being printed to stderr?
-The return code for the mkdir call in TempDir::Tempdir() is never checked.  
Should this have a check that clears out path, or bails if path is already 
empty, in case of permissions issues?

> libhdfs++: load configuration from files
> 
>
> Key: HDFS-9538
> URL: https://issues.apache.org/jira/browse/HDFS-9538
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9538.HDFS-9537.000.patch, 
> HDFS-9538.HDFS-9537.001.patch
>
>
> One goal of the Configuration classes are to allow the consumers of the 
> libhdfs++ library to deploy client applications into hadoop edge nodes and 
> have them pick up the Hadoop configuration that has been deployed there.
> Note that we also need to support the use case where the consumer application 
> will manage Hadoop configuration files itself, or will handle all 
> configuration out-of-band.
> libhdfs++ should be able to read files that are found in the field and easily 
> construct an instance that will communicate with the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9524) libhdfs++ deadlocks in Filesystem::New if NN conneciton fails

2015-12-14 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056453#comment-15056453
 ] 

James Clampffer commented on HDFS-9524:
---

This seems to cause libhdfs_threaded_hdfspp_test_shim_static to fail on my dev 
machine.  Might be related to the code snippet below.

In hdfsConnect:
"if (fs->Connect(nn, port_as_string).ok()) {
  ReportError(ENODEV, "Unable to connect to NameNode.");"
Should the if condition be negated here?

> libhdfs++ deadlocks in Filesystem::New if NN conneciton fails
> -
>
> Key: HDFS-9524
> URL: https://issues.apache.org/jira/browse/HDFS-9524
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9524.HDFS-8707.000.patch
>
>
> FileSystem::New attempts to free the new FileSystem if the connection fails.  
> Unfortunately, it's in the middle of a callback from the filesystem's 
> threadpool, and attempts to join the worker thread while running the worker 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7764) DirectoryScanner shouldn't abort the scan if one directory had an error

2015-12-14 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7764:
---
Component/s: datanode

> DirectoryScanner shouldn't abort the scan if one directory had an error
> ---
>
> Key: HDFS-7764
> URL: https://issues.apache.org/jira/browse/HDFS-7764
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-7764-01.patch, HDFS-7764-02.patch, 
> HDFS-7764-03.patch, HDFS-7764.patch
>
>
> If there is an exception while preparing the ScanInfo for the blocks in the 
> directory, DirectoryScanner is immediately throwing exception and coming out 
> of the current scan cycle. The idea of this jira is to discuss & improve the 
> exception handling mechanism.
> DirectoryScanner.java
> {code}
> for (Entry report :
> compilersInProgress.entrySet()) {
>   try {
> dirReports[report.getKey()] = report.getValue().get();
>   } catch (Exception ex) {
> LOG.error("Error compiling report", ex);
> // Propagate ex to DataBlockScanner to deal with
> throw new RuntimeException(ex);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9281) Change TestDeleteBlockPool to not explicitly use File to check block pool existence.

2015-12-14 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056383#comment-15056383
 ] 

Colin Patrick McCabe commented on HDFS-9281:


+1.  Thanks, [~eddyxu].

> Change TestDeleteBlockPool to not explicitly use File to check block pool 
> existence.
> 
>
> Key: HDFS-9281
> URL: https://issues.apache.org/jira/browse/HDFS-9281
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-9281.00.patch, HDFS-9281.02.patch, 
> HDFS-9281.03.patch, HDFS-9281.combo.00.patch
>
>
> {{TestDeleteBlockPool}} checks the existence of a block pool by checking the 
> directories in the file-based block pool exists. However, it does not apply 
> to non file based fsdataset. 
> We can fix it by abstracting the checking logic behind {{FsDatasetTestUtils}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9533) seen_txid in the shared edits directory is modified during bootstrapping

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056456#comment-15056456
 ] 

Hadoop QA commented on HDFS-9533:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
7s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 30s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 16s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 21s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 27s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 192m 33s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestDatanodeDeath |
|   | hadoop.hdfs.TestLeaseRecovery2 |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.server.namenode.TestFSImageWithAcl |
|   | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12777179/HDFS-9533.patch |
| JIRA Issue | HDFS-9533 |
| Optional Tests |  asflicense  compile  

[jira] [Commented] (HDFS-9535) Newly completed blocks in IBR should not be considered under-replicated too quickly

2015-12-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056476#comment-15056476
 ] 

Mingliang Liu commented on HDFS-9535:
-

Thanks to [~jingzhao] for cutting this jira, reviewing and committing the 
patch. Thanks to [~iwasakims] for useful comments.

> Newly completed blocks in IBR should not be considered under-replicated too 
> quickly
> ---
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9535.000.patch, HDFS-9535.001.patch, 
> HDFS-9535.002.patch
>
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in 
> several Jenkins run (e.g., 
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The 
> failure is on the last {{assertNoReplicationWasPerformed}} check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7344) [umbrella] Erasure Coding worker and support in DataNode

2015-12-14 Thread GAO Rui (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055600#comment-15055600
 ] 

GAO Rui commented on HDFS-7344:
---

Hi [~drankye], [~libo-intel], do we still use this Jira to track the 
implementing process of {{ErasureCodingWorker}} ? Could you give me some advice 
and Jiras which could help me to catch up with the {{ErasureCodingWorker}} 
implementing? 

> [umbrella] Erasure Coding worker and support in DataNode
> 
>
> Key: HDFS-7344
> URL: https://issues.apache.org/jira/browse/HDFS-7344
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Kai Zheng
>Assignee: Li Bo
> Attachments: ECWorker-design-v2.pdf, HDFS ECWorker Design.pdf, 
> hdfs-ec-datanode.0108.zip, hdfs-ec-datanode.0108.zip
>
>
> According to HDFS-7285 and the design, this handles DataNode side extension 
> and related support for Erasure Coding. More specifically, it implements 
> {{ECWorker}}, which reconstructs lost blocks (in striping layout).
> It generally needs to restore BlockGroup and schema information from coding 
> commands from NameNode or other entities, and construct specific coding work 
> to execute. The required block reader, writer, either local or remote, 
> encoder and decoder, will be implemented separately as sub-tasks. 
> This JIRA will track all the linked sub-tasks, and is responsible for general 
> discussions and integration for ECWorker. It won't resolve until all the 
> related tasks are done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055603#comment-15055603
 ] 

Hadoop QA commented on HDFS-9494:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 55s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 33s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 9m 11s {color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.8.0_66 with JDK 
v1.8.0_66 generated 1 new issues (was 14, now 14). {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 10m 17s 
{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.7.0_91 with 
JDK v1.7.0_91 generated 1 new issues (was 14, now 14). {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s 
{color} | {color:red} Patch generated 3 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs-client (total was 31, now 34). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 14s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 43s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
58s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 58m 47s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes 

[jira] [Commented] (HDFS-9525) hadoop utilities need to support provided delegation tokens

2015-12-14 Thread HeeSoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056795#comment-15056795
 ] 

HeeSoo Kim commented on HDFS-9525:
--

[~daryn], [~aw] Thank you for your feedback.
{quote}
An enhanced fetchdt is probably the best solution to side step the lack of 
realm trust.
{quote}
That's right. We can use fetchdt to get the token from un-trusted realm cluster.
However, WebHDFS still has a problem to use the token which get the token using 
fetchdt.

I changed the code that supports the following features.
# It supports multiple token files when we fetched the delegationToken from 
target filesystem using fetchdt.
# If we want to distcp from non-kerberos cluster to kerberos cluster, WebHDFS 
does not use the delegationToken even ugi has the webHDFS's token.  It supports 
to use token for WebHDFS on non-kerberos cluster.

> hadoop utilities need to support provided delegation tokens
> ---
>
> Key: HDFS-9525
> URL: https://issues.apache.org/jira/browse/HDFS-9525
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-7984.001.patch, HDFS-7984.002.patch, 
> HDFS-7984.003.patch, HDFS-7984.004.patch, HDFS-7984.005.patch, 
> HDFS-7984.006.patch, HDFS-7984.007.patch, HDFS-7984.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9525) hadoop utilities need to support provided delegation tokens

2015-12-14 Thread HeeSoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeeSoo Kim updated HDFS-9525:
-
Attachment: HDFS-9525.008.patch

# It supports multiple token files when we fetched the delegationToken from 
target filesystem using fetchdt.
# If we want to distcp from non-kerberos cluster to kerberos cluster, WebHDFS 
does not use the delegationToken even ugi has the webHDFS's token. It supports 
to use token for WebHDFS on non-kerberos cluster.

> hadoop utilities need to support provided delegation tokens
> ---
>
> Key: HDFS-9525
> URL: https://issues.apache.org/jira/browse/HDFS-9525
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-7984.001.patch, HDFS-7984.002.patch, 
> HDFS-7984.003.patch, HDFS-7984.004.patch, HDFS-7984.005.patch, 
> HDFS-7984.006.patch, HDFS-7984.007.patch, HDFS-7984.patch, HDFS-9525.008.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9173) Erasure Coding: Lease recovery for striped file

2015-12-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056836#comment-15056836
 ] 

Jing Zhao commented on HDFS-9173:
-

Thanks for the review, Zhe! 

bq. FSNamesystem#commitBlockSynchronization also uses getDatanodeStorageInfos. 
The behavior is a little tricky.

Good catch. This is a mistake I made during the rebase. Walter's original patch 
is correct. Will update the patch to address your comments.


> Erasure Coding: Lease recovery for striped file
> ---
>
> Key: HDFS-9173
> URL: https://issues.apache.org/jira/browse/HDFS-9173
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-9173.00.wip.patch, HDFS-9173.01.patch, 
> HDFS-9173.02.step125.patch, HDFS-9173.03.patch, HDFS-9173.04.patch, 
> HDFS-9173.05.patch, HDFS-9173.06.patch, HDFS-9173.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9371) Code cleanup for DatanodeManager

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056839#comment-15056839
 ] 

Hadoop QA commented on HDFS-9371:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 32s {color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_66 with JDK v1.8.0_66 
generated 1 new issues (was 32, now 32). {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 16s {color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_91 with JDK v1.7.0_91 
generated 1 new issues (was 34, now 34). {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s 
{color} | {color:red} Patch generated 8 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 474, now 453). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
1s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 11s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 38s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 140m 24s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 

[jira] [Updated] (HDFS-9487) libhdfs++ Enable builds with no compiler optimizations

2015-12-14 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9487:
-
Attachment: HDFS-9487.HDFS-8707.000.patch

Patch sets default optimization for debug builds to 0 and adds mvn properties 
for native_cmake_args, native_make_args, and native_ctest_args.

To build debug and spew to the console to make sure it worked, use:
{code}
mvn -pl :hadoop-hdfs-native-client -Pnative \
-Dnative_cmake_args="-DCMAKE_BUILD_TYPE=Debug"  \
-Dnative_make_args="VERBOSE=1" install
{code}

> libhdfs++ Enable builds with no compiler optimizations
> --
>
> Key: HDFS-9487
> URL: https://issues.apache.org/jira/browse/HDFS-9487
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9487.HDFS-8707.000.patch
>
>
> The default build configuration uses -02 -g .  To make 
> debugging easier it would be really nice to be able to produce builds with 
> -O0.
> I haven't found an existing flag to pass to maven or cmake to accomplish 
> this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9552) Document types of permission checks performed for HDFS operations.

2015-12-14 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9552:

Attachment: HDFS-9552.002.patch

I'm attaching patch v002 with a few additional clarifications about concat and 
setOwner.

> Document types of permission checks performed for HDFS operations.
> --
>
> Key: HDFS-9552
> URL: https://issues.apache.org/jira/browse/HDFS-9552
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-9552.001.patch, HDFS-9552.002.patch
>
>
> The HDFS permissions guide discusses our use of a POSIX-like model with read, 
> write and execute permissions associated with users, groups and the catch-all 
> other class.  However, there is no documentation that describes exactly what 
> permission checks are performed by user-facing HDFS operations.  This is a 
> frequent source of questions, so it would be good to document this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9552) Document types of permission checks performed for HDFS operations.

2015-12-14 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9552:

Attachment: (was: hadoop-site.tar.bz2)

> Document types of permission checks performed for HDFS operations.
> --
>
> Key: HDFS-9552
> URL: https://issues.apache.org/jira/browse/HDFS-9552
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-9552.001.patch, HDFS-9552.002.patch
>
>
> The HDFS permissions guide discusses our use of a POSIX-like model with read, 
> write and execute permissions associated with users, groups and the catch-all 
> other class.  However, there is no documentation that describes exactly what 
> permission checks are performed by user-facing HDFS operations.  This is a 
> frequent source of questions, so it would be good to document this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9281) Change TestDeleteBlockPool to not explicitly use File to check block pool existence.

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056736#comment-15056736
 ] 

Hadoop QA commented on HDFS-9281:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
45s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 57s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 15s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 55s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 26s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 191m 27s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.namenode.ha.TestRequestHedgingProxyProvider |
|   | hadoop.hdfs.TestDistributedFileSystem |
| JDK v1.8.0_66 Timed out junit tests | 
org.apache.hadoop.hdfs.server.mover.TestStorageMover |
|   | org.apache.hadoop.hdfs.server.balancer.TestBalancer |
|   | org.apache.hadoop.hdfs.server.mover.TestMover |
|   | org.apache.hadoop.hdfs.TestEncryptedTransfer |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
|   | 

[jira] [Commented] (HDFS-9524) libhdfs++ deadlocks in Filesystem::New if NN conneciton fails

2015-12-14 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056693#comment-15056693
 ] 

James Clampffer commented on HDFS-9524:
---

Thanks for the patch update [~bobthansen]; it works for me now.

As far as I can tell you'd have the same problem if you deleted the FileSystem 
in the FileSystem::Connect callback on a failed connect.  Maybe it's worth 
having a rule/comment about deleting the filesystem from within the context of 
a callback?

> libhdfs++ deadlocks in Filesystem::New if NN conneciton fails
> -
>
> Key: HDFS-9524
> URL: https://issues.apache.org/jira/browse/HDFS-9524
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9524.HDFS-8707.000.patch, 
> HDFS-9524.HDFS-8707.001.patch
>
>
> FileSystem::New attempts to free the new FileSystem if the connection fails.  
> Unfortunately, it's in the middle of a callback from the filesystem's 
> threadpool, and attempts to join the worker thread while running the worker 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9173) Erasure Coding: Lease recovery for striped file

2015-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056698#comment-15056698
 ] 

Hadoop QA commented on HDFS-9173:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} Patch generated 15 new checkstyle issues in 
hadoop-hdfs-project (total was 595, now 602). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 5s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs introduced 1 new FindBugs 
issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 31s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 4s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 59 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 149m 37s {color} 
| {color:black} {color} |
\\
\\
|| Reason 

[jira] [Updated] (HDFS-9538) libhdfs++: load configuration from files

2015-12-14 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9538:
-
Attachment: HDFS-9538.HDFS-8707.003.patch

New patch: rebased onto HDFS-8707.  Should be landable now.

> libhdfs++: load configuration from files
> 
>
> Key: HDFS-9538
> URL: https://issues.apache.org/jira/browse/HDFS-9538
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9538.HDFS-8707.003.patch, 
> HDFS-9538.HDFS-9537.000.patch, HDFS-9538.HDFS-9537.001.patch, 
> HDFS-9538.HDFS-9537.002.patch
>
>
> One goal of the Configuration classes are to allow the consumers of the 
> libhdfs++ library to deploy client applications into hadoop edge nodes and 
> have them pick up the Hadoop configuration that has been deployed there.
> Note that we also need to support the use case where the consumer application 
> will manage Hadoop configuration files itself, or will handle all 
> configuration out-of-band.
> libhdfs++ should be able to read files that are found in the field and easily 
> construct an instance that will communicate with the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9538) libhdfs++: load configuration from files

2015-12-14 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056713#comment-15056713
 ] 

James Clampffer commented on HDFS-9538:
---

"To be able to re-use the calculations for Java classpaths as a search path for 
the config files."
Sounds good, makes sense.

"I took out the "file_exists" call, and just try to read from the file. If we 
can't then we go on as if it doesn't exist."
Nice, that works.

"We now check the results and just fail the test if something goes wrong."
Sounds good to me.

+1, I'll commit sometime tomorrow unless someone else spots an issue.

> libhdfs++: load configuration from files
> 
>
> Key: HDFS-9538
> URL: https://issues.apache.org/jira/browse/HDFS-9538
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9538.HDFS-8707.003.patch, 
> HDFS-9538.HDFS-9537.000.patch, HDFS-9538.HDFS-9537.001.patch, 
> HDFS-9538.HDFS-9537.002.patch
>
>
> One goal of the Configuration classes are to allow the consumers of the 
> libhdfs++ library to deploy client applications into hadoop edge nodes and 
> have them pick up the Hadoop configuration that has been deployed there.
> Note that we also need to support the use case where the consumer application 
> will manage Hadoop configuration files itself, or will handle all 
> configuration out-of-band.
> libhdfs++ should be able to read files that are found in the field and easily 
> construct an instance that will communicate with the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9552) Document types of permission checks performed for HDFS operations.

2015-12-14 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9552:

Attachment: hadoop-site.tar.bz2

> Document types of permission checks performed for HDFS operations.
> --
>
> Key: HDFS-9552
> URL: https://issues.apache.org/jira/browse/HDFS-9552
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-9552.001.patch, HDFS-9552.002.patch, 
> hadoop-site.tar.bz2
>
>
> The HDFS permissions guide discusses our use of a POSIX-like model with read, 
> write and execute permissions associated with users, groups and the catch-all 
> other class.  However, there is no documentation that describes exactly what 
> permission checks are performed by user-facing HDFS operations.  This is a 
> frequent source of questions, so it would be good to document this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9555) LazyPersistFileScrubber should still sleep if there are errors in the clear progress

2015-12-14 Thread Phil Yang (JIRA)
Phil Yang created HDFS-9555:
---

 Summary: LazyPersistFileScrubber should still sleep if there are 
errors in the clear progress
 Key: HDFS-9555
 URL: https://issues.apache.org/jira/browse/HDFS-9555
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Phil Yang
Assignee: Phil Yang


If LazyPersistFileScrubber.clearCorruptLazyPersistFiles throw an exception in 
run(), there will be no sleep logic so it will restart immediately. However it 
may be still fail so there are too many ERROR logs in namenode said "Ignoring 
exception in LazyPersistFileScrubber".

We need sleep if we catch the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >