[jira] [Updated] (HDFS-9129) Move the safemode block count into BlockManager

2015-10-16 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9129:

Attachment: HDFS-9129.006.patch

As per offline discussion, the v6 patch prefers synchronized {{getSafeModeTip}} 
to volatile fields. Some public methods for test in {{BlockManagerSafeMode}} 
are removed as well.

> Move the safemode block count into BlockManager
> ---
>
> Key: HDFS-9129
> URL: https://issues.apache.org/jira/browse/HDFS-9129
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9129.000.patch, HDFS-9129.001.patch, 
> HDFS-9129.002.patch, HDFS-9129.003.patch, HDFS-9129.004.patch, 
> HDFS-9129.005.patch, HDFS-9129.006.patch
>
>
> The {{SafeMode}} needs to track whether there are enough blocks so that the 
> NN can get out of the safemode. These fields can moved to the 
> {{BlockManager}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9250) LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty

2015-10-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961728#comment-14961728
 ] 

Hadoop QA commented on HDFS-9250:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  26m  5s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   9m 25s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 52s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   3m  9s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 36s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 38s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 18s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  50m 24s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 30s | Tests passed in 
hadoop-hdfs-client. |
| | | 111m  0s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.TestRenameWhileOpen |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767182/HDFS-9250.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 58590fe |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13035/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13035/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13035/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13035/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13035/console |


This message was automatically generated.

> LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty
> ---
>
> Key: HDFS-9250
> URL: https://issues.apache.org/jira/browse/HDFS-9250
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9250.001.patch, HDFS-9250.002.patch
>
>
> We may see the following exception:
> {noformat}
> java.lang.ArrayStoreException
> at java.util.ArrayList.toArray(ArrayList.java:389)
> at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.addCachedLoc(LocatedBlock.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.CacheManager.setCachedLocations(CacheManager.java:907)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1974)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> {noformat}
> The cause is that in LocatedBlock.java, when {{addCachedLoc}}:
> - Passed in parameter {{loc}}, which is type {{DatanodeDescriptor}}, is added 
> to {{cachedList}}
> - {{cachedList}} was assigned to {{EMPTY_LOCS}}, which is type 
> {{DatanodeInfoWithStorage}}.
> Both {{DatanodeDescriptor}} and {{DatanodeInfoWithStorage}} are subclasses of 
> {{DatanodeInfo}} but do not inherit from each other, resulting in the 
> ArrayStoreException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8671) Add client support for HTTP/2 stream channels

2015-10-16 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HDFS-8671:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to HDFS-7966 branch. Thanks [~wheat9] for reviewing.

> Add client support for HTTP/2 stream channels
> -
>
> Key: HDFS-8671
> URL: https://issues.apache.org/jira/browse/HDFS-8671
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: HDFS-7966
>
> Attachments: HDFS-8671-v0.patch, HDFS-8671-v1.patch
>
>
> {{Http2StreamChannel}} is introduced in HDFS-8515 but can only be used at 
> server side.
> Now we implement Http2BlockReader using jetty http2-client in the POC branch, 
> but the final version of jetty 9.3.0 only accepts java8.
> So here we plan to extend the functions of {{Http2StreamChannel}} to support 
> client side usage and then implement Http2BlockReader based on it. And we 
> still use jetty http2-client to write testcases to ensure that our http2 
> implementation is valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8836) Skip newline on empty files with getMerge -nl

2015-10-16 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961710#comment-14961710
 ] 

Akira AJISAKA commented on HDFS-8836:
-

Some comments from me.

{code}
+skipEmptyFileDelimiter = cf.getOpt("skip-empty-file") ? true : false;
{code}
1. {{? true : false}} is redundant, can be removed.

{code}
  if (skipEmptyFileDelimiter && src.stat.getLen() == 0) {
continue;
  }
  FSDataInputStream in = src.fs.open(src.path);
  try {
IOUtils.copyBytes(in, out, getConf(), false);
if (delimiter != null) {
  out.write(delimiter.getBytes("UTF-8"));
}
  } finally {
in.close();
  }
{code}
2. Can we skip opening empty file if the file length is zero as follows?
{code}
if (src.stat.getLen() != 0) {
  try (FSDataInputStream in = src.fs.open(src.path)) {
IOUtils.copyBytes(in, out, getConf(), false);
writeDelimiter(out);
  }
} else if (!skipEmptyFileDelimiter) {
  writeDelimiter(out);
}

private void writeDelimiter(FSDataOutputStream out) {
  ...
}  
{code}

{code:title=TestFsShellCopy#testCopyMerge}
// directory with 3 files, should skip subdir
{code}
3. An empty file is added, so there are 4 files.

> Skip newline on empty files with getMerge -nl
> -
>
> Key: HDFS-8836
> URL: https://issues.apache.org/jira/browse/HDFS-8836
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.6.0, 2.7.1
>Reporter: Jan Filipiak
>Assignee: Kanaka Kumar Avvaru
>Priority: Trivial
> Attachments: HDFS-8836-01.patch, HDFS-8836-02.patch, 
> HDFS-8836-03.patch, HDFS-8836-04.patch, HDFS-8836-05.patch
>
>
> Hello everyone,
> I recently was in the need of using the new line option -nl with getMerge 
> because the files I needed to merge simply didn't had one. I was merging all 
> the files from one directory and unfortunately this directory also included 
> empty files, which effectively led to multiple newlines append after some 
> files. I needed to remove them manually afterwards.
> In this situation it is maybe good to have another argument that allows 
> skipping empty files.
> Thing one could try to implement this feature:
> The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't
> return the number of bytes copied which would be convenient as one could
> skip append the new line when 0 bytes where copied or one would check the 
> file size before.
> I posted this Idea on the mailing list 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201507.mbox/%3C55B25140.3060005%40trivago.com%3E
>  but I didn't really get many responses, so I thought I my try this way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8836) Skip newline on empty files with getMerge -nl

2015-10-16 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961682#comment-14961682
 ] 

Akira AJISAKA commented on HDFS-8836:
-

Sorry for late response.

bq. One could set up many oozie coordinators that would wait for A/_SUCCESS and 
then start processing it. There would be no safe time to delete the file as one 
is always in danger of having one of the cooridnators not executed as they 
didn't find its "dataset" file.
Reasonable for me. I'll review your patch.

> Skip newline on empty files with getMerge -nl
> -
>
> Key: HDFS-8836
> URL: https://issues.apache.org/jira/browse/HDFS-8836
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.6.0, 2.7.1
>Reporter: Jan Filipiak
>Assignee: Kanaka Kumar Avvaru
>Priority: Trivial
> Attachments: HDFS-8836-01.patch, HDFS-8836-02.patch, 
> HDFS-8836-03.patch, HDFS-8836-04.patch, HDFS-8836-05.patch
>
>
> Hello everyone,
> I recently was in the need of using the new line option -nl with getMerge 
> because the files I needed to merge simply didn't had one. I was merging all 
> the files from one directory and unfortunately this directory also included 
> empty files, which effectively led to multiple newlines append after some 
> files. I needed to remove them manually afterwards.
> In this situation it is maybe good to have another argument that allows 
> skipping empty files.
> Thing one could try to implement this feature:
> The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't
> return the number of bytes copied which would be convenient as one could
> skip append the new line when 0 bytes where copied or one would check the 
> file size before.
> I posted this Idea on the mailing list 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201507.mbox/%3C55B25140.3060005%40trivago.com%3E
>  but I didn't really get many responses, so I thought I my try this way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8880) NameNode metrics logging

2015-10-16 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961680#comment-14961680
 ] 

Arpit Agarwal commented on HDFS-8880:
-

I'll address #3 by eliminating the extra thread. I am not opposed to a more 
general solution, pending which this is still useful. I added this to scratch a 
personal itch as I often missed textual records of NN metrics stored with the 
service logs for easy grep'ing by metric name or timestamp. There was no 
intention to add this to every service.

Coda Hale 
[slf4jreporter|https://dropwizard.github.io/metrics/3.1.0/manual/core/#man-core-reporters-slf4j]
 looks particularly interesting but IIUC the reporters also use a polling 
thread and there'd be at least some code added to each service to instantiate 
reporters. We can file a Jira for a more general solution as there was some 
community interest from YARN, and perhaps downstream.

> NameNode metrics logging
> 
>
> Key: HDFS-8880
> URL: https://issues.apache.org/jira/browse/HDFS-8880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.8.0
>
> Attachments: HDFS-8880.01.patch, HDFS-8880.02.patch, 
> HDFS-8880.03.patch, HDFS-8880.04.patch, namenode-metrics.log
>
>
> The NameNode can periodically log metrics to help debugging when the cluster 
> is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9250) LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty

2015-10-16 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9250:

Attachment: (was: HDFS-9250.002.patch)

> LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty
> ---
>
> Key: HDFS-9250
> URL: https://issues.apache.org/jira/browse/HDFS-9250
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9250.001.patch, HDFS-9250.002.patch
>
>
> We may see the following exception:
> {noformat}
> java.lang.ArrayStoreException
> at java.util.ArrayList.toArray(ArrayList.java:389)
> at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.addCachedLoc(LocatedBlock.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.CacheManager.setCachedLocations(CacheManager.java:907)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1974)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> {noformat}
> The cause is that in LocatedBlock.java, when {{addCachedLoc}}:
> - Passed in parameter {{loc}}, which is type {{DatanodeDescriptor}}, is added 
> to {{cachedList}}
> - {{cachedList}} was assigned to {{EMPTY_LOCS}}, which is type 
> {{DatanodeInfoWithStorage}}.
> Both {{DatanodeDescriptor}} and {{DatanodeInfoWithStorage}} are subclasses of 
> {{DatanodeInfo}} but do not inherit from each other, resulting in the 
> ArrayStoreException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9250) LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty

2015-10-16 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9250:

Status: Patch Available  (was: Open)

> LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty
> ---
>
> Key: HDFS-9250
> URL: https://issues.apache.org/jira/browse/HDFS-9250
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9250.001.patch, HDFS-9250.002.patch
>
>
> We may see the following exception:
> {noformat}
> java.lang.ArrayStoreException
> at java.util.ArrayList.toArray(ArrayList.java:389)
> at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.addCachedLoc(LocatedBlock.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.CacheManager.setCachedLocations(CacheManager.java:907)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1974)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> {noformat}
> The cause is that in LocatedBlock.java, when {{addCachedLoc}}:
> - Passed in parameter {{loc}}, which is type {{DatanodeDescriptor}}, is added 
> to {{cachedList}}
> - {{cachedList}} was assigned to {{EMPTY_LOCS}}, which is type 
> {{DatanodeInfoWithStorage}}.
> Both {{DatanodeDescriptor}} and {{DatanodeInfoWithStorage}} are subclasses of 
> {{DatanodeInfo}} but do not inherit from each other, resulting in the 
> ArrayStoreException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9250) LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty

2015-10-16 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961646#comment-14961646
 ] 

Xiao Chen commented on HDFS-9250:
-

Hey [~andrew.wang],

Thanks again for bringing up HDFS-8646, which looks complete to me. The version 
I encountered the {{ArrayStoreException}} is before your fix. Thus I think it's 
possible that the location is added without disk replica.

Patch 002 is attached. Your suggestion of adding a precondition check sounds 
great, since otherwise we know it's gonna throw the {{ArrayStoreException}} for 
sure in that condition. I left the test case untouched just to run into the 
precondition block. Please review. Thanks!

> LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty
> ---
>
> Key: HDFS-9250
> URL: https://issues.apache.org/jira/browse/HDFS-9250
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9250.001.patch, HDFS-9250.002.patch
>
>
> We may see the following exception:
> {noformat}
> java.lang.ArrayStoreException
> at java.util.ArrayList.toArray(ArrayList.java:389)
> at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.addCachedLoc(LocatedBlock.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.CacheManager.setCachedLocations(CacheManager.java:907)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1974)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> {noformat}
> The cause is that in LocatedBlock.java, when {{addCachedLoc}}:
> - Passed in parameter {{loc}}, which is type {{DatanodeDescriptor}}, is added 
> to {{cachedList}}
> - {{cachedList}} was assigned to {{EMPTY_LOCS}}, which is type 
> {{DatanodeInfoWithStorage}}.
> Both {{DatanodeDescriptor}} and {{DatanodeInfoWithStorage}} are subclasses of 
> {{DatanodeInfo}} but do not inherit from each other, resulting in the 
> ArrayStoreException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9250) LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty

2015-10-16 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9250:

Attachment: HDFS-9250.002.patch

> LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty
> ---
>
> Key: HDFS-9250
> URL: https://issues.apache.org/jira/browse/HDFS-9250
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9250.001.patch, HDFS-9250.002.patch
>
>
> We may see the following exception:
> {noformat}
> java.lang.ArrayStoreException
> at java.util.ArrayList.toArray(ArrayList.java:389)
> at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.addCachedLoc(LocatedBlock.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.CacheManager.setCachedLocations(CacheManager.java:907)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1974)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> {noformat}
> The cause is that in LocatedBlock.java, when {{addCachedLoc}}:
> - Passed in parameter {{loc}}, which is type {{DatanodeDescriptor}}, is added 
> to {{cachedList}}
> - {{cachedList}} was assigned to {{EMPTY_LOCS}}, which is type 
> {{DatanodeInfoWithStorage}}.
> Both {{DatanodeDescriptor}} and {{DatanodeInfoWithStorage}} are subclasses of 
> {{DatanodeInfo}} but do not inherit from each other, resulting in the 
> ArrayStoreException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9250) LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty

2015-10-16 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9250:

Attachment: HDFS-9250.002.patch

> LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty
> ---
>
> Key: HDFS-9250
> URL: https://issues.apache.org/jira/browse/HDFS-9250
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9250.001.patch, HDFS-9250.002.patch
>
>
> We may see the following exception:
> {noformat}
> java.lang.ArrayStoreException
> at java.util.ArrayList.toArray(ArrayList.java:389)
> at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.addCachedLoc(LocatedBlock.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.CacheManager.setCachedLocations(CacheManager.java:907)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1974)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> {noformat}
> The cause is that in LocatedBlock.java, when {{addCachedLoc}}:
> - Passed in parameter {{loc}}, which is type {{DatanodeDescriptor}}, is added 
> to {{cachedList}}
> - {{cachedList}} was assigned to {{EMPTY_LOCS}}, which is type 
> {{DatanodeInfoWithStorage}}.
> Both {{DatanodeDescriptor}} and {{DatanodeInfoWithStorage}} are subclasses of 
> {{DatanodeInfo}} but do not inherit from each other, resulting in the 
> ArrayStoreException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8671) Add client support for HTTP/2 stream channels

2015-10-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961635#comment-14961635
 ] 

Haohui Mai commented on HDFS-8671:
--

LGTM. +1

> Add client support for HTTP/2 stream channels
> -
>
> Key: HDFS-8671
> URL: https://issues.apache.org/jira/browse/HDFS-8671
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: HDFS-7966
>
> Attachments: HDFS-8671-v0.patch, HDFS-8671-v1.patch
>
>
> {{Http2StreamChannel}} is introduced in HDFS-8515 but can only be used at 
> server side.
> Now we implement Http2BlockReader using jetty http2-client in the POC branch, 
> but the final version of jetty 9.3.0 only accepts java8.
> So here we plan to extend the functions of {{Http2StreamChannel}} to support 
> client side usage and then implement Http2BlockReader based on it. And we 
> still use jetty http2-client to write testcases to ensure that our http2 
> implementation is valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9251) Refactor TestWriteToReplica and TestFsDatasetImpl to avoid explicitly creating Files in tests code.

2015-10-16 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961607#comment-14961607
 ] 

Colin Patrick McCabe commented on HDFS-9251:


Thanks, [~eddyxu].

{code}
222 Preconditions.checkArgument(volume instanceof FsVolumeImpl);
{code}
We should not have these lines.  The test is {{FsDatasetImplTestUtils.java}}, 
so we know that the volume must be an instance of {{FsVolumeImpl}}.  The only 
way it could not be is if there was a bug, which we don't want to hide.

Looks good aside from that.

> Refactor TestWriteToReplica and TestFsDatasetImpl to avoid explicitly 
> creating Files in tests code.
> ---
>
> Key: HDFS-9251
> URL: https://issues.apache.org/jira/browse/HDFS-9251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9251.00.patch, HDFS-9251.01.patch
>
>
> In {{TestWriteToReplica}} and {{TestFsDatasetImpl}}, tests directly creates 
> block and metadata files:
> {code}
> replicaInfo.getBlockFile().createNewFile();
> replicaInfo.getMetaFile().createNewFile();
> {code}
> It leaks the implementation details of {{FsDatasetImpl}}. This JIRA proposes 
> to use {{FsDatasetImplTestUtils}} (HDFS-9188) to create replicas. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7087) Ability to list /.reserved

2015-10-16 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-7087:

Status: Patch Available  (was: Open)

> Ability to list /.reserved
> --
>
> Key: HDFS-7087
> URL: https://issues.apache.org/jira/browse/HDFS-7087
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Andrew Wang
>Assignee: Xiao Chen
> Attachments: HDFS-7087.001.patch, HDFS-7087.002.patch, 
> HDFS-7087.draft.patch
>
>
> We have two special paths within /.reserved now, /.reserved/.inodes and 
> /.reserved/raw. It seems like we should be able to list /.reserved to see 
> them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961598#comment-14961598
 ] 

Hudson commented on HDFS-9253:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #507 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/507/])
HDFS-9253. Refactor tests of libhdfs into a directory. Contributed by (wheat9: 
rev 79b8d60d085ae196b05ff4ab511ff89f652e3c55)
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/include/hdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_http_client.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/test_fuse_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_libhdfs_threaded.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_read.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs_test.h
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_context_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_threaded.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/exception.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_stat_struct.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_file_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_json_parser.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/fuse_workload.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_web.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-

[jira] [Commented] (HDFS-7964) Add support for async edit logging

2015-10-16 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961593#comment-14961593
 ] 

Jing Zhao commented on HDFS-7964:
-

Thanks for rebasing the patch, Daryn. The patch looks good to me. Some minor 
comments:
# The following code uses whether the current thread holds the monitor to 
decide whether the edit should be async/sync. This way may be not direct to 
follow, also make it hard to guarantee the correctness of future code. Can we 
simply make the decision based on the op itself?
{code}
// only rpc calls not explicitly sync'ed on the log will be async.
if (rpcCall != null && !Thread.holdsLock(this)) {
  edit = new AsyncEdit(this, op, rpcCall);
} else {
  edit = new SyncEdit(this, op);
}
{code}
# If requests keeps coming but the traffic is slow, the sync will happen only 
when the buffer is full, which means the response may be delayed? This may be a 
rare case in practice but maybe we should avoid it here. Can we make each 
iteration of the loop either fill the buffer or drain the pending queue?
{code}
if (edit != null) {
  // sync if requested by edit log.
  doSync = edit.logEdit();
  syncWaitQ.add(edit);
} else {
  // sync when editq runs dry, but have edits pending a sync.
  doSync = !syncWaitQ.isEmpty();
}
{code}
# The class InvalidOp has not been used. We can either remove it or use it in 
{{OP_INVALID}}.
# Maybe we can do some further cleanup for {{RollingUpgradeOp}}. E.g., after 
adding classes like {{RollingUpgradeStartOp}} and {{RollingUpgradeFinalizeOp}}, 
we can put {{getInstance}} methods there and remove {{getStartInstance}} and 
{{getFinalizeInstance}}.
# Is the main reason of having {{OpInstanceCache#get}} to minimize the code 
change?
# It will be helpful to add a comment to explain the calculation logic of 
{{editsBatchedInSync}}.

> Add support for async edit logging
> --
>
> Key: HDFS-7964
> URL: https://issues.apache.org/jira/browse/HDFS-7964
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.2-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-7964.patch, HDFS-7964.patch
>
>
> Edit logging is a major source of contention within the NN.  LogEdit is 
> called within the namespace write log, while logSync is called outside of the 
> lock to allow greater concurrency.  The handler thread remains busy until 
> logSync returns to provide the client with a durability guarantee for the 
> response.
> Write heavy RPC load and/or slow IO causes handlers to stall in logSync.  
> Although the write lock is not held, readers are limited/starved and the call 
> queue fills.  Combining an edit log thread with postponed RPC responses from 
> HADOOP-10300 will provide the same durability guarantee but immediately free 
> up the handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-16 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961589#comment-14961589
 ] 

Mingliang Liu commented on HDFS-9184:
-

Thanks for your comment [~daijy]. To address this, I think we have several 
options.
# One is that we set the max length of caller context as 128 bytes. The 
{{CallerContext.Builder}} will throw an exception if end user is trying to set 
a longer context of >128 bytes length. It works just fine if we won't miss the 
_configurability_.
# Another approach is to validate the length when we create a RPC 
{{Client$Connection}}. We can either truncate the caller context and log a 
warning, or we can throw an exception. We may have to change the 
{{ProtoUtils#makeRpcRequestHeader}} for this validation, as we need to read the 
config keys.


> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-16 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961574#comment-14961574
 ] 

Daniel Dai commented on HDFS-9184:
--

If we want to impose a limitation on the length, it is better to impose on the 
client side explicitly rather than silently truncate on datanode. This id will 
be used in other components for cross reference. If hdfs audit log shows a 
truncated id, it would be hard to cross reference to logs of other components.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.003.patch

Add null check when creating iterator of storages

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks are not on any storage, so no 
> replication can occur causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9230) Report space overhead of unfinalized upgrade/rollingUpgrade

2015-10-16 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961561#comment-14961561
 ] 

Andrew Wang commented on HDFS-9230:
---

For hardlink upgrades, you could check the link count to see if a file in 
previous is still referenced in current. This is similar in cost to du.

> Report space overhead of unfinalized upgrade/rollingUpgrade
> ---
>
> Key: HDFS-9230
> URL: https://issues.apache.org/jira/browse/HDFS-9230
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS
>Reporter: Xiaoyu Yao
>
> DataNodes do not delete block files during upgrades to allow rollback. This 
> is often confusing to administrators since they sometimes delete files before 
> finalize upgrade but don't see the DFS used space reduce.
> Ideally, HDFS should report the un-finalized upgrade overhead along with its 
> message on NN UI "Upgrade in progress. Not yet finalized." Or, this can be 
> improve with better NN UI message and document that space won't be reclaimed 
> for deletion until upgrade is finalized.
> For non-rolling upgrade, it is not easy to track this due to hard link. Say 
> NN initialized upgrade at T1, the block files on DNs that exist before T1 are 
> still under 'current' directory but is just a hard link to 'previous' 
> directory. When those files are deleted after T1 due to deletion, the block 
> file usage on DN won't get deleted until upgrade is finalized. 
> So we need to book keeping files created before T1 but deleted after T1 as 
> the un-finalized upgrade overhead here.
> For rolling upgrade, it is relative easy to track space overhead as we are 
> not using hard link.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9241) HDFS clients can't construct HdfsConfiguration instances

2015-10-16 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961557#comment-14961557
 ] 

Mingliang Liu commented on HDFS-9241:
-

{quote}
Old applications can still depend on hadoop-hdfs and nothing will break. 
However, the application might need to change a couple lines of code if it only 
wants to depend on hadoop-hdfs-client. 
{quote}
It makes sense to me. Do you think we need to make {{HdfsConfigurationLoader}} 
public so that code depending on {{hadoop-hdfs-client}} is able to load the 
default resource forcefully (in case)?

> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9241
> URL: https://issues.apache.org/jira/browse/HDFS-9241
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Attachments: HDFS-9241.000.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-10-16 Thread Tony Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961543#comment-14961543
 ] 

Tony Wu commented on HDFS-9236:
---

checksyle and pre-patch error are not related to this patch.

> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9239) DataNode Lifeline Protocol: an alternative protocol for reporting DataNode liveness

2015-10-16 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961540#comment-14961540
 ] 

Jitendra Nath Pandey commented on HDFS-9239:


bq. .. Well before node liveness is affected by inundation of IBRs and FBRs, 
the namenode performance will degrade to unacceptable level...

  Yes, indeed. But if datanodes are marked as dead in that situation, that 
completely destabilizes the system. At that point, even if we kill certain 
offending jobs, it takes a while before NN can come back to an acceptable 
service level. This proposal should help prevent the death after NN is past the 
overloading scenario.

  I think ZKFC healthcheck should also be separated into a different queue or 
port so that they are not blocked by other messages in NN's call queue. A 
failover because NN is busy is not very helpful. The other NN also gets busy 
and we end up seeing active-standby flip-flop between the namenodes.

> DataNode Lifeline Protocol: an alternative protocol for reporting DataNode 
> liveness
> ---
>
> Key: HDFS-9239
> URL: https://issues.apache.org/jira/browse/HDFS-9239
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: DataNode-Lifeline-Protocol.pdf
>
>
> This issue proposes introduction of a new feature: the DataNode Lifeline 
> Protocol.  This is an RPC protocol that is responsible for reporting liveness 
> and basic health information about a DataNode to a NameNode.  Compared to the 
> existing heartbeat messages, it is lightweight and not prone to resource 
> contention problems that can harm accurate tracking of DataNode liveness 
> currently.  The attached design document contains more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961530#comment-14961530
 ] 

Hudson commented on HDFS-9253:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2444 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2444/])
HDFS-9253. Refactor tests of libhdfs into a directory. Contributed by (wheat9: 
rev 79b8d60d085ae196b05ff4ab511ff89f652e3c55)
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_context_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/include/hdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/exception.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_web.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_http_client.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/vecsum.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_threaded.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_zerocopy.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/fuse_workload.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.c
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/test_fuse_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_stat_struct.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/hdfs_test.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_json_parser.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_file_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini

[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Description: 
This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change and also some help 
investigating/understanding a few outstanding issues if we are interested in 
moving forward with this.

There seems to be some timing issues I hit when testing the patch, not sure if 
it is a bug in the patch or something else (most likely the earlier)...

Tests that fail for me:

   The issues seems to be that the blocks are not on any storage, so no 
replication can occur causing the tests to fail in different ways.

   TestDecomission.testDecommision
   If I add a little sleep after the cleanup/delete things seem to work
   TestDFSStripedOutputStreamWithFailure
   A couple of tests fails in this class.



  was:
This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change and also some help 
investigating/understanding a few outstanding issues if we are interested in 
moving forward with this.

There seems to be some timing issues I hit when testing the patch, not sure if 
it is a bug in the patch or something else (most likely the earlier)...

Tests that fail for me:

   The issues seems to be that the blocks is not on any storage, so no 
replication can occurs causing the tests to fail in different ways.

   TestDecomission.testDecommision
   If I add a little sleep after the cleanup/delete things seem to work
   TestDFSStripedOutputStreamWithFailure
   A couple of tests fails in this class.




> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks are not on any storage, so no 
> replication can occur causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-16 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961480#comment-14961480
 ] 

Jitendra Nath Pandey commented on HDFS-9184:


I will commit it to trunk if there are no objections.
[~aw], I think the latest patch addresses your concern of change in audit log 
by keeping it disabled by default. If you are ok, I would like to commit this 
to branch-2 as well.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-16 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961475#comment-14961475
 ] 

Jitendra Nath Pandey commented on HDFS-9184:


+1

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-16 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961458#comment-14961458
 ] 

Arpit Agarwal commented on HDFS-4015:
-

bq. When the operator makes the name node leave safe mode manually, the -force 
option is not checked, even if there are orphaned blocks. Is this possible? If 
true, is it expected?
[~liuml07], you are right. It's admittedly odd for an administrator to enter 
safe mode manually during startup but we should guard against the sequence of 
steps you described.

I need to think about this some more but we should be able to remove the 
{{isInStartupSafeMode()}} from the clause below. i.e. never exit safe mode 
without the force flag if there bytes with future generation stamps. (The 
rollback exception is already handled elsewhere).

{code}
private synchronized void leave(boolean force) {
...
  if (!force && isInStartupSafeMode() && (blockManager.getBytesInFuture() >
  0)) {
{code}

> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-16 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961413#comment-14961413
 ] 

Mingliang Liu commented on HDFS-9184:
-

The failing tests seem unrelated and can pass locally (Gentoo Linux and Mac).

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9262) Reconfigure lazy writer interval on the fly

2015-10-16 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9262:

Affects Version/s: 2.7.0

> Reconfigure lazy writer interval on the fly
> ---
>
> Key: HDFS-9262
> URL: https://issues.apache.org/jira/browse/HDFS-9262
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> This is to reconfigure
> {code}
> dfs.datanode.lazywriter.interval.sec
> {code}
> without restarting DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9262) Reconfigure lazy writer interval on the fly

2015-10-16 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9262:

Description: 
This is to reconfigure
{code}
dfs.datanode.lazywriter.interval.sec
{code}
without restarting DN.

  was:
This is to reconfigure
dfs.datanode.lazywriter.interval.sec
without restarting DN.


> Reconfigure lazy writer interval on the fly
> ---
>
> Key: HDFS-9262
> URL: https://issues.apache.org/jira/browse/HDFS-9262
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> This is to reconfigure
> {code}
> dfs.datanode.lazywriter.interval.sec
> {code}
> without restarting DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9262) Reconfigure lazy writer interval on the fly

2015-10-16 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9262:

Description: 
This is to reconfigure
dfs.datanode.lazywriter.interval.sec
without restarting DN.

> Reconfigure lazy writer interval on the fly
> ---
>
> Key: HDFS-9262
> URL: https://issues.apache.org/jira/browse/HDFS-9262
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> This is to reconfigure
> dfs.datanode.lazywriter.interval.sec
> without restarting DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9262) Reconfigure lazy writer interval on the fly

2015-10-16 Thread Xiaobing Zhou (JIRA)
Xiaobing Zhou created HDFS-9262:
---

 Summary: Reconfigure lazy writer interval on the fly
 Key: HDFS-9262
 URL: https://issues.apache.org/jira/browse/HDFS-9262
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9245) Fix findbugs warnings in hdfs-nfs/WriteCtx

2015-10-16 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961358#comment-14961358
 ] 

Li Lu commented on HDFS-9245:
-

Yes I think using volatile here is appropriate. Findbugs also turned green for 
the fix. 

> Fix findbugs warnings in hdfs-nfs/WriteCtx
> --
>
> Key: HDFS-9245
> URL: https://issues.apache.org/jira/browse/HDFS-9245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9245.000.patch
>
>
> There are findbugs warnings as follows, brought by [HDFS-9092].
> It seems fine to ignore them by write a filter rule in the 
> {{findbugsExcludeFile.xml}} file. 
> {code:xml}
>  instanceHash="592511935f7cb9e5f97ef4c99a6c46c2" instanceOccurrenceNum="0" 
> priority="2" abbrev="IS" type="IS2_INCONSISTENT_SYNC" cweid="366" 
> instanceOccurrenceMax="0">
> Inconsistent synchronization
> 
> Inconsistent synchronization of 
> org.apache.hadoop.hdfs.nfs.nfs3.WriteCtx.offset; locked 75% of time
> 
> 
>  sourcepath="org/apache/hadoop/hdfs/nfs/nfs3/WriteCtx.java" 
> sourcefile="WriteCtx.java" end="314">
> At WriteCtx.java:[lines 40-314]
> 
> In class org.apache.hadoop.hdfs.nfs.nfs3.WriteCtx
> 
> {code}
> and
> {code:xml}
>  instanceHash="4f3daa339eb819220f26c998369b02fe" instanceOccurrenceNum="0" 
> priority="2" abbrev="IS" type="IS2_INCONSISTENT_SYNC" cweid="366" 
> instanceOccurrenceMax="0">
> Inconsistent synchronization
> 
> Inconsistent synchronization of 
> org.apache.hadoop.hdfs.nfs.nfs3.WriteCtx.originalCount; locked 50% of time
> 
> 
>  sourcepath="org/apache/hadoop/hdfs/nfs/nfs3/WriteCtx.java" 
> sourcefile="WriteCtx.java" end="314">
> At WriteCtx.java:[lines 40-314]
> 
> In class org.apache.hadoop.hdfs.nfs.nfs3.WriteCtx
> 
>  name="originalCount" primary="true" signature="I">
>  sourcepath="org/apache/hadoop/hdfs/nfs/nfs3/WriteCtx.java" 
> sourcefile="WriteCtx.java">
> In WriteCtx.java
> 
> 
> Field org.apache.hadoop.hdfs.nfs.nfs3.WriteCtx.originalCount
> 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961338#comment-14961338
 ] 

Hudson commented on HDFS-9253:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1279 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1279/])
HDFS-9253. Refactor tests of libhdfs into a directory. Contributed by (wheat9: 
rev 79b8d60d085ae196b05ff4ab511ff89f652e3c55)
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.c
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/test_fuse_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_read.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_write.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_web.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_stat_struct.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_file_handle.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/exception.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/fuse_workload.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_json_parser.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/include/hdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_context_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_http_client.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_native_mini_dfs.

[jira] [Commented] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961317#comment-14961317
 ] 

Hudson commented on HDFS-9257:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #506 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/506/])
HDFS-9257. improve error message for "Absolute path required" in (harsh: rev 
52ac73f344e822e41457582f82abb4f35eba9dec)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9205) Do not schedule corrupt blocks for replication

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961316#comment-14961316
 ] 

Hudson commented on HDFS-9205:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #506 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/506/])
Revert "Move HDFS-9205 to trunk in CHANGES.txt." (szetszwo: rev 
a554701fe4402ae30461e2ef165cb60970a202a0)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Do not schedule corrupt blocks for replication
> --
>
> Key: HDFS-9205
> URL: https://issues.apache.org/jira/browse/HDFS-9205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h9205_20151007.patch, h9205_20151007b.patch, 
> h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch, 
> h9205_20151013.patch, h9205_20151015.patch
>
>
> Corrupted blocks by definition are blocks cannot be read. As a consequence, 
> they cannot be replicated.  In UnderReplicatedBlocks, there is a queue for 
> QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks 
> from it.  It seems that scheduling corrupted block for replication is wasting 
> resource and potentially slow down replication for the higher priority blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-16 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-9259:
--
Assignee: Mingliang Liu

Thanks [~liuml07]! I have assigned it to you.

> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Mingliang Liu
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
> across DC with different SendBufferSize and ReceiveBufferSize values. The 
> results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c by 
> making SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He 
> Tianyi] [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-16 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-9259:
--
Description: 
We recently found that cross-DC hdfs write could be really slow. Further 
investigation identified that is due to SendBufferSize and ReceiveBufferSize 
used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
across DC with different SendBufferSize and ReceiveBufferSize values. The 
results showed that c much faster than b; b is faster than a.

a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)

HDFS-8829 has enabled scenario b. We would like to enable scenario c by making 
SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He Tianyi] 
[~kanaka] [~vinayrpet].

  was:
We recently found that cross-DC hdfs write could be really slow. Further 
investigation identified that is due to SendBufferSize and ReceiveBufferSize 
used for hdfs write. The test is to do "hadoop -fs -copyFromLocal" of a 256MB 
file across DC with different SendBufferSize and ReceiveBufferSize values. The 
results showed that c much faster than b; b is faster than a.

a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)

HDFS-8829 has enabled scenario b. We would like to enable scenario c to make 
SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He Tianyi] 
[~kanaka] [~vinayrpet].


> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
> across DC with different SendBufferSize and ReceiveBufferSize values. The 
> results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c by 
> making SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He 
> Tianyi] [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961301#comment-14961301
 ] 

Hudson commented on HDFS-9253:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #558 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/558/])
HDFS-9253. Refactor tests of libhdfs into a directory. Contributed by (wheat9: 
rev 79b8d60d085ae196b05ff4ab511ff89f652e3c55)
* hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/exception.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/CMakeLists.txt
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/fuse_workload.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_read.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_context_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_file_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_http_client.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_web.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_json_parser.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/include/hdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_stat_struct.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/hdfs_test.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hd

[jira] [Commented] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961279#comment-14961279
 ] 

Hudson commented on HDFS-9253:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #543 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/543/])
HDFS-9253. Refactor tests of libhdfs into a directory. Contributed by (wheat9: 
rev 79b8d60d085ae196b05ff4ab511ff89f652e3c55)
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_native_mini_dfs.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_context_handle.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_json_parser.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/exception.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_stat_struct.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/test_fuse_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_http_client.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_web.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_file_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/fuse_workload.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_read.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/i

[jira] [Commented] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961242#comment-14961242
 ] 

Hudson commented on HDFS-9253:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2492 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2492/])
HDFS-9253. Refactor tests of libhdfs into a directory. Contributed by (wheat9: 
rev 79b8d60d085ae196b05ff4ab511ff89f652e3c55)
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_threaded.c
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_context_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_native_mini_dfs.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_web.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/fuse_workload.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_read.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/test_fuse_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_http_client.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_json_parser.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_write.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/include/hdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/exception.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_file_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-na

[jira] [Commented] (HDFS-9205) Do not schedule corrupt blocks for replication

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961226#comment-14961226
 ] 

Hudson commented on HDFS-9205:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2443 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2443/])
Revert "Move HDFS-9205 to trunk in CHANGES.txt." (szetszwo: rev 
a554701fe4402ae30461e2ef165cb60970a202a0)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Do not schedule corrupt blocks for replication
> --
>
> Key: HDFS-9205
> URL: https://issues.apache.org/jira/browse/HDFS-9205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h9205_20151007.patch, h9205_20151007b.patch, 
> h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch, 
> h9205_20151013.patch, h9205_20151015.patch
>
>
> Corrupted blocks by definition are blocks cannot be read. As a consequence, 
> they cannot be replicated.  In UnderReplicatedBlocks, there is a queue for 
> QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks 
> from it.  It seems that scheduling corrupted block for replication is wasting 
> resource and potentially slow down replication for the higher priority blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961227#comment-14961227
 ] 

Hudson commented on HDFS-9257:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2443 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2443/])
HDFS-9257. improve error message for "Absolute path required" in (harsh: rev 
52ac73f344e822e41457582f82abb4f35eba9dec)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9249) NPE thrown if an IOException is thrown in NameNode.

2015-10-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961221#comment-14961221
 ] 

Hadoop QA commented on HDFS-9249:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 16s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 52s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 37s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 46s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 46s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 35s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  53m  4s | Tests failed in hadoop-hdfs. |
| | | 104m 37s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.server.namenode.TestBackupNode |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767098/HDFS-9249.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 52ac73f |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13034/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13034/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13034/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13034/console |


This message was automatically generated.

> NPE thrown if an IOException is thrown in NameNode.
> -
>
> Key: HDFS-9249
> URL: https://issues.apache.org/jira/browse/HDFS-9249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-9249.001.patch, HDFS-9249.002.patch
>
>
> This issue was found when running test case 
> TestBackupNode.testCheckpointNode, but upon closer look, the problem is not 
> due to the test case.
> Looks like an IOException was thrown in
> try {
>   initializeGenericKeys(conf, nsId, namenodeId);
>   initialize(conf);
>   try {
> haContext.writeLock();
> state.prepareToEnterState(haContext);
> state.enterState(haContext);
>   } finally {
> haContext.writeUnlock();
>   }
> causing the namenode to stop, but the namesystem was not yet properly 
> instantiated, causing NPE.
> I tried to reproduce locally, but to no avail.
> Because I could not reproduce the bug, and the log does not indicate what 
> caused the IOException, I suggest make this a supportability JIRA to log the 
> exception for future improvement.
> Stacktrace
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getFSImage(NameNode.java:906)
> at org.apache.hadoop.hdfs.server.namenode.BackupNode.stop(BackupNode.java:210)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:827)
> at 
> org.apache.hadoop.hdfs.server.namenode.BackupNode.(BackupNode.java:89)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1474)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.startBackupNode(TestBackupNode.java:102)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpoint(TestBackupNode.java:298)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpointNode(TestBackupNode.java:130)
> The last few lines of log:
> 2015-10-14 19:45:07,807 INFO namenode.NameNode 
> (NameNode.java:createNameNode(1422)) - createNameNode 

[jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961217#comment-14961217
 ] 

Haohui Mai commented on HDFS-8766:
--

Thanks for updating the patch. Some comments:

1. Remove {[hdfs.h}}, {{c_api_test.cc}} in this patch and reuse the code 
existing repo, as HDFS-9207 and HDFS-9253 have landed.
2. Remove {{hdfs_macros.h}} and use {{unique_ptr}}.
3. Separate the bug fixes in 
{{hadoop-hdfs-project/hadoop-hdfs-client/src/main/native/libhdfspp/lib/reader/remote_block_reader_impl.h}}
 into another jira.
4. Rename {{HdfsInternal::pread}} to {{HdfsInternal::Pread}} to follow the 
Google C++ styling guide.
5. Separate the implementation of the C API and the definition of 
{{HdfsInternal}} in different files.

Haven't closely looked into the logics yet -- the patch should be much smaller 
and cleaner after the above changes. Will do it in the next round.

> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch, HDFS-8766.HDFS-8707.004.patch, 
> HDFS-8766.HDFS-8707.005.patch, HDFS-8766.HDFS-8707.006.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9255) Consolidate block recovery related implementation into a single class

2015-10-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961210#comment-14961210
 ] 

Hadoop QA commented on HDFS-9255:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 13s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 54s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 25s | The applied patch generated  8 
new checkstyle issues (total was 512, now 493). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 34s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  50m 53s | Tests failed in hadoop-hdfs. |
| | |  97m 15s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestReplaceDatanodeOnFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767097/HDFS-9255.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 52ac73f |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13033/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13033/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13033/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13033/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13033/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13033/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13033/console |


This message was automatically generated.

> Consolidate block recovery related implementation into a single class
> -
>
> Key: HDFS-9255
> URL: https://issues.apache.org/jira/browse/HDFS-9255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Attachments: HDFS-9255.01.patch, HDFS-9255.02.patch, 
> HDFS-9255.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9254) HDFS Secure Mode Documentation updates

2015-10-16 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961186#comment-14961186
 ] 

Arpit Agarwal commented on HDFS-9254:
-

The test failures do seem to be caused by my patch, oddly. I'll take a look.

bq. in the patch, you say that `d...@realm.tld` is allowed. I recall seeing 
some JIRAs where people were saying you get a stack trace unless you have the 
/HOST value of some kind or other
Let me verify that in a test cluster, thanks for looking at the patch.


> HDFS Secure Mode Documentation updates
> --
>
> Key: HDFS-9254
> URL: https://issues.apache.org/jira/browse/HDFS-9254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-9254.01.patch
>
>
> Some Kerberos configuration parameters are not documented well enough. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9253:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~wheat9] for the 
contribution.

> Refactor tests of libhdfs into a directory
> --
>
> Key: HDFS-9253
> URL: https://issues.apache.org/jira/browse/HDFS-9253
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-9253.000.patch, HDFS-9253.001.patch, 
> HDFS-9253.002.patch
>
>
> This jira proposes to refactor the current tests in libhdfs into a separate 
> directory. The refactor opens up the opportunity to reuse tests in libhdfs, 
> libwebhdfs and libhdfspp in HDFS-8707 and to also allow cross validation of 
> different implementation of the libhdfs API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961182#comment-14961182
 ] 

Haohui Mai edited comment on HDFS-9253 at 10/16/15 6:42 PM:


I've committed the patch to trunk and branch-2. Thanks Jing for the reviews.


was (Author: wheat9):
I've committed the patch to trunk and branch-2. Thanks [~wheat9] for the 
contribution.

> Refactor tests of libhdfs into a directory
> --
>
> Key: HDFS-9253
> URL: https://issues.apache.org/jira/browse/HDFS-9253
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-9253.000.patch, HDFS-9253.001.patch, 
> HDFS-9253.002.patch
>
>
> This jira proposes to refactor the current tests in libhdfs into a separate 
> directory. The refactor opens up the opportunity to reuse tests in libhdfs, 
> libwebhdfs and libhdfspp in HDFS-8707 and to also allow cross validation of 
> different implementation of the libhdfs API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.002.patch

Merged with latest head

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961174#comment-14961174
 ] 

Hudson commented on HDFS-9253:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8652 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8652/])
HDFS-9253. Refactor tests of libhdfs into a directory. Contributed by (wheat9: 
rev 79b8d60d085ae196b05ff4ab511ff89f652e3c55)
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/fuse_workload.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_threaded.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_trash.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_http_client.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_web.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/native_mini_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/hdfs_json_parser.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/vecsum.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs_test.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_write.c
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/exception.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/contrib/libwebhdfs/src/test_libwebhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_stat_struct.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_zerocopy.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_file_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/test/test_fuse_dfs.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_htable.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_read.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_libhdfs_write.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/CMakeLists.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/test/test_htable.c
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_context_handle.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/expect.c
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/native_mini_dfs.h
* hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/sr

[jira] [Commented] (HDFS-9208) Disabling atime may fail clients like distCp

2015-10-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961132#comment-14961132
 ] 

Hadoop QA commented on HDFS-9208:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 45s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 59s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 40s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 39s | The applied patch generated  1 
new checkstyle issues (total was 14, now 15). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 42s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 38s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 53s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 41s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  69m 14s | Tests failed in hadoop-hdfs. |
| | | 121m 37s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.TestRecoverStripedFile |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767090/HDFS-9208.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 52ac73f |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13032/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13032/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13032/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13032/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13032/console |


This message was automatically generated.

> Disabling atime may fail clients like distCp
> 
>
> Key: HDFS-9208
> URL: https://issues.apache.org/jira/browse/HDFS-9208
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9208.patch
>
>
> When atime is disabled, {{setTimes()}} throws an exception if the passed-in 
> atime is not -1.  But since atime is not -1, distCp fails when it tries to 
> set the mtime and atime. 
> There are several options:
> 1) make distCp check for 0 atime and call {{setTimes()}} with -1. I am not 
> very enthusiastic about it.
> 2) make NN also accept 0 atime in addition to -1, when the atime support is 
> disabled.
> 3) support setting mtime & atime regardless of the atime support.  The main 
> reason why atime is disabled is to avoid edit logging/syncing during 
> {{getBlockLocations()}} read calls. Explicit setting can be allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9261) Erasure Coding: Skip encoding the data cells if all the parity data streamers are failed for the current block group

2015-10-16 Thread Rakesh R (JIRA)
Rakesh R created HDFS-9261:
--

 Summary: Erasure Coding: Skip encoding the data cells if all the 
parity data streamers are failed for the current block group
 Key: HDFS-9261
 URL: https://issues.apache.org/jira/browse/HDFS-9261
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
Priority: Minor


{{DFSStripedOutputStream}} will continue writing with minimum number 
(dataBlockNum) of live datanodes. It won't replace the failed datanodes 
immediately for the current block group. Consider a case where all the parity 
data streamers are failed, now it is unnecessary to encode the data block cells 
and generate the parity data. This is a corner case where it can skip 
{{writeParityCells()}} step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9129) Move the safemode block count into BlockManager

2015-10-16 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961110#comment-14961110
 ] 

Mingliang Liu commented on HDFS-9129:
-

The failing tests can pass locally. Addressing the findbugs and checkstyle 
warnings.

> Move the safemode block count into BlockManager
> ---
>
> Key: HDFS-9129
> URL: https://issues.apache.org/jira/browse/HDFS-9129
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9129.000.patch, HDFS-9129.001.patch, 
> HDFS-9129.002.patch, HDFS-9129.003.patch, HDFS-9129.004.patch, 
> HDFS-9129.005.patch
>
>
> The {{SafeMode}} needs to track whether there are enough blocks so that the 
> NN can get out of the safemode. These fields can moved to the 
> {{BlockManager}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline

2015-10-16 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9098:

Affects Version/s: 3.0.0

> Erasure coding: emulate race conditions among striped streamers in write 
> pipeline
> -
>
> Key: HDFS-9098
> URL: https://issues.apache.org/jira/browse/HDFS-9098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> Apparently the interleaving of events among {{StripedDataStreamer}}'s is very 
> tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race 
> conditions under HDFS-9040.
> Let's use FaultInjector to emulate different combinations of interleaved 
> events.
> In particular, we should consider inject delays in the following places:
> # {{Streamer#endBlock}}
> # {{Streamer#locateFollowingBlock}}
> # {{Streamer#updateBlockForPipeline}}
> # {{Streamer#updatePipeline}}
> # {{OutputStream#writeChunk}}
> # {{OutputStream#close}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline

2015-10-16 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9098:

Component/s: erasure-coding

> Erasure coding: emulate race conditions among striped streamers in write 
> pipeline
> -
>
> Key: HDFS-9098
> URL: https://issues.apache.org/jira/browse/HDFS-9098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> Apparently the interleaving of events among {{StripedDataStreamer}}'s is very 
> tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race 
> conditions under HDFS-9040.
> Let's use FaultInjector to emulate different combinations of interleaved 
> events.
> In particular, we should consider inject delays in the following places:
> # {{Streamer#endBlock}}
> # {{Streamer#locateFollowingBlock}}
> # {{Streamer#updateBlockForPipeline}}
> # {{Streamer#updatePipeline}}
> # {{OutputStream#writeChunk}}
> # {{OutputStream#close}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9252) Change TestFileTruncate to use FsDatasetTestUtils to get block file size and genstamp.

2015-10-16 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9252:

Summary: Change TestFileTruncate to use FsDatasetTestUtils to get block 
file size and genstamp.  (was: Change TestFileTruncate to FsDatasetTestUtils to 
get block file size and genstamp.)

> Change TestFileTruncate to use FsDatasetTestUtils to get block file size and 
> genstamp.
> --
>
> Key: HDFS-9252
> URL: https://issues.apache.org/jira/browse/HDFS-9252
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9252.00.patch
>
>
> {{TestFileTruncate}} verifies block size and genstamp by directly accessing 
> the  local filesystem, e.g.:
> {code}
> assertTrue(cluster.getBlockMetadataFile(dn0,
>newBlock.getBlock()).getName().endsWith(
>newBlock.getBlock().getGenerationStamp() + ".meta"));
> {code}
> Lets abstract the fsdataset-special logic behind FsDatasetTestUtils.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9260:

Assignee: Staffan Friberg

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-16 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961091#comment-14961091
 ] 

Mingliang Liu commented on HDFS-9259:
-

Hi [~mingma], can I work on this, if we reach consensus on the issue itself?

> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test is to do "hadoop -fs -copyFromLocal" of a 256MB 
> file across DC with different SendBufferSize and ReceiveBufferSize values. 
> The results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c to make 
> SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He Tianyi] 
> [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961087#comment-14961087
 ] 

Hudson commented on HDFS-9257:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2491 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2491/])
HDFS-9257. improve error message for "Absolute path required" in (harsh: rev 
52ac73f344e822e41457582f82abb4f35eba9dec)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java


> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9205) Do not schedule corrupt blocks for replication

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961086#comment-14961086
 ] 

Hudson commented on HDFS-9205:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2491 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2491/])
Revert "Move HDFS-9205 to trunk in CHANGES.txt." (szetszwo: rev 
a554701fe4402ae30461e2ef165cb60970a202a0)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Do not schedule corrupt blocks for replication
> --
>
> Key: HDFS-9205
> URL: https://issues.apache.org/jira/browse/HDFS-9205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h9205_20151007.patch, h9205_20151007b.patch, 
> h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch, 
> h9205_20151013.patch, h9205_20151015.patch
>
>
> Corrupted blocks by definition are blocks cannot be read. As a consequence, 
> they cannot be replicated.  In UnderReplicatedBlocks, there is a queue for 
> QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks 
> from it.  It seems that scheduling corrupted block for replication is wasting 
> resource and potentially slow down replication for the higher priority blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9253) Refactor tests of libhdfs into a directory

2015-10-16 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961075#comment-14961075
 ] 

Jing Zhao commented on HDFS-9253:
-

+1

> Refactor tests of libhdfs into a directory
> --
>
> Key: HDFS-9253
> URL: https://issues.apache.org/jira/browse/HDFS-9253
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9253.000.patch, HDFS-9253.001.patch, 
> HDFS-9253.002.patch
>
>
> This jira proposes to refactor the current tests in libhdfs into a separate 
> directory. The refactor opens up the opportunity to reuse tests in libhdfs, 
> libwebhdfs and libhdfspp in HDFS-8707 and to also allow cross validation of 
> different implementation of the libhdfs API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961049#comment-14961049
 ] 

Hudson commented on HDFS-9257:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #542 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/542/])
HDFS-9257. improve error message for "Absolute path required" in (harsh: rev 
52ac73f344e822e41457582f82abb4f35eba9dec)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java


> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9205) Do not schedule corrupt blocks for replication

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961048#comment-14961048
 ] 

Hudson commented on HDFS-9205:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #542 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/542/])
Revert "Move HDFS-9205 to trunk in CHANGES.txt." (szetszwo: rev 
a554701fe4402ae30461e2ef165cb60970a202a0)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Do not schedule corrupt blocks for replication
> --
>
> Key: HDFS-9205
> URL: https://issues.apache.org/jira/browse/HDFS-9205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h9205_20151007.patch, h9205_20151007b.patch, 
> h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch, 
> h9205_20151013.patch, h9205_20151015.patch
>
>
> Corrupted blocks by definition are blocks cannot be read. As a consequence, 
> they cannot be replicated.  In UnderReplicatedBlocks, there is a queue for 
> QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks 
> from it.  It seems that scheduling corrupted block for replication is wasting 
> resource and potentially slow down replication for the higher priority blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9208) Disabling atime may fail clients like distCp

2015-10-16 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9208:
-
Target Version/s: 2.8.0
  Status: Patch Available  (was: Open)

> Disabling atime may fail clients like distCp
> 
>
> Key: HDFS-9208
> URL: https://issues.apache.org/jira/browse/HDFS-9208
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9208.patch
>
>
> When atime is disabled, {{setTimes()}} throws an exception if the passed-in 
> atime is not -1.  But since atime is not -1, distCp fails when it tries to 
> set the mtime and atime. 
> There are several options:
> 1) make distCp check for 0 atime and call {{setTimes()}} with -1. I am not 
> very enthusiastic about it.
> 2) make NN also accept 0 atime in addition to -1, when the atime support is 
> disabled.
> 3) support setting mtime & atime regardless of the atime support.  The main 
> reason why atime is disabled is to avoid edit logging/syncing during 
> {{getBlockLocations()}} read calls. Explicit setting can be allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS Block and Replica Management 20151013.pdf

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: (was: HDFS Block and Replica Management 20151013.pdf)

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS Block and Replica Management 20151013.pdf

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.001.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
> Attachments: HDFS-7435.001.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961037#comment-14961037
 ] 

Hudson commented on HDFS-9257:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #557 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/557/])
HDFS-9257. improve error message for "Absolute path required" in (harsh: rev 
52ac73f344e822e41457582f82abb4f35eba9dec)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java


> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9205) Do not schedule corrupt blocks for replication

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961036#comment-14961036
 ] 

Hudson commented on HDFS-9205:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #557 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/557/])
Revert "Move HDFS-9205 to trunk in CHANGES.txt." (szetszwo: rev 
a554701fe4402ae30461e2ef165cb60970a202a0)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Do not schedule corrupt blocks for replication
> --
>
> Key: HDFS-9205
> URL: https://issues.apache.org/jira/browse/HDFS-9205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h9205_20151007.patch, h9205_20151007b.patch, 
> h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch, 
> h9205_20151013.patch, h9205_20151015.patch
>
>
> Corrupted blocks by definition are blocks cannot be read. As a consequence, 
> they cannot be replicated.  In UnderReplicatedBlocks, there is a queue for 
> QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks 
> from it.  It seems that scheduling corrupted block for replication is wasting 
> resource and potentially slow down replication for the higher priority blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)
Staffan Friberg created HDFS-9260:
-

 Summary: Improve performance and GC friendliness of startup and 
FBRs
 Key: HDFS-9260
 URL: https://issues.apache.org/jira/browse/HDFS-9260
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode, performance
Affects Versions: 2.7.1
Reporter: Staffan Friberg


This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change and also some help 
investigating/understanding a few outstanding issues if we are interested in 
moving forward with this.

There seems to be some timing issues I hit when testing the patch, not sure if 
it is a bug in the patch or something else (most likely the earlier)...

Tests that fail for me:

   The issues seems to be that the blocks is not on any storage, so no 
replication can occurs causing the tests to fail in different ways.

   TestDecomission.testDecommision
   If I add a little sleep after the cleanup/delete things seem to work
   TestDFSStripedOutputStreamWithFailure
   A couple of tests fails in this class.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9249) NPE thrown if an IOException is thrown in NameNode.

2015-10-16 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9249:
--
Attachment: HDFS-9249.002.patch

Attaching rev2. This patch adds a test case that verifies the fix for the NPE 
when the authentication of backup node is incorrectly configured.

Thanks [~steve_l] for thoughtful comments.

> NPE thrown if an IOException is thrown in NameNode.
> -
>
> Key: HDFS-9249
> URL: https://issues.apache.org/jira/browse/HDFS-9249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-9249.001.patch, HDFS-9249.002.patch
>
>
> This issue was found when running test case 
> TestBackupNode.testCheckpointNode, but upon closer look, the problem is not 
> due to the test case.
> Looks like an IOException was thrown in
> try {
>   initializeGenericKeys(conf, nsId, namenodeId);
>   initialize(conf);
>   try {
> haContext.writeLock();
> state.prepareToEnterState(haContext);
> state.enterState(haContext);
>   } finally {
> haContext.writeUnlock();
>   }
> causing the namenode to stop, but the namesystem was not yet properly 
> instantiated, causing NPE.
> I tried to reproduce locally, but to no avail.
> Because I could not reproduce the bug, and the log does not indicate what 
> caused the IOException, I suggest make this a supportability JIRA to log the 
> exception for future improvement.
> Stacktrace
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getFSImage(NameNode.java:906)
> at org.apache.hadoop.hdfs.server.namenode.BackupNode.stop(BackupNode.java:210)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:827)
> at 
> org.apache.hadoop.hdfs.server.namenode.BackupNode.(BackupNode.java:89)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1474)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.startBackupNode(TestBackupNode.java:102)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpoint(TestBackupNode.java:298)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpointNode(TestBackupNode.java:130)
> The last few lines of log:
> 2015-10-14 19:45:07,807 INFO namenode.NameNode 
> (NameNode.java:createNameNode(1422)) - createNameNode [-checkpoint]
> 2015-10-14 19:45:07,807 INFO impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:init(158)) - CheckpointNode metrics system started 
> (again)
> 2015-10-14 19:45:07,808 INFO namenode.NameNode 
> (NameNode.java:setClientNamenodeAddress(402)) - fs.defaultFS is 
> hdfs://localhost:37835
> 2015-10-14 19:45:07,808 INFO namenode.NameNode 
> (NameNode.java:setClientNamenodeAddress(422)) - Clients are to use 
> localhost:37835 to access this namenode/service.
> 2015-10-14 19:45:07,810 INFO hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:shutdown(1708)) - Shutting down the Mini HDFS Cluster
> 2015-10-14 19:45:07,810 INFO namenode.FSNamesystem 
> (FSNamesystem.java:stopActiveServices(1298)) - Stopping services started for 
> active state
> 2015-10-14 19:45:07,811 INFO namenode.FSEditLog 
> (FSEditLog.java:endCurrentLogSegment(1228)) - Ending log segment 1
> 2015-10-14 19:45:07,811 INFO namenode.FSNamesystem 
> (FSNamesystem.java:run(5306)) - NameNodeEditLogRoller was interrupted, exiting
> 2015-10-14 19:45:07,811 INFO namenode.FSEditLog 
> (FSEditLog.java:printStatistics(703)) - Number of transactions: 3 Total time 
> for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of 
> syncs: 4 SyncTimes(ms): 2 1 
> 2015-10-14 19:45:07,811 INFO namenode.FSNamesystem 
> (FSNamesystem.java:run(5373)) - LazyPersistFileScrubber was interrupted, 
> exiting
> 2015-10-14 19:45:07,822 INFO namenode.FileJournalManager 
> (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name1/current/edits_inprogress_001
>  -> 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name1/current/edits_001-003
> 2015-10-14 19:45:07,835 INFO namenode.FileJournalManager 
> (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name2/current/edits_inprogress_001
>  -> 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name2/current/edits_

[jira] [Assigned] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline

2015-10-16 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang reassigned HDFS-9098:
---

Assignee: Zhe Zhang

> Erasure coding: emulate race conditions among striped streamers in write 
> pipeline
> -
>
> Key: HDFS-9098
> URL: https://issues.apache.org/jira/browse/HDFS-9098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> Apparently the interleaving of events among {{StripedDataStreamer}}'s is very 
> tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race 
> conditions under HDFS-9040.
> Let's use FaultInjector to emulate different combinations of interleaved 
> events.
> In particular, we should consider inject delays in the following places:
> # {{Streamer#endBlock}}
> # {{Streamer#locateFollowingBlock}}
> # {{Streamer#updateBlockForPipeline}}
> # {{Streamer#updatePipeline}}
> # {{OutputStream#writeChunk}}
> # {{OutputStream#close}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9254) HDFS Secure Mode Documentation updates

2015-10-16 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960995#comment-14960995
 ] 

Steve Loughran commented on HDFS-9254:
--

I'd thought "underdocumented" is a complete summary of kerberos info —good to 
see you trying to fix this.

in the patch, you say that  `d...@realm.tld` is allowed. I recall seeing some 
JIRAs where people were saying you get a stack trace unless you have the /HOST 
value of some kind or other

> HDFS Secure Mode Documentation updates
> --
>
> Key: HDFS-9254
> URL: https://issues.apache.org/jira/browse/HDFS-9254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-9254.01.patch
>
>
> Some Kerberos configuration parameters are not documented well enough. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9255) Consolidate block recovery related implementation into a single class

2015-10-16 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-9255:

Attachment: HDFS-9255.03.patch

> Consolidate block recovery related implementation into a single class
> -
>
> Key: HDFS-9255
> URL: https://issues.apache.org/jira/browse/HDFS-9255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Attachments: HDFS-9255.01.patch, HDFS-9255.02.patch, 
> HDFS-9255.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9241) HDFS clients can't construct HdfsConfiguration instances

2015-10-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960987#comment-14960987
 ] 

Haohui Mai commented on HDFS-9241:
--

bq. the changes for the hdfs client classpath make instantiating 
HdfsConfiguration from the client impossible; it only lives server side. This 
breaks any app which creates one.

I'm trying to understand the use cases of applications creating a 
{{HdfsConfiguration}} instance. Is it because that the apps need a way to force 
the hdfs configurations to be loaded?

Old applications can still depend on {{hadoop-hdfs}} and nothing will break. 
However, the application might need to change a couple lines of code if it only 
wants to depend on {{hadoop-hdfs-client}}. Thoughts?

> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9241
> URL: https://issues.apache.org/jira/browse/HDFS-9241
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Attachments: HDFS-9241.000.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9241) HDFS clients can't construct HdfsConfiguration instances

2015-10-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960983#comment-14960983
 ] 

Haohui Mai commented on HDFS-9241:
--

bq. One other thing to consider is "would we expect thin clients to ever 
instantiate this class?". If so, should it be in that JAR.

My answer is no -- the current implementation has a class 
{{HdfsConfigurationLoader}} to load the configurations that serves the original 
purposes of {{HdfsConfiguration}} on the client side.

The reason is that {{HdfsConfiguration}} are used by both the client and the 
server side. It contains deprecated keys for the server side, which IMO should 
not be exposed to the clients at all.

> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9241
> URL: https://issues.apache.org/jira/browse/HDFS-9241
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Attachments: HDFS-9241.000.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9250) LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty

2015-10-16 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9250:

Status: Open  (was: Patch Available)

> LocatedBlock#addCachedLoc may throw ArrayStoreException when cache is empty
> ---
>
> Key: HDFS-9250
> URL: https://issues.apache.org/jira/browse/HDFS-9250
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9250.001.patch
>
>
> We may see the following exception:
> {noformat}
> java.lang.ArrayStoreException
> at java.util.ArrayList.toArray(ArrayList.java:389)
> at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.addCachedLoc(LocatedBlock.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.CacheManager.setCachedLocations(CacheManager.java:907)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1974)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> {noformat}
> The cause is that in LocatedBlock.java, when {{addCachedLoc}}:
> - Passed in parameter {{loc}}, which is type {{DatanodeDescriptor}}, is added 
> to {{cachedList}}
> - {{cachedList}} was assigned to {{EMPTY_LOCS}}, which is type 
> {{DatanodeInfoWithStorage}}.
> Both {{DatanodeDescriptor}} and {{DatanodeInfoWithStorage}} are subclasses of 
> {{DatanodeInfo}} but do not inherit from each other, resulting in the 
> ArrayStoreException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9205) Do not schedule corrupt blocks for replication

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960974#comment-14960974
 ] 

Hudson commented on HDFS-9205:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1278 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1278/])
Revert "Move HDFS-9205 to trunk in CHANGES.txt." (szetszwo: rev 
a554701fe4402ae30461e2ef165cb60970a202a0)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Do not schedule corrupt blocks for replication
> --
>
> Key: HDFS-9205
> URL: https://issues.apache.org/jira/browse/HDFS-9205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h9205_20151007.patch, h9205_20151007b.patch, 
> h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch, 
> h9205_20151013.patch, h9205_20151015.patch
>
>
> Corrupted blocks by definition are blocks cannot be read. As a consequence, 
> they cannot be replicated.  In UnderReplicatedBlocks, there is a queue for 
> QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks 
> from it.  It seems that scheduling corrupted block for replication is wasting 
> resource and potentially slow down replication for the higher priority blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960975#comment-14960975
 ] 

Hudson commented on HDFS-9257:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1278 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1278/])
HDFS-9257. improve error message for "Absolute path required" in (harsh: rev 
52ac73f344e822e41457582f82abb4f35eba9dec)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java


> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9208) Disabling atime may fail clients like distCp

2015-10-16 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9208:
-
Attachment: HDFS-9208.patch

The {{setTimes()}} call through {{getBlockLocations()}} does not force update, 
while explicit call does. So it is a simple matter of removing the check.  The 
behavior of {{getBlockLocations()}} is not changed.

> Disabling atime may fail clients like distCp
> 
>
> Key: HDFS-9208
> URL: https://issues.apache.org/jira/browse/HDFS-9208
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9208.patch
>
>
> When atime is disabled, {{setTimes()}} throws an exception if the passed-in 
> atime is not -1.  But since atime is not -1, distCp fails when it tries to 
> set the mtime and atime. 
> There are several options:
> 1) make distCp check for 0 atime and call {{setTimes()}} with -1. I am not 
> very enthusiastic about it.
> 2) make NN also accept 0 atime in addition to -1, when the atime support is 
> disabled.
> 3) support setting mtime & atime regardless of the atime support.  The main 
> reason why atime is disabled is to avoid edit logging/syncing during 
> {{getBlockLocations()}} read calls. Explicit setting can be allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-16 Thread Ming Ma (JIRA)
Ming Ma created HDFS-9259:
-

 Summary: Make SO_SNDBUF size configurable at DFSClient side for 
hdfs write scenario
 Key: HDFS-9259
 URL: https://issues.apache.org/jira/browse/HDFS-9259
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma


We recently found that cross-DC hdfs write could be really slow. Further 
investigation identified that is due to SendBufferSize and ReceiveBufferSize 
used for hdfs write. The test is to do "hadoop -fs -copyFromLocal" of a 256MB 
file across DC with different SendBufferSize and ReceiveBufferSize values. The 
results showed that c much faster than b; b is faster than a.

a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)

HDFS-8829 has enabled scenario b. We would like to enable scenario c to make 
SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He Tianyi] 
[~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9254) HDFS Secure Mode Documentation updates

2015-10-16 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960901#comment-14960901
 ] 

Arpit Agarwal commented on HDFS-9254:
-

Documentation-only patch, needs no new tests. Test failures are unrelated to 
the patch.

> HDFS Secure Mode Documentation updates
> --
>
> Key: HDFS-9254
> URL: https://issues.apache.org/jira/browse/HDFS-9254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-9254.01.patch
>
>
> Some Kerberos configuration parameters are not documented well enough. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2015-10-16 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960884#comment-14960884
 ] 

Rakesh R commented on HDFS-8287:


Thank you [~kaisasak] for taking care this, latest patch mostly looks fine to 
me. There are some more comments, could you please take a look at it.

# There are few minor checkstyle warnings, please fix it.
# I failed to understand the purpose of synchronization here. Is this required?
{code}
synchronized public CellBuffers flip()
{code}
# During DFSStripedOutputStream#closeImpl, I could see a corner case - number 
of bytes reaches striped boundary. Assume writeParityCells() has submitted a 
parity generator task and again assume the client has invoked #close() 
function. Now, generateParityCellsForLastStripe() will return false and its not 
waiting for the parity gen task in queue of previous cell, right? IMHO, we 
could have a mechanism to wait for any previously submitted parity gen task 
before closure.
{code}
 private boolean generateParityCellsForLastStripe(){


final long lastStripeSize = currentBlockGroupBytes % stripeDataSize();
if (lastStripeSize == 0) {
  return false;
}
{code}
# I think executor service can be moved to DFSClient, rather than creating 
again and again for every DFSStripedOutputStream, isn't it? 
{code}
  private final ExecutorService executorService;
{code}
Also, I've one comment about {{Executors.newCachedThreadPool}} -> It's 
unbounded, which means that you're opening the door for anyone to cripple your 
JVM by simply injecting more work into the service (DoS attack). Any specific 
reason to use cachedThreadPool? If not, I prefer to use fixed 
Executors.newFixedThreadPool or a ThreadPoolExecutor with a set maximum number 
of threads;
# {{public}} class DoubleCellBuffer, please make this to {{private}}. Also, you 
can make the methods to private visibility.

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, 
> HDFS-8287-HDFS-7285.05.patch, HDFS-8287-HDFS-7285.06.patch, 
> HDFS-8287-HDFS-7285.07.patch, HDFS-8287-HDFS-7285.08.patch, 
> HDFS-8287-HDFS-7285.09.patch, HDFS-8287-HDFS-7285.10.patch, 
> HDFS-8287-HDFS-7285.11.patch, HDFS-8287-HDFS-7285.WIP.patch, 
> HDFS-8287-performance-report.pdf, HDFS-8287.12.patch, h8287_20150911.patch, 
> jstack-dump.txt
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960879#comment-14960879
 ] 

Hudson commented on HDFS-9257:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8651 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8651/])
HDFS-9257. improve error message for "Absolute path required" in (harsh: rev 
52ac73f344e822e41457582f82abb4f35eba9dec)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9205) Do not schedule corrupt blocks for replication

2015-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960870#comment-14960870
 ] 

Hudson commented on HDFS-9205:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8650 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8650/])
Revert "Move HDFS-9205 to trunk in CHANGES.txt." (szetszwo: rev 
a554701fe4402ae30461e2ef165cb60970a202a0)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Do not schedule corrupt blocks for replication
> --
>
> Key: HDFS-9205
> URL: https://issues.apache.org/jira/browse/HDFS-9205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h9205_20151007.patch, h9205_20151007b.patch, 
> h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch, 
> h9205_20151013.patch, h9205_20151015.patch
>
>
> Corrupted blocks by definition are blocks cannot be read. As a consequence, 
> they cannot be replicated.  In UnderReplicatedBlocks, there is a queue for 
> QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks 
> from it.  It seems that scheduling corrupted block for replication is wasting 
> resource and potentially slow down replication for the higher priority blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-9257:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thank you for the improvement contribution 
Marcell, hope to see more!

> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960854#comment-14960854
 ] 

Harsh J commented on HDFS-9257:
---

+1, failed tests are unrelated. Tests shouldn't be necessary for the trivial 
message improvement. Committing shortly.

> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9258) NN should indicate which nodes are stale

2015-10-16 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla reassigned HDFS-9258:
-

Assignee: Kuhu Shukla

> NN should indicate which nodes are stale
> 
>
> Key: HDFS-9258
> URL: https://issues.apache.org/jira/browse/HDFS-9258
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Kuhu Shukla
>
> Determining why the NN is not coming out of safemode is difficult - is it a 
> bug or pending block reports?  If the number of nodes appears sufficient, but 
> there are missing blocks, it would be nice to know which nodes haven't block 
> reported (stale).  Instead of forcing the NN to leave safemode prematurely, 
> the SE can first force block reports from stale nodes.
> The datanode report and the web ui's node list should contain this 
> information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9258) NN should indicate which nodes are stale

2015-10-16 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-9258:
-

 Summary: NN should indicate which nodes are stale
 Key: HDFS-9258
 URL: https://issues.apache.org/jira/browse/HDFS-9258
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha
Reporter: Daryn Sharp


Determining why the NN is not coming out of safemode is difficult - is it a bug 
or pending block reports?  If the number of nodes appears sufficient, but there 
are missing blocks, it would be nice to know which nodes haven't block reported 
(stale).  Instead of forcing the NN to leave safemode prematurely, the SE can 
first force block reports from stale nodes.

The datanode report and the web ui's node list should contain this information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8898) Create API and command-line argument to get quota without need to get file and directory counts

2015-10-16 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960823#comment-14960823
 ] 

Ming Ma commented on HDFS-8898:
---

Thanks [~kihwal]! I will update the patch with new unit tests.

> Create API and command-line argument to get quota without need to get file 
> and directory counts
> ---
>
> Key: HDFS-8898
> URL: https://issues.apache.org/jira/browse/HDFS-8898
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Reporter: Joep Rottinghuis
> Attachments: HDFS-8898.patch
>
>
> On large directory structures it takes significant time to iterate through 
> the file and directory counts recursively to get a complete ContentSummary.
> When you want to just check for the quota on a higher level directory it 
> would be good to have an option to skip the file and directory counts.
> Moreover, currently one can only check the quota if you have access to all 
> the directories underneath. For example, if I have a large home directory 
> under /user/joep and I host some files for another user in a sub-directory, 
> the moment they create an unreadable sub-directory under my home I can no 
> longer check what my quota is. Understood that I cannot check the current 
> file counts unless I can iterate through all the usage, but for 
> administrative purposes it is nice to be able to get the current quota 
> setting on a directory without the need to iterate through and run into 
> permission issues on sub-directories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9257) improve error message for "Absolute path required" in INode.java to contain the rejected path

2015-10-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960788#comment-14960788
 ] 

Hadoop QA commented on HDFS-9257:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 49s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 53s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 17s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 22s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 28s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  9s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  51m 44s | Tests failed in hadoop-hdfs. |
| | |  97m  9s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767066/HDFS-9257.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / cf23f2c |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13031/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13031/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13031/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13031/console |


This message was automatically generated.

> improve error message for "Absolute path required" in INode.java to contain 
> the rejected path
> -
>
> Key: HDFS-9257
> URL: https://issues.apache.org/jira/browse/HDFS-9257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Marcell Szabo
>Assignee: Marcell Szabo
>Priority: Trivial
> Attachments: HDFS-9257.000.patch
>
>
> throw new AssertionError("Absolute path required");
> message should also show the path to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9241) HDFS clients can't construct HdfsConfiguration instances

2015-10-16 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960785#comment-14960785
 ] 

Steve Loughran commented on HDFS-9241:
--

One other thing to consider is "would we expect thin clients to ever 
instantiate this class?". If so, should it be in that JAR. 

until now, creating it has been the way to force hdfs-site in, just as creating 
a {{YarnConfiguration()}} forced that in. After hitting problems with race 
conditions in UGI init, I now load all of these on startup. Should this be 
necessary?

> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9241
> URL: https://issues.apache.org/jira/browse/HDFS-9241
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Attachments: HDFS-9241.000.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9239) DataNode Lifeline Protocol: an alternative protocol for reporting DataNode liveness

2015-10-16 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960776#comment-14960776
 ] 

Kihwal Lee commented on HDFS-9239:
--

It may not help much with the namenode side. Even on extremely busy clusters, I 
have not seen nodes missing heartbeat and considered dead because of the 
contention among heartbeats, incremental block reports (IBR) and full block 
reports (FBR).  Well before node liveness is affected by inundation of IBRs and 
FBRs, the namenode performance will degrade to unacceptable level. It is really 
easy to test this. Create a wide job that creates a lot small files. 

However,making it lighter on the datanode side is a good idea. We have seen 
many cases where nodes are declared dead because the service actor thread is 
delayed/blocked. 

> DataNode Lifeline Protocol: an alternative protocol for reporting DataNode 
> liveness
> ---
>
> Key: HDFS-9239
> URL: https://issues.apache.org/jira/browse/HDFS-9239
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: DataNode-Lifeline-Protocol.pdf
>
>
> This issue proposes introduction of a new feature: the DataNode Lifeline 
> Protocol.  This is an RPC protocol that is responsible for reporting liveness 
> and basic health information about a DataNode to a NameNode.  Compared to the 
> existing heartbeat messages, it is lightweight and not prone to resource 
> contention problems that can harm accurate tracking of DataNode liveness 
> currently.  The attached design document contains more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9239) DataNode Lifeline Protocol: an alternative protocol for reporting DataNode liveness

2015-10-16 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960776#comment-14960776
 ] 

Kihwal Lee edited comment on HDFS-9239 at 10/16/15 2:24 PM:


It may not help much with the namenode side. Even on extremely busy clusters, I 
have not seen nodes missing heartbeat and considered dead because of the 
contention among heartbeats, incremental block reports (IBR) and full block 
reports (FBR).  Well before node liveness is affected by inundation of IBRs and 
FBRs, the namenode performance will degrade to unacceptable level. It is really 
easy to test this. Create a wide job that creates a lot of small files. 

However,making it lighter on the datanode side is a good idea. We have seen 
many cases where nodes are declared dead because the service actor thread is 
delayed/blocked. 


was (Author: kihwal):
It may not help much with the namenode side. Even on extremely busy clusters, I 
have not seen nodes missing heartbeat and considered dead because of the 
contention among heartbeats, incremental block reports (IBR) and full block 
reports (FBR).  Well before node liveness is affected by inundation of IBRs and 
FBRs, the namenode performance will degrade to unacceptable level. It is really 
easy to test this. Create a wide job that creates a lot small files. 

However,making it lighter on the datanode side is a good idea. We have seen 
many cases where nodes are declared dead because the service actor thread is 
delayed/blocked. 

> DataNode Lifeline Protocol: an alternative protocol for reporting DataNode 
> liveness
> ---
>
> Key: HDFS-9239
> URL: https://issues.apache.org/jira/browse/HDFS-9239
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: DataNode-Lifeline-Protocol.pdf
>
>
> This issue proposes introduction of a new feature: the DataNode Lifeline 
> Protocol.  This is an RPC protocol that is responsible for reporting liveness 
> and basic health information about a DataNode to a NameNode.  Compared to the 
> existing heartbeat messages, it is lightweight and not prone to resource 
> contention problems that can harm accurate tracking of DataNode liveness 
> currently.  The attached design document contains more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9249) NPE thrown if an IOException is thrown in NameNode.

2015-10-16 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960775#comment-14960775
 ] 

Steve Loughran commented on HDFS-9249:
--

don't worry about being new: welcome to the fun of debugging hadoop from stack 
traces.

I agree, you may not have found the cause, but you have certainly found what 
would have triggered it, verified that there's logging and there's a shutdown. 

would you be able to derive a test from that? We shouldn't need to have a 
miniHDFS cluster spun up, just try to start an NN with those configuration 
options



> NPE thrown if an IOException is thrown in NameNode.
> -
>
> Key: HDFS-9249
> URL: https://issues.apache.org/jira/browse/HDFS-9249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-9249.001.patch
>
>
> This issue was found when running test case 
> TestBackupNode.testCheckpointNode, but upon closer look, the problem is not 
> due to the test case.
> Looks like an IOException was thrown in
> try {
>   initializeGenericKeys(conf, nsId, namenodeId);
>   initialize(conf);
>   try {
> haContext.writeLock();
> state.prepareToEnterState(haContext);
> state.enterState(haContext);
>   } finally {
> haContext.writeUnlock();
>   }
> causing the namenode to stop, but the namesystem was not yet properly 
> instantiated, causing NPE.
> I tried to reproduce locally, but to no avail.
> Because I could not reproduce the bug, and the log does not indicate what 
> caused the IOException, I suggest make this a supportability JIRA to log the 
> exception for future improvement.
> Stacktrace
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getFSImage(NameNode.java:906)
> at org.apache.hadoop.hdfs.server.namenode.BackupNode.stop(BackupNode.java:210)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:827)
> at 
> org.apache.hadoop.hdfs.server.namenode.BackupNode.(BackupNode.java:89)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1474)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.startBackupNode(TestBackupNode.java:102)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpoint(TestBackupNode.java:298)
> at 
> org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpointNode(TestBackupNode.java:130)
> The last few lines of log:
> 2015-10-14 19:45:07,807 INFO namenode.NameNode 
> (NameNode.java:createNameNode(1422)) - createNameNode [-checkpoint]
> 2015-10-14 19:45:07,807 INFO impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:init(158)) - CheckpointNode metrics system started 
> (again)
> 2015-10-14 19:45:07,808 INFO namenode.NameNode 
> (NameNode.java:setClientNamenodeAddress(402)) - fs.defaultFS is 
> hdfs://localhost:37835
> 2015-10-14 19:45:07,808 INFO namenode.NameNode 
> (NameNode.java:setClientNamenodeAddress(422)) - Clients are to use 
> localhost:37835 to access this namenode/service.
> 2015-10-14 19:45:07,810 INFO hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:shutdown(1708)) - Shutting down the Mini HDFS Cluster
> 2015-10-14 19:45:07,810 INFO namenode.FSNamesystem 
> (FSNamesystem.java:stopActiveServices(1298)) - Stopping services started for 
> active state
> 2015-10-14 19:45:07,811 INFO namenode.FSEditLog 
> (FSEditLog.java:endCurrentLogSegment(1228)) - Ending log segment 1
> 2015-10-14 19:45:07,811 INFO namenode.FSNamesystem 
> (FSNamesystem.java:run(5306)) - NameNodeEditLogRoller was interrupted, exiting
> 2015-10-14 19:45:07,811 INFO namenode.FSEditLog 
> (FSEditLog.java:printStatistics(703)) - Number of transactions: 3 Total time 
> for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of 
> syncs: 4 SyncTimes(ms): 2 1 
> 2015-10-14 19:45:07,811 INFO namenode.FSNamesystem 
> (FSNamesystem.java:run(5373)) - LazyPersistFileScrubber was interrupted, 
> exiting
> 2015-10-14 19:45:07,822 INFO namenode.FileJournalManager 
> (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name1/current/edits_inprogress_001
>  -> 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name1/current/edits_001-003
> 2015-10-14 19:45:07,835 INFO namenode.FileJournalManager 
> (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file 
> /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/had

[jira] [Commented] (HDFS-7964) Add support for async edit logging

2015-10-16 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960721#comment-14960721
 ] 

Daryn Sharp commented on HDFS-7964:
---

# The thread is the only one calling the real logEdit so the lastest txid is 
the one it last logged.
# I changed it because the bookkeeper tests emitted nothing at all which makes 
it really hard to debug.  I can undo it or change to INFO (what I intended) if 
you like.
# It depends on the latest changes in HADOOP-12483.  I renamed a new method in 
HADOOP-10300 to be more explicit about it's purpose. 

> Add support for async edit logging
> --
>
> Key: HDFS-7964
> URL: https://issues.apache.org/jira/browse/HDFS-7964
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.0.2-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-7964.patch, HDFS-7964.patch
>
>
> Edit logging is a major source of contention within the NN.  LogEdit is 
> called within the namespace write log, while logSync is called outside of the 
> lock to allow greater concurrency.  The handler thread remains busy until 
> logSync returns to provide the client with a durability guarantee for the 
> response.
> Write heavy RPC load and/or slow IO causes handlers to stall in logSync.  
> Although the write lock is not held, readers are limited/starved and the call 
> queue fills.  Combining an edit log thread with postponed RPC responses from 
> HADOOP-10300 will provide the same durability guarantee but immediately free 
> up the handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8880) NameNode metrics logging

2015-10-16 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960683#comment-14960683
 ] 

Steve Loughran commented on HDFS-8880:
--

I'd be in favour of improving our own general metrics sinks, rather than adding 
new stuff to every service
# it just adds more stuff to maintain, to test, to document and debug
# it's not broadly re-usable
# it adds 1/thread service, which in test runs could be many more per vm.

I note that Coda Hale metrics has a [stdout 
streamer|https://dropwizard.github.io/metrics/3.1.0/getting-started/] for such 
purposes.

If we were to do things with metrics, codahale integration would seem a good 
strategy (though the transitive LGPL dependency on the ganglia reporter is 
something to be aware of)

> NameNode metrics logging
> 
>
> Key: HDFS-8880
> URL: https://issues.apache.org/jira/browse/HDFS-8880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.8.0
>
> Attachments: HDFS-8880.01.patch, HDFS-8880.02.patch, 
> HDFS-8880.03.patch, HDFS-8880.04.patch, namenode-metrics.log
>
>
> The NameNode can periodically log metrics to help debugging when the cluster 
> is not setup with another metrics monitoring scheme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >