[jira] [Commented] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939438#comment-14939438
 ] 

Uma Maheswara Rao G commented on HDFS-9185:
---

Thank you Rakesh for reporting it. Changes looked good to me. Lets wait for 
jenkins to see this test failure fixed.

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad

2015-10-01 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939789#comment-14939789
 ] 

Kihwal Lee commented on HDFS-9178:
--

- release audit: caused by the EC branch merge
- checkstyle: file length, which was already over the "limit".
- test failures: mostly new EC related tests. They seem to pass when run 
locally, including {{TestLazyWriter}}. 

> Slow datanode I/O can cause a wrong node to be marked bad
> -
>
> Key: HDFS-9178
> URL: https://issues.apache.org/jira/browse/HDFS-9178
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-9178.patch
>
>
> When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the 
> downstream node can timeout on reading packet since even the heartbeat 
> packets will not be relayed down.  
> The packet read timeout is set in {{DataXceiver#run()}}:
> {code}
>   peer.setReadTimeout(dnConf.socketTimeout);
> {code}
> When the downstream node times out and closes the connection to the upstream, 
> the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an 
> ack upstream with the downstream node status set to {{ERROR}}.  This caused 
> the client to exclude the downstream node, even thought the upstream node was 
> the one got stuck.
> The connection to downstream has longer timeout, so the downstream will 
> always timeout  first. The downstream timeout is set in {{writeBlock()}}
> {code}
>   int timeoutValue = dnConf.socketTimeout +
>   (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length);
>   int writeTimeout = dnConf.socketWriteTimeout +
>   (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length);
>   NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue);
>   OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock,
>   writeTimeout);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8676) Delayed rolling upgrade finalization can cause heartbeat expiration

2015-10-01 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8676:
-
Target Version/s: 2.7.2

> Delayed rolling upgrade finalization can cause heartbeat expiration
> ---
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Critical
> Attachments: HDFS-8676.01.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks 
> can pile up in the datanode trash directories until an upgrade is finalized.  
> When it is finally finalized, the deletion of trash is done in the service 
> actor thread's context synchronously.  This blocks the heartbeat and can 
> cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade 
> finalization.  The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939724#comment-14939724
 ] 

Hadoop QA commented on HDFS-9185:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 12s | Pre-patch trunk has 7 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  2s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 54s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 12s | The applied patch generated  1 
new checkstyle issues (total was 288, now 285). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 23s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  9s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 215m 47s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 31s | Tests passed in 
hadoop-hdfs-client. |
| | | 267m 17s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.util.TestByteArrayManager |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764550/HDFS-9185-00.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5db371f |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12760/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs-client.html
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12760/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12760/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12760/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12760/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12760/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12760/console |


This message was automatically generated.

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-01 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-8766:
--
Attachment: HDFS-8766.HDFS-8707.003.patch

Added another patch

Forgot to change include guard in bindings/hdfs.h from 
LIBHDFSPP_COMPATIBILITY_HDFS_H_ to LIBHDFSPP_BINDINGS_HDFS_H_ in last patch.

> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9186) Simplify embedding libhdfspp into other projects

2015-10-01 Thread James Clampffer (JIRA)
James Clampffer created HDFS-9186:
-

 Summary: Simplify embedding libhdfspp into other projects
 Key: HDFS-9186
 URL: https://issues.apache.org/jira/browse/HDFS-9186
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: James Clampffer
Assignee: James Clampffer


I'd like to add a script to the root libhdfspp directory that can prune 
anything that libhdfspp doesn't need to compile out of the hadoop source tree.  

This way the project is a lot smaller if it's going to be included in a 
third-party directory of another project.  The directory structure, aside from 
missing directories, is preserved so modifications can be diffed against a 
fresh checkout of the source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939954#comment-14939954
 ] 

Hadoop QA commented on HDFS-8766:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 14s | Pre-patch HDFS-8707 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:red}-1{color} | javac |   1m 22s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764625/HDFS-8766.HDFS-8707.003.patch
 |
| Optional Tests | javac unit |
| git revision | HDFS-8707 / 3668778 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12761/console |


This message was automatically generated.

> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-01 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940047#comment-14940047
 ] 

James Clampffer commented on HDFS-8766:
---

Thanks for the review Bob.

1) Thanks for pointing that out.  I'm not sure how the shared_ptr snuck back 
in; I'll remove that.
2) The thinking there was to avoid making the user do a cast if they just got 
some fresh space with malloc or wanted to do something else odd with memory in 
C.  I was on the fence about using char *.  I have no strong preference either 
way.
3) That was just a remnant of when I was doing a quick test/prototype; libhdfs 
used structs and I took the declaration from there.  I just left it that way 
because there wasn't much to hide yet.  Originally I had the exposed C 
functions touching input_stream_ and file_system_ on hdfs_internal and 
hdfsFile_internal a lot more so I wanted members to be public by default.  The 
only function that does that now is hdfsFileIsOpenForRead and the 
implementation for that should be in hdfsFile_internal anyway.  I can switch it 
over to a class and make the members private.

And yes, I'll put a note into HDFS-8790 about managed pointers and commit them 
with that patch.

> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-01 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Status: Open  (was: Patch Available)

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8790) Add Filesystem level stress tests

2015-10-01 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-8790:
--
Description: 
I propose adding stress tests on libhdfs(3) compatibility layer was well as the 
async calls.  These can be also used for basic performance metrics and inputs 
to profiling tools to see improvements over time.

I'd like to make these tests into a seperate executable, or set of them, so 
that they can be used for longer running tests on dedicated clusters that may 
already exist.  Each should provide a simple command line interface for 
scripted or manual use.

Basic tests would be:
looped open-read-close
sequential scans
small random reads 

All tests will be parameterized for number of threads, read size, and upper and 
lower offset bounds for a specified file.  This will make it much easier to 
detect and reproduce threading issues and resource leaks as well as provide a 
simple executable (or set of executables) that can be run with valgrind to gain 
a high confidence that the code is operating correctly.

I'd appreciate suggestions for any other simple stress tests.

HDFS-8766 intentionally avoided shared_ptr and unique_ptr in the C api to make 
debugging this a little easier in case memory stomps and dangling references 
show up in stress tests.  These will be added into the C API when the patch for 
this jira is submitted because things should be reasonably stable once the 
stress tests pass.

  was:
I propose adding stress tests on libhdfs(3) compatibility layer was well as the 
async calls.  These can be also used for basic performance metrics and inputs 
to profiling tools to see improvements over time.

I'd like to make these tests into a seperate executable, or set of them, so 
that they can be used for longer running tests on dedicated clusters that may 
already exist.  Each should provide a simple command line interface for 
scripted or manual use.

Basic tests would be:
looped open-read-close
sequential scans
small random reads 

All tests will be parameterized for number of threads, read size, and upper and 
lower offset bounds for a specified file.  This will make it much easier to 
detect and reproduce threading issues and resource leaks as well as provide a 
simple executable (or set of executables) that can be run with valgrind to gain 
a high confidence that the code is operating correctly.

I'd appreciate suggestions for any other simple stress tests.


> Add Filesystem level stress tests
> -
>
> Key: HDFS-8790
> URL: https://issues.apache.org/jira/browse/HDFS-8790
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>
> I propose adding stress tests on libhdfs(3) compatibility layer was well as 
> the async calls.  These can be also used for basic performance metrics and 
> inputs to profiling tools to see improvements over time.
> I'd like to make these tests into a seperate executable, or set of them, so 
> that they can be used for longer running tests on dedicated clusters that may 
> already exist.  Each should provide a simple command line interface for 
> scripted or manual use.
> Basic tests would be:
> looped open-read-close
> sequential scans
> small random reads 
> All tests will be parameterized for number of threads, read size, and upper 
> and lower offset bounds for a specified file.  This will make it much easier 
> to detect and reproduce threading issues and resource leaks as well as 
> provide a simple executable (or set of executables) that can be run with 
> valgrind to gain a high confidence that the code is operating correctly.
> I'd appreciate suggestions for any other simple stress tests.
> HDFS-8766 intentionally avoided shared_ptr and unique_ptr in the C api to 
> make debugging this a little easier in case memory stomps and dangling 
> references show up in stress tests.  These will be added into the C API when 
> the patch for this jira is submitted because things should be reasonably 
> stable once the stress tests pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-01 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940089#comment-14940089
 ] 

Haohui Mai commented on HDFS-8766:
--

Thanks for updating the patch!.
{code}
+  if (!s.ok()) {
+// for now assume this was a down DN, not going to reinvent HDFS-9103
+if (s.code() == Status::kResourceUnavailable) {
+  errno = ECOMM;
+  bad_datanodes_.insert(datanode);
+}
+// otherwise it was a real error so indicate that something's gone wrong
+return -1;
+  }
+
+  return (ssize_t)read_count;
+}
{code}

There are several issues:

(1) It makes sense to refactor the above code into a function.
(2) the errno should be {{EINTR}}.
(3) When a DN added into the dead DN list, it needs to be removed from the list 
after a pre-configured time.
(4) This function needs a unit test in gmock.

Nit: Things like class name should be in caml cases instead of underscores. In 
order to glue them with hdfs.h, an easy way is to add {{typedef ClassName 
hdfs_internal}} in the implementation.

> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8676) Delayed rolling upgrade finalization can cause heartbeat expiration

2015-10-01 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940045#comment-14940045
 ] 

Walter Su commented on HDFS-8676:
-

Thanks Lee for reviewing. I'm on holiday. I'll update it soon.

> Delayed rolling upgrade finalization can cause heartbeat expiration
> ---
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Critical
> Attachments: HDFS-8676.01.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks 
> can pile up in the datanode trash directories until an upgrade is finalized.  
> When it is finally finalized, the deletion of trash is done in the service 
> actor thread's context synchronously.  This blocks the heartbeat and can 
> cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade 
> finalization.  The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-01 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Attachment: HDFS-8164.002.patch

The release audit warning and test failures are unrelated.
Update patch 002 to fix whitespace error.

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7268) Can't append with SYNC_BLOCK

2015-10-01 Thread Bogdan Raducanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bogdan Raducanu reassigned HDFS-7268:
-

Assignee: (was: Bogdan Raducanu)

> Can't append with SYNC_BLOCK
> 
>
> Key: HDFS-7268
> URL: https://issues.apache.org/jira/browse/HDFS-7268
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.1
>Reporter: Bogdan Raducanu
>
> It seems to me that currently it's not possible to start appending to a file 
> with CreateFlag.SYNC_BLOCK behavior (i.e. hsync after each block). So, I 
> think, the only way to guarantee durability when appending is if the user 
> calls hsync at the end of each block.
> FileSystem.append doesn't accept a CreateFlag argument.
> FileSystem.create, on the other hand, ignores CreateFlag.APPEND afaics. I 
> think the plan in HDFS-744 was to use this method if durability is needed.
> It seems it might work through FileContext but in the end DFSOutputStream 
> never sets shouldSyncBlock when appending anyway.
> Or am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8979) Clean up checkstyle warnings in hadoop-hdfs-client module

2015-10-01 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8979:

Attachment: HDFS-8979.001.patch

The release audit is considered unrelated.

The v1 patch fixes the newly brought checkstyle warnings. Some existing 
checkstyle warnings may be addressed separately, e.g. file/method length 
exceeds maximum lines, potential false positive for unused imports (used in 
javadoc) etc.

> Clean up checkstyle warnings in hadoop-hdfs-client module
> -
>
> Key: HDFS-8979
> URL: https://issues.apache.org/jira/browse/HDFS-8979
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8979.000.patch, HDFS-8979.001.patch
>
>
> This jira tracks the effort of cleaning up checkstyle warnings in 
> {{hadoop-hdfs-client}} module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-01 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8164:

Status: Patch Available  (was: Open)

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-01 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940090#comment-14940090
 ] 

Haohui Mai commented on HDFS-8766:
--

bq. For now I'd really like to avoid shared_ptr and unique_ptr until I move the 
stress tests from github to this for HDFS-8790. I wouldn't be surprised if some 
bugs shake out when running large stress tests so I'd like to make things 
easier to debug using the pointer invalidation macro.

I think it makes sense to commit clean codes and separate the debugging code 
into a separate jira.

> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-01 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940001#comment-14940001
 ] 

Bob Hansen commented on HDFS-8766:
--

Minor nits:
* Still have a shared_ptr to a promise in hdfsFile_internal::pread
* Is there a reason getErrorStr doesn't take a char *?
* Is there a reason hdfs_internal is a struct rather than a class?

Can you file a follow-up JIRA or add to HDFS-8790 to convert raw pointers to 
managed pointers?


> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9180) Update excluded DataNodes in DFSStripedOutputStream based on failures in data streamers

2015-10-01 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940042#comment-14940042
 ] 

Walter Su commented on HDFS-9180:
-

Thanks Jing for the great work. I'm on holiday. I'll take a look tomorrow. 
Maybe others can help review together.

> Update excluded DataNodes in DFSStripedOutputStream based on failures in data 
> streamers
> ---
>
> Key: HDFS-9180
> URL: https://issues.apache.org/jira/browse/HDFS-9180
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9180.000.patch
>
>
> This is a TODO in HDFS-9040: based on the failures all the striped data 
> streamers hit, the DFSStripedOutputStream should keep a record of all the 
> DataNodes that should be excluded.
> This jira will also fix several bugs in the DFSStripedOutputStream. Will 
> provide more details in the comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940084#comment-14940084
 ] 

Rakesh R commented on HDFS-9185:


Note: It seems test case failures are not related to the patch. Also, release 
audit and checkstyle warning are unrelated.

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940238#comment-14940238
 ] 

Masatake Iwasaki commented on HDFS-9187:


{{tracer}} seems to be null if the filssystem instance is not created by 
{{FileSystem#createFileSystem}}. Should we add NullTracer (similar to 
NullScope) to HTrace and use it's singleton as default tracer rather than 
adding around null check everywhere?

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread stack (JIRA)
stack created HDFS-9187:
---

 Summary: Check if tracer is null before using it
 Key: HDFS-9187
 URL: https://issues.apache.org/jira/browse/HDFS-9187
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tracing
Affects Versions: 2.8.0
Reporter: stack


Saw this where an hbase that has not been updated to htrace-4.0.1 was trying to 
start:

{code}
Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
Failed to become active master
java.lang.NullPointerException
at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8312) Trash does not descent into child directories to check for permissions

2015-10-01 Thread Luis Fernando Antonioli (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940109#comment-14940109
 ] 

Luis Fernando Antonioli commented on HDFS-8312:
---

I have been able to reproduce the bug in Hadoop 2.6.0, but not by following the 
same steps you proposed. Following your steps, I got permission denied in both 
cases and could not delete the files. In my test, I used the super user account 
to create a shared folder (every user can upload files to this directory) in 
the root directory of the HDFS and then used two different non root accounts 
(user1 and user2) to upload files to this folder (one user does not have 
permission to edit the files of the other). Finally, I could reproduce the 
inconsistency. When the HDFS thash was disabled, I got permission denied when 
trying to delete the files with one of the non root accounts and when the trash 
was enabled I was able to move all the files to the trash folder. Although I 
cannot delete the files directly from the trash folder, they will be deleted 
when the deletion interval set in the Hadoop configuration is reached. I could 
not reproduce this issue in Hadoop 2.7.1, I got permission denied in both 
cases. I think this bug was fixed in newer versions of Hadoop.


> Trash does not descent into child directories to check for permissions
> --
>
> Key: HDFS-8312
> URL: https://issues.apache.org/jira/browse/HDFS-8312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS, security
>Affects Versions: 2.2.0, 2.6.0
>Reporter: Eric Yang
>
> HDFS trash does not descent into child directory to check if user has 
> permission to delete files.  For example:
> Run the following command to initialize directory structure as super user:
> {code}
> hadoop fs -mkdir /BSS/level1
> hadoop fs -mkdir /BSS/level1/level2
> hadoop fs -mkdir /BSS/level1/level2/level3
> hadoop fs -put /tmp/appConfig.json /BSS/level1/level2/level3/testfile.txt
> hadoop fs -chown user1:users /BSS/level1/level2/level3/testfile.txt
> hadoop fs -chown -R user1:users /BSS/level1
> hadoop fs -chown -R 750 /BSS/level1
> hadoop fs -chmod -R 640 /BSS/level1/level2/level3/testfile.txt
> hadoop fs -chmod 775 /BSS
> {code}
> Change to a normal user called user2. 
> When trash is enabled:
> {code}
> sudo su user2 -
> hadoop fs -rm -r /BSS/level1
> 15/05/01 16:51:20 INFO fs.TrashPolicyDefault: Namenode trash configuration: 
> Deletion interval = 3600 minutes, Emptier interval = 0 minutes.
> Moved: 'hdfs://bdvs323.svl.ibm.com:9000/BSS/level1' to trash at: 
> hdfs://bdvs323.svl.ibm.com:9000/user/user2/.Trash/Current
> {code}
> When trash is disabled:
> {code}
> /opt/ibm/biginsights/IHC/bin/hadoop fs -Dfs.trash.interval=0 -rm -r 
> /BSS/level1
> 15/05/01 16:58:31 INFO fs.TrashPolicyDefault: Namenode trash configuration: 
> Deletion interval = 0 minutes, Emptier interval = 0 minutes.
> rm: Permission denied: user=user2, access=ALL, 
> inode="/BSS/level1":user1:users:drwxr-x---
> {code}
> There is inconsistency between trash behavior and delete behavior.  When 
> trash is enabled, files owned by user1 is deleted by user2.  It looks like 
> trash does not recursively validate if the child directory files can be 
> removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940200#comment-14940200
 ] 

Jing Zhao commented on HDFS-9185:
-

Thanks for working on this, [~rakeshr]. The changes looks good to me. One 
comment is about the log level change. Changing the log level from debug to 
warn may generate unnecessary exception trace for DFSStripedInputStream since 
the failure can be covered by later decoding. So how about we change the log 
level for the unit test? We can need to add the following code to 
{{TestRecoverStripedBlocks}}:
{code}
static {
  GenericTestUtils.setLogLevel(DFSClient.LOG, Level.ALL);
}
{code}

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940338#comment-14940338
 ] 

stack commented on HDFS-9187:
-

[~iwasakims] Yeah, we should probably get a NullTracer in... Less null-checks 
all around. Would be htrace-4.0.2?

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8979) Clean up checkstyle warnings in hadoop-hdfs-client module

2015-10-01 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8979:

Attachment: HDFS-8979.002.patch

The v2 patch fixes whitespaces and more checkstyle warnings.

> Clean up checkstyle warnings in hadoop-hdfs-client module
> -
>
> Key: HDFS-8979
> URL: https://issues.apache.org/jira/browse/HDFS-8979
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8979.000.patch, HDFS-8979.001.patch, 
> HDFS-8979.002.patch
>
>
> This jira tracks the effort of cleaning up checkstyle warnings in 
> {{hadoop-hdfs-client}} module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9188) Make block corruption related tests FsDataset-agnostic.

2015-10-01 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-9188:
---

 Summary: Make block corruption related tests FsDataset-agnostic. 
 Key: HDFS-9188
 URL: https://issues.apache.org/jira/browse/HDFS-9188
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: HDFS, test
Affects Versions: 2.7.1
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu


Currently, HDFS does block corruption tests by directly accessing the files 
stored on the storage directories, which assumes {{FsDatasetImpl}} is the 
dataset implementation. However, with works like OZone (HDFS-7240) and 
HDFS-8679, there will be different FsDataset implementations. 

So we need a general way to run whitebox tests like corrupting blocks and crc 
files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7342) Lease Recovery doesn't happen some times

2015-10-01 Thread Venkata Ganji (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940323#comment-14940323
 ] 

Venkata Ganji commented on HDFS-7342:
-

Hello [~raviprak], I have a quick question, will the above test reproduce the 
infinite loop of recovery by lease manager ? I am trying to reproduce this 
issue in Hadoop 2.0.0.

Thanks

> Lease Recovery doesn't happen some times
> 
>
> Key: HDFS-7342
> URL: https://issues.apache.org/jira/browse/HDFS-7342
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HDFS-7342.04.patch, HDFS-7342.1.patch, 
> HDFS-7342.2.patch, HDFS-7342.3.patch
>
>
> In some cases, LeaseManager tries to recover a lease, but is not able to. 
> HDFS-4882 describes a possibility of that. We should fix this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8979) Clean up checkstyle warnings in hadoop-hdfs-client module

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940282#comment-14940282
 ] 

Hadoop QA commented on HDFS-8979:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 17s | Pre-patch trunk has 7 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 51s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 17s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 30s | The applied patch generated  
12 new checkstyle issues (total was 1912, now 1179). |
| {color:red}-1{color} | whitespace | 109m 33s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 15s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests |   0m 30s | Tests passed in 
hadoop-hdfs-client. |
| | | 155m  9s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764637/HDFS-8979.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ecbfd68 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12763/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs-client.html
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12763/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12763/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12763/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12763/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12763/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12763/console |


This message was automatically generated.

> Clean up checkstyle warnings in hadoop-hdfs-client module
> -
>
> Key: HDFS-8979
> URL: https://issues.apache.org/jira/browse/HDFS-8979
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8979.000.patch, HDFS-8979.001.patch
>
>
> This jira tracks the effort of cleaning up checkstyle warnings in 
> {{hadoop-hdfs-client}} module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-01 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-8766:
--
Attachment: HDFS-8766.HDFS-8707.004.patch

Thanks Bob and Haohui for the reviews.  The new patch should address all of 
Bob's suggestions and all but one of Haohui's(no gmock yet).  

Haohui, I refactored the error detection code in the file handle 
implementation, are you saying I should write a gmock test just for that 
function?

I added a simple timeout to clear out the bad datanodes after a specified 
period of time (default to 2 minutes).  I think that should be sufficient for 
the initial API, I plan on adding a lot more features (mostly namenode 
operations) to the C api after HDFS-8790; but that depends on this being in.  
I'll open a new jira api extentions.

Right now I think it's important to move on to HDFS-8790 as soon as possible.  
I'm seeing some odd behavior around the RpcConnectionImpl where virtual methods 
are being called mid-destruction when used against a real cluster.  I'd like to 
knock out those sorts of bugs while this API is still fairly simple.

> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch, HDFS-8766.HDFS-8707.004.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9180) Update excluded DataNodes in DFSStripedOutputStream based on failures in data streamers

2015-10-01 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9180:

Attachment: HDFS-9180.001.patch

Update the patch:
# The above fix #3 is unnecessary since we mark a data streamer as external 
error only when the streamer is already in DATA_STREAMING stage, when the 
{{nodes}} should be non-empty.
# Fix another race case in {{writeChunk}}: the current streamer can become a 
healthy one after calling {{allocateNewBlock}}.
# Fix some issues in the current test code.

> Update excluded DataNodes in DFSStripedOutputStream based on failures in data 
> streamers
> ---
>
> Key: HDFS-9180
> URL: https://issues.apache.org/jira/browse/HDFS-9180
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9180.000.patch, HDFS-9180.001.patch
>
>
> This is a TODO in HDFS-9040: based on the failures all the striped data 
> streamers hit, the DFSStripedOutputStream should keep a record of all the 
> DataNodes that should be excluded.
> This jira will also fix several bugs in the DFSStripedOutputStream. Will 
> provide more details in the comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940462#comment-14940462
 ] 

Hadoop QA commented on HDFS-8766:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   6m 28s | Pre-patch HDFS-8707 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:red}-1{color} | javac |   1m 35s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764686/HDFS-8766.HDFS-8707.004.patch
 |
| Optional Tests | javac unit |
| git revision | HDFS-8707 / 3668778 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12765/console |


This message was automatically generated.

> Implement a libhdfs(3) compatible API
> -
>
> Key: HDFS-8766
> URL: https://issues.apache.org/jira/browse/HDFS-8766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch, HDFS-8766.HDFS-8707.004.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9187:
---
Attachment: HDFS-9187.001.patch

Tracer without recievers and samplers seems to work as effectively no-op 
tracer. Maybe we could the singleton instance to static field of Tracer later.

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
> Attachments: HDFS-9187.001.patch
>
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940594#comment-14940594
 ] 

Colin Patrick McCabe commented on HDFS-9187:


I think adding a NullTracer would make sense.  I also think there is an easier 
solution via using FsTracer.java... let me check it out

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
> Attachments: HDFS-9187.001.patch
>
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9188) Make block corruption related tests FsDataset-agnostic.

2015-10-01 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9188:

Attachment: HDFS-9188.000.patch

The patch:
* Provides a Dataset-special TestUtils interface. 
* Provides a {{ReplicaToCorrupt}} interface to abstract out the block 
implementation (e.g., using File).

> Make block corruption related tests FsDataset-agnostic. 
> 
>
> Key: HDFS-9188
> URL: https://issues.apache.org/jira/browse/HDFS-9188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9188.000.patch
>
>
> Currently, HDFS does block corruption tests by directly accessing the files 
> stored on the storage directories, which assumes {{FsDatasetImpl}} is the 
> dataset implementation. However, with works like OZone (HDFS-7240) and 
> HDFS-8679, there will be different FsDataset implementations. 
> So we need a general way to run whitebox tests like corrupting blocks and crc 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9180) Update excluded DataNodes in DFSStripedOutputStream based on failures in data streamers

2015-10-01 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940564#comment-14940564
 ] 

Jing Zhao commented on HDFS-9180:
-

To help testing the patch, we can add the following test in 
{{TestDFSStripedOutputStreamWithFailure}}:
{code}
  @Test
  public void testDataFailureWithAllLengths() throws Exception {
for (int length : LENGTHS) {
  LOG.info("run test with length: " + length);
  runTest(length);
}
  }
{code}

The test may take a couple of hours.

> Update excluded DataNodes in DFSStripedOutputStream based on failures in data 
> streamers
> ---
>
> Key: HDFS-9180
> URL: https://issues.apache.org/jira/browse/HDFS-9180
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9180.000.patch, HDFS-9180.001.patch
>
>
> This is a TODO in HDFS-9040: based on the failures all the striped data 
> streamers hit, the DFSStripedOutputStream should keep a record of all the 
> DataNodes that should be excluded.
> This jira will also fix several bugs in the DFSStripedOutputStream. Will 
> provide more details in the comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940480#comment-14940480
 ] 

Hadoop QA commented on HDFS-8164:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 54s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  8s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 34s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 27s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 17s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 240m 46s | Tests failed in hadoop-hdfs. |
| | | 288m  3s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.TestRecoverStripedFile |
|   | hadoop.hdfs.TestWriteReadStripedFile |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764636/HDFS-8164.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ecbfd68 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12762/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12762/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12762/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12762/console |


This message was automatically generated.

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-01 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940496#comment-14940496
 ] 

Xiao Chen commented on HDFS-8164:
-

The test failures and release audit warning are again unrelated to this fix.

> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8164) cTime is 0 in VERSION file for newly formatted NameNode.

2015-10-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940536#comment-14940536
 ] 

Yongjun Zhang commented on HDFS-8164:
-

Hi [~xiaochen],

Thanks for working on this issue. The patch looks good, one comment here:
{code}
cluster.getNamesystem().getFSImage().getStorage().getCTime()
{code}
is a quite deep call. I'd suggest that we can create an API at different level, 
such as
{{
FSNamesystem.getCTime()
FSImage.getCTime()
}}

Hi [~cnauroth], thanks for creating this jira. Would you cmment whether my 
suggestion above makes sense to you?

Thanks a lot.




> cTime is 0 in VERSION file for newly formatted NameNode.
> 
>
> Key: HDFS-8164
> URL: https://issues.apache.org/jira/browse/HDFS-8164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Nauroth
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-8164.001.patch, HDFS-8164.002.patch
>
>
> After formatting a NameNode and inspecting its VERSION file, the cTime 
> property shows 0.  The value does get updated to current time during an 
> upgrade, but I believe this is intended to be the creation time of the 
> cluster, and therefore the initial value of 0 before an upgrade can cause 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-01 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940562#comment-14940562
 ] 

Colin Patrick McCabe commented on HDFS-9184:


HTrace doesn't need to be enabled at 100% sampling to detect abuse or spamming 
of requests.  If the spamming is significant enough to cause a problem, it will 
also show up in the sampled traces.

bq. Moreover, passing additional information (via annotations) other than span 
id from root of the tree to leaf is a significant additional work

Annotations aren't passed from root to leaf.  Annotations are properties of 
spans, and are sent to the span receiver.

bq. We propose another approach to address this problem. We also treat HDFS 
audit log as a good place for after-the-fact root cause analysis. We propose to 
put the caller id (e.g. Hive query id) in threadlocals. Specially, on client 
side the threadlocal object is passed to NN as a part of RPC header (optional), 
while on sever side NN retrieves it from header and put it to Handler's 
threadlocals. Finally in FSNamesystem, HDFS audit logger will record the caller 
context for each operation. In this way, the existing code is not affected.

I think this kind of full-system analysis should be handled by HTrace, not by 
ad-hoc solutions like this.  There are a lot of use-cases for Hive that don't 
involve HDFS at all, such as using Hive over HBase, or using Hive to access 
local filesystem resources.  We cannot use the HDFS audit log for that, because 
HDFS is not involved (or is involved only as the backend for another storage 
system).  And that's ignoring the significant compatibility, performance, and 
complexity problems of adding this to the NameNode.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 

[jira] [Commented] (HDFS-6584) Support Archival Storage

2015-10-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939371#comment-14939371
 ] 

Yongjun Zhang commented on HDFS-6584:
-

HI [~szetszwo],

Thanks for the work you and other folks did here. I have a question:

Per your comment:

https://issues.apache.org/jira/browse/HDFS-6584?focusedCommentId=14139690=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14139690

https://issues.apache.org/jira/browse/HDFS-6584?focusedCommentId=14148307=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14148307

you merged the feature branch to trunk and branch-2. 

When I look the git log for trunk and branch-2, I can see that trunk has
{code}
commit 22a41dce4af4d5b533ba875b322551db1c152878
Author: Tsz-Wo Nicholas Sze 
Date:   Sun Sep 7 07:44:28 2014 +0800

HDFS-6997: add more tests for data migration and replicaion.
{code}
. However, branch-2 doesn't have it. 

I looked at branch-2, and saw that the HDFS-6997 code is there. I checked 
another subtask of HDFS-6584, and it's same situation.

It looks that the commit history is collapsed during the branch-2 merge, but 
the commit history was kept when doing the trunk merge.  Is this intended? 
would you please comment on what might have happened?

Thanks much.






> Support Archival Storage
> 
>
> Key: HDFS-6584
> URL: https://issues.apache.org/jira/browse/HDFS-6584
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover, namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.6.0
>
> Attachments: HDFS-6584.000.patch, 
> HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, 
> archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, 
> h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, 
> h6584_20140915.patch, h6584_20140916.patch, h6584_20140916.patch, 
> h6584_20140917.patch, h6584_20140917b.patch, h6584_20140918.patch, 
> h6584_20140918b.patch
>
>
> In most of the Hadoop clusters, as more and more data is stored for longer 
> time, the demand for storage is outstripping the compute. Hadoop needs a cost 
> effective and easy to manage solution to meet this demand for storage. 
> Current solution is:
> - Delete the old unused data. This comes at operational cost of identifying 
> unnecessary data and deleting them manually.
> - Add more nodes to the clusters. This adds along with storage capacity 
> unnecessary compute capacity to the cluster.
> Hadoop needs a solution to decouple growing storage capacity from compute 
> capacity. Nodes with higher density and less expensive storage with low 
> compute power are becoming available and can be used as cold storage in the 
> clusters. Based on policy the data from hot storage can be moved to cold 
> storage. Adding more nodes to the cold storage can grow the storage 
> independent of the compute capacity in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Rakesh R (JIRA)
Rakesh R created HDFS-9185:
--

 Summary: TestRecoverStripedFile is failing
 Key: HDFS-9185
 URL: https://issues.apache.org/jira/browse/HDFS-9185
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding
Reporter: Rakesh R
Assignee: Rakesh R
Priority: Critical


Below is the message taken from build:
{code}
Error Message

Time out waiting for EC block recovery.
Stacktrace

java.io.IOException: Time out waiting for EC block recovery.
at 
org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
at 
org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
at 
org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
{code}

Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8979) Clean up checkstyle warnings in hadoop-hdfs-client module

2015-10-01 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8979:

Status: Patch Available  (was: Open)

> Clean up checkstyle warnings in hadoop-hdfs-client module
> -
>
> Key: HDFS-8979
> URL: https://issues.apache.org/jira/browse/HDFS-8979
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8979.000.patch
>
>
> This jira tracks the effort of cleaning up checkstyle warnings in 
> {{hadoop-hdfs-client}} module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9185:
---
Attachment: HDFS-9185-00.patch

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9185:
---
Target Version/s: 3.0.0
  Status: Patch Available  (was: Open)

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939361#comment-14939361
 ] 

Hadoop QA commented on HDFS-9053:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m 49s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 8 new or modified test files. |
| {color:green}+1{color} | javac |   8m  5s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 16s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  5s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 10s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 23s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |   7m 46s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 200m  8s | Tests failed in hadoop-hdfs. |
| | | 255m 18s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.metrics2.impl.TestGangliaMetrics |
|   | hadoop.hdfs.TestRecoverStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764524/HDFS-9053.004.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5db371f |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12758/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12758/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12758/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12758/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12758/console |


This message was automatically generated.

> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch, HDFS-9053.003.patch, 
> HDFS-9053.004.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete, the time complexity is 
> O\(n), (the search is O(log n), but insertion/deleting causes re-allocations 
> and copies of arrays), for large directory, the operations are expensive.  If 
> the children grow to 1M size, the ArrayList will resize to > 1M capacity, so 
> need > 1M * 8bytes = 8M (the reference size is 8 for 64-bits system/JVM) 
> continuous heap memory, it easily causes full GC in HDFS cluster where 
> namenode heap memory is already highly used.  I recap the 3 main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is not large (less than the maximum degree of B-Tree 
> node), the B-Tree only has one root node which contains an array for the 
> elements. And if the size grows large enough, it will split automatically, 
> and if elements are removed, then B-Tree nodes can merge automatically (see 
> more: 

[jira] [Commented] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939402#comment-14939402
 ] 

Rakesh R commented on HDFS-9185:


Following is my analysis:

# ErasureCodingWorker is creating the {{RemoteBlockReader2}} with null 
{{tracer}}, during the {{RemoteBlockReader2#read}} function call, it is hitting 
NPE and resulting in the failure. To fix this, how about passing the 
{{datanode#getTracer()}} to the reader ?
{code}
ErasureCodingWorker .java

return RemoteBlockReader2.newBlockReader(
"dummy", block, blockToken, offsetInBlock, 
block.getNumBytes() - offsetInBlock, true,
"", newConnectedPeer(block, dnAddr, blockToken, dnInfo), dnInfo,
null, cachingStrategy, null);
{code}
{code}
RemoteBlockReader2.java

  public synchronized int read(ByteBuffer buf) throws IOException {
if (curDataSlice == null || curDataSlice.remaining() == 0 && 
bytesNeededToFinish > 0) {
  TraceScope scope = tracer.newScope(
  "RemoteBlockReader2#readNextPacket(" + blockId + ")");
  try {
readNextPacket();
  } finally {
scope.close();
  }
}
{code}
# The root cause is not visible in the log messages as 
StripedBlockUtil#getNextCompletedStripedRead() is logging the exception with 
{{DEBUG}} level, IMHO the log level has to be changed to {{INFO}}  to know the 
failure reason.
{code}
if (DFSClient.LOG.isDebugEnabled()) {
DFSClient.LOG.debug("ExecutionException " + e);
  }
{code}

I'll soon prepare a patch including these changes.

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8979) Clean up checkstyle warnings in hadoop-hdfs-client module

2015-10-01 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8979:

Attachment: HDFS-8979.000.patch

The v0 patch addresses the following warnings in the {{hadoop-hdfs-client}} 
module. The fixes were mostly done automatically via Intellij IDE or shell 
script. The changes are mainly as follows:

# Replace try-catch-finally with try with resource management block
# Format the lines exceeding 80 columns
# Simply if/foreach/return statement
# Remove unused imports
# Remove redundant exception thrown in the method header if a more general one 
is listed
# Fixes wrong indents
# Remove unused parameters in private methods (public/deprecated methods 
considered stable)
# Language simplification like type inference for generic instance creation, 
using string concatenation instead of string builder for simple string 
construction
# Trailing whitespace

Some warnings can be fixed separately as they are case-by-case dependent, 
complex for automatic fix, or error-prone. Including but not limited to:
* Methods that are public but never be used in {{hadoop-hdfs-client}}
* Potential false report as _value never used after assigned_, _assert has side 
effects_
* Method parameter that can be removed as the passed value is always constant 
* Non-java (e.g. XML) problems
* Overrides deprecated methods

> Clean up checkstyle warnings in hadoop-hdfs-client module
> -
>
> Key: HDFS-8979
> URL: https://issues.apache.org/jira/browse/HDFS-8979
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8979.000.patch
>
>
> This jira tracks the effort of cleaning up checkstyle warnings in 
> {{hadoop-hdfs-client}} module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9187:
---
Assignee: Colin Patrick McCabe
  Status: Patch Available  (was: Open)

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9187.001.patch, HDFS-9187.002.patch
>
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9187:
---
Attachment: HDFS-9187.002.patch

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
> Attachments: HDFS-9187.001.patch, HDFS-9187.002.patch
>
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940608#comment-14940608
 ] 

Colin Patrick McCabe commented on HDFS-9187:


This is a different approach that just removes Tracer from FileSystem, given 
the difficulty of ensuring it's initialized.  (The other Tracers like DFSClient 
don't have this problem since the constructor ALWAYS does it... unlike FS where 
there is a no-argument constructor.)  I also added a unit test which is a 
regression test for this bug.

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9187.001.patch, HDFS-9187.002.patch
>
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940612#comment-14940612
 ] 

stack commented on HDFS-9187:
-

Globber is only place that uses the FS tracer?

This patch is better than what I was thinking of doing.

This patch seems good to me. We should work on the [~iwasakims] idea in the 
meantime "for everyone else.." to save having to do null checks.

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9187.001.patch, HDFS-9187.002.patch
>
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940651#comment-14940651
 ] 

Hadoop QA commented on HDFS-9187:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  1s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 28s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 25s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 14s | The applied patch generated  1 
new checkstyle issues (total was 145, now 145). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 36s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m  6s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |   6m 24s | Tests failed in 
hadoop-common. |
| | |  49m  8s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common |
| Failed unit tests | hadoop.fs.TestFileUtil |
|   | hadoop.fs.TestLocalFSFileContextMainOperations |
|   | hadoop.fs.viewfs.TestFcMainOperationsLocalFs |
|   | hadoop.metrics2.sink.TestFileSink |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764711/HDFS-9187.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fd026f5 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12767/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12767/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12767/artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12767/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12767/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12767/console |


This message was automatically generated.

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9187.001.patch, HDFS-9187.002.patch
>
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-01 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940656#comment-14940656
 ] 

Jitendra Nath Pandey commented on HDFS-9184:


  I think the purpose is not only to detect abusive clients, but also the 
ability to audit where the requests are coming from. Users often wonder, what 
is running in the cluster, and how hdfs is being used. This feature will allow 
us to analyze how upstream components are using hdfs and their load 
distribution. Sampling will not work for this kind of analysis.
  Htrace is more of a profiling tool and is useful to analyze for performance 
of various spans which are a pretty low level. But, it is an overkill and 
doesn't really fit for audit purpose that needs to capture high level contexts 
all the time.
  

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9180) Update excluded DataNodes in DFSStripedOutputStream based on failures in data streamers

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940677#comment-14940677
 ] 

Hadoop QA commented on HDFS-9180:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 39s | Pre-patch trunk has 7 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 45s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  1s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 56s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 24s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 102m  5s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 36s | Tests passed in 
hadoop-hdfs-client. |
| | | 153m 24s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.namenode.ha.TestHAFsck |
|   | hadoop.hdfs.server.namenode.TestCreateEditsLog |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestNNHealthCheck |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyBlockManagement |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
|   | hadoop.hdfs.server.namenode.ha.TestStateTransitionFailure |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.namenode.TestBackupNode |
|   | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions |
|   | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.server.namenode.ha.TestHAMetrics |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade |
|   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogsDuringFailover |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | org.apache.hadoop.hdfs.server.namenode.TestFileTruncate |
|   | org.apache.hadoop.hdfs.server.mover.TestStorageMover |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestFailureOfSharedDir |
|   | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764700/HDFS-9180.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fd026f5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12766/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs-client.html
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12766/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12766/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12766/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12766/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12766/console |


This message was automatically generated.

> Update excluded DataNodes in DFSStripedOutputStream based on failures in data 
> streamers
> ---
>
> Key: HDFS-9180
> URL: https://issues.apache.org/jira/browse/HDFS-9180
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: 

[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

2015-10-01 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940673#comment-14940673
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9053:
---

For the common case that the #children is small, say < 4K, ArrayList is better 
than B-Tree since it uses less memory and has similar running time as B-Tree.  
B-Tree is better for the special case that the #children is large.  How about 
keep using ArrayList when #children < 4K and change to B-Tree when #children >= 
4K.

> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch, HDFS-9053.003.patch, 
> HDFS-9053.004.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete, the time complexity is 
> O\(n), (the search is O(log n), but insertion/deleting causes re-allocations 
> and copies of arrays), for large directory, the operations are expensive.  If 
> the children grow to 1M size, the ArrayList will resize to > 1M capacity, so 
> need > 1M * 8bytes = 8M (the reference size is 8 for 64-bits system/JVM) 
> continuous heap memory, it easily causes full GC in HDFS cluster where 
> namenode heap memory is already highly used.  I recap the 3 main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is not large (less than the maximum degree of B-Tree 
> node), the B-Tree only has one root node which contains an array for the 
> elements. And if the size grows large enough, it will split automatically, 
> and if elements are removed, then B-Tree nodes can merge automatically (see 
> more: https://en.wikipedia.org/wiki/B-tree).  It will solve the above 3 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940616#comment-14940616
 ] 

Colin Patrick McCabe commented on HDFS-9187:


bq. Globber is only place that uses the FS tracer?

Yeah

bq. This patch seems good to me. We should work on the Masatake Iwasaki idea in 
the meantime "for everyone else.." to save having to do null checks.

Agree.  I filed HTRACE-275 for this.

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9187.001.patch, HDFS-9187.002.patch
>
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9188) Make block corruption related tests FsDataset-agnostic.

2015-10-01 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9188:

Status: Patch Available  (was: Open)

> Make block corruption related tests FsDataset-agnostic. 
> 
>
> Key: HDFS-9188
> URL: https://issues.apache.org/jira/browse/HDFS-9188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9188.000.patch
>
>
> Currently, HDFS does block corruption tests by directly accessing the files 
> stored on the storage directories, which assumes {{FsDatasetImpl}} is the 
> dataset implementation. However, with works like OZone (HDFS-7240) and 
> HDFS-8679, there will be different FsDataset implementations. 
> So we need a general way to run whitebox tests like corrupting blocks and crc 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9188) Make block corruption related tests FsDataset-agnostic.

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940712#comment-14940712
 ] 

Hadoop QA commented on HDFS-9188:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  10m 23s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 11 new or modified test files. |
| {color:green}+1{color} | javac |   9m 58s | There were no new javac warning 
messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 47s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 43s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 51s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m 36s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  61m 25s | Tests failed in hadoop-hdfs. |
| | |  90m 43s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.fs.TestGlobPaths |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764687/HDFS-9188.000.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / fd026f5 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12768/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12768/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12768/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12768/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12768/console |


This message was automatically generated.

> Make block corruption related tests FsDataset-agnostic. 
> 
>
> Key: HDFS-9188
> URL: https://issues.apache.org/jira/browse/HDFS-9188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9188.000.patch
>
>
> Currently, HDFS does block corruption tests by directly accessing the files 
> stored on the storage directories, which assumes {{FsDatasetImpl}} is the 
> dataset implementation. However, with works like OZone (HDFS-7240) and 
> HDFS-8679, there will be different FsDataset implementations. 
> So we need a general way to run whitebox tests like corrupting blocks and crc 
> files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9187) Check if tracer is null before using it

2015-10-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940622#comment-14940622
 ] 

stack commented on HDFS-9187:
-

Let me test to see if it fixes the issue I saw above...

> Check if tracer is null before using it
> ---
>
> Key: HDFS-9187
> URL: https://issues.apache.org/jira/browse/HDFS-9187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: stack
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9187.001.patch, HDFS-9187.002.patch
>
>
> Saw this where an hbase that has not been updated to htrace-4.0.1 was trying 
> to start:
> {code}
> Oct 1, 5:12:11.861 AM FATAL org.apache.hadoop.hbase.master.HMaster
> Failed to become active master
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Globber.glob(Globber.java:145)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1634)
> at org.apache.hadoop.hbase.util.FSUtils.getTableDirs(FSUtils.java:1372)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:206)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:619)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-01 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940639#comment-14940639
 ] 

Allen Wittenauer commented on HDFS-9184:


bq.  significant compatibility ... problems

I'm pretty much a big -1 (esp in branch-2) just because of that.  This will 
break users in a major ways, so the earliest any thing like this could happen 
is in trunk.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8979) Clean up checkstyle warnings in hadoop-hdfs-client module

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940645#comment-14940645
 ] 

Hadoop QA commented on HDFS-8979:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 59s | Pre-patch trunk has 7 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 49s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 43s | The applied patch generated  
11 new checkstyle issues (total was 1912, now 1172). |
| {color:green}+1{color} | whitespace | 135m 28s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m 56s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   1m 38s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 10s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   5m 27s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests |   0m 43s | Tests passed in 
hadoop-hdfs-client. |
| | | 192m  7s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764676/HDFS-8979.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fd026f5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12764/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs-client.html
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12764/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12764/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12764/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12764/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12764/console |


This message was automatically generated.

> Clean up checkstyle warnings in hadoop-hdfs-client module
> -
>
> Key: HDFS-8979
> URL: https://issues.apache.org/jira/browse/HDFS-8979
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8979.000.patch, HDFS-8979.001.patch, 
> HDFS-8979.002.patch
>
>
> This jira tracks the effort of cleaning up checkstyle warnings in 
> {{hadoop-hdfs-client}} module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9185:
---
Attachment: HDFS-9185-01.patch

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-247) A tool to plot the locations of the blocks of a directory

2015-10-01 Thread Pooja Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pooja Gupta reassigned HDFS-247:


Assignee: Pooja Gupta  (was: Avinash Desireddy)

> A tool to plot the locations of the blocks of a directory
> -
>
> Key: HDFS-247
> URL: https://issues.apache.org/jira/browse/HDFS-247
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Owen O'Malley
>Assignee: Pooja Gupta
>  Labels: newbie
>
> It would be very useful to have a command that we could give a hdfs directory 
> to, that would use fsck to find the block locations of the data files in that 
> directory and group them by host and display the distribution graphically. We 
> did this by hand and it was very for finding a skewed distribution that was 
> causing performance problems. The tool should also be able to group by rack 
> id and generate a similar plot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940669#comment-14940669
 ] 

Rakesh R commented on HDFS-9185:


Thank you [~umamaheswararao], [~jingzhao] for the review comments.

Attached another patch addressing the above comment. Kindly review it again.

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2015-10-01 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940676#comment-14940676
 ] 

Ming Ma commented on HDFS-7877:
---

Maybe we should try to support persistence for timeout. We can persist the 
maintenance expiration UTC time via some new mechanism discussed in HDFS-9005. 
The clock can be out of sync among NNs, but we can accept that given the 
maintenance timeout precision is in the order of minutes. [~ctrezzo] [~eddyxu], 
thought?

> Support maintenance state for datanodes
> ---
>
> Key: HDFS-7877
> URL: https://issues.apache.org/jira/browse/HDFS-7877
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Ming Ma
> Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
> Supportmaintenancestatefordatanodes-2.pdf, 
> Supportmaintenancestatefordatanodes.pdf
>
>
> This requirement came up during the design for HDFS-7541. Given this feature 
> is mostly independent of upgrade domain feature, it is better to track it 
> under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9180) Update excluded DataNodes in DFSStripedOutputStream based on failures in data streamers

2015-10-01 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940679#comment-14940679
 ] 

Yi Liu commented on HDFS-9180:
--

Thanks [~jingzhao] for the work. I have spent some time to familiar with what 
you guys did in HDFS-9040 :).

The patch looks good to me. +1 pending new Jenkins.  For the test, maybe we can 
add it and just comment out {{@Test}}.

> Update excluded DataNodes in DFSStripedOutputStream based on failures in data 
> streamers
> ---
>
> Key: HDFS-9180
> URL: https://issues.apache.org/jira/browse/HDFS-9180
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9180.000.patch, HDFS-9180.001.patch
>
>
> This is a TODO in HDFS-9040: based on the failures all the striped data 
> streamers hit, the DFSStripedOutputStream should keep a record of all the 
> DataNodes that should be excluded.
> This jira will also fix several bugs in the DFSStripedOutputStream. Will 
> provide more details in the comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8979) Clean up checkstyle warnings in hadoop-hdfs-client module

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939585#comment-14939585
 ] 

Hadoop QA commented on HDFS-8979:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 28s | Pre-patch trunk has 7 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  7s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 23s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 36s | The applied patch generated  
31 new checkstyle issues (total was 1909, now 1267). |
| {color:green}+1{color} | whitespace |  99m 47s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m  5s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 22s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests |   0m 30s | Tests passed in 
hadoop-hdfs-client. |
| | | 146m  3s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764549/HDFS-8979.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5db371f |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12759/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs-client.html
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12759/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12759/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12759/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12759/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12759/console |


This message was automatically generated.

> Clean up checkstyle warnings in hadoop-hdfs-client module
> -
>
> Key: HDFS-8979
> URL: https://issues.apache.org/jira/browse/HDFS-8979
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8979.000.patch
>
>
> This jira tracks the effort of cleaning up checkstyle warnings in 
> {{hadoop-hdfs-client}} module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1172) Blocks in newly completed files are considered under-replicated too quickly

2015-10-01 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940761#comment-14940761
 ] 

Masatake Iwasaki commented on HDFS-1172:


bq. We'd better place the new "adding block to pending replica queue" logic 
only in checkReplication.

Thanks for the comment again. We can not get expected nodes in 
{{BlockManager#checkReplication}} because BlockUnderConstructionFeature is 
already removed by {{BlockInfo#convertToCompleteBlock}} at that point. I'm 
trying to update the pendingReplications only in the code path of 
{{completeFile}} now.

> Blocks in newly completed files are considered under-replicated too quickly
> ---
>
> Key: HDFS-1172
> URL: https://issues.apache.org/jira/browse/HDFS-1172
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 0.21.0
>Reporter: Todd Lipcon
>Assignee: Masatake Iwasaki
> Attachments: HDFS-1172-150907.patch, HDFS-1172.008.patch, 
> HDFS-1172.009.patch, HDFS-1172.010.patch, HDFS-1172.patch, hdfs-1172.txt, 
> hdfs-1172.txt, replicateBlocksFUC.patch, replicateBlocksFUC1.patch, 
> replicateBlocksFUC1.patch
>
>
> I've seen this for a long time, and imagine it's a known issue, but couldn't 
> find an existing JIRA. It often happens that we see the NN schedule 
> replication on the last block of files very quickly after they're completed, 
> before the other DNs in the pipeline have a chance to report the new block. 
> This results in a lot of extra replication work on the cluster, as we 
> replicate the block and then end up with multiple excess replicas which are 
> very quickly deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7275) Add TLSv1.1,TLSv1.2 to HttpFS

2015-10-01 Thread Vijay Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Singh reassigned HDFS-7275:
-

Assignee: Vijay Singh

> Add TLSv1.1,TLSv1.2 to HttpFS
> -
>
> Key: HDFS-7275
> URL: https://issues.apache.org/jira/browse/HDFS-7275
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Robert Kanter
>Assignee: Vijay Singh
>
> HDFS-7274 required us to specifically list the versions of TLS that HttpFS 
> supports. With Hadoop 2.7 dropping support for Java 6 and Java 7 supporting 
> TLSv1.1 and TLSv1.2, we should add them to the list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7275) Add TLSv1.1,TLSv1.2 to HttpFS

2015-10-01 Thread Vijay Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940770#comment-14940770
 ] 

Vijay Singh commented on HDFS-7275:
---

Hi Robert,
I have tested this code change in my local and it works fine. I am attaching 
the patch for everyone's review and feedback. Please let me know in case of any 
suggestions, I will complete those changes.
For now the change involves modifying file 
hadoop/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/tomcat/ssl-server.xml.conf
 to include entries for TLSv1.1 and TLSv1.2 on line 73.
This patch is required for couple of clients as they have their clients running 
curl on ubuntu or RHEL7 that offers clients to specify tls level while fetching 
data from httpFs.

Please provide feedback if any.
The code snippted change looks as follows:
{code:ssl-server.xml.conf|borderStyle=solid}

{code}
 

> Add TLSv1.1,TLSv1.2 to HttpFS
> -
>
> Key: HDFS-7275
> URL: https://issues.apache.org/jira/browse/HDFS-7275
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Robert Kanter
>Assignee: Vijay Singh
>
> HDFS-7274 required us to specifically list the versions of TLS that HttpFS 
> supports. With Hadoop 2.7 dropping support for Java 6 and Java 7 supporting 
> TLSv1.1 and TLSv1.2, we should add them to the list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7275) Add TLSv1.1,TLSv1.2 to HttpFS

2015-10-01 Thread Vijay Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940772#comment-14940772
 ] 

Vijay Singh commented on HDFS-7275:
---

The formatting did not work typing code snippet without formatting.
{noformat}

{noformat}


> Add TLSv1.1,TLSv1.2 to HttpFS
> -
>
> Key: HDFS-7275
> URL: https://issues.apache.org/jira/browse/HDFS-7275
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Robert Kanter
>Assignee: Vijay Singh
>
> HDFS-7274 required us to specifically list the versions of TLS that HttpFS 
> supports. With Hadoop 2.7 dropping support for Java 6 and Java 7 supporting 
> TLSv1.1 and TLSv1.2, we should add them to the list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9185) TestRecoverStripedFile is failing

2015-10-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940766#comment-14940766
 ] 

Hadoop QA commented on HDFS-9185:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 33s | Pre-patch trunk has 7 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  3s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m  9s | The applied patch generated  1 
new checkstyle issues (total was 288, now 285). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 10s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 181m  4s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 30s | Tests passed in 
hadoop-hdfs-client. |
| | | 232m  3s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestFSNamesystem |
| Timed out tests | org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement |
|   | org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764715/HDFS-9185-01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fd026f5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12769/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs-client.html
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12769/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12769/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12769/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12769/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12769/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12769/console |


This message was automatically generated.

> TestRecoverStripedFile is failing
> -
>
> Key: HDFS-9185
> URL: https://issues.apache.org/jira/browse/HDFS-9185
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Critical
> Attachments: HDFS-9185-00.patch, HDFS-9185-01.patch
>
>
> Below is the message taken from build:
> {code}
> Error Message
> Time out waiting for EC block recovery.
> Stacktrace
> java.io.IOException: Time out waiting for EC block recovery.
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.waitForRecoveryFinished(TestRecoverStripedFile.java:383)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.assertFileBlocksRecovery(TestRecoverStripedFile.java:283)
>   at 
> org.apache.hadoop.hdfs.TestRecoverStripedFile.testRecoverAnyBlocks1(TestRecoverStripedFile.java:168)
> {code}
> Reference : https://builds.apache.org/job/PreCommit-HDFS-Build/12758



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-01 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-4015:
---
Attachment: HDFS-4015.002.patch

Hi [~liuml07] [~arpitagarwal], Thanks for your reviews. I have fixed all issues 
mentioned by both of you in this new patch. Please take a look when you get a 
chance

> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> dfsAdmin-report_with_forceExit.png, dfsHealth.html.message.png
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9179) fs.defaultFS should not be used on the server side

2015-10-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939505#comment-14939505
 ] 

Steve Loughran commented on HDFS-9179:
--

Dan, you're going to break so much stuff here people will hate you forever

> fs.defaultFS should not be used on the server side
> --
>
> Key: HDFS-9179
> URL: https://issues.apache.org/jira/browse/HDFS-9179
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> Currently the namenode will bind to the address given by defaultFS if no 
> rpc-address is given.  That behavior is an evolutionary artifact and should 
> be removed.  Instead, the rpc-address should be a required setting for the 
> server side configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)