[jira] [Commented] (HDFS-8716) introduce a new config specifically for safe mode block count

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627589#comment-14627589
 ] 

Hadoop QA commented on HDFS-8716:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m  3s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 44s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 20s | The applied patch generated  1 
new checkstyle issues (total was 676, now 676). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 28s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  4s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 160m 56s | Tests failed in hadoop-hdfs. |
| | | 204m 32s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencing |
|   | hadoop.hdfs.TestDistributedFileSystem |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745376/HDFS-8716.7.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0a16ee6 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11708/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11708/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11708/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11708/console |


This message was automatically generated.

> introduce a new config specifically for safe mode block count
> -
>
> Key: HDFS-8716
> URL: https://issues.apache.org/jira/browse/HDFS-8716
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: HDFS-8716.1.patch, HDFS-8716.2.patch, HDFS-8716.3.patch, 
> HDFS-8716.4.patch, HDFS-8716.5.patch, HDFS-8716.6.patch, HDFS-8716.7.patch, 
> HDFS-8716.7.patch
>
>
> During the start up, namenode waits for n replicas of each block to be 
> reported by datanodes before exiting the safe mode. Currently n is tied to 
> the min replicas config. We could set min replicas to more than one but we 
> might want to exit safe mode as soon as each block has one replica reported. 
> This can be worked out by introducing a new config variable for safe mode 
> block count



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8762) Erasure Coding: the log of each streamer should show its index

2015-07-14 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8762:

Status: Patch Available  (was: Open)

> Erasure Coding: the log of each streamer should show its index
> --
>
> Key: HDFS-8762
> URL: https://issues.apache.org/jira/browse/HDFS-8762
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8762-HDFS-7285-001.patch
>
>
> The log in {{DataStreamer}} doesn't show which streamer it's generated from. 
> In order to make log information more convenient for debugging, each log 
> should include the index of the streamer it's generated from. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8762) Erasure Coding: the log of each streamer should show its index

2015-07-14 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8762:

Attachment: HDFS-8762-HDFS-7285-001.patch

> Erasure Coding: the log of each streamer should show its index
> --
>
> Key: HDFS-8762
> URL: https://issues.apache.org/jira/browse/HDFS-8762
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8762-HDFS-7285-001.patch
>
>
> The log in {{DataStreamer}} doesn't show which streamer it's generated from. 
> In order to make log information more convenient for debugging, each log 
> should include the index of the streamer it's generated from. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8742) Inotify: Support event for OP_TRUNCATE

2015-07-14 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627585#comment-14627585
 ] 

Surendra Singh Lilhore commented on HDFS-8742:
--

Thanks [~ajisakaa]  for review and commit

> Inotify: Support event for OP_TRUNCATE
> --
>
> Key: HDFS-8742
> URL: https://issues.apache.org/jira/browse/HDFS-8742
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Fix For: 2.8.0
>
> Attachments: HDFS-8742-001.patch, HDFS-8742.patch
>
>
> Currently inotify is not giving any event for Truncate operation. NN should 
> send event for "Truncate".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6945) BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed

2015-07-14 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627539#comment-14627539
 ] 

Akira AJISAKA commented on HDFS-6945:
-

If my understand is correct, HDFS-8616 suggests that this issue should be 
cherry-picked to branch-2.7. I'd like to do this on July 18 if there are no 
further comment.

> BlockManager should remove a block from excessReplicateMap and decrement 
> ExcessBlocks metric when the block is removed
> --
>
> Key: HDFS-6945
> URL: https://issues.apache.org/jira/browse/HDFS-6945
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Critical
>  Labels: metrics
> Fix For: 2.8.0
>
> Attachments: HDFS-6945-003.patch, HDFS-6945-004.patch, 
> HDFS-6945-005.patch, HDFS-6945.2.patch, HDFS-6945.patch
>
>
> I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, 
> however, there are no over-replicated blocks (confirmed by fsck).
> After a further research, I noticed when deleting a block, BlockManager does 
> not remove the block from excessReplicateMap or decrement excessBlocksCount.
> Usually the metric is decremented when processing block report, however, if 
> the block has been deleted, BlockManager does not remove the block from 
> excessReplicateMap or decrement the metric.
> That way the metric and excessReplicateMap can increase infinitely (i.e. 
> memory leak can occur).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8616) Cherry pick HDFS-6495 for excess block leak

2015-07-14 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA reassigned HDFS-8616:
---

Assignee: Akira AJISAKA

> Cherry pick HDFS-6495 for excess block leak
> ---
>
> Key: HDFS-8616
> URL: https://issues.apache.org/jira/browse/HDFS-8616
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Akira AJISAKA
>
> Busy clusters quickly leak tens or hundreds of thousands of excess blocks 
> which slow BR processing.  HDFS-6495 should be cherry picked into 2.7.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8616) Cherry pick HDFS-6495 for excess block leak

2015-07-14 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627537#comment-14627537
 ] 

Akira AJISAKA commented on HDFS-8616:
-

I'd like to cherry-pick HDFS-6945 to branch-2.7 on July 18 if there are no 
further comment.

> Cherry pick HDFS-6495 for excess block leak
> ---
>
> Key: HDFS-8616
> URL: https://issues.apache.org/jira/browse/HDFS-8616
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>
> Busy clusters quickly leak tens or hundreds of thousands of excess blocks 
> which slow BR processing.  HDFS-6495 should be cherry picked into 2.7.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2

2015-07-14 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627512#comment-14627512
 ] 

Duo Zhang commented on HDFS-7966:
-

I'd say sorry... The performance result above is useless since the flow control 
part of my code does not work at that time... I found it when I tried to 
transfer 512MB block-I got an OOM...

I have rewritten the flow control part, and setup a cluster with 1 NN and DN to 
evaluate the performance. There is a netty 
bug(https://github.com/netty/netty/pull/3929) so I need to modify my code when 
running different tests.

The performance test code is here
https://github.com/Apache9/hadoop/blob/HDFS-7966-POC/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/http2/PerformanceTest.java

First I ran a large read test with 1 file with a 1GB block. Each ran 5 times 
with the command
{noformat}
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 1 1 
1073741824
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 1 1 
1073741824
{noformat}

Note that I set {{dfs.datanode.transferTo.allowed}} to {{false}} since http2 
implementation can not use transferTo(I'm currently working on implementing 
{{FileRegion}} support in netty-http2-codec, see 
https://github.com/netty/netty/issues/3927)

The result is
{noformat}
*** time based on http2 9953
*** time based on http2 9967
*** time based on http2 9954
*** time based on http2 9985
*** time based on http2 9976

*** time based on tcp 9383
*** time based on tcp 9375
*** time based on tcp 9377
*** time based on tcp 9373
*** time based on tcp 9376
{noformat}

The average latency of http2 is 9967ms, and for tcp it is 9376.8ms.

9967/9376.8=1.063, so http2 is about 6% slow than tcp. I think this is an 
acceptable result?

Let me test small read later and post the result here. Thanks.

> New Data Transfer Protocol via HTTP/2
> -
>
> Key: HDFS-7966
> URL: https://issues.apache.org/jira/browse/HDFS-7966
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Haohui Mai
>Assignee: Qianqian Shi
>  Labels: gsoc, gsoc2015, mentor
> Attachments: GSoC2015_Proposal.pdf, 
> TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg
>
>
> The current Data Transfer Protocol (DTP) implements a rich set of features 
> that span across multiple layers, including:
> * Connection pooling and authentication (session layer)
> * Encryption (presentation layer)
> * Data writing pipeline (application layer)
> All these features are HDFS-specific and defined by implementation. As a 
> result it requires non-trivial amount of work to implement HDFS clients and 
> servers.
> This jira explores to delegate the responsibilities of the session and 
> presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
> connection multiplexing, QoS, authentication and encryption, reducing the 
> scope of DTP to the application layer only. By leveraging the existing HTTP/2 
> library, it should simplify the implementation of both HDFS clients and 
> servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files

2015-07-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8777:

Assignee: Rakesh R

> Erasure Coding: add tests for taking snapshots on EC files
> --
>
> Key: HDFS-8777
> URL: https://issues.apache.org/jira/browse/HDFS-8777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Rakesh R
>
> We need to add more tests for (EC + snapshots). The tests need to verify the 
> fsimage saving/loading is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627500#comment-14627500
 ] 

Jing Zhao commented on HDFS-8777:
-

Yes. Let me assign this jira to you.

> Erasure Coding: add tests for taking snapshots on EC files
> --
>
> Key: HDFS-8777
> URL: https://issues.apache.org/jira/browse/HDFS-8777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>
> We need to add more tests for (EC + snapshots). The tests need to verify the 
> fsimage saving/loading is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8260) Erasure Coding: system test of writing EC file

2015-07-14 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627488#comment-14627488
 ] 

Xinwei Qin  commented on HDFS-8260:
---

Hi, [~demongaorui], Maybe I incorrectly understand what you mean. These jiras 
under the umbrella(HDFS8197) need system test results in real cluster, but not 
the patch in code. So, the attached patch does not match your intention, right?

> Erasure Coding:  system test of writing EC file
> ---
>
> Key: HDFS-8260
> URL: https://issues.apache.org/jira/browse/HDFS-8260
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: HDFS-7285
>Reporter: GAO Rui
>Assignee: Xinwei Qin 
> Attachments: HDFS-8260-HDFS-7285.001.patch
>
>
> 1. Normally writing EC file(writing without datanote failure)
> 2. Writing EC file with tolerable number of datanodes failing.
> 3. Writing EC file with intolerable number of datanodes failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8260) Erasure Coding: system test of writing EC file

2015-07-14 Thread Xinwei Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-8260:
--
Attachment: HDFS-8260-HDFS-7285.001.patch

Attach the system test patch of writing EC file with some datanodes failing.  
Some tests cannot pass due to the issues(HDFS-8704, HDFS-8383) have not been 
fixed.

> Erasure Coding:  system test of writing EC file
> ---
>
> Key: HDFS-8260
> URL: https://issues.apache.org/jira/browse/HDFS-8260
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: HDFS-7285
>Reporter: GAO Rui
>Assignee: Xinwei Qin 
> Attachments: HDFS-8260-HDFS-7285.001.patch
>
>
> 1. Normally writing EC file(writing without datanote failure)
> 2. Writing EC file with tolerable number of datanodes failing.
> 3. Writing EC file with intolerable number of datanodes failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8778) TestBlockReportRateLimiting#testLeaseExpiration can deadlock

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627453#comment-14627453
 ] 

Hadoop QA commented on HDFS-8778:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   7m 40s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 19s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 24s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m  5s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 161m 27s | Tests failed in hadoop-hdfs. |
| | | 184m  0s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745354/HDFS-8778.01.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 0a16ee6 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11707/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11707/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11707/console |


This message was automatically generated.

> TestBlockReportRateLimiting#testLeaseExpiration can deadlock
> 
>
> Key: HDFS-8778
> URL: https://issues.apache.org/jira/browse/HDFS-8778
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8778.01.patch
>
>
> {{requestBlockReportLease}} blocks on DataNode registration while holding the 
> NameSystem read lock.
> DataNode registration can block on the NameSystem read lock if a writer gets 
> in the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8716) introduce a new config specifically for safe mode block count

2015-07-14 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated HDFS-8716:
---
Attachment: HDFS-8716.7.patch

> introduce a new config specifically for safe mode block count
> -
>
> Key: HDFS-8716
> URL: https://issues.apache.org/jira/browse/HDFS-8716
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: HDFS-8716.1.patch, HDFS-8716.2.patch, HDFS-8716.3.patch, 
> HDFS-8716.4.patch, HDFS-8716.5.patch, HDFS-8716.6.patch, HDFS-8716.7.patch, 
> HDFS-8716.7.patch
>
>
> During the start up, namenode waits for n replicas of each block to be 
> reported by datanodes before exiting the safe mode. Currently n is tied to 
> the min replicas config. We could set min replicas to more than one but we 
> might want to exit safe mode as soon as each block has one replica reported. 
> This can be worked out by introducing a new config variable for safe mode 
> block count



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8763) Incremental blockreport order may replicate unnecessary block

2015-07-14 Thread jiangyu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627439#comment-14627439
 ] 

jiangyu commented on HDFS-8763:
---

Hi,[~shv] ,  the situation is reporting RBW after FINALIZE. It is rare when the 
cluster is not big enough, but for larger cluster,  it is easy to find. I don't 
know the impact if we change dfs.namenode.replication.min,  i will investigate 
it from code and test. 

> Incremental blockreport order may replicate unnecessary block
> -
>
> Key: HDFS-8763
> URL: https://issues.apache.org/jira/browse/HDFS-8763
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.4.0
>Reporter: jiangyu
>Priority: Minor
>
> For our cluster, the NameNode is always very busy, so for every incremental 
> block report , the contention of lock is heavy.
> The logic of incremental block report is as follow, client send block to dn1 
> and dn1 mirror to dn2, dn2 mirror to dn3. After finish this block, all 
> datanode will report the newly received block to namenode. In NameNode side, 
> all will go to the method processIncrementalBlockReport in BlockManager 
> class. But the status of the block reported from dn2,dn3 is RECEIVING_BLOCK, 
> for dn1 is RECEIED_BLOCK. It is okay if dn2, dn3 report before dn1(that is 
> common), but in some busy environment, it is easy to find dn1 report before 
> dn2 or dn3, let’s assume dn2 report first, dn1 report second, and dn3 report 
> third.
> So dn1 will addStoredBlock and find the replica of this block is not reach 
> the the original number(which is 3), and the block will add to 
> neededReplications construction and soon ask some node in pipeline (dn1 or 
> dn2)to replica it dn4 . After sometime, dn4 and dn3 all report this block, 
> then choose one node to invalidate.
> Here is one log i found in our cluster:
> 2015-07-08 01:05:34,675 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: 
> /logs/***_bigdata_spam/logs/application_1435099124107_470749/xx.xx.4.62_45454.tmp.
>  BP-1386326728-xx.xx.2.131-1382089338395 
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-a7c0f8f6-2399-4980-9479-efa08487b7b3:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-c75145a0-ed63-4180-87ee-d48ccaa647c5:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW]]}
> 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.7.75:50010 is added to 
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]}
>  size 0
> 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.4.62:50010 is added to 
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]}
>  size 0
> 2015-07-08 01:05:35,003 INFO BlockStateChange: BLOCK* ask xx.xx.4.62:50010 to 
> replicate blk_3194502674_2121080184 to datanode(s) xx.xx.4.65:50010
> 2015-07-08 01:05:35,403 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.7.73:50010 is added to blk_3194502674_2121080184 size 
> 67750
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.4.65:50010 is added to blk_3194502674_2121080184 size 
> 67750
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* InvalidateBlocks: add 
> blk_3194502674_2121080184 to xx.xx.7.75:50010
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* chooseExcessReplicates: 
> (xx.xx.7.75:50010, blk_3194502674_2121080184) is added to invalidated blocks 
> set
> 2015-07-08 01:05:35,852 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> InvalidateBlocks: ask xx.xx.7.75:50010 to delete [blk_3194502674_2121080184, 
> blk_3194497594_2121075104]
> Some day, the number of this situation can be 40, that is not good for 
> the performance and waste network band.
> Our base version is hadoop 2.4 and i have checked hadoop 2.7.1 didn’t find 
> any difference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8763) Incremental blockreport order may replicate unnecessary block

2015-07-14 Thread jiangyu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627438#comment-14627438
 ] 

jiangyu commented on HDFS-8763:
---

Hi,[~shv] ,  the situation is reporting RBW after FINALIZE. It is rare when the 
cluster is not big enough, but for larger cluster,  it is easy to find. I don't 
know the impact if we change dfs.namenode.replication.min,  i will investigate 
it from code and test. 

> Incremental blockreport order may replicate unnecessary block
> -
>
> Key: HDFS-8763
> URL: https://issues.apache.org/jira/browse/HDFS-8763
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.4.0
>Reporter: jiangyu
>Priority: Minor
>
> For our cluster, the NameNode is always very busy, so for every incremental 
> block report , the contention of lock is heavy.
> The logic of incremental block report is as follow, client send block to dn1 
> and dn1 mirror to dn2, dn2 mirror to dn3. After finish this block, all 
> datanode will report the newly received block to namenode. In NameNode side, 
> all will go to the method processIncrementalBlockReport in BlockManager 
> class. But the status of the block reported from dn2,dn3 is RECEIVING_BLOCK, 
> for dn1 is RECEIED_BLOCK. It is okay if dn2, dn3 report before dn1(that is 
> common), but in some busy environment, it is easy to find dn1 report before 
> dn2 or dn3, let’s assume dn2 report first, dn1 report second, and dn3 report 
> third.
> So dn1 will addStoredBlock and find the replica of this block is not reach 
> the the original number(which is 3), and the block will add to 
> neededReplications construction and soon ask some node in pipeline (dn1 or 
> dn2)to replica it dn4 . After sometime, dn4 and dn3 all report this block, 
> then choose one node to invalidate.
> Here is one log i found in our cluster:
> 2015-07-08 01:05:34,675 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: 
> /logs/***_bigdata_spam/logs/application_1435099124107_470749/xx.xx.4.62_45454.tmp.
>  BP-1386326728-xx.xx.2.131-1382089338395 
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-a7c0f8f6-2399-4980-9479-efa08487b7b3:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-c75145a0-ed63-4180-87ee-d48ccaa647c5:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW]]}
> 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.7.75:50010 is added to 
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]}
>  size 0
> 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.4.62:50010 is added to 
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]}
>  size 0
> 2015-07-08 01:05:35,003 INFO BlockStateChange: BLOCK* ask xx.xx.4.62:50010 to 
> replicate blk_3194502674_2121080184 to datanode(s) xx.xx.4.65:50010
> 2015-07-08 01:05:35,403 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.7.73:50010 is added to blk_3194502674_2121080184 size 
> 67750
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.4.65:50010 is added to blk_3194502674_2121080184 size 
> 67750
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* InvalidateBlocks: add 
> blk_3194502674_2121080184 to xx.xx.7.75:50010
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* chooseExcessReplicates: 
> (xx.xx.7.75:50010, blk_3194502674_2121080184) is added to invalidated blocks 
> set
> 2015-07-08 01:05:35,852 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> InvalidateBlocks: ask xx.xx.7.75:50010 to delete [blk_3194502674_2121080184, 
> blk_3194497594_2121075104]
> Some day, the number of this situation can be 40, that is not good for 
> the performance and waste network band.
> Our base version is hadoop 2.4 and i have checked hadoop 2.7.1 didn’t find 
> any difference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files

2015-07-14 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627440#comment-14627440
 ] 

Rakesh R commented on HDFS-8777:


[~jingzhao] I hope this task is to add unit tests. Could you please take a look 
at HDFS-8266 system test task, as an initial attempt I had attached patch to 
unit tests EC + snapshots in that jira. I think will move the unit tests here 
and in that jira will focus only system test cases. Probably I will include few 
more test cases with fsimage saving/loading scenarios. Does this make sense?

> Erasure Coding: add tests for taking snapshots on EC files
> --
>
> Key: HDFS-8777
> URL: https://issues.apache.org/jira/browse/HDFS-8777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>
> We need to add more tests for (EC + snapshots). The tests need to verify the 
> fsimage saving/loading is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627392#comment-14627392
 ] 

Hadoop QA commented on HDFS-8058:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  14m 58s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 8 new or modified test files. |
| {color:green}+1{color} | javac |   7m 24s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 38s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m 11s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 23s | The patch appears to introduce 5 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 187m 37s | Tests failed in hadoop-hdfs. |
| | | 229m 33s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
| Timed out tests | 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745335/HDFS-8058-HDFS-7285.009.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / 0a93712 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11705/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11705/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11705/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11705/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11705/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11705/console |


This message was automatically generated.

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, 
> HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8768) Erasure Coding: block group ID displayed in WebUI is not consistent with fsck

2015-07-14 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8768:

Summary: Erasure Coding: block group ID displayed in WebUI is not 
consistent with fsck  (was: The display of Erasure Code file block group ID in 
WebUI is not consistent with fsck command)

> Erasure Coding: block group ID displayed in WebUI is not consistent with fsck
> -
>
> Key: HDFS-8768
> URL: https://issues.apache.org/jira/browse/HDFS-8768
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: GAO Rui
> Attachments: Screen Shot 2015-07-14 at 15.33.08.png
>
>
> For example, In WebUI( usually, namenode port: 50070) , one Erasure Code   
> file with one block group was displayed as the attached screenshot [^Screen 
> Shot 2015-07-14 at 15.33.08.png]. But, with fsck command, the block group of 
> the same file was displayed like: {{0. 
> BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 
> len=6438256640}}
> After checking block file names in datanodes, we believe WebUI may have some 
> problem with Erasure Code block group display.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8769) Erasure Coding: unit test for SequentialBlockGroupIdGenerator

2015-07-14 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8769:

Summary: Erasure Coding: unit test for SequentialBlockGroupIdGenerator  
(was: unit test for SequentialBlockGroupIdGenerator)

> Erasure Coding: unit test for SequentialBlockGroupIdGenerator
> -
>
> Key: HDFS-8769
> URL: https://issues.apache.org/jira/browse/HDFS-8769
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Rakesh R
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7483) Display information per tier on the Namenode UI

2015-07-14 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627342#comment-14627342
 ] 

Benoy Antony commented on HDFS-7483:


I don't think the suggested approach will work. But I could be wrong. 
[~wheat9], Could you please provide a working code snippet for this approach ?

> Display information per tier on the Namenode UI
> ---
>
> Key: HDFS-7483
> URL: https://issues.apache.org/jira/browse/HDFS-7483
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HDFS-7483-001.patch, HDFS-7483-002.patch, overview.png, 
> storagetypes.png, storagetypes_withnostorage.png, withOneStorageType.png, 
> withTwoStorageType.png
>
>
> If cluster has different types of storage, it is useful to display the 
> storage information per type. 
> The information will be available via JMX (HDFS-7390)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7483) Display information per tier on the Namenode UI

2015-07-14 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627322#comment-14627322
 ] 

Haohui Mai commented on HDFS-7483:
--

{code}
+  var helpers = {
+'percentage': function(chunk, context, bodies, params) {
+  var a = dust.helpers.tap(params.a, chunk, context);
+  var b = dust.helpers.tap(params.b, chunk, context);
+  var f = parseFloat(a)/parseFloat(b);
+  return chunk.write(Math.round(f * 1) / 100 + '%');
+}
+  };
+
+  for(var key in helpers) {
+ dust.helpers[key] = helpers[key];
+  }
+
{code}

A better approach is to do it through the math helper 
(http://www.dustjs.com/guides/dust-helpers/) and to format it with the 
{{fmt_percentage}} helper.

> Display information per tier on the Namenode UI
> ---
>
> Key: HDFS-7483
> URL: https://issues.apache.org/jira/browse/HDFS-7483
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HDFS-7483-001.patch, HDFS-7483-002.patch, overview.png, 
> storagetypes.png, storagetypes_withnostorage.png, withOneStorageType.png, 
> withTwoStorageType.png
>
>
> If cluster has different types of storage, it is useful to display the 
> storage information per type. 
> The information will be available via JMX (HDFS-7390)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8619) Erasure Coding: revisit replica counting for striped blocks

2015-07-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8619:

Attachment: HDFS-8619-HDFS-7285.001.patch

Update the patch. Instead of tracking BlockInfo in CorruptReplicasMap, the 001 
patch uses a Block with the same ID of the striped block group.

> Erasure Coding: revisit replica counting for striped blocks
> ---
>
> Key: HDFS-8619
> URL: https://issues.apache.org/jira/browse/HDFS-8619
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8619-HDFS-7285.001.patch, HDFS-8619.000.patch
>
>
> Currently we use the same {{BlockManager#countNodes}} method for striped 
> blocks, which simply treat each internal block as a replica. However, for a 
> striped block, we may have more complicated scenario, e.g., we have multiple 
> replicas of the first internal block while we miss some other internal 
> blocks. Using the current {{countNodes}} methods can lead to wrong decision 
> in these scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8778) TestBlockReportRateLimiting#testLeaseExpiration can deadlock

2015-07-14 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8778:

Attachment: HDFS-8778.01.patch

The fix is to make {{requestBlockReportLease}} non-blocking.

Also ensure that the cluster is shutdown on test failure.



> TestBlockReportRateLimiting#testLeaseExpiration can deadlock
> 
>
> Key: HDFS-8778
> URL: https://issues.apache.org/jira/browse/HDFS-8778
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8778.01.patch
>
>
> {{requestBlockReportLease}} blocks on DataNode registration while holding the 
> NameSystem read lock.
> DataNode registration can block on the NameSystem read lock if a writer gets 
> in the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8778) TestBlockReportRateLimiting#testLeaseExpiration can deadlock

2015-07-14 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8778:

Status: Patch Available  (was: Open)

> TestBlockReportRateLimiting#testLeaseExpiration can deadlock
> 
>
> Key: HDFS-8778
> URL: https://issues.apache.org/jira/browse/HDFS-8778
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8778.01.patch
>
>
> {{requestBlockReportLease}} blocks on DataNode registration while holding the 
> NameSystem read lock.
> DataNode registration can block on the NameSystem read lock if a writer gets 
> in the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8778) TestBlockReportRateLimiting#testLeaseExpiration can deadlock

2015-07-14 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-8778:
---

 Summary: TestBlockReportRateLimiting#testLeaseExpiration can 
deadlock
 Key: HDFS-8778
 URL: https://issues.apache.org/jira/browse/HDFS-8778
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.1
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


{{requestBlockReportLease}} blocks on DataNode registration while holding the 
NameSystem read lock.

DataNode registration can block on the NameSystem read lock if a writer gets in 
the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil

2015-07-14 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627215#comment-14627215
 ] 

Walter Su commented on HDFS-8433:
-

I was thinking
{code}
  public void readFields(DataInput in) throws IOException {
...
for (int i = 0; i < length; i++) {
  modes.add(WritableUtils.readEnum(in, AccessMode.class));
}
+   idRange = WritableUtils.readVLong(in);
  }

  @Override
  public void write(DataOutput out) throws IOException {
...
for (AccessMode aMode : modes) {
  WritableUtils.writeEnum(out, aMode);
}
+   WritableUtils.writeVLong(out, idRange);
  }
{code}
A token generated by new NN can be parsed by old DN.
A token generated by old DN can be parsed by old DN.
A token generated by old DN can't be parsed by new DN.

I hope DN don't generate token then there is no problem. Actually 
{{DataNode.DataTransfer}} generates token. Now I think this idea is a bad idea. 
I miss protobuf now but nothing we can do.

> blockToken is not set in constructInternalBlock and parseStripedBlockGroup in 
> StripedBlockUtil
> --
>
> Key: HDFS-8433
> URL: https://issues.apache.org/jira/browse/HDFS-8433
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Walter Su
> Attachments: HDFS-8433-HDFS-7285.02.patch, HDFS-8433.00.patch, 
> HDFS-8433.01.patch
>
>
> The blockToken provided in LocatedStripedBlock is not used to create 
> LocatedBlock in constructInternalBlock and parseStripedBlockGroup in 
> StripedBlockUtil.
> We should also add ec tests with security on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8716) introduce a new config specifically for safe mode block count

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627210#comment-14627210
 ] 

Hadoop QA commented on HDFS-8716:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  6s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  4s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  9s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 29s | The applied patch generated  1 
new checkstyle issues (total was 676, now 676). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  2s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 159m 31s | Tests failed in hadoop-hdfs. |
| | | 205m 14s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestSecondaryWebUi |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.namenode.TestQuotaByStorageType |
|   | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
|   | hadoop.hdfs.TestDistributedFileSystem |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745311/HDFS-8716.7.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 59388a8 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11703/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11703/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11703/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11703/console |


This message was automatically generated.

> introduce a new config specifically for safe mode block count
> -
>
> Key: HDFS-8716
> URL: https://issues.apache.org/jira/browse/HDFS-8716
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: HDFS-8716.1.patch, HDFS-8716.2.patch, HDFS-8716.3.patch, 
> HDFS-8716.4.patch, HDFS-8716.5.patch, HDFS-8716.6.patch, HDFS-8716.7.patch
>
>
> During the start up, namenode waits for n replicas of each block to be 
> reported by datanodes before exiting the safe mode. Currently n is tied to 
> the min replicas config. We could set min replicas to more than one but we 
> might want to exit safe mode as soon as each block has one replica reported. 
> This can be worked out by introducing a new config variable for safe mode 
> block count



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627196#comment-14627196
 ] 

Hadoop QA commented on HDFS-8058:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 59s | Pre-patch HDFS-7285 has 5 
extant Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 8 new or modified test files. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 22s | The applied patch generated  
10 new checkstyle issues (total was 643, now 634). |
| {color:red}-1{color} | whitespace |   0m  8s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 20s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 186m 20s | Tests failed in hadoop-hdfs. |
| | | 233m 10s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
| Timed out tests | 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745307/HDFS-8058-HDFS-7285.008.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / 6ff957b |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/HDFS-7285FindbugsWarningshadoop-hdfs.html
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11702/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11702/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11702/console |


This message was automatically generated.

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, 
> HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627181#comment-14627181
 ] 

Hadoop QA commented on HDFS-8058:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 59s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 8 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 39s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  7s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 23s | The patch appears to introduce 5 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 16s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |   0m 27s | Tests failed in hadoop-hdfs. |
| | |  43m 39s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745335/HDFS-8058-HDFS-7285.009.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / 0a93712 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11706/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11706/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11706/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11706/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11706/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11706/console |


This message was automatically generated.

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, 
> HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627180#comment-14627180
 ] 

Jing Zhao commented on HDFS-8058:
-

Looks like the timeout of TestDFSStripedOutputStreamWithFailure is related. 
Could you please take a look at it, Zhe?

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, 
> HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627112#comment-14627112
 ] 

Jing Zhao commented on HDFS-8058:
-

Created HDFS-8777 to add more tests for (snapshot + EC).

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, 
> HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files

2015-07-14 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-8777:
---

 Summary: Erasure Coding: add tests for taking snapshots on EC files
 Key: HDFS-8777
 URL: https://issues.apache.org/jira/browse/HDFS-8777
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao


We need to add more tests for (EC + snapshots). The tests need to verify the 
fsimage saving/loading is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627104#comment-14627104
 ] 

Jing Zhao commented on HDFS-8058:
-

Thanks for the update, Zhe. +1 for the 09 patch.

{{testTruncateWithDataNodesRestartImmediately}} has been fixed in trunk 
recently so we can ignore it in the feature branch now.

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, 
> HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8058:

Attachment: HDFS-8058-HDFS-7285.009.patch

Good catch Jing! Uploading 09 patch to address the issue. 

{{testTruncateWithDataNodesRestartImmediately}} fails even without the patch. 
We should do some more debugging around it. To verify the new change I ran 
{{TestFSImage}}, {{TestINodeFile}} and {{TestStripedINodeFile}} locally and 
they all passed.

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, 
> HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8759) Implement remote block reader in libhdfspp

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627037#comment-14627037
 ] 

Jing Zhao commented on HDFS-8759:
-

Agree with [~James Clampffer] that we need to have detailed documentation 
especially for class like Status. But it's fine to do this in a separate jira. 
+1 for the 001 patch.

> Implement remote block reader in libhdfspp
> --
>
> Key: HDFS-8759
> URL: https://issues.apache.org/jira/browse/HDFS-8759
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8759.000.patch, HDFS-8759.001.patch
>
>
> This jira tracks the effort of implementing the remote block reader that 
> communicates with DN in libhdfspp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627017#comment-14627017
 ] 

Hadoop QA commented on HDFS-8058:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 13s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 37s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  7s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 27s | The patch appears to introduce 5 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 16s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 188m 12s | Tests failed in hadoop-hdfs. |
| | | 230m 36s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
| Timed out tests | 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745196/HDFS-8058-HDFS-7285.007.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / b1e6429 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11701/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11701/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11701/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11701/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11701/console |


This message was automatically generated.

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627005#comment-14627005
 ] 

Jing Zhao commented on HDFS-8433:
-

bq. This time, we don't need BlockIdRange class. We extends the fields of 
BlockTokenIdentifier. (See BlockTokenIdentifier#readFields(..) / write(..) )

Maybe more details? Not sure if I catch the idea here. How to avoid changing 
the readFields/writeFields but still adding a new field?

> blockToken is not set in constructInternalBlock and parseStripedBlockGroup in 
> StripedBlockUtil
> --
>
> Key: HDFS-8433
> URL: https://issues.apache.org/jira/browse/HDFS-8433
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Walter Su
> Attachments: HDFS-8433-HDFS-7285.02.patch, HDFS-8433.00.patch, 
> HDFS-8433.01.patch
>
>
> The blockToken provided in LocatedStripedBlock is not used to create 
> LocatedBlock in constructInternalBlock and parseStripedBlockGroup in 
> StripedBlockUtil.
> We should also add ec tests with security on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8776) Decom manager should not be active on standby

2015-07-14 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-8776:
-

 Summary: Decom manager should not be active on standby
 Key: HDFS-8776
 URL: https://issues.apache.org/jira/browse/HDFS-8776
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp


The decommission manager should not be actively processing on the standby.

The decomm manager goes through the costly computation for determining every 
block on the node requires replication yet doesn't queue them for replication - 
because it's in standby. The decomm manager is holding the namesystem write 
lock, causing DNs to timeout on heartbeats or IBRs, NN purges the call queue of 
timed out clients, NN processes some heartbeats/IBRs before the decomm manager 
locks up the namesystem again. Nodes attempting to register will be sending 
full BRs which are more costly to send and discard than a heartbeat.

If a failover is required, the standby will likely have to struggle very hard 
to not GC while "catching up" on its queued IBRs while DNs continue to fill the 
call queue and time out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8775) SASL support for data transfer protocol in libhdfspp

2015-07-14 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-8775:


 Summary: SASL support for data transfer protocol in libhdfspp
 Key: HDFS-8775
 URL: https://issues.apache.org/jira/browse/HDFS-8775
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai


This jira proposes to implement basic SASL support for the data transfer 
protocol which allows libhdfspp to talk to secure clusters.

Support for encryption is deferred to subsequent jiras.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8734) Erasure Coding: fix one cell need two packets

2015-07-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8734:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7285
   Status: Resolved  (was: Patch Available)

I've committed this to the feature branch. Thanks for the contribution, Walter! 
Thanks for the review, Bo!

> Erasure Coding: fix one cell need two packets
> -
>
> Key: HDFS-8734
> URL: https://issues.apache.org/jira/browse/HDFS-8734
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Fix For: HDFS-7285
>
> Attachments: HDFS-8734-HDFS-7285.01.patch, HDFS-8734.01.patch
>
>
> The default WritePacketSize is 64k
> Currently default cellSize is 64k
> We hope one cell consumes one packet. In fact it's not.
> By default,
> chunkSize = 516( 512 data + 4 checksum)
> packetSize = 64k
> chunksPerPacket = 126 ( See DFSOutputStream#computePacketChunkSize for 
> details)
> numBytes of data in one packet = 64512
> cellSize = 65536
> When first packet is full ( with 64512 data), there are still 65536 - 64512 = 
> 1024 bytes left.
> {code}
> super.writeChunk(bytes, offset, len, checksum, ckoff, cklen);
> // cell is full and current packet has not been enqueued,
> if (cellFull && currentPacket != null) {
>   enqueueCurrentPacketFull();
> }   
> {code}
> When  the last 1024 bytes of the cell was written, we meet {{cellFull}} and 
> create another packet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8734) Erasure Coding: fix one cell need two packets

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626973#comment-14626973
 ] 

Jing Zhao commented on HDFS-8734:
-

The patch looks good to me. +1.

One concern is that we now have more and more variables to track the states of 
the streamers. We can revisit them later and maybe do some code cleanup.

> Erasure Coding: fix one cell need two packets
> -
>
> Key: HDFS-8734
> URL: https://issues.apache.org/jira/browse/HDFS-8734
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-8734-HDFS-7285.01.patch, HDFS-8734.01.patch
>
>
> The default WritePacketSize is 64k
> Currently default cellSize is 64k
> We hope one cell consumes one packet. In fact it's not.
> By default,
> chunkSize = 516( 512 data + 4 checksum)
> packetSize = 64k
> chunksPerPacket = 126 ( See DFSOutputStream#computePacketChunkSize for 
> details)
> numBytes of data in one packet = 64512
> cellSize = 65536
> When first packet is full ( with 64512 data), there are still 65536 - 64512 = 
> 1024 bytes left.
> {code}
> super.writeChunk(bytes, offset, len, checksum, ckoff, cklen);
> // cell is full and current packet has not been enqueued,
> if (cellFull && currentPacket != null) {
>   enqueueCurrentPacketFull();
> }   
> {code}
> When  the last 1024 bytes of the cell was written, we meet {{cellFull}} and 
> create another packet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8774) Implement FileSystem and InputStream API for libhdfspp

2015-07-14 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-8774:


 Summary: Implement FileSystem and InputStream API for libhdfspp
 Key: HDFS-8774
 URL: https://issues.apache.org/jira/browse/HDFS-8774
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: HDFS-8707


This jira proposes to implement FileSystem and InputStream APIs for libhdfspp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client

2015-07-14 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626969#comment-14626969
 ] 

Arun Suresh commented on HDFS-7858:
---

[~arpitagarwal], apologize for sitting on this...

I was trying to refactor this as per [~jingzhao]'s suggestion (replacing 
RetryInvocationHandler with RequestHedgingInvocationHandler). Unfortunately, it 
was turning out to be a more far reaching impact (technically request hedging 
is different from retry.. so the whole policy framework etc. would need to be 
refactored)

If everyone is ok with the current approach, we can punt the larger refactoring 
to another JIRA and I can incorporate [~arpitagarwal]'s suggestion (skip 
standby for subsequent requests) and provide a quick patch.

> Improve HA Namenode Failover detection on the client
> 
>
> Key: HDFS-7858
> URL: https://issues.apache.org/jira/browse/HDFS-7858
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, 
> HDFS-7858.3.patch
>
>
> In an HA deployment, Clients are configured with the hostnames of both the 
> Active and Standby Namenodes.Clients will first try one of the NNs 
> (non-deterministically) and if its a standby NN, then it will respond to the 
> client to retry the request on the other Namenode.
> If the client happens to talks to the Standby first, and the standby is 
> undergoing some GC / is busy, then those clients might not get a response 
> soon enough to try the other NN.
> Proposed Approach to solve this :
> 1) Since Zookeeper is already used as the failover controller, the clients 
> could talk to ZK and find out which is the active namenode before contacting 
> it.
> 2) Long-lived DFSClients would have a ZK watch configured which fires when 
> there is a failover so they do not have to query ZK everytime to find out the 
> active NN
> 2) Clients can also cache the last active NN in the user's home directory 
> (~/.lastNN) so that short-lived clients can try that Namenode first before 
> querying ZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626963#comment-14626963
 ] 

Hadoop QA commented on HDFS-7858:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12702886/HDFS-7858.3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 979c9ca |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11704/console |


This message was automatically generated.

> Improve HA Namenode Failover detection on the client
> 
>
> Key: HDFS-7858
> URL: https://issues.apache.org/jira/browse/HDFS-7858
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, 
> HDFS-7858.3.patch
>
>
> In an HA deployment, Clients are configured with the hostnames of both the 
> Active and Standby Namenodes.Clients will first try one of the NNs 
> (non-deterministically) and if its a standby NN, then it will respond to the 
> client to retry the request on the other Namenode.
> If the client happens to talks to the Standby first, and the standby is 
> undergoing some GC / is busy, then those clients might not get a response 
> soon enough to try the other NN.
> Proposed Approach to solve this :
> 1) Since Zookeeper is already used as the failover controller, the clients 
> could talk to ZK and find out which is the active namenode before contacting 
> it.
> 2) Long-lived DFSClients would have a ZK watch configured which fires when 
> there is a failover so they do not have to query ZK everytime to find out the 
> active NN
> 2) Clients can also cache the last active NN in the user's home directory 
> (~/.lastNN) so that short-lived clients can try that Namenode first before 
> querying ZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8742) Inotify: Support event for OP_TRUNCATE

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626957#comment-14626957
 ] 

Hudson commented on HDFS-8742:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8164 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8164/])
HDFS-8742. Inotify: Support event for OP_TRUNCATE. Contributed by Surendra 
Singh Lilhore. (aajisaka: rev 979c9ca2ca89e99dc7165abfa29c78d66de43d9a)
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/inotify.proto
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/InotifyFSEditLogOpTranslator.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/inotify/Event.java


> Inotify: Support event for OP_TRUNCATE
> --
>
> Key: HDFS-8742
> URL: https://issues.apache.org/jira/browse/HDFS-8742
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Fix For: 2.8.0
>
> Attachments: HDFS-8742-001.patch, HDFS-8742.patch
>
>
> Currently inotify is not giving any event for Truncate operation. NN should 
> send event for "Truncate".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626956#comment-14626956
 ] 

Jing Zhao commented on HDFS-8058:
-

Here {{fileInPb.getIsStriped()}} should be {{file.isStriped}} since we have not 
persist the isStriped information into FileDiff. Or we can move 
"setIsStriped(n.isStriped())" into {{buildINodeFile}}. Maybe the later way is 
more clean. Let's also create a separate jira to add more tests on the (EC + 
snapshot) scenario.
{code}
-  (byte)fileInPb.getStoragePolicyID(), xAttrs);
+  (byte)fileInPb.getStoragePolicyID(), xAttrs, 
fileInPb.getIsStriped());
{code}

Other than this +1 if Jenkins runs fine.

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8764) Generate Hadoop RPC stubs from protobuf definitions

2015-07-14 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8764:
-
Attachment: HDFS-8764.000.patch

> Generate Hadoop RPC stubs from protobuf definitions
> ---
>
> Key: HDFS-8764
> URL: https://issues.apache.org/jira/browse/HDFS-8764
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8764.000.patch
>
>
> It would be nice to have the the RPC stubs generated from the protobuf 
> definitions which is similar to what the HADOOP-10388 has achieved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client

2015-07-14 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626950#comment-14626950
 ] 

Arpit Agarwal commented on HDFS-7858:
-

Hi [~asuresh], were you thinking of posting an updated patch. The overall 
approach looks good.

One comment from a quick look - RequestHedgingProxyProvider sends all requests 
to both NNs. Should it skip the standby for subsequent requests?

> Improve HA Namenode Failover detection on the client
> 
>
> Key: HDFS-7858
> URL: https://issues.apache.org/jira/browse/HDFS-7858
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, 
> HDFS-7858.3.patch
>
>
> In an HA deployment, Clients are configured with the hostnames of both the 
> Active and Standby Namenodes.Clients will first try one of the NNs 
> (non-deterministically) and if its a standby NN, then it will respond to the 
> client to retry the request on the other Namenode.
> If the client happens to talks to the Standby first, and the standby is 
> undergoing some GC / is busy, then those clients might not get a response 
> soon enough to try the other NN.
> Proposed Approach to solve this :
> 1) Since Zookeeper is already used as the failover controller, the clients 
> could talk to ZK and find out which is the active namenode before contacting 
> it.
> 2) Long-lived DFSClients would have a ZK watch configured which fires when 
> there is a failover so they do not have to query ZK everytime to find out the 
> active NN
> 2) Clients can also cache the last active NN in the user's home directory 
> (~/.lastNN) so that short-lived clients can try that Namenode first before 
> querying ZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8758) Implement the continuation library for libhdfspp

2015-07-14 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved HDFS-8758.
--
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: HDFS-8707
Target Version/s: HDFS-8707

Committed to HDFS-8707. Thanks Jing for reviews.

> Implement the continuation library for libhdfspp
> 
>
> Key: HDFS-8758
> URL: https://issues.apache.org/jira/browse/HDFS-8758
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: HDFS-8707
>
> Attachments: HDFS-8758.000.patch
>
>
> libhdfspp uses continuations as basic building blocks to implement 
> asynchronous operations. This jira imports the continuation library into the 
> repository.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8742) Inotify: Support event for OP_TRUNCATE

2015-07-14 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-8742:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed this to trunk and branch-2. Thanks [~surendrasingh] for the 
contribution!

> Inotify: Support event for OP_TRUNCATE
> --
>
> Key: HDFS-8742
> URL: https://issues.apache.org/jira/browse/HDFS-8742
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Fix For: 2.8.0
>
> Attachments: HDFS-8742-001.patch, HDFS-8742.patch
>
>
> Currently inotify is not giving any event for Truncate operation. NN should 
> send event for "Truncate".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8759) Implement remote block reader in libhdfspp

2015-07-14 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8759:
-
Attachment: HDFS-8759.001.patch

> Implement remote block reader in libhdfspp
> --
>
> Key: HDFS-8759
> URL: https://issues.apache.org/jira/browse/HDFS-8759
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8759.000.patch, HDFS-8759.001.patch
>
>
> This jira tracks the effort of implementing the remote block reader that 
> communicates with DN in libhdfspp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8742) Inotify: Support event for OP_TRUNCATE

2015-07-14 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626921#comment-14626921
 ] 

Akira AJISAKA commented on HDFS-8742:
-

+1, the test failure looks unrelated to the patch. I confirmed the test passed 
locally.

> Inotify: Support event for OP_TRUNCATE
> --
>
> Key: HDFS-8742
> URL: https://issues.apache.org/jira/browse/HDFS-8742
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-8742-001.patch, HDFS-8742.patch
>
>
> Currently inotify is not giving any event for Truncate operation. NN should 
> send event for "Truncate".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626895#comment-14626895
 ] 

Hudson commented on HDFS-8722:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8163 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8163/])
HDFS-8722. Optimize datanode writes for small writes and flushes. Contributed 
by Kihwal Lee (kihwal: rev 59388a801514d6af64ef27fbf246d8054f1dcc74)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Optimize datanode writes for small writes and flushes
> -
>
> Key: HDFS-8722
> URL: https://issues.apache.org/jira/browse/HDFS-8722
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8722.patch, HDFS-8722.v1.patch
>
>
> After the data corruption fix by HDFS-4660, the CRC recalculation for partial 
> chunk is executed more frequently, if the client repeats writing few bytes 
> and calling hflush/hsync.  This is because the generic logic forces CRC 
> recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, 
> datanode blindly accepted whatever CRC client provided, if the incoming data 
> is chunk-aligned. This was the source of the corruption.
> We can still optimize for the most common case where a client is repeatedly 
> writing small number of bytes followed by hflush/hsync with no pipeline 
> recovery or append, by allowing the previous behavior for this specific case. 
>  If the incoming data has a duplicate portion and that is at the last 
> chunk-boundary before the partial chunk on disk, datanode can use the 
> checksum supplied by the client without redoing the checksum on its own.  
> This reduces disk reads as well as CPU load for the checksum calculation.
> If the incoming packet data goes back further than the last on-disk chunk 
> boundary, datanode will still do a recalculation, but this occurs rarely 
> during pipeline recoveries. Thus the optimization for this specific case 
> should be sufficient to speed up the vast majority of cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8722) Optimize datanode writes for small writes and flushes

2015-07-14 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8722:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.2
   Status: Resolved  (was: Patch Available)

> Optimize datanode writes for small writes and flushes
> -
>
> Key: HDFS-8722
> URL: https://issues.apache.org/jira/browse/HDFS-8722
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8722.patch, HDFS-8722.v1.patch
>
>
> After the data corruption fix by HDFS-4660, the CRC recalculation for partial 
> chunk is executed more frequently, if the client repeats writing few bytes 
> and calling hflush/hsync.  This is because the generic logic forces CRC 
> recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, 
> datanode blindly accepted whatever CRC client provided, if the incoming data 
> is chunk-aligned. This was the source of the corruption.
> We can still optimize for the most common case where a client is repeatedly 
> writing small number of bytes followed by hflush/hsync with no pipeline 
> recovery or append, by allowing the previous behavior for this specific case. 
>  If the incoming data has a duplicate portion and that is at the last 
> chunk-boundary before the partial chunk on disk, datanode can use the 
> checksum supplied by the client without redoing the checksum on its own.  
> This reduces disk reads as well as CPU load for the checksum calculation.
> If the incoming packet data goes back further than the last on-disk chunk 
> boundary, datanode will still do a recalculation, but this occurs rarely 
> during pipeline recoveries. Thus the optimization for this specific case 
> should be sufficient to speed up the vast majority of cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes

2015-07-14 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626888#comment-14626888
 ] 

Kihwal Lee commented on HDFS-8722:
--

Thanks for the review, Arpit. I've committed this to trunk, branch-2 and 
branch-2.7. 

> Optimize datanode writes for small writes and flushes
> -
>
> Key: HDFS-8722
> URL: https://issues.apache.org/jira/browse/HDFS-8722
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8722.patch, HDFS-8722.v1.patch
>
>
> After the data corruption fix by HDFS-4660, the CRC recalculation for partial 
> chunk is executed more frequently, if the client repeats writing few bytes 
> and calling hflush/hsync.  This is because the generic logic forces CRC 
> recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, 
> datanode blindly accepted whatever CRC client provided, if the incoming data 
> is chunk-aligned. This was the source of the corruption.
> We can still optimize for the most common case where a client is repeatedly 
> writing small number of bytes followed by hflush/hsync with no pipeline 
> recovery or append, by allowing the previous behavior for this specific case. 
>  If the incoming data has a duplicate portion and that is at the last 
> chunk-boundary before the partial chunk on disk, datanode can use the 
> checksum supplied by the client without redoing the checksum on its own.  
> This reduces disk reads as well as CPU load for the checksum calculation.
> If the incoming packet data goes back further than the last on-disk chunk 
> boundary, datanode will still do a recalculation, but this occurs rarely 
> during pipeline recoveries. Thus the optimization for this specific case 
> should be sufficient to speed up the vast majority of cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8716) introduce a new config specifically for safe mode block count

2015-07-14 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated HDFS-8716:
---
Attachment: HDFS-8716.7.patch

add a unit test

> introduce a new config specifically for safe mode block count
> -
>
> Key: HDFS-8716
> URL: https://issues.apache.org/jira/browse/HDFS-8716
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: HDFS-8716.1.patch, HDFS-8716.2.patch, HDFS-8716.3.patch, 
> HDFS-8716.4.patch, HDFS-8716.5.patch, HDFS-8716.6.patch, HDFS-8716.7.patch
>
>
> During the start up, namenode waits for n replicas of each block to be 
> reported by datanodes before exiting the safe mode. Currently n is tied to 
> the min replicas config. We could set min replicas to more than one but we 
> might want to exit safe mode as soon as each block has one replica reported. 
> This can be worked out by introducing a new config variable for safe mode 
> block count



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8763) Incremental blockreport order may replicate unnecessary block

2015-07-14 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626839#comment-14626839
 ] 

Konstantin Shvachko commented on HDFS-8763:
---

Blocks should not be scheduled for replication, while they are 
UNDER_CONSTRUCTION. Could you check if that is the case. If so it is a bug.
If not then your DNs seem to be quite busy because of too many of small blocks.
- You can try to increase {{dfs.namenode.replication.min}} to 2 (the default is 
1). That way NN should wait for 2 replicas reported by DNs before scheduling 
replication.
- Also as you know HDFS does not "like" small files. Where small means < 128MB. 
You may try to encourage your users to combine data into larger blobs.

> Incremental blockreport order may replicate unnecessary block
> -
>
> Key: HDFS-8763
> URL: https://issues.apache.org/jira/browse/HDFS-8763
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.4.0
>Reporter: jiangyu
>Priority: Minor
>
> For our cluster, the NameNode is always very busy, so for every incremental 
> block report , the contention of lock is heavy.
> The logic of incremental block report is as follow, client send block to dn1 
> and dn1 mirror to dn2, dn2 mirror to dn3. After finish this block, all 
> datanode will report the newly received block to namenode. In NameNode side, 
> all will go to the method processIncrementalBlockReport in BlockManager 
> class. But the status of the block reported from dn2,dn3 is RECEIVING_BLOCK, 
> for dn1 is RECEIED_BLOCK. It is okay if dn2, dn3 report before dn1(that is 
> common), but in some busy environment, it is easy to find dn1 report before 
> dn2 or dn3, let’s assume dn2 report first, dn1 report second, and dn3 report 
> third.
> So dn1 will addStoredBlock and find the replica of this block is not reach 
> the the original number(which is 3), and the block will add to 
> neededReplications construction and soon ask some node in pipeline (dn1 or 
> dn2)to replica it dn4 . After sometime, dn4 and dn3 all report this block, 
> then choose one node to invalidate.
> Here is one log i found in our cluster:
> 2015-07-08 01:05:34,675 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: 
> /logs/***_bigdata_spam/logs/application_1435099124107_470749/xx.xx.4.62_45454.tmp.
>  BP-1386326728-xx.xx.2.131-1382089338395 
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-a7c0f8f6-2399-4980-9479-efa08487b7b3:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-c75145a0-ed63-4180-87ee-d48ccaa647c5:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW]]}
> 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.7.75:50010 is added to 
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]}
>  size 0
> 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.4.62:50010 is added to 
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]}
>  size 0
> 2015-07-08 01:05:35,003 INFO BlockStateChange: BLOCK* ask xx.xx.4.62:50010 to 
> replicate blk_3194502674_2121080184 to datanode(s) xx.xx.4.65:50010
> 2015-07-08 01:05:35,403 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.7.73:50010 is added to blk_3194502674_2121080184 size 
> 67750
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: xx.xx.4.65:50010 is added to blk_3194502674_2121080184 size 
> 67750
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* InvalidateBlocks: add 
> blk_3194502674_2121080184 to xx.xx.7.75:50010
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* chooseExcessReplicates: 
> (xx.xx.7.75:50010, blk_3194502674_2121080184) is added to invalidated blocks 
> set
> 2015-07-08 01:05:35,852 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> InvalidateBlocks: ask xx.xx.7.75:50010 to delete [blk_3194502674_2121080184, 
> blk_3194497594_2121075104]
> Some day, the number of this situation can be 40, that is not good for 
> the performance and wa

[jira] [Updated] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8058:

Attachment: HDFS-8058-HDFS-7285.008.patch

Thanks Jing for reviewing again! Uploading new patch to address the 2 issues. 

When creating a snapshot copy in {{FSImageFormat}}, {{isStriped}} is set to 
false. IIRC we don't support EC on non-PB images. Let me know if that's 
correct. 

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058-HDFS-7285.008.patch, HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626810#comment-14626810
 ] 

Jing Zhao commented on HDFS-8058:
-

The 007 patch looks good to me. Just two minors:
# In {{INodeFileAttributes}}, it's better to keep the {{isStriped}} attribute 
of the snapshot copy the same with the INodeFile. Thus maybe we can add a 
boolean parameter to {{INodeFileAttributes.SnapshotCopy}}'s constructor, and in 
{{FSImageFormatPBSnapshot#loadFileDiffList}} we pass in {{file.isStriped}}.
{code}
-  header = HeaderFormat.toLong(preferredBlockSize, replication,
+  header = HeaderFormat.toLong(preferredBlockSize, replication, false,
   storagePolicyID);
{code}
# Looks like we do not need to add a new constructor for INodeFile.

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7608) hdfs dfsclient newConnectedPeer has no write timeout

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626783#comment-14626783
 ] 

Hudson commented on HDFS-7608:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8162 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8162/])
HDFS-7608: hdfs dfsclient newConnectedPeer has no write timeout (Xiaoyu Yao via 
Colin P. McCabe) (cmccabe: rev 1d74ccececaefffaa90c0c18b40a3645dbc819d9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
HDFS-7608: add CHANGES.txt (cmccabe: rev 
b7fb6ec4513de7d342c541eb3d9e14642286e2cf)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> hdfs dfsclient  newConnectedPeer has no write timeout
> -
>
> Key: HDFS-7608
> URL: https://issues.apache.org/jira/browse/HDFS-7608
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs, hdfs-client
>Affects Versions: 2.3.0, 2.6.0
> Environment: hdfs 2.3.0  hbase 0.98.6
>Reporter: zhangshilong
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-7608.0.patch, HDFS-7608.1.patch, HDFS-7608.2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> problem:
> hbase compactSplitThread may lock forever on  read datanode blocks.
> debug found:  epollwait timeout set to 0,so epollwait can not  run out.
> cause: in hdfs 2.3.0
> hbase using DFSClient to read and write blocks.
> DFSClient  creates one socket using newConnectedPeer(addr), but has no read 
> or write timeout. 
> in v 2.6.0,  newConnectedPeer has added readTimeout to deal with the 
> problem,but did not add writeTimeout. why did not add write Timeout?
> I think NioInetPeer need a default socket timeout,so appalications will no 
> need to force adding timeout by themselives. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7608) hdfs dfsclient newConnectedPeer has no write timeout

2015-07-14 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7608:
---
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

committed to 2.8, thanks all

> hdfs dfsclient  newConnectedPeer has no write timeout
> -
>
> Key: HDFS-7608
> URL: https://issues.apache.org/jira/browse/HDFS-7608
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs, hdfs-client
>Affects Versions: 2.3.0, 2.6.0
> Environment: hdfs 2.3.0  hbase 0.98.6
>Reporter: zhangshilong
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-7608.0.patch, HDFS-7608.1.patch, HDFS-7608.2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> problem:
> hbase compactSplitThread may lock forever on  read datanode blocks.
> debug found:  epollwait timeout set to 0,so epollwait can not  run out.
> cause: in hdfs 2.3.0
> hbase using DFSClient to read and write blocks.
> DFSClient  creates one socket using newConnectedPeer(addr), but has no read 
> or write timeout. 
> in v 2.6.0,  newConnectedPeer has added readTimeout to deal with the 
> problem,but did not add writeTimeout. why did not add write Timeout?
> I think NioInetPeer need a default socket timeout,so appalications will no 
> need to force adding timeout by themselives. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8694) Expose the stats of IOErrors on each FsVolume through JMX

2015-07-14 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626770#comment-14626770
 ] 

Andrew Wang commented on HDFS-8694:
---

Hi Eddy, overall this patch looks great, thanks for working on it. Some review 
comments:

High-level:
* I have a hard time understanding when we should call handle the disk error 
vs. just bubbling up, since it bubbles there seems like a danger of handling 
the same root IOE more than once. What's the methodology here? Is it possible 
to move handling to the top-level somewhere? I can manually examine all the 
current callsites and callers, but that's not very future-proof.
* Related to the above, our unit tests do not cover anywhere close to all the 
locations that handle an IOError. Adding tests for all of these would be 
onerous, so very interested in a solution to the above.
* Since we now have the volume as context, we should really move the disk 
checker to be per-volume rather than DN wide. One volume throwing an error is 
no reason to check all of them. This can be deferred to a follow-up; I think 
it's a slam dunk.

Nits:

* FsDatasetImpl#moveBlockAcrossStorage, can we get rid of volume and move 
targetVolume's declaration outside of the try? Looks equal to volume.
* In places like BlockSender#close we actually could IOE multiple times but 
only increment once. Thoughts?
* Extra debug print in TestDataTransferKeepalive#testSlowReader
* Linebreak here is kind of awkward, move the end parens or the try up?

{code}
try (ReplicaHandler replica = dataset.append(
block, block.getGenerationStamp() + 1, block.getNumBytes())
) {
{code}

* TestMover adds a test timeout, looks unrelated to this patch?

> Expose the stats of IOErrors on each FsVolume through JMX
> -
>
> Key: HDFS-8694
> URL: https://issues.apache.org/jira/browse/HDFS-8694
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, HDFS
>Affects Versions: 2.7.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-8694.000.patch, HDFS-8694.001.patch
>
>
> Currently, once DataNode hits an {{IOError}} when writing / reading block 
> files, it starts a background {{DiskChecker.checkDirs()}} thread. But if this 
> thread successfully finishes, DN does not record this {{IOError}}. 
> We need one measurement to count all {{IOErrors}} for each volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block

2015-07-14 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8702:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7285
   Status: Resolved  (was: Patch Available)

I've committed this to the feature branch. Thanks for the contribution, 
[~kaisasak]! Thanks for the review, [~walter.k.su]!

> Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped 
> block
> ---
>
> Key: HDFS-8702
> URL: https://issues.apache.org/jira/browse/HDFS-8702
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Kai Sasaki
> Fix For: HDFS-7285
>
> Attachments: HDFS-8702-HDFS-7285.00.patch, 
> HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, 
> HDFS-8702-HDFS-7285.03.patch, HDFS-8702-HDFS-7285.04.patch
>
>
> Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs 
> updated for striped blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block

2015-07-14 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626748#comment-14626748
 ] 

Jing Zhao commented on HDFS-8702:
-

bq. I think we can use getRealDataBlockNum();

Hmm, you're right. The parameter is the expected storage number not the rack 
number. +1 for the latest patch. I will commit it shortly.

> Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped 
> block
> ---
>
> Key: HDFS-8702
> URL: https://issues.apache.org/jira/browse/HDFS-8702
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Kai Sasaki
> Attachments: HDFS-8702-HDFS-7285.00.patch, 
> HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, 
> HDFS-8702-HDFS-7285.03.patch, HDFS-8702-HDFS-7285.04.patch
>
>
> Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs 
> updated for striped blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes

2015-07-14 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626709#comment-14626709
 ] 

Arpit Agarwal commented on HDFS-8722:
-

+1 for the patch.

Verified that it brings unaligned writes on par with 2.6.0.

> Optimize datanode writes for small writes and flushes
> -
>
> Key: HDFS-8722
> URL: https://issues.apache.org/jira/browse/HDFS-8722
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-8722.patch, HDFS-8722.v1.patch
>
>
> After the data corruption fix by HDFS-4660, the CRC recalculation for partial 
> chunk is executed more frequently, if the client repeats writing few bytes 
> and calling hflush/hsync.  This is because the generic logic forces CRC 
> recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, 
> datanode blindly accepted whatever CRC client provided, if the incoming data 
> is chunk-aligned. This was the source of the corruption.
> We can still optimize for the most common case where a client is repeatedly 
> writing small number of bytes followed by hflush/hsync with no pipeline 
> recovery or append, by allowing the previous behavior for this specific case. 
>  If the incoming data has a duplicate portion and that is at the last 
> chunk-boundary before the partial chunk on disk, datanode can use the 
> checksum supplied by the client without redoing the checksum on its own.  
> This reduces disk reads as well as CPU load for the checksum calculation.
> If the incoming packet data goes back further than the last on-disk chunk 
> boundary, datanode will still do a recalculation, but this occurs rarely 
> during pipeline recoveries. Thus the optimization for this specific case 
> should be sufficient to speed up the vast majority of cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8736) ability to deny access to HDFS filesystems

2015-07-14 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626713#comment-14626713
 ] 

Steve Loughran commented on HDFS-8736:
--

You will also need to guard against untrusted code trying to open a network 
port and talking to hadoop direct, and doing the same for webhdfs. Given a 
socket and sufficient code, I can talk to an HDFS filesystem.

There is a well defined way to stop untrusted code talking to HDFS, it is 
called Kerberos. Yes, we all hate it. Yes, we all fear it, Yes, none of us 
understand it properly. But we know that it does lock things down so that not 
only are untrusted applications forbidden access, the caller gets the specific 
rights associated with the identity of the user making the operation.

(There's also a little detail of that patch still being un-applicable, but 
that's a detail here).

As I stated on the related MR JIRA, file an uber-JIRA where the whole aspect of 
running Hadoop (client?) in a sandbox can be discussed, rather than piece by 
piece patches which will probably get rejected on a case-by-case basis.

> ability to deny access to HDFS filesystems
> --
>
> Key: HDFS-8736
> URL: https://issues.apache.org/jira/browse/HDFS-8736
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Purvesh Patel
>Priority: Minor
>  Labels: security
> Attachments: HDFS-8736-1.patch
>
>
> In order to run in a secure context, ability to deny access to different 
> filesystems(specifically the local file system) to non-trusted code this 
> patch adds a new SecurityPermission class(AccessFileSystemPermission) and 
> checks the permission in FileSystem#get before returning a cached file system 
> or creating a new one. Please see attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile

2015-07-14 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626645#comment-14626645
 ] 

Zhe Zhang commented on HDFS-8058:
-

Triggering Jenkins again. Last run generated a lot of "Class not found" errors.

> Erasure coding: use BlockInfo[] for both striped and contiguous blocks in 
> INodeFile
> ---
>
> Key: HDFS-8058
> URL: https://issues.apache.org/jira/browse/HDFS-8058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Yi Liu
>Assignee: Zhe Zhang
> Attachments: HDFS-8058-HDFS-7285.003.patch, 
> HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, 
> HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, 
> HDFS-8058.001.patch, HDFS-8058.002.patch
>
>
> This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous 
> blocks in INodeFile.
> Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped 
> blocks, and the methods there duplicate with those in INodeFile, and current 
> code need to judge {{isStriped}} then do different things. Also if file is 
> striped, the {{blocks}} in INodeFile occupy a reference memory space.
> These are not necessary, and we can use the same {{blocks}} to make code more 
> clear.
> I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file 
> a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from 
> *BlockInfoStriped* to INodeFile, since ideally they are the same for all 
> striped blocks in a file, and store them in block will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-14 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626642#comment-14626642
 ] 

Allen Wittenauer commented on HDFS-8344:


+1

> NameNode doesn't recover lease for files with missing blocks
> 
>
> Key: HDFS-8344
> URL: https://issues.apache.org/jira/browse/HDFS-8344
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
> HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch
>
>
> I found another\(?) instance in which the lease is not recovered. This is 
> reproducible easily on a pseudo-distributed single node cluster
> # Before you start it helps if you set. This is not necessary, but simply 
> reduces how long you have to wait
> {code}
>   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
>   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
> LEASE_SOFTLIMIT_PERIOD;
> {code}
> # Client starts to write a file. (could be less than 1 block, but it hflushed 
> so some of the data has landed on the datanodes) (I'm copying the client code 
> I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
> # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
> TestHadoop.jar) process after it has printed "Wrote to the bufferedWriter"
> # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
> only 1)
> I believe the lease should be recovered and the block should be marked 
> missing. However this is not happening. The lease is never recovered.
> The effect of this bug for us was that nodes could not be decommissioned 
> cleanly. Although we knew that the client had crashed, the Namenode never 
> released the leases (even after restarting the Namenode) (even months 
> afterwards). There are actually several other cases too where we don't 
> consider what happens if ALL the datanodes die while the file is being 
> written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8736) ability to deny access to HDFS filesystems

2015-07-14 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626633#comment-14626633
 ] 

Allen Wittenauer commented on HDFS-8736:


Trying to solve server security problems from the client side never works.

bq. with the caveat that you'd need to also guard against users trying to 
instantiate the file system implementation directly using other permissions. 

... which is nearly impossible.  There's not a lot of work here to do exactly 
that:

java -Dfs.hdfs.impl=myclass

or

java -Dfs.s3.impl=DistributedFileSystem

or whatever

Now what?

> ability to deny access to HDFS filesystems
> --
>
> Key: HDFS-8736
> URL: https://issues.apache.org/jira/browse/HDFS-8736
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Purvesh Patel
>Priority: Minor
>  Labels: security
> Attachments: HDFS-8736-1.patch
>
>
> In order to run in a secure context, ability to deny access to different 
> filesystems(specifically the local file system) to non-trusted code this 
> patch adds a new SecurityPermission class(AccessFileSystemPermission) and 
> checks the permission in FileSystem#get before returning a cached file system 
> or creating a new one. Please see attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile

2015-07-14 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626625#comment-14626625
 ] 

Haohui Mai commented on HDFS-8767:
--

It looks like that a cleaner approach is to call {{list()}} only when the file 
is a directory.

> RawLocalFileSystem.listStatus() returns null for UNIX pipefile
> --
>
> Key: HDFS-8767
> URL: https://issues.apache.org/jira/browse/HDFS-8767
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: kanaka kumar avvaru
>Priority: Critical
> Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch
>
>
> Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of 
> the file. The bug breaks Hive when Hive loads data from UNIX pipe file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil

2015-07-14 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626596#comment-14626596
 ] 

Walter Su commented on HDFS-8433:
-

I have another idea: Pick up the {{BlockIdRange}} idea again in 01 patch. This 
time,  we don't need  {{BlockIdRange}} class. We extends the fields of 
{{BlockTokenIdentifier}}. (See BlockTokenIdentifier#readFields(..) / write(..) )
Just add a field {{IdRange}}, default value is 0.
I think the performance impact on contiguous block is little. And I think it 
also support old DN, old DN just doesn't read the last field.

I prefer to implement this against trunk. And just merge 02 patch(multiple 
tokens method) into feature branch and see how it works. Then we decide if it's 
worth to pick up BlockIdRange.

Hi, [~jingzhao], [~szetszwo]! Any idea?

> blockToken is not set in constructInternalBlock and parseStripedBlockGroup in 
> StripedBlockUtil
> --
>
> Key: HDFS-8433
> URL: https://issues.apache.org/jira/browse/HDFS-8433
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Walter Su
> Attachments: HDFS-8433-HDFS-7285.02.patch, HDFS-8433.00.patch, 
> HDFS-8433.01.patch
>
>
> The blockToken provided in LocatedStripedBlock is not used to create 
> LocatedBlock in constructInternalBlock and parseStripedBlockGroup in 
> StripedBlockUtil.
> We should also add ec tests with security on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626500#comment-14626500
 ] 

Hadoop QA commented on HDFS-8767:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 26s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m  7s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 21s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 51s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  22m  4s | Tests passed in 
hadoop-common. |
| | |  61m  3s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745256/HDFS-8767-01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4084eaf |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11700/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11700/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11700/console |


This message was automatically generated.

> RawLocalFileSystem.listStatus() returns null for UNIX pipefile
> --
>
> Key: HDFS-8767
> URL: https://issues.apache.org/jira/browse/HDFS-8767
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: kanaka kumar avvaru
>Priority: Critical
> Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch
>
>
> Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of 
> the file. The bug breaks Hive when Hive loads data from UNIX pipe file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626490#comment-14626490
 ] 

Hudson commented on HDFS-8541:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #254 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/254/])
HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move 
progress.  Contributed by Surendra Singh Lilhore (szetszwo: rev 
9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Mover should exit with NO_MOVE_PROGRESS if there is no move progress
> 
>
> Key: HDFS-8541
> URL: https://issues.apache.org/jira/browse/HDFS-8541
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch
>
>
> HDFS-8143 changed Mover to exit after some retry when failed to move blocks.  
> Two additional suggestions:
> # Mover retry counter should be incremented only if all moves fail.  If there 
> are some successful moves, the counter should be reset.
> # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626492#comment-14626492
 ] 

Hudson commented on HDFS-8143:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #254 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/254/])
Add HDFS-8143 to CHANGES.txt. (szetszwo: rev 
f7c8311e9836ad1a1a2ef6eca8b42fd61a688164)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS Mover tool should exit after some retry when failed to move blocks.
> 
>
> Key: HDFS-8143
> URL: https://issues.apache.org/jira/browse/HDFS-8143
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.7.1
>
> Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, 
> HDFS-8143_3.patch
>
>
> Mover is not coming out in case of failed to move blocks.
> {code}
> hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values());
> {code}
> {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks 
> migration failed. So hasRemaining never become false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626472#comment-14626472
 ] 

Hadoop QA commented on HDFS-8578:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 57s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 21s | The applied patch generated  1 
new checkstyle issues (total was 597, now 592). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 29s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  3s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 161m  1s | Tests failed in hadoop-hdfs. |
| | | 204m 22s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745242/HDFS-8578-07.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4084eaf |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11699/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11699/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11699/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11699/console |


This message was automatically generated.

> On upgrade, Datanode should process all storage/data dirs in parallel
> -
>
> Key: HDFS-8578
> URL: https://issues.apache.org/jira/browse/HDFS-8578
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Raju Bairishetti
>Assignee: Vinayakumar B
>Priority: Critical
> Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, 
> HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, 
> HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-branch-2.6.0.patch
>
>
> Right now, during upgrades datanode is processing all the storage dirs 
> sequentially. Assume it takes ~20 mins to process a single storage dir then  
> datanode which has ~10 disks will take around 3hours to come up.
> *BlockPoolSliceStorage.java*
> {code}
>for (int idx = 0; idx < getNumStorageDirs(); idx++) {
>   doTransition(datanode, getStorageDir(idx), nsInfo, startOpt);
>   assert getCTime() == nsInfo.getCTime() 
>   : "Data-node and name-node CTimes must be the same.";
> }
> {code}
> It would save lots of time during major upgrades if datanode process all 
> storagedirs/disks parallelly.
> Can we make datanode to process all storage dirs parallelly?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8736) ability to deny access to HDFS filesystems

2015-07-14 Thread Purvesh Patel (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626463#comment-14626463
 ] 

Purvesh Patel commented on HDFS-8736:
-

There is little confusion on the description of issue. This patch is introduced 
to prevent untrusted user code from accessing to HDFS, not the local file 
system. It's written in such a way as to potentially enable it to be used to 
block access to any type of FileSystem, with the caveat that you'd need to also 
guard against users trying to instantiate the file system implementation 
directly using other permissions. 

Additional permission to prevent users from getting access to instances of the 
HDFS FileSystem that were created when the user code was off-stack and that 
have pre-cached network connections.

> ability to deny access to HDFS filesystems
> --
>
> Key: HDFS-8736
> URL: https://issues.apache.org/jira/browse/HDFS-8736
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Purvesh Patel
>Priority: Minor
>  Labels: security
> Attachments: HDFS-8736-1.patch
>
>
> In order to run in a secure context, ability to deny access to different 
> filesystems(specifically the local file system) to non-trusted code this 
> patch adds a new SecurityPermission class(AccessFileSystemPermission) and 
> checks the permission in FileSystem#get before returning a cached file system 
> or creating a new one. Please see attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626452#comment-14626452
 ] 

Hudson commented on HDFS-8143:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2202 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2202/])
Add HDFS-8143 to CHANGES.txt. (szetszwo: rev 
f7c8311e9836ad1a1a2ef6eca8b42fd61a688164)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS Mover tool should exit after some retry when failed to move blocks.
> 
>
> Key: HDFS-8143
> URL: https://issues.apache.org/jira/browse/HDFS-8143
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.7.1
>
> Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, 
> HDFS-8143_3.patch
>
>
> Mover is not coming out in case of failed to move blocks.
> {code}
> hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values());
> {code}
> {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks 
> migration failed. So hasRemaining never become false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626450#comment-14626450
 ] 

Hudson commented on HDFS-8541:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2202 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2202/])
HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move 
progress.  Contributed by Surendra Singh Lilhore (szetszwo: rev 
9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java


> Mover should exit with NO_MOVE_PROGRESS if there is no move progress
> 
>
> Key: HDFS-8541
> URL: https://issues.apache.org/jira/browse/HDFS-8541
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch
>
>
> HDFS-8143 changed Mover to exit after some retry when failed to move blocks.  
> Two additional suggestions:
> # Mover retry counter should be incremented only if all moves fail.  If there 
> are some successful moves, the counter should be reset.
> # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8736) ability to deny access to HDFS filesystems

2015-07-14 Thread Purvesh Patel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Purvesh Patel updated HDFS-8736:

Summary: ability to deny access to HDFS filesystems  (was: ability to deny 
access to different filesystems)

> ability to deny access to HDFS filesystems
> --
>
> Key: HDFS-8736
> URL: https://issues.apache.org/jira/browse/HDFS-8736
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Purvesh Patel
>Priority: Minor
>  Labels: security
> Attachments: HDFS-8736-1.patch
>
>
> In order to run in a secure context, ability to deny access to different 
> filesystems(specifically the local file system) to non-trusted code this 
> patch adds a new SecurityPermission class(AccessFileSystemPermission) and 
> checks the permission in FileSystem#get before returning a cached file system 
> or creating a new one. Please see attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8736) ability to deny access to HDFS filesystems

2015-07-14 Thread Purvesh Patel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Purvesh Patel updated HDFS-8736:

Attachment: (was: Patch.pdf)

> ability to deny access to HDFS filesystems
> --
>
> Key: HDFS-8736
> URL: https://issues.apache.org/jira/browse/HDFS-8736
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Purvesh Patel
>Priority: Minor
>  Labels: security
> Attachments: HDFS-8736-1.patch
>
>
> In order to run in a secure context, ability to deny access to different 
> filesystems(specifically the local file system) to non-trusted code this 
> patch adds a new SecurityPermission class(AccessFileSystemPermission) and 
> checks the permission in FileSystem#get before returning a cached file system 
> or creating a new one. Please see attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8736) ability to deny access to HDFS filesystems

2015-07-14 Thread Purvesh Patel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Purvesh Patel updated HDFS-8736:

Attachment: HDFS-8736-1.patch

> ability to deny access to HDFS filesystems
> --
>
> Key: HDFS-8736
> URL: https://issues.apache.org/jira/browse/HDFS-8736
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Purvesh Patel
>Priority: Minor
>  Labels: security
> Attachments: HDFS-8736-1.patch
>
>
> In order to run in a secure context, ability to deny access to different 
> filesystems(specifically the local file system) to non-trusted code this 
> patch adds a new SecurityPermission class(AccessFileSystemPermission) and 
> checks the permission in FileSystem#get before returning a cached file system 
> or creating a new one. Please see attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626401#comment-14626401
 ] 

Hudson commented on HDFS-8541:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #244 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/244/])
HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move 
progress.  Contributed by Surendra Singh Lilhore (szetszwo: rev 
9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java


> Mover should exit with NO_MOVE_PROGRESS if there is no move progress
> 
>
> Key: HDFS-8541
> URL: https://issues.apache.org/jira/browse/HDFS-8541
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch
>
>
> HDFS-8143 changed Mover to exit after some retry when failed to move blocks.  
> Two additional suggestions:
> # Mover retry counter should be incremented only if all moves fail.  If there 
> are some successful moves, the counter should be reset.
> # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626403#comment-14626403
 ] 

Hudson commented on HDFS-8143:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #244 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/244/])
Add HDFS-8143 to CHANGES.txt. (szetszwo: rev 
f7c8311e9836ad1a1a2ef6eca8b42fd61a688164)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS Mover tool should exit after some retry when failed to move blocks.
> 
>
> Key: HDFS-8143
> URL: https://issues.apache.org/jira/browse/HDFS-8143
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.7.1
>
> Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, 
> HDFS-8143_3.patch
>
>
> Mover is not coming out in case of failed to move blocks.
> {code}
> hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values());
> {code}
> {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks 
> migration failed. So hasRemaining never become false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile

2015-07-14 Thread kanaka kumar avvaru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626397#comment-14626397
 ] 

kanaka kumar avvaru commented on HDFS-8767:
---

Updated patch with the test case on UNIX based systems as per 
[~ste...@apache.org] comment

> RawLocalFileSystem.listStatus() returns null for UNIX pipefile
> --
>
> Key: HDFS-8767
> URL: https://issues.apache.org/jira/browse/HDFS-8767
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: kanaka kumar avvaru
>Priority: Critical
> Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch
>
>
> Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of 
> the file. The bug breaks Hive when Hive loads data from UNIX pipe file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile

2015-07-14 Thread kanaka kumar avvaru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kanaka kumar avvaru updated HDFS-8767:
--
Attachment: HDFS-8767-01.patch

> RawLocalFileSystem.listStatus() returns null for UNIX pipefile
> --
>
> Key: HDFS-8767
> URL: https://issues.apache.org/jira/browse/HDFS-8767
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: kanaka kumar avvaru
>Priority: Critical
> Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch
>
>
> Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of 
> the file. The bug breaks Hive when Hive loads data from UNIX pipe file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626387#comment-14626387
 ] 

Hudson commented on HDFS-8143:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2183 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2183/])
Add HDFS-8143 to CHANGES.txt. (szetszwo: rev 
f7c8311e9836ad1a1a2ef6eca8b42fd61a688164)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS Mover tool should exit after some retry when failed to move blocks.
> 
>
> Key: HDFS-8143
> URL: https://issues.apache.org/jira/browse/HDFS-8143
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.7.1
>
> Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, 
> HDFS-8143_3.patch
>
>
> Mover is not coming out in case of failed to move blocks.
> {code}
> hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values());
> {code}
> {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks 
> migration failed. So hasRemaining never become false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626385#comment-14626385
 ] 

Hudson commented on HDFS-8541:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2183 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2183/])
HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move 
progress.  Contributed by Surendra Singh Lilhore (szetszwo: rev 
9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Mover should exit with NO_MOVE_PROGRESS if there is no move progress
> 
>
> Key: HDFS-8541
> URL: https://issues.apache.org/jira/browse/HDFS-8541
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch
>
>
> HDFS-8143 changed Mover to exit after some retry when failed to move blocks.  
> Two additional suggestions:
> # Mover retry counter should be incremented only if all moves fail.  If there 
> are some successful moves, the counter should be reset.
> # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewe

2015-07-14 Thread kanaka kumar avvaru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626381#comment-14626381
 ] 

kanaka kumar avvaru commented on HDFS-8622:
---

Thanks for update [~jagadesh.kiran], "-02.patch" looks good for me. +1 (Non 
binding). [~ajisakaa], please give your view.

> Implement GETCONTENTSUMMARY operation for WebImageViewe
> ---
>
> Key: HDFS-8622
> URL: https://issues.apache.org/jira/browse/HDFS-8622
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jagadesh Kiran N
>Assignee: Jagadesh Kiran N
> Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, 
> HDFS-8622-02.patch
>
>
>  it would be better for administrators if {code} GETCONTENTSUMMARY {code} are 
> supported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8772) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626312#comment-14626312
 ] 

Hadoop QA commented on HDFS-8772:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |   5m 40s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 19s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 33s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m  7s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 160m 28s | Tests failed in hadoop-hdfs. |
| | | 180m 16s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745227/HDFS-8772.01.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / ac94ba3 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11697/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11697/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11697/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11697/console |


This message was automatically generated.

> fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails  
> 
>
> Key: HDFS-8772
> URL: https://issues.apache.org/jira/browse/HDFS-8772
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-8772.01.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11598/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11600/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11606/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11608/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11612/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11618/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11650/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11655/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11659/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11663/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11664/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11667/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11669/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11676/testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/11677/testReport/
> {noformat}
> java.lang.AssertionError: expected:<0> but was:<4>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot.testDatanodeRestarts(TestStandbyIsHot.java:188)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8773) Few FSNamesystem metrics are not documented in the Metrics page

2015-07-14 Thread Rakesh R (JIRA)
Rakesh R created HDFS-8773:
--

 Summary: Few FSNamesystem metrics are not documented in the 
Metrics page
 Key: HDFS-8773
 URL: https://issues.apache.org/jira/browse/HDFS-8773
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Reporter: Rakesh R
Assignee: Rakesh R


This jira is to document missing metrics in the [Metrics 
page|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Metrics.html#FSNamesystem].
 Following are not documented:
{code}
MissingReplOneBlocks
NumFilesUnderConstruction
NumActiveClients
HAState
FSState
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8475) Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available

2015-07-14 Thread Vinod Valecha (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626288#comment-14626288
 ] 

Vinod Valecha commented on HDFS-8475:
-

Hi Team,

Could this be a configuration issue for hadoop. Can you pls point us to the 
configuration that we should be looking at in order to correct this.
Thanks a lot!


> Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no 
> length prefix available
> 
>
> Key: HDFS-8475
> URL: https://issues.apache.org/jira/browse/HDFS-8475
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Vinod Valecha
>Priority: Blocker
>
> Scenraio:
> =
> write a file
> corrupt block manually
> Exception stack trace- 
> 2015-05-24 02:31:55.291 INFO [T-33716795] 
> [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1155)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1088)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> [5/24/15 2:31:55:291 UTC] 02027a3b DFSClient I 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer createBlockOutputStream 
> Exception in createBlockOutputStream
>  java.io.EOFException: Premature EOF: no 
> length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1155)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1088)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> 2015-05-24 02:31:55.291 INFO [T-33716795] 
> [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] Abandoning 
> BP-176676314-10.108.106.59-1402620296713:blk_1404621403_330880579
> [5/24/15 2:31:55:291 UTC] 02027a3b DFSClient I 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream 
> Abandoning BP-176676314-10.108.106.59-1402620296713:blk_1404621403_330880579
> 2015-05-24 02:31:55.299 INFO [T-33716795] 
> [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] Excluding datanode 
> 10.108.106.59:50010
> [5/24/15 2:31:55:299 UTC] 02027a3b DFSClient I 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream 
> Excluding datanode 10.108.106.59:50010
> 2015-05-24 02:31:55.300 WARNING [T-33716795] 
> [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /var/db/opera/files/B4889CCDA75F9751DDBB488E5AAB433E/BE4DAEF290B7136ED6EF3D4B157441A2/BE4DAEF290B7136ED6EF3D4B157441A2-4.pag
>  could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 1 datanode(s) running and 1 node(s) are excluded in this operation.
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
> [5/24/15 2:31:55:300 UTC] 02027a3b DFSClient W 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer run DataStreamer Exception
>  
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /var/db/opera/files/B4889CCDA75F9751DDBB488E5AAB433E/BE4DAEF290B7136ED6EF3D4B157441A2/BE4DAEF290B7136ED6EF3D4B157441A2-4.pag
>  could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 1 datanode(s) running and 1 node(s) are excluded in this operation.
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolPro

[jira] [Updated] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel

2015-07-14 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-8578:

Attachment: HDFS-8578-07.patch

Fixed {{TestReplication}} which fails intermittently, not exactly related to 
this Jira.

{{TestHDFSCLI}} failed due to collision with hadoop-common precommit job.
{{TestDataNodeRollingUpgrade}} passes locally.

> On upgrade, Datanode should process all storage/data dirs in parallel
> -
>
> Key: HDFS-8578
> URL: https://issues.apache.org/jira/browse/HDFS-8578
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Raju Bairishetti
>Assignee: Vinayakumar B
>Priority: Critical
> Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, 
> HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, 
> HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-branch-2.6.0.patch
>
>
> Right now, during upgrades datanode is processing all the storage dirs 
> sequentially. Assume it takes ~20 mins to process a single storage dir then  
> datanode which has ~10 disks will take around 3hours to come up.
> *BlockPoolSliceStorage.java*
> {code}
>for (int idx = 0; idx < getNumStorageDirs(); idx++) {
>   doTransition(datanode, getStorageDir(idx), nsInfo, startOpt);
>   assert getCTime() == nsInfo.getCTime() 
>   : "Data-node and name-node CTimes must be the same.";
> }
> {code}
> It would save lots of time during major upgrades if datanode process all 
> storagedirs/disks parallelly.
> Can we make datanode to process all storage dirs parallelly?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626241#comment-14626241
 ] 

Hudson commented on HDFS-8143:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #986 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/986/])
Add HDFS-8143 to CHANGES.txt. (szetszwo: rev 
f7c8311e9836ad1a1a2ef6eca8b42fd61a688164)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS Mover tool should exit after some retry when failed to move blocks.
> 
>
> Key: HDFS-8143
> URL: https://issues.apache.org/jira/browse/HDFS-8143
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.7.1
>
> Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, 
> HDFS-8143_3.patch
>
>
> Mover is not coming out in case of failed to move blocks.
> {code}
> hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values());
> {code}
> {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks 
> migration failed. So hasRemaining never become false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626239#comment-14626239
 ] 

Hudson commented on HDFS-8541:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #986 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/986/])
HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move 
progress.  Contributed by Surendra Singh Lilhore (szetszwo: rev 
9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java


> Mover should exit with NO_MOVE_PROGRESS if there is no move progress
> 
>
> Key: HDFS-8541
> URL: https://issues.apache.org/jira/browse/HDFS-8541
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch
>
>
> HDFS-8143 changed Mover to exit after some retry when failed to move blocks.  
> Two additional suggestions:
> # Mover retry counter should be incremented only if all moves fail.  If there 
> are some successful moves, the counter should be reset.
> # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626231#comment-14626231
 ] 

Hudson commented on HDFS-8143:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #256 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/256/])
Add HDFS-8143 to CHANGES.txt. (szetszwo: rev 
f7c8311e9836ad1a1a2ef6eca8b42fd61a688164)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS Mover tool should exit after some retry when failed to move blocks.
> 
>
> Key: HDFS-8143
> URL: https://issues.apache.org/jira/browse/HDFS-8143
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.7.1
>
> Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, 
> HDFS-8143_3.patch
>
>
> Mover is not coming out in case of failed to move blocks.
> {code}
> hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values());
> {code}
> {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks 
> migration failed. So hasRemaining never become false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626228#comment-14626228
 ] 

Hudson commented on HDFS-8541:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #256 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/256/])
HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move 
progress.  Contributed by Surendra Singh Lilhore (szetszwo: rev 
9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java


> Mover should exit with NO_MOVE_PROGRESS if there is no move progress
> 
>
> Key: HDFS-8541
> URL: https://issues.apache.org/jira/browse/HDFS-8541
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch
>
>
> HDFS-8143 changed Mover to exit after some retry when failed to move blocks.  
> Two additional suggestions:
> # Mover retry counter should be incremented only if all moves fail.  If there 
> are some successful moves, the counter should be reset.
> # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626209#comment-14626209
 ] 

Hadoop QA commented on HDFS-8767:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 11s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  6s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  7s | The applied patch generated  1 
new checkstyle issues (total was 21, now 21). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 21s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 53s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  22m  4s | Tests failed in 
hadoop-common. |
| | |  62m 44s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.crypto.key.TestValueQueue |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745231/HDFS-8767-00.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4084eaf |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11698/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11698/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11698/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11698/console |


This message was automatically generated.

> RawLocalFileSystem.listStatus() returns null for UNIX pipefile
> --
>
> Key: HDFS-8767
> URL: https://issues.apache.org/jira/browse/HDFS-8767
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: kanaka kumar avvaru
>Priority: Critical
> Attachments: HDFS-8767-00.patch
>
>
> Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of 
> the file. The bug breaks Hive when Hive loads data from UNIX pipe file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile

2015-07-14 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626184#comment-14626184
 ] 

Steve Loughran commented on HDFS-8767:
--

this'll need a test for unix which at least downgrades on windows

> RawLocalFileSystem.listStatus() returns null for UNIX pipefile
> --
>
> Key: HDFS-8767
> URL: https://issues.apache.org/jira/browse/HDFS-8767
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: kanaka kumar avvaru
>Priority: Critical
> Attachments: HDFS-8767-00.patch
>
>
> Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of 
> the file. The bug breaks Hive when Hive loads data from UNIX pipe file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel

2015-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626168#comment-14626168
 ] 

Hadoop QA commented on HDFS-8578:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m  1s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  0s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 33s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 37s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  7s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 158m 41s | Tests failed in hadoop-hdfs. |
| | | 201m 29s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestReplication |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12745204/HDFS-8578-06.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / a431ed9 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11695/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11695/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11695/console |


This message was automatically generated.

> On upgrade, Datanode should process all storage/data dirs in parallel
> -
>
> Key: HDFS-8578
> URL: https://issues.apache.org/jira/browse/HDFS-8578
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Raju Bairishetti
>Assignee: Vinayakumar B
>Priority: Critical
> Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, 
> HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, 
> HDFS-8578-06.patch, HDFS-8578-branch-2.6.0.patch
>
>
> Right now, during upgrades datanode is processing all the storage dirs 
> sequentially. Assume it takes ~20 mins to process a single storage dir then  
> datanode which has ~10 disks will take around 3hours to come up.
> *BlockPoolSliceStorage.java*
> {code}
>for (int idx = 0; idx < getNumStorageDirs(); idx++) {
>   doTransition(datanode, getStorageDir(idx), nsInfo, startOpt);
>   assert getCTime() == nsInfo.getCTime() 
>   : "Data-node and name-node CTimes must be the same.";
> }
> {code}
> It would save lots of time during major upgrades if datanode process all 
> storagedirs/disks parallelly.
> Can we make datanode to process all storage dirs parallelly?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >