[jira] [Commented] (HDFS-8925) Move BlockReader to hdfs-client

2015-08-19 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704396#comment-14704396
 ] 

Zhe Zhang commented on HDFS-8925:
-

Thanks Haohui for the feedback. With the current {{ErasureCodingWorker}} design 
I think we are almost certain to move {{BlockReader}} out of client module 
again. I'll leave to [~hitliuyi] to comment on whether it makes sense to 
reimplement a block reader for DN.

> Move BlockReader to hdfs-client
> ---
>
> Key: HDFS-8925
> URL: https://issues.apache.org/jira/browse/HDFS-8925
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
>
> This jira tracks the effort of moving the {{BlockReader}} class into the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8927) CredentialsSys is not unix/linux compatible

2015-08-19 Thread Haim Helman (JIRA)
Haim Helman created HDFS-8927:
-

 Summary: CredentialsSys is not unix/linux compatible
 Key: HDFS-8927
 URL: https://issues.apache.org/jira/browse/HDFS-8927
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Reporter: Haim Helman
Priority: Minor


When trying to connect to a linux NFS server using AUTH_SYS and a hostname with 
33 bytes I get:
bad auth_len gid 0 str 36 auth 53

Looking at the Unix/Linux code at svc_auth_unix.c, it looks like the hostname 
length is rounded up to the nearest multiple of 4:
str_len = RNDUP(str_len);

Perhaps CredentialsSys should do that too?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-08-19 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704384#comment-14704384
 ] 

Zhe Zhang commented on HDFS-7285:
-

I just finished step #1 and #2 above, and pushed the result of "git merge" to 
the {{HDFS-7285-merge}} branch. Jenkins [job | 
https://builds.apache.org/job/Hadoop-HDFS-7285-Merge/] has been triggered.

Because HDFS-8801 requires major changes to the branch, this "git merge" was 
against HDFS-6407, which immediately precedes HDFS-8801 in trunk. [~jingzhao] 
has created a patch under HDFS-8909 which should be able to merge HDFS-8801 to 
the branch (I think we should keep it as a separate JIRA because it does more 
than just merging).

> Erasure Coding Support inside HDFS
> --
>
> Key: HDFS-7285
> URL: https://issues.apache.org/jira/browse/HDFS-7285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Weihua Jiang
>Assignee: Zhe Zhang
> Attachments: Consolidated-20150707.patch, 
> Consolidated-20150806.patch, Consolidated-20150810.patch, ECAnalyzer.py, 
> ECParser.py, HDFS-7285-initial-PoC.patch, 
> HDFS-7285-merge-consolidated-01.patch, 
> HDFS-7285-merge-consolidated-trunk-01.patch, 
> HDFS-7285-merge-consolidated.trunk.03.patch, 
> HDFS-7285-merge-consolidated.trunk.04.patch, 
> HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, 
> HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, 
> HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, 
> HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, 
> fsimage-analysis-20150105.pdf
>
>
> Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
> of data reliability, comparing to the existing HDFS 3-replica approach. For 
> example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
> with storage overhead only being 40%. This makes EC a quite attractive 
> alternative for big data storage, particularly for cold data. 
> Facebook had a related open source project called HDFS-RAID. It used to be 
> one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
> for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
> on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
> cold files that are intended not to be appended anymore; 3) the pure Java EC 
> coding implementation is extremely slow in practical use. Due to these, it 
> might not be a good idea to just bring HDFS-RAID back.
> We (Intel and Cloudera) are working on a design to build EC into HDFS that 
> gets rid of any external dependencies, makes it self-contained and 
> independently maintained. This design lays the EC feature on the storage type 
> support and considers compatible with existing HDFS features like caching, 
> snapshot, encryption, high availability and etc. This design will also 
> support different EC coding schemes, implementations and policies for 
> different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
> ISA-L library), an implementation can greatly improve the performance of EC 
> encoding/decoding and makes the EC solution even more attractive. We will 
> post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp

2015-08-19 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704343#comment-14704343
 ] 

Yongjun Zhang commented on HDFS-8828:
-

+1 on rev 011. Will commit tomorrow morning.

Thanks [~yufeigu] and [~jingzhao]!



> Utilize Snapshot diff report to build incremental copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch, HDFS-8828.011.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704321#comment-14704321
 ] 

Hadoop QA commented on HDFS-8924:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   7m 54s | The applied patch generated  1  
additional warning messages. |
| {color:green}+1{color} | javadoc |   9m 55s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 29s | The applied patch generated  
22 new checkstyle issues (total was 40, now 62). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 41s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 17s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 203m 24s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 28s | Tests passed in 
hadoop-hdfs-client. |
| | | 254m 26s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestNameNodeMetricsLogger |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751359/HDFS-8924.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4e14f79 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12052/artifact/patchprocess/diffJavacWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12052/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12052/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12052/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12052/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12052/console |


This message was automatically generated.

> Add pluggable interface for reading replicas in DFSClient
> -
>
> Key: HDFS-8924
> URL: https://issues.apache.org/jira/browse/HDFS-8924
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8924.001.patch
>
>
> We should add a pluggable interface for reading replicas in the DFSClient.  
> This could be used to implement short-circuit reads on systems without file 
> descriptors, or for other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2015-08-19 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8287:
---
Target Version/s: HDFS-7285

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, 
> HDFS-8287-HDFS-7285.05.patch
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2015-08-19 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704315#comment-14704315
 ] 

Rakesh R commented on HDFS-8287:


bq. I think it might need some more rewrite, so it is better to do in separate 
JIRA. Is that okay?
OK, it makes sense to me.

Thanks [~kaisasak], +1 latest patch looks good.

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, 
> HDFS-8287-HDFS-7285.05.patch
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs

2015-08-19 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704288#comment-14704288
 ] 

Akira AJISAKA commented on HDFS-8306:
-

Thanks [~eddyxu] for creating the patch. Could you skip output the contents of 
a feature (ex. quota by storage type) instead of throwing IOException if the 
layout version of the fsimage does not support the feature?

> Generate ACL and Xattr outputs in OIV XML outputs
> -
>
> Key: HDFS-8306
> URL: https://issues.apache.org/jira/browse/HDFS-8306
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-8306.000.patch, HDFS-8306.001.patch, 
> HDFS-8306.002.patch, HDFS-8306.003.patch, HDFS-8306.004.patch, 
> HDFS-8306.005.patch, HDFS-8306.006.patch, HDFS-8306.007.patch, 
> HDFS-8306.008.patch, HDFS-8306.debug0.patch, HDFS-8306.debug1.patch
>
>
> Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are 
> outputs. It makes inspecting {{fsimage}} from XML outputs less practical. 
> Also it prevents recovering a fsimage from XML file.
> This JIRA is adding ACL and XAttrs in the XML outputs as the first step to 
> achieve the goal described in HDFS-8061.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704275#comment-14704275
 ] 

Hadoop QA commented on HDFS-8823:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 57s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |   7m 58s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 52s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 24s | The applied patch generated  2 
new checkstyle issues (total was 648, now 645). |
| {color:green}+1{color} | whitespace |   0m  7s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 38s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 176m 27s | Tests failed in hadoop-hdfs. |
| | | 222m 10s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751373/HDFS-8823.006.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4e14f79 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12051/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12051/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12051/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12051/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12051/console |


This message was automatically generated.

> Move replication factor into individual blocks
> --
>
> Key: HDFS-8823
> URL: https://issues.apache.org/jira/browse/HDFS-8823
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, 
> HDFS-8823.002.patch, HDFS-8823.003.patch, HDFS-8823.004.patch, 
> HDFS-8823.005.patch, HDFS-8823.006.patch
>
>
> This jira proposes to record the replication factor in the {{BlockInfo}} 
> class. The changes have two advantages:
> * Decoupling the namespace and the block management layer. It is a 
> prerequisite step to move block management off the heap or to a separate 
> process.
> * Increased flexibility on replicating blocks. Currently the replication 
> factors of all blocks have to be the same. The replication factors of these 
> blocks are equal to the highest replication factor across all snapshots. The 
> changes will allow blocks in a file to have different replication factor, 
> potentially saving some space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704270#comment-14704270
 ] 

Hadoop QA commented on HDFS-8828:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 56s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 26s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 10s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   6m 26s | Tests passed in 
hadoop-distcp. |
| | |  43m 32s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751374/HDFS-8828.011.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 36b1a1e |
| hadoop-distcp test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12053/artifact/patchprocess/testrun_hadoop-distcp.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12053/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12053/console |


This message was automatically generated.

> Utilize Snapshot diff report to build incremental copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch, HDFS-8828.011.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8829) DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning

2015-08-19 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704266#comment-14704266
 ] 

He Tianyi commented on HDFS-8829:
-

Completely agree with you, [~cmccabe].

> DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning
> ---
>
> Key: HDFS-8829
> URL: https://issues.apache.org/jira/browse/HDFS-8829
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.3.0, 2.6.0
>Reporter: He Tianyi
>Assignee: kanaka kumar avvaru
>
> {code:java}
>   private void initDataXceiver(Configuration conf) throws IOException {
> // find free port or use privileged port provided
> TcpPeerServer tcpPeerServer;
> if (secureResources != null) {
>   tcpPeerServer = new TcpPeerServer(secureResources);
> } else {
>   tcpPeerServer = new TcpPeerServer(dnConf.socketWriteTimeout,
>   DataNode.getStreamingAddr(conf));
> }
> 
> tcpPeerServer.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE);
> {code}
> The last line sets SO_RCVBUF explicitly, thus disabling tcp auto-tuning on 
> some system.
> Shall we make this behavior configurable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-247) A tool to plot the locations of the blocks of a directory

2015-08-19 Thread Avinash Desireddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Desireddy reassigned HDFS-247:
--

Assignee: Avinash Desireddy

> A tool to plot the locations of the blocks of a directory
> -
>
> Key: HDFS-247
> URL: https://issues.apache.org/jira/browse/HDFS-247
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Owen O'Malley
>Assignee: Avinash Desireddy
>  Labels: newbie
>
> It would be very useful to have a command that we could give a hdfs directory 
> to, that would use fsck to find the block locations of the data files in that 
> directory and group them by host and display the distribution graphically. We 
> did this by hand and it was very for finding a skewed distribution that was 
> causing performance problems. The tool should also be able to group by rack 
> id and generate a similar plot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8388) Time and Date format need to be in sync in Namenode UI page

2015-08-19 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704264#comment-14704264
 ] 

Surendra Singh Lilhore commented on HDFS-8388:
--

yes, I will do this and update the patch here soon..

> Time and Date format need to be in sync in Namenode UI page
> ---
>
> Key: HDFS-8388
> URL: https://issues.apache.org/jira/browse/HDFS-8388
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: HDFS-8388-002.patch, HDFS-8388-003.patch, 
> HDFS-8388.patch, HDFS-8388_1.patch, ScreenShot-InvalidDate.png
>
>
> In NameNode UI Page, Date and Time FORMAT  displayed on the page are not in 
> sync currently.
> Started:Wed May 13 12:28:02 IST 2015
> Compiled:23 Apr 2015 12:22:59 
> Block Deletion Start Time   13 May 2015 12:28:02
> We can keep a common format in all the above places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8388) Time and Date format need to be in sync in Namenode UI page

2015-08-19 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704184#comment-14704184
 ] 

Akira AJISAKA commented on HDFS-8388:
-

bq. I didn't see any document for NNStarted, so I didn't added for new metric.
I'm thinking {{NNStarted}} should be documented as well. Would you document 
both {{NNStarted}} and the new metric?

> Time and Date format need to be in sync in Namenode UI page
> ---
>
> Key: HDFS-8388
> URL: https://issues.apache.org/jira/browse/HDFS-8388
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: HDFS-8388-002.patch, HDFS-8388-003.patch, 
> HDFS-8388.patch, HDFS-8388_1.patch, ScreenShot-InvalidDate.png
>
>
> In NameNode UI Page, Date and Time FORMAT  displayed on the page are not in 
> sync currently.
> Started:Wed May 13 12:28:02 IST 2015
> Compiled:23 Apr 2015 12:22:59 
> Block Deletion Start Time   13 May 2015 12:28:02
> We can keep a common format in all the above places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7116) Add a command to get the bandwidth of balancer

2015-08-19 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704182#comment-14704182
 ] 

Akira AJISAKA commented on HDFS-7116:
-

bq. How about exposing balancerBandwidth value as a Datanode metric?
I'm fine with your suggestion. Let's start this first.

> Add a command to get the bandwidth of balancer
> --
>
> Key: HDFS-7116
> URL: https://issues.apache.org/jira/browse/HDFS-7116
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: Akira AJISAKA
>Assignee: Rakesh R
> Attachments: HDFS-7116-00.patch, HDFS-7116-01.patch
>
>
> Now reading logs is the only way to check how the balancer bandwidth is set. 
> It would be useful for administrators if they can get the parameter via CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too

2015-08-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704181#comment-14704181
 ] 

Colin Patrick McCabe commented on HDFS-8892:


bq. I assume slot-invalidation will happen during block-invalidation/deletes 
{Primarily triggered by compaction/shard-takeover etc..}

Yes.

I guess the good thing about this patch is that it might reduce fd consumption 
in some scenarios.  The bad thing about this approach is that instead of only 
checking the older replicas, we would have to iterate over all replicas in 
order to check every slot.  Also, currently the Runnable runs less often when 
the replica timeout is longer, so this logic would have to be changed.

> ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too
> -
>
> Key: HDFS-8892
> URL: https://issues.apache.org/jira/browse/HDFS-8892
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.7.1
>Reporter: Ravikumar
>Assignee: kanaka kumar avvaru
>Priority: Minor
>
> Currently CacheCleaner thread checks only for cache-expiry times. It would be 
> nice if it handles an invalid-slot too in an extra-pass of evictable map…
> for(ShortCircuitReplica replica:evictable.values()) {
>  if(!scr.getSlot().isValid()) {
> purge(replica);
>  }
> }
> //Existing code...
> int numDemoted = demoteOldEvictableMmaped(curMs);
> int numPurged = 0;
> Long evictionTimeNs = Long.valueOf(0);
> ….
> …..
> Apps like HBase can tweak the expiry/staleness/cache-size params in 
> DFS-Client, so that ShortCircuitReplica will never be closed except when Slot 
> is declared invalid. 
> I assume slot-invalidation will happen during block-invalidation/deletes 
> {Primarily triggered by compaction/shard-takeover etc..}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8922) IBM Java requires libdl for linking in native_mini_dfs

2015-08-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704149#comment-14704149
 ] 

Colin Patrick McCabe commented on HDFS-8922:


+1

> IBM Java requires libdl for linking in native_mini_dfs
> --
>
> Key: HDFS-8922
> URL: https://issues.apache.org/jira/browse/HDFS-8922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.7.1
> Environment: IBM Java RHEL7.1 
>Reporter: Ayappan
> Attachments: HDFS-8922.patch
>
>
> Building hadoop-hdfs-project with -Pnative option using IBM Java fails with 
> the following error
> [exec] Linking C executable test_native_mini_dfs
>  [exec] /usr/bin/cmake -E cmake_link_script 
> CMakeFiles/test_native_mini_dfs.dir/link.txt --verbose=1
>  [exec] /usr/bin/cc   -g -Wall -O2 -D_REENTRANT -D_GNU_SOURCE 
> -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -fvisibility=hidden
> CMakeFiles/test_native_mini_dfs.dir/main/native/libhdfs/test_native_mini_dfs.c.o
>   -o test_native_mini_dfs -rdynamic libnative_mini_dfs.a 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so -lpthread 
> -Wl,-rpath,/home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic
>  [exec] make[2]: Leaving directory 
> `/home/ayappan/hadoop_2.7.1_new/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/native'
>  [exec] make[1]: Leaving directory 
> `/home/ayappan/hadoop_2.7.1_new/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/native'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dlopen'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dlclose'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dlerror'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dlsym'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dladdr'
>  [exec] collect2: error: ld returned 1 exit status
>  [exec] make[2]: *** [test_native_mini_dfs] Error 1
>  [exec] make[1]: *** [CMakeFiles/test_native_mini_dfs.dir/all] Error 2
>  [exec] make: *** [all] Error 2
> It seems like the IBM jvm requires libdl for linking in native_mini_dfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient

2015-08-19 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704147#comment-14704147
 ] 

Lei (Eddy) Xu commented on HDFS-8924:
-

This patch is mostly proposing an interface, it looks good to me.

Only a few minor comments:
* Could you remove the change in {{ClientContext.java}}?
* Would {{ReplicaAccessor}}, {{ReplicaAccessorBuilder}} to be {{interface}}?

+1 after these being addressed.

> Add pluggable interface for reading replicas in DFSClient
> -
>
> Key: HDFS-8924
> URL: https://issues.apache.org/jira/browse/HDFS-8924
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8924.001.patch
>
>
> We should add a pluggable interface for reading replicas in the DFSClient.  
> This could be used to implement short-circuit reads on systems without file 
> descriptors, or for other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8920) Erasure Coding: when recovering lost blocks, logs can be too verbose and hurt performance

2015-08-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704148#comment-14704148
 ] 

Colin Patrick McCabe commented on HDFS-8920:


These log message reflect possible data loss.  I do not think we should change 
the log level here, although maybe there is another change we could make.

> Erasure Coding: when recovering lost blocks, logs can be too verbose and hurt 
> performance
> -
>
> Key: HDFS-8920
> URL: https://issues.apache.org/jira/browse/HDFS-8920
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rui Li
>Assignee: Rui Li
>
> When we test reading data with datanodes killed, 
> {{DFSInputStream::getBestNodeDNAddrPair}} becomes a hot spot method and 
> effectively blocks the client JVM. This log seems too verbose:
> {code}
> if (chosenNode == null) {
>   DFSClient.LOG.warn("No live nodes contain block " + block.getBlock() +
>   " after checking nodes = " + Arrays.toString(nodes) +
>   ", ignoredNodes = " + ignoredNodes);
>   return null;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-08-19 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704102#comment-14704102
 ] 

Xiaobing Zhou commented on HDFS-8855:
-

Thanks [~wheat9] and [~bobhansen] for review. Tracing down to ProtobufRpcEngine 
that calls Client.getConnection which fetches connection from 
cache(com.google.common.cache.Cache) by using ConnectionId as key. ConnectionId 
is different for every webhdfs request even if the url and user are same. 
That's why the NN connection is constantly created by DN in this case. Need to 
refactor ConnectionId(hashCode or equals) somehow to meet comparison invariant 
assumed by cache(com.google.common.cache.Cache) to make it work properly.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
> Environment: HDP 2.2
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.1.patch, HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704101#comment-14704101
 ] 

Hadoop QA commented on HDFS-6481:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | patch |   0m  1s | The patch file was not named 
according to hadoop's naming conventions. Please see 
https://wiki.apache.org/hadoop/HowToContribute for instructions. |
| {color:blue}0{color} | pre-patch |  18m 54s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  9s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 27s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 46s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 16s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 181m 21s | Tests failed in hadoop-hdfs. |
| | | 228m 58s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestNameNodeMetricsLogger |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12648186/hdfs-6481-v1.txt |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3aac475 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12047/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12047/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12047/console |


This message was automatically generated.

> DatanodeManager#getDatanodeStorageInfos() should check the length of 
> storageIDs
> ---
>
> Key: HDFS-6481
> URL: https://issues.apache.org/jira/browse/HDFS-6481
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>  Labels: BB2015-05-TBR
> Attachments: hdfs-6481-v1.txt
>
>
> Ian Brooks reported the following stack trace:
> {code}
> 2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
> /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
>  block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
> hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security

[jira] [Commented] (HDFS-8809) HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing blocks) when HBase is running

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704088#comment-14704088
 ] 

Hadoop QA commented on HDFS-8809:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 22s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 16s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 19s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 34s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 42s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 23s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 177m 26s | Tests failed in hadoop-hdfs. |
| | | 221m 53s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751321/HDFS-8809.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3aac475 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12048/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12048/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12048/console |


This message was automatically generated.

> HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing 
> blocks) when HBase is running
> ---
>
> Key: HDFS-8809
> URL: https://issues.apache.org/jira/browse/HDFS-8809
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.0
> Environment: Hadoop 2.7.1 and HBase 1.1.1, on SUSE11sp3 (other 
> Linuxes not tested, probably not platform-dependent).  This did NOT happen 
> with Hadoop 2.4 and HBase 0.98.
>Reporter: Sudhir Prakash
>Assignee: Jing Zhao
> Attachments: HDFS-8809.000.patch
>
>
> Whenever HBase is running, the "hdfs fsck /"  reports four hbase-related 
> files in the path "hbase/data/WALs/" as CORRUPT. Even after letting the 
> cluster sit idle for a couple hours, it is still in the corrupt state.  If 
> HBase is shut down, the problem goes away.  If HBase is then restarted, the 
> problem recurs.  This was observed with Hadoop 2.7.1 and HBase 1.1.1, and did 
> NOT happen with Hadoop 2.4 and HBase 0.98.
> {code}
> hades1:/var/opt/teradata/packages # su hdfs
> hdfs@hades1:/var/opt/teradata/packages> hdfs fsck /
> Connecting to namenode via 
> http://hades1.labs.teradata.com:50070/fsck?ugi=hdfs&path=%2F
> FSCK started by hdfs (auth:SIMPLE) from /39.0.8.2 for path / at Wed Jun 24 
> 20:40:17 GMT 2015
> ...
> /apps/hbase/data/WALs/hades4.labs.teradata.com,16020,1435168292684/hades4.labs.teradata.com%2C16020%2C1435168292684.default.1435175500556:
>  MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466..meta.1435175562144.meta:
>  MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466.default.1435175498500:
>  MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades6.labs.teradata.com,16020,1435168292373/hades6.labs.teradata.com%2C16020%2C1435168292373.default.1435175500301:
>  MISSING 1 blocks of total size 83 
> B..
> 
> .

[jira] [Updated] (HDFS-8900) Optimize XAttr memory footprint.

2015-08-19 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8900:
-
Summary: Optimize XAttr memory footprint.  (was: Improve XAttr memory 
footprint.)

> Optimize XAttr memory footprint.
> 
>
> Key: HDFS-8900
> URL: https://issues.apache.org/jira/browse/HDFS-8900
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>
> {code}
> private final ImmutableList xAttrs;
> {code}
> Currently we use above in XAttrFeature, it's not efficient from memory point 
> of view, since {{ImmutableList}} and {{XAttr}} have object memory overhead, 
> and each object has memory alignment. 
> We can use a {{byte[]}} in XAttrFeature and do some compact in {{XAttr}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8926) Update the distcp document for new improvements by using snapshot diff report

2015-08-19 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated HDFS-8926:
---
Description: HDFS-8828 utilize Snapshot diff report to build incremental 
copy list in distcp. We should update the DistCp document to describe how to 
use this feature.

> Update the distcp document for new improvements by using snapshot diff report
> -
>
> Key: HDFS-8926
> URL: https://issues.apache.org/jira/browse/HDFS-8926
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, documentation
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>
> HDFS-8828 utilize Snapshot diff report to build incremental copy list in 
> distcp. We should update the DistCp document to describe how to use this 
> feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8926) Update the distcp document for new improvements by using snapshot diff report

2015-08-19 Thread Yufei Gu (JIRA)
Yufei Gu created HDFS-8926:
--

 Summary: Update the distcp document for new improvements by using 
snapshot diff report
 Key: HDFS-8926
 URL: https://issues.apache.org/jira/browse/HDFS-8926
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, documentation
Reporter: Yufei Gu
Assignee: Yufei Gu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp

2015-08-19 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704039#comment-14704039
 ] 

Yufei Gu commented on HDFS-8828:


Thank you very much, [~jingzhao]!
Thank you very much, [~yzhangal]!

I've uploaded a new patch 011 for all your comments. I will create a follow-up 
jira for the document. 

Thanks.

> Utilize Snapshot diff report to build incremental copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch, HDFS-8828.011.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp

2015-08-19 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated HDFS-8828:
---
Attachment: HDFS-8828.011.patch

> Utilize Snapshot diff report to build incremental copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch, HDFS-8828.011.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8823) Move replication factor into individual blocks

2015-08-19 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8823:
-
Attachment: HDFS-8823.006.patch

> Move replication factor into individual blocks
> --
>
> Key: HDFS-8823
> URL: https://issues.apache.org/jira/browse/HDFS-8823
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, 
> HDFS-8823.002.patch, HDFS-8823.003.patch, HDFS-8823.004.patch, 
> HDFS-8823.005.patch, HDFS-8823.006.patch
>
>
> This jira proposes to record the replication factor in the {{BlockInfo}} 
> class. The changes have two advantages:
> * Decoupling the namespace and the block management layer. It is a 
> prerequisite step to move block management off the heap or to a separate 
> process.
> * Increased flexibility on replicating blocks. Currently the replication 
> factors of all blocks have to be the same. The replication factors of these 
> blocks are equal to the highest replication factor across all snapshots. The 
> changes will allow blocks in a file to have different replication factor, 
> potentially saving some space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp

2015-08-19 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704021#comment-14704021
 ] 

Yongjun Zhang commented on HDFS-8828:
-

Thanks a lot [~jingzhao]!

Hi [~yufeigu],

Would you please work out a new rev to address our latest comments? I will 
commit it after jenkins.

And would you please create a follow-up jira for document update afterwards?

Thanks.




> Utilize Snapshot diff report to build incremental copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2015-08-19 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated HDFS-8287:
-
Attachment: HDFS-8287-HDFS-7285.05.patch

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, 
> HDFS-8287-HDFS-7285.05.patch
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks

2015-08-19 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704010#comment-14704010
 ] 

Jing Zhao commented on HDFS-8823:
-

The latest patch looks good to me. One comment is that for {{setReplication}}, 
instead of checking/updating quota block by block, we should calculate the 
total delta first and check if it breaks the quota limit.
{code}
// Ensure the quota does not exceed
if (oldBR < replication) {
  for (BlockInfo b : file.getBlocks()) {
fsd.updateCount(iip, 0L, b.getNumBytes(), oldBR, replication,
true);
  }
}
{code}

[~daryn], do you want to take a look at the patch?

> Move replication factor into individual blocks
> --
>
> Key: HDFS-8823
> URL: https://issues.apache.org/jira/browse/HDFS-8823
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, 
> HDFS-8823.002.patch, HDFS-8823.003.patch, HDFS-8823.004.patch, 
> HDFS-8823.005.patch
>
>
> This jira proposes to record the replication factor in the {{BlockInfo}} 
> class. The changes have two advantages:
> * Decoupling the namespace and the block management layer. It is a 
> prerequisite step to move block management off the heap or to a separate 
> process.
> * Increased flexibility on replicating blocks. Currently the replication 
> factors of all blocks have to be the same. The replication factors of these 
> blocks are equal to the highest replication factor across all snapshots. The 
> changes will allow blocks in a file to have different replication factor, 
> potentially saving some space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8888) Support volumes in HDFS

2015-08-19 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703974#comment-14703974
 ] 

Andrew Wang commented on HDFS-:
---

I see encryption zones as the closest thing semantically to volumes right now 
because of the rename restriction, and it's been incompatible with some 
applications like Hive (which we fixed) and HDFS trash (which we haven't). 
Right now that pain is restricted to the subset of HDFS users who are also 
using encryption, but volumes as a first-class citizen will bring this into the 
spotlight. Volumes might be compelling enough to revisit the various rename 
assumptions in our app stack, but need to think hard about the app changes that 
are required.

The motivations you've listed for the first phase of development reference 
simpler implementation and management. Regarding implementation, we've already 
implemented the additional complexity of doing it at the directory-level,  so 
what's the advantage of changing it up now? Management-wise, I don't quite 
understand why it's easier to manage volumes vs. folders. You can treat some 
folders as you would volumes and get the same properties, right?

The scalability motivations are more compelling to me since it's something we 
can't do now, but there's still more vertical scalability work we can do first 
that preserves existing semantics. Also if we want to pursue volumes vs. a true 
distributed namespace implementation which might preserve existing semantics.

Finally, is this going to be linked with viewfs improvements? If volumes are a 
first-class citizen and being added and removed all the time, it'd be nice to 
have a centralized mount table rather than having to push out new client 
configs each time. Also need it to be able to say, list the set of volumes, or 
automatically choosing a NN when provisioning or rebalancing volumes.

> Support volumes in HDFS
> ---
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>
> There are multiple types of zones (e.g., snapshottable directories, 
> encryption zones, directories with quotas) which are conceptually close to 
> namespace volumes in traditional file systems.
> This jira proposes to introduce the concept of volume to simplify the 
> implementation of snapshots and encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient

2015-08-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703956#comment-14703956
 ] 

Colin Patrick McCabe commented on HDFS-8924:


One example is a storage appliance which wanted to create a custom 
short-circuit read implementation.  Creating a new BlockReader does not satisfy 
this use-case because this block reader would have to have hardware-specific 
code which does not belong in upstream.  For example, it would have 
dependencies on specific JNI and other libraries for interfacing with the 
hardware.

> Add pluggable interface for reading replicas in DFSClient
> -
>
> Key: HDFS-8924
> URL: https://issues.apache.org/jira/browse/HDFS-8924
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8924.001.patch
>
>
> We should add a pluggable interface for reading replicas in the DFSClient.  
> This could be used to implement short-circuit reads on systems without file 
> descriptors, or for other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8925) Move BlockReader to hdfs-client

2015-08-19 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703957#comment-14703957
 ] 

Haohui Mai commented on HDFS-8925:
--

In general the development of feature branches does not block changes in trunk.

Feature branches need to continuously be in sync of trunk. Please feel free to 
create a jira if you think you need one to track the change of merging things 
back.

> Move BlockReader to hdfs-client
> ---
>
> Key: HDFS-8925
> URL: https://issues.apache.org/jira/browse/HDFS-8925
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
>
> This jira tracks the effort of moving the {{BlockReader}} class into the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length

2015-08-19 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8883:

Reporter: Arpit Agarwal  (was: Anu Engineer)

> NameNode Metrics : Add FSNameSystem lock Queue Length
> -
>
> Key: HDFS-8883
> URL: https://issues.apache.org/jira/browse/HDFS-8883
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
> Fix For: 2.8.0
>
> Attachments: HDFS-8883.001.patch
>
>
> FSNameSystemLock can have contention when NameNode is under load. This patch 
> adds  LockQueueLength -- the number of threads waiting on FSNameSystemLock -- 
> as a metric in NameNode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length

2015-08-19 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8883:

Assignee: Anu Engineer

> NameNode Metrics : Add FSNameSystem lock Queue Length
> -
>
> Key: HDFS-8883
> URL: https://issues.apache.org/jira/browse/HDFS-8883
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-8883.001.patch
>
>
> FSNameSystemLock can have contention when NameNode is under load. This patch 
> adds  LockQueueLength -- the number of threads waiting on FSNameSystemLock -- 
> as a metric in NameNode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length

2015-08-19 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8883:

Assignee: (was: Anu Engineer)

> NameNode Metrics : Add FSNameSystem lock Queue Length
> -
>
> Key: HDFS-8883
> URL: https://issues.apache.org/jira/browse/HDFS-8883
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-8883.001.patch
>
>
> FSNameSystemLock can have contention when NameNode is under load. This patch 
> adds  LockQueueLength -- the number of threads waiting on FSNameSystemLock -- 
> as a metric in NameNode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8917) Cleanup BlockInfoUnderConstruction from comments and tests

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703954#comment-14703954
 ] 

Hudson commented on HDFS-8917:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8324 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8324/])
HDFS-8917. Cleanup BlockInfoUnderConstruction from comments and tests. 
Contributed by Zhe Zhang. (jing9: rev 4e14f7982a6e57bf08deb3b266806c2b779a157d)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotTestHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileUnderConstructionFeature.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockInfoUnderConstruction.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockUnderConstructionFeature.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java


> Cleanup BlockInfoUnderConstruction from comments and tests
> --
>
> Key: HDFS-8917
> URL: https://issues.apache.org/jira/browse/HDFS-8917
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8917.00.patch
>
>
> HDFS-8801 eliminates the {{BlockInfoUnderConstruction}} class. This JIRA is a 
> follow-on to cleanup comments and tests which refer to the class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length

2015-08-19 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8883:

Reporter: Anu Engineer  (was: Arpit Agarwal)

> NameNode Metrics : Add FSNameSystem lock Queue Length
> -
>
> Key: HDFS-8883
> URL: https://issues.apache.org/jira/browse/HDFS-8883
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-8883.001.patch
>
>
> FSNameSystemLock can have contention when NameNode is under load. This patch 
> adds  LockQueueLength -- the number of threads waiting on FSNameSystemLock -- 
> as a metric in NameNode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8925) Move BlockReader to hdfs-client

2015-08-19 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703952#comment-14703952
 ] 

Zhe Zhang commented on HDFS-8925:
-

While {{BlockReader}} is only used on the client side in trunk, the HDFS-7285 
branch uses it in DN. Maybe we should reconsider this refactor? Thanks.

> Move BlockReader to hdfs-client
> ---
>
> Key: HDFS-8925
> URL: https://issues.apache.org/jira/browse/HDFS-8925
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
>
> This jira tracks the effort of moving the {{BlockReader}} class into the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2015-08-19 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703946#comment-14703946
 ] 

Kai Sasaki commented on HDFS-8287:
--

[~rakeshr] Thank you for reviewing! 

{quote}
We should improve this by notifying the writers, isn't it?
{quote}

It's reasonable. Current client side or streamer itself cannot handle 
exceptions threw by {{ParityGenerator}}. There  might be two points we can do 
to improve this. 

1. Put together handling logic into {{UncaughtExceptionHandler}} in order to 
maintainability and readability.
2. Notify exception to client side from {{UncaughtExceptionHandler}}.

I think it might need some more rewrite, so it is better to do in separate 
JIRA. Is that okay?

I'll update other points you pointed out in this JIRA. Thank you.

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6244) Make Trash Interval configurable for each of the namespaces

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703943#comment-14703943
 ] 

Hadoop QA commented on HDFS-6244:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751344/HDFS-6244.v6.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4e14f79 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12050/console |


This message was automatically generated.

> Make Trash Interval configurable for each of the namespaces
> ---
>
> Key: HDFS-6244
> URL: https://issues.apache.org/jira/browse/HDFS-6244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, 
> HDFS-6244.v3.patch, HDFS-6244.v4.patch, HDFS-6244.v5.patch, HDFS-6244.v6.patch
>
>
> Somehow we need to avoid the cluster filling up.
> One solution is to have a different trash policy per namespace. However, if 
> we can simply make the property configurable per namespace, then the same 
> config can be rolled everywhere and we'd be done. This seems simple enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8890) Allow admin to specify which blockpools the balancer should run on

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703942#comment-14703942
 ] 

Hadoop QA commented on HDFS-8890:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 12s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 10s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 11s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 33s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 3  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 39s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 39s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 20s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 173m 56s | Tests failed in hadoop-hdfs. |
| | | 217m 45s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestWriteBlockGetsBlockLengthHint |
| Timed out tests | org.apache.hadoop.hdfs.server.balancer.TestBalancer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751316/HDFS-8890-trunk-v2.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3aac475 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12046/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12046/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12046/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12046/console |


This message was automatically generated.

> Allow admin to specify which blockpools the balancer should run on
> --
>
> Key: HDFS-8890
> URL: https://issues.apache.org/jira/browse/HDFS-8890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: HDFS-8890-trunk-v1.patch, HDFS-8890-trunk-v2.patch
>
>
> Currently the balancer runs on all blockpools. Allow an admin to run the 
> balancer on a set of blockpools. This will enable the balancer to skip 
> blockpools that should not be balanced. For example, a tmp blockpool that has 
> a large amount of churn.
> An example of the command line interface would be an additional flag that 
> specifies the blockpools by id:
> -blockpools 
> BP-6299761-10.55.116.188-1415904647555,BP-47348528-10.51.120.139-1415904199257



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8925) Move BlockReader to hdfs-client

2015-08-19 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8925:

Description: This jira tracks the effort of moving the {{BlockReader}} 
class into the hdfs-client module.  (was: This jira tracks the effort of moving 
the {{DfsClientConf}} class into the hdfs-client module.)

> Move BlockReader to hdfs-client
> ---
>
> Key: HDFS-8925
> URL: https://issues.apache.org/jira/browse/HDFS-8925
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
>
> This jira tracks the effort of moving the {{BlockReader}} class into the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8925) Move BlockReader to hdfs-client

2015-08-19 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8925:

Hadoop Flags:   (was: Reviewed)

> Move BlockReader to hdfs-client
> ---
>
> Key: HDFS-8925
> URL: https://issues.apache.org/jira/browse/HDFS-8925
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
>
> This jira tracks the effort of moving the {{DfsClientConf}} class into the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8925) Move BlockReader to hdfs-client

2015-08-19 Thread Mingliang Liu (JIRA)
Mingliang Liu created HDFS-8925:
---

 Summary: Move BlockReader to hdfs-client
 Key: HDFS-8925
 URL: https://issues.apache.org/jira/browse/HDFS-8925
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Fix For: 2.8.0


This jira tracks the effort of moving the {{DfsClientConf}} class into the 
hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-6290) File is not closed in OfflineImageViewerPB#run()

2015-08-19 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved HDFS-6290.
--
Resolution: Won't Fix

I don't think this is worth fixing as the life cycle of the file closely 
matches the life cycle of the process. The file will be automatically closed 
when the process exits.

> File is not closed in OfflineImageViewerPB#run()
> 
>
> Key: HDFS-6290
> URL: https://issues.apache.org/jira/browse/HDFS-6290
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
>   } else if (processor.equals("XML")) {
> new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile,
> "r"));
> {code}
> The RandomAccessFile instance should be closed before the method returns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient

2015-08-19 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703926#comment-14703926
 ] 

Haohui Mai commented on HDFS-8924:
--

Do you have any specific use cases in mind? Will creating a new {{BlockReader}} 
satisfy your use case?

> Add pluggable interface for reading replicas in DFSClient
> -
>
> Key: HDFS-8924
> URL: https://issues.apache.org/jira/browse/HDFS-8924
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8924.001.patch
>
>
> We should add a pluggable interface for reading replicas in the DFSClient.  
> This could be used to implement short-circuit reads on systems without file 
> descriptors, or for other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703923#comment-14703923
 ] 

Hadoop QA commented on HDFS-8828:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 43s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 53s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 27s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  6s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   6m 24s | Tests passed in 
hadoop-distcp. |
| | |  43m 40s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751292/HDFS-8828.010.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4e14f79 |
| hadoop-distcp test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12049/artifact/patchprocess/testrun_hadoop-distcp.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12049/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12049/console |


This message was automatically generated.

> Utilize Snapshot diff report to build incremental copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient

2015-08-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703916#comment-14703916
 ] 

Colin Patrick McCabe commented on HDFS-8924:


This patch adds a pluggable {{ReplicaAccessorBuilder}} class which can be used 
to create {{ReplicaAccessor}} objects.  Unlike {{BlockReader}}, 
{{ReplicaAccessor}} is a stable API which is decoupled from internal 
implementation details and non-public classes.  {{BlockReaderFactory}} will ask 
all of the configured {{ReplicaAccessorBuilder}} objects to create a new 
{{ReplicaAccessor}}.  If none are configured, or none can create one, we use 
the existing block reader code.  Otherwise, we create an 
{{ExternalBlockReader}} wrapping the {{ReplicaAccessor}}.  I also added a 
reserved {{DataTransferProtocol}} opcode (127) in {{Op.java}}.  This will 
ensure that anyone adding a custom opcode will not conflict with other new 
opcodes added upstream.

> Add pluggable interface for reading replicas in DFSClient
> -
>
> Key: HDFS-8924
> URL: https://issues.apache.org/jira/browse/HDFS-8924
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8924.001.patch
>
>
> We should add a pluggable interface for reading replicas in the DFSClient.  
> This could be used to implement short-circuit reads on systems without file 
> descriptors, or for other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient

2015-08-19 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8924:
---
Attachment: HDFS-8924.001.patch

> Add pluggable interface for reading replicas in DFSClient
> -
>
> Key: HDFS-8924
> URL: https://issues.apache.org/jira/browse/HDFS-8924
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8924.001.patch
>
>
> We should add a pluggable interface for reading replicas in the DFSClient.  
> This could be used to implement short-circuit reads on systems without file 
> descriptors, or for other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient

2015-08-19 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8924:
---
Status: Patch Available  (was: Open)

> Add pluggable interface for reading replicas in DFSClient
> -
>
> Key: HDFS-8924
> URL: https://issues.apache.org/jira/browse/HDFS-8924
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8924.001.patch
>
>
> We should add a pluggable interface for reading replicas in the DFSClient.  
> This could be used to implement short-circuit reads on systems without file 
> descriptors, or for other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient

2015-08-19 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-8924:
--

 Summary: Add pluggable interface for reading replicas in DFSClient
 Key: HDFS-8924
 URL: https://issues.apache.org/jira/browse/HDFS-8924
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


We should add a pluggable interface for reading replicas in the DFSClient.  
This could be used to implement short-circuit reads on systems without file 
descriptors, or for other optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp

2015-08-19 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703863#comment-14703863
 ] 

Jing Zhao commented on HDFS-8828:
-

Thanks for updating the patch, Yufei! The latest patch looks good to me. One 
nit is the two "=null" initialization can be skipped.
{code}
  DistCpSync(DistCpOptions options, Configuration conf) {
this.inputOptions = options;
this.conf = conf;
this.diffMap = null;
this.renameDiffs = null;
  }
{code}

+1 after addressing this and Yongjun's comments. Also looks like we need to 
update the distcp doc for this functionality? Please feel free to do it in a 
separate jira.

> Utilize Snapshot diff report to build incremental copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8923) Add -source flag to balancer usage message

2015-08-19 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703855#comment-14703855
 ] 

Chris Trezzo commented on HDFS-8923:


Test failure is unrelated. Patch should be good to go.

> Add -source flag to balancer usage message
> --
>
> Key: HDFS-8923
> URL: https://issues.apache.org/jira/browse/HDFS-8923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
> Attachments: HDFS-8923-trunk-v1.patch
>
>
> HDFS-8826 added a -source flag to the balancer, but the usage message still 
> needs to be updated. See current usage message in trunk:
> {code}
>private static final String USAGE = "Usage: hdfs balancer"
>+ "\n\t[-policy ]\tthe balancing policy: "
>+ BalancingPolicy.Node.INSTANCE.getName() + " or "
>+ BalancingPolicy.Pool.INSTANCE.getName()
>+ "\n\t[-threshold ]\tPercentage of disk capacity"
>+ "\n\t[-exclude [-f  | ]]"
>+ "\tExcludes the specified datanodes."
>+ "\n\t[-include [-f  | ]]"
>+ "\tIncludes only the specified datanodes."
>+ "\n\t[-idleiterations ]"
>+ "\tNumber of consecutive idle iterations (-1 for Infinite) before "
>+ "exit."
>+ "\n\t[-runDuringUpgrade]"
>+ "\tWhether to run the balancer during an ongoing HDFS upgrade."
>+ "This is usually not desired since it will not affect used space "
>+ "on over-utilized machines.";
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8917) Cleanup BlockInfoUnderConstruction from comments and tests

2015-08-19 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703851#comment-14703851
 ] 

Zhe Zhang commented on HDFS-8917:
-

Thanks Jing for reviewing!

> Cleanup BlockInfoUnderConstruction from comments and tests
> --
>
> Key: HDFS-8917
> URL: https://issues.apache.org/jira/browse/HDFS-8917
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8917.00.patch
>
>
> HDFS-8801 eliminates the {{BlockInfoUnderConstruction}} class. This JIRA is a 
> follow-on to cleanup comments and tests which refer to the class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8918) Convert BlockUnderConstructionFeature#replicas form list to array

2015-08-19 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang resolved HDFS-8918.
-
Resolution: Duplicate

> Convert BlockUnderConstructionFeature#replicas form list to array
> -
>
> Key: HDFS-8918
> URL: https://issues.apache.org/jira/browse/HDFS-8918
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> {{BlockInfoUnderConstruction}} / {{BlockUnderConstructionFeature}} uses a 
> List to store its {{replicas}}. To reduce memory usage, we can use an array 
> instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8917) Cleanup BlockInfoUnderConstruction from comments and tests

2015-08-19 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8917:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed this. Thanks Zhe for the contribution!

> Cleanup BlockInfoUnderConstruction from comments and tests
> --
>
> Key: HDFS-8917
> URL: https://issues.apache.org/jira/browse/HDFS-8917
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8917.00.patch
>
>
> HDFS-8801 eliminates the {{BlockInfoUnderConstruction}} class. This JIRA is a 
> follow-on to cleanup comments and tests which refer to the class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8917) Cleanup BlockInfoUnderConstruction from comments and tests

2015-08-19 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703819#comment-14703819
 ] 

Jing Zhao commented on HDFS-8917:
-

+1. I will commit it shortly.

> Cleanup BlockInfoUnderConstruction from comments and tests
> --
>
> Key: HDFS-8917
> URL: https://issues.apache.org/jira/browse/HDFS-8917
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>Priority: Minor
> Attachments: HDFS-8917.00.patch
>
>
> HDFS-8801 eliminates the {{BlockInfoUnderConstruction}} class. This JIRA is a 
> follow-on to cleanup comments and tests which refer to the class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8909) Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use BlockUnderConstructionFeature

2015-08-19 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8909:

Attachment: HDFS-8909.001.patch

Thanks for the review, Zhe! Update the patch to address your comments.

bq. separate ReplicaUnderConstruction

Currently I prefer separating it out as a standalone class. Maybe we can do it 
also in trunk.

bq. a contiguous block should not have different IDs reported. Should we add 
some assertion to be more clear

Here the issue is the BlockUnderConstructionFeature itself does not know if 
this is striped or not, thus there is no way to do extra verification. But 
since the reported block is added a replica only if its block id is mapped to 
the BlockInfo object, I do not think we need to add extra assertion here.

bq. simplify BlockInfo#convertToBlockUnderConstruction

Currently I want to make sure when we create a BlockUCFeature, the expected 
locations are already passed in. Thus I leave the storage array in the 
constructor parameter list. But please let me know if you have a strong feeling 
about it.

> Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use 
> BlockUnderConstructionFeature
> 
>
> Key: HDFS-8909
> URL: https://issues.apache.org/jira/browse/HDFS-8909
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Jing Zhao
> Attachments: HDFS-8909.000.patch, HDFS-8909.001.patch
>
>
> HDFS-8801 converts {{BlockInfoUC}} as a feature. We should consolidate 
> {{BlockInfoContiguousUC}} and {{BlockInfoStripedUC}} logics to use this 
> feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8923) Add -source flag to balancer usage message

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703778#comment-14703778
 ] 

Hadoop QA commented on HDFS-8923:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 35s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 31s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 32s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 45s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 20s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  93m  2s | Tests failed in hadoop-hdfs. |
| | | 140m 52s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.hdfs.TestPread |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751312/HDFS-8923-trunk-v1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3aac475 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12045/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12045/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12045/console |


This message was automatically generated.

> Add -source flag to balancer usage message
> --
>
> Key: HDFS-8923
> URL: https://issues.apache.org/jira/browse/HDFS-8923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
> Attachments: HDFS-8923-trunk-v1.patch
>
>
> HDFS-8826 added a -source flag to the balancer, but the usage message still 
> needs to be updated. See current usage message in trunk:
> {code}
>private static final String USAGE = "Usage: hdfs balancer"
>+ "\n\t[-policy ]\tthe balancing policy: "
>+ BalancingPolicy.Node.INSTANCE.getName() + " or "
>+ BalancingPolicy.Pool.INSTANCE.getName()
>+ "\n\t[-threshold ]\tPercentage of disk capacity"
>+ "\n\t[-exclude [-f  | ]]"
>+ "\tExcludes the specified datanodes."
>+ "\n\t[-include [-f  | ]]"
>+ "\tIncludes only the specified datanodes."
>+ "\n\t[-idleiterations ]"
>+ "\tNumber of consecutive idle iterations (-1 for Infinite) before "
>+ "exit."
>+ "\n\t[-runDuringUpgrade]"
>+ "\tWhether to run the balancer during an ongoing HDFS upgrade."
>+ "This is usually not desired since it will not affect used space "
>+ "on over-utilized machines.";
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp

2015-08-19 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703777#comment-14703777
 ] 

Yongjun Zhang commented on HDFS-8828:
-

The build failure may be an intermittent one. I manually kicked off a new run at

https://builds.apache.org/job/PreCommit-HDFS-Build/12049/


> Utilize Snapshot diff report to build incremental copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8922) IBM Java requires libdl for linking in native_mini_dfs

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703754#comment-14703754
 ] 

Hadoop QA commented on HDFS-8922:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 44s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m 13s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 25s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | native |   1m  1s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 176m 59s | Tests failed in hadoop-hdfs. |
| | | 194m 21s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751299/HDFS-8922.patch |
| Optional Tests | javac unit |
| git revision | trunk / f61120d |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12043/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12043/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12043/console |


This message was automatically generated.

> IBM Java requires libdl for linking in native_mini_dfs
> --
>
> Key: HDFS-8922
> URL: https://issues.apache.org/jira/browse/HDFS-8922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.7.1
> Environment: IBM Java RHEL7.1 
>Reporter: Ayappan
> Attachments: HDFS-8922.patch
>
>
> Building hadoop-hdfs-project with -Pnative option using IBM Java fails with 
> the following error
> [exec] Linking C executable test_native_mini_dfs
>  [exec] /usr/bin/cmake -E cmake_link_script 
> CMakeFiles/test_native_mini_dfs.dir/link.txt --verbose=1
>  [exec] /usr/bin/cc   -g -Wall -O2 -D_REENTRANT -D_GNU_SOURCE 
> -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -fvisibility=hidden
> CMakeFiles/test_native_mini_dfs.dir/main/native/libhdfs/test_native_mini_dfs.c.o
>   -o test_native_mini_dfs -rdynamic libnative_mini_dfs.a 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so -lpthread 
> -Wl,-rpath,/home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic
>  [exec] make[2]: Leaving directory 
> `/home/ayappan/hadoop_2.7.1_new/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/native'
>  [exec] make[1]: Leaving directory 
> `/home/ayappan/hadoop_2.7.1_new/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/native'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dlopen'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dlclose'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dlerror'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dlsym'
>  [exec] 
> /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: 
> undefined reference to `dladdr'
>  [exec] collect2: error: ld returned 1 exit status
>  [exec] make[2]: *** [test_native_mini_dfs] Error 1
>  [exec] make[1]: *** [CMakeFiles/test_native_mini_dfs.dir/all] Error 2
>  [exec] make: *** [all] Error 2
> It seems like the IBM jvm requires libdl for linking in native_mini_dfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces

2015-08-19 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-6244:
--
Attachment: HDFS-6244.v6.patch

> Make Trash Interval configurable for each of the namespaces
> ---
>
> Key: HDFS-6244
> URL: https://issues.apache.org/jira/browse/HDFS-6244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, 
> HDFS-6244.v3.patch, HDFS-6244.v4.patch, HDFS-6244.v5.patch, HDFS-6244.v6.patch
>
>
> Somehow we need to avoid the cluster filling up.
> One solution is to have a different trash policy per namespace. However, if 
> we can simply make the property configurable per namespace, then the same 
> config can be rolled everywhere and we'd be done. This seems simple enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces

2015-08-19 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-6244:
--
Status: Patch Available  (was: Open)

> Make Trash Interval configurable for each of the namespaces
> ---
>
> Key: HDFS-6244
> URL: https://issues.apache.org/jira/browse/HDFS-6244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, 
> HDFS-6244.v3.patch, HDFS-6244.v4.patch, HDFS-6244.v5.patch, HDFS-6244.v6.patch
>
>
> Somehow we need to avoid the cluster filling up.
> One solution is to have a different trash policy per namespace. However, if 
> we can simply make the property configurable per namespace, then the same 
> config can be rolled everywhere and we'd be done. This seems simple enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8803) Move DfsClientConf to hdfs-client

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703732#comment-14703732
 ] 

Hudson commented on HDFS-8803:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8323 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8323/])
HDFS-8803. Move DfsClientConf to hdfs-client. Contributed by Mingliang Liu. 
(wheat9: rev 3aac4758b007a56e3d66998d457b2156effca528)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend2.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelRead.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/TestFiDataTransferProtocol2.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/TestFiDataTransferProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDisableConnCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRead.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyPersistTestCase.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestCachingStrategy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeDeath.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/ByteArrayManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/FileAppendTest4.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFsDefaultValue.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferKeepalive.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ClientContext.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestDatanodeRestart.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderFactory.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend4.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUnbuffer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestConnCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/TestFiPipelines.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelShortCircuitLegacyRead.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.ja

[jira] [Commented] (HDFS-8867) Enable optimized block reports

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703731#comment-14703731
 ] 

Hudson commented on HDFS-8867:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8323 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8323/])
HDFS-8867. Enable optimized block reports. Contributed by Daryn Sharp. (jing9: 
rev f61120d964a609ae5eabeb5c4d6c9afe0a15cad8)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java


> Enable optimized block reports
> --
>
> Key: HDFS-8867
> URL: https://issues.apache.org/jira/browse/HDFS-8867
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
> Fix For: 2.7.2
>
> Attachments: HDFS-8867.patch
>
>
> Opening this ticket on behalf of [~daryn]
> HDFS-7435 introduced a more efficiently encoded block report format designed 
> to improve performance and reduce GC load on the NN and DNs. The NN is not 
> advertising this capability to the DNs so old-style reports are still being 
> used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp

2015-08-19 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-8828:

Summary: Utilize Snapshot diff report to build incremental copy list in 
distcp  (was: Utilize Snapshot diff report to build copy list in distcp)

> Utilize Snapshot diff report to build incremental copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs

2015-08-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6481:
-
Description: 
Ian Brooks reported the following stack trace:

{code}
2014-06-03 13:05:03,915 WARN  [DataStreamer for file 
/user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200
 block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] 
hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
 0
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)

at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: 
syncer encountered error, will retry. txid=211
org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
 0
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apach

[jira] [Commented] (HDFS-8888) Support volumes in HDFS

2015-08-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703624#comment-14703624
 ] 

Colin Patrick McCabe commented on HDFS-:


bq. Each volume could become its own RW lock with in the NN. This would improve 
parallelism within NN without much additional effort.

Given the problems we already have with large NN heaps, perhaps we would be 
better off running multiple Namenode processes than trying to manage multiple 
independent subtrees in a single process.

I am also worried that a lot of the changes here seem incompatible.  If we are 
going to break backwards compatibility, why wouldn't we push people towards 
something like ozone, which does have a better horizontal scalability story.

It seems like we should have a design meeting about this before we do any work 
in this direction.

> Support volumes in HDFS
> ---
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>
> There are multiple types of zones (e.g., snapshottable directories, 
> encryption zones, directories with quotas) which are conceptually close to 
> namespace volumes in traditional file systems.
> This jira proposes to introduce the concept of volume to simplify the 
> implementation of snapshots and encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6264) Provide FileSystem#create() variant which throws exception if parent directory doesn't exist

2015-08-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6264:
-
Description: 
FileSystem#createNonRecursive() is deprecated.
However, there is no DistributedFileSystem#create() implementation which throws 
exception if parent directory doesn't exist.
This limits clients' migration away from the deprecated method.

For HBase, IO fencing relies on the behavior of FileSystem#createNonRecursive().

Variant of create() method should be added which throws exception if parent 
directory doesn't exist.

  was:
FileSystem#createNonRecursive() is deprecated.

However, there is no DistributedFileSystem#create() implementation which throws 
exception if parent directory doesn't exist.
This limits clients' migration away from the deprecated method.

For HBase, IO fencing relies on the behavior of FileSystem#createNonRecursive().

Variant of create() method should be added which throws exception if parent 
directory doesn't exist.


> Provide FileSystem#create() variant which throws exception if parent 
> directory doesn't exist
> 
>
> Key: HDFS-6264
> URL: https://issues.apache.org/jira/browse/HDFS-6264
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Ted Yu
>  Labels: hbase
> Attachments: hdfs-6264-v1.txt
>
>
> FileSystem#createNonRecursive() is deprecated.
> However, there is no DistributedFileSystem#create() implementation which 
> throws exception if parent directory doesn't exist.
> This limits clients' migration away from the deprecated method.
> For HBase, IO fencing relies on the behavior of 
> FileSystem#createNonRecursive().
> Variant of create() method should be added which throws exception if parent 
> directory doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6290) File is not closed in OfflineImageViewerPB#run()

2015-08-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6290:
-
Description: 
{code}
  } else if (processor.equals("XML")) {
new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile,
"r"));
{code}
The RandomAccessFile instance should be closed before the method returns.

  was:
{code}
  } else if (processor.equals("XML")) {
new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile,
"r"));
{code}

The RandomAccessFile instance should be closed before the method returns.


> File is not closed in OfflineImageViewerPB#run()
> 
>
> Key: HDFS-6290
> URL: https://issues.apache.org/jira/browse/HDFS-6290
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
>   } else if (processor.equals("XML")) {
> new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile,
> "r"));
> {code}
> The RandomAccessFile instance should be closed before the method returns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8829) DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning

2015-08-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703617#comment-14703617
 ] 

Colin Patrick McCabe commented on HDFS-8829:


I agree that we might not want to default to auto-tuning.  But we should at 
least make it available.  I think if {{dfs.data.socket.size}} is -1, we should 
use auto-tuning.

> DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning
> ---
>
> Key: HDFS-8829
> URL: https://issues.apache.org/jira/browse/HDFS-8829
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.3.0, 2.6.0
>Reporter: He Tianyi
>Assignee: kanaka kumar avvaru
>
> {code:java}
>   private void initDataXceiver(Configuration conf) throws IOException {
> // find free port or use privileged port provided
> TcpPeerServer tcpPeerServer;
> if (secureResources != null) {
>   tcpPeerServer = new TcpPeerServer(secureResources);
> } else {
>   tcpPeerServer = new TcpPeerServer(dnConf.socketWriteTimeout,
>   DataNode.getStreamingAddr(conf));
> }
> 
> tcpPeerServer.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE);
> {code}
> The last line sets SO_RCVBUF explicitly, thus disabling tcp auto-tuning on 
> some system.
> Shall we make this behavior configurable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8862) BlockManager#excessReplicateMap should use a HashMap

2015-08-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703610#comment-14703610
 ] 

Colin Patrick McCabe commented on HDFS-8862:


Thanks, [~hitliuyi].

> BlockManager#excessReplicateMap should use a HashMap
> 
>
> Key: HDFS-8862
> URL: https://issues.apache.org/jira/browse/HDFS-8862
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8862.001.patch
>
>
> Per [~cmccabe]'s comments in HDFS-8792, this JIRA is to discuss improving 
> {{BlockManager#excessReplicateMap}}.
> That's right HashMap don't ever shrink when elements are removed,  but 
> TreeMap entry needs to store more (memory) references (left,  right, parent) 
> than HashMap entry (only one reference next),  even when there is element 
> removing and cause some entry empty, the empty HashMap entry is just a 
> {{null}} reference (4 bytes),  so they are close at this point.  On the other 
> hand, the key of {{excessReplicateMap}} is datanode uuid, so the entries 
> number is almost fixed, so HashMap memory is good than TreeMap memory in this 
> case.   I think the most important is the search/insert/remove performance, 
> HashMap is absolutely better than TreeMap.  Because we don't need to sort,  
> we should use HashMap instead of TreeMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-19 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703591#comment-14703591
 ] 

Yongjun Zhang commented on HDFS-8828:
-

Hi [~jingzhao],

Thanks a lot for your review and comments, I discussed with [~yufeigu] and he 
worked out the new revs to address your comments. 

HI [~yufeigu], thanks for the new rev, some nits:

* put the following code into its own method, like 
createInputFileListingWithDiff
{code}
180   Path fileListingPath = getFileListingPath();
181   CopyListing copyListing =
182   new SimpleCopyListing(job.getConfiguration(),
183   job.getCredentials(), distCpSync);
184   copyListing.buildListing(fileListingPath, inputOptions);
{code}
so this can be in parallel with the existing method  
{{createInputFileListing(Job job)}}
* you accidentally changed {{* 
http://www.apache.org/licenses/LICENSE-2.0}}, please revert this change
* In comments, "//xyz" should be "// xyz", notice the space between "//" and 
the text

Please consider addressing them together with what Jing might have,

Hi [~jingzhao], it looks good to me after the above nits addressed. Would you 
mind take another look so Yufei can address altogether if you have any more 
comments?

Thanks,



> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8803) Move DfsClientConf to hdfs-client

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703575#comment-14703575
 ] 

Hudson commented on HDFS-8803:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8322 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8322/])
HDFS-8803. Move DfsClientConf to hdfs-client. Contributed by Mingliang Liu. 
(wheat9: rev 3aac4758b007a56e3d66998d457b2156effca528)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelRead.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelShortCircuitLegacyRead.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/package-info.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/TestFiDataTransferProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/ByteArrayManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelShortCircuitReadUnCached.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/TestFiDataTransferProtocol2.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend4.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/TestFiPipelines.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestCachingStrategy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitLocalRead.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRead.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/package-info.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFsDefaultValue.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/ByteArrayManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferKeepalive.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDisableConnCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend2.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apac

[jira] [Commented] (HDFS-8867) Enable optimized block reports

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703574#comment-14703574
 ] 

Hudson commented on HDFS-8867:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8322 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8322/])
HDFS-8867. Enable optimized block reports. Contributed by Daryn Sharp. (jing9: 
rev f61120d964a609ae5eabeb5c4d6c9afe0a15cad8)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java


> Enable optimized block reports
> --
>
> Key: HDFS-8867
> URL: https://issues.apache.org/jira/browse/HDFS-8867
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
> Fix For: 2.7.2
>
> Attachments: HDFS-8867.patch
>
>
> Opening this ticket on behalf of [~daryn]
> HDFS-7435 introduced a more efficiently encoded block report format designed 
> to improve performance and reduce GC load on the NN and DNs. The NN is not 
> advertising this capability to the DNs so old-style reports are still being 
> used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703563#comment-14703563
 ] 

Hadoop QA commented on HDFS-8828:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | javac |   0m  6s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751292/HDFS-8828.010.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f61120d |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12044/console |


This message was automatically generated.

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, 
> HDFS-8828.009.patch, HDFS-8828.010.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8809) HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing blocks) when HBase is running

2015-08-19 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8809:

Attachment: HDFS-8809.000.patch

There is another issue here, which I think exists before HDFS-8215. When 
"-OPENFORWRITE" is enabled, an UC block is still treated as missing/corrupted, 
since {{countNodes}} only checks the triplets inside of the BlockInfo and thus 
the liveReplicas is usually 0 for a UC block.

The fix can be to ignore the check if the block is the last one and it's UC. 
Theoretically the penultimate block can also be in the committed state and with 
0 reported replica yet, but maybe we do not need to handle this part here.

> HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing 
> blocks) when HBase is running
> ---
>
> Key: HDFS-8809
> URL: https://issues.apache.org/jira/browse/HDFS-8809
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.0
> Environment: Hadoop 2.7.1 and HBase 1.1.1, on SUSE11sp3 (other 
> Linuxes not tested, probably not platform-dependent).  This did NOT happen 
> with Hadoop 2.4 and HBase 0.98.
>Reporter: Sudhir Prakash
>Assignee: Jing Zhao
> Attachments: HDFS-8809.000.patch
>
>
> Whenever HBase is running, the "hdfs fsck /"  reports four hbase-related 
> files in the path "hbase/data/WALs/" as CORRUPT. Even after letting the 
> cluster sit idle for a couple hours, it is still in the corrupt state.  If 
> HBase is shut down, the problem goes away.  If HBase is then restarted, the 
> problem recurs.  This was observed with Hadoop 2.7.1 and HBase 1.1.1, and did 
> NOT happen with Hadoop 2.4 and HBase 0.98.
> {code}
> hades1:/var/opt/teradata/packages # su hdfs
> hdfs@hades1:/var/opt/teradata/packages> hdfs fsck /
> Connecting to namenode via 
> http://hades1.labs.teradata.com:50070/fsck?ugi=hdfs&path=%2F
> FSCK started by hdfs (auth:SIMPLE) from /39.0.8.2 for path / at Wed Jun 24 
> 20:40:17 GMT 2015
> ...
> /apps/hbase/data/WALs/hades4.labs.teradata.com,16020,1435168292684/hades4.labs.teradata.com%2C16020%2C1435168292684.default.1435175500556:
>  MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466..meta.1435175562144.meta:
>  MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466.default.1435175498500:
>  MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades6.labs.teradata.com,16020,1435168292373/hades6.labs.teradata.com%2C16020%2C1435168292373.default.1435175500301:
>  MISSING 1 blocks of total size 83 
> B..
> 
> 
> Status:
>  CORRUPT
>  Total size:723977553 B (Total open files size: 332 B)
>  Total dirs:79
>  Total files:   388
>  Total symlinks:0 (Files currently being written: 5)
>  Total blocks (validated):  387 (avg. block size 1870743 B) (Total open 
> file blocks (not validated): 4)
>   
>   UNDER MIN REPL'D BLOCKS:  4 (1.0335917 %)
>   dfs.namenode.replication.min: 1
>   CORRUPT FILES:4
>   MISSING BLOCKS:   4
>   MISSING SIZE: 332 B
>   
>  Minimally replicated blocks:   387 (100.0 %)
>  Over-replicated blocks:0 (0.0 %)
>  Under-replicated blocks:   0 (0.0 %)
>  Mis-replicated blocks: 0 (0.0 %)
>  Default replication factor:3
>  Average block replication: 3.0
>  Corrupt blocks:0
>  Missing replicas:  0 (0.0 %)
>  Number of data-nodes:  3
>  Number of racks:   1
> FSCK ended at Wed Jun 24 20:40:17 GMT 2015 in 7 milliseconds
> The filesystem under path '/' is CORRUPT
> hdfs@hades1:/var/opt/teradata/packages>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8809) HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing blocks) when HBase is running

2015-08-19 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8809:

Status: Patch Available  (was: Open)

> HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing 
> blocks) when HBase is running
> ---
>
> Key: HDFS-8809
> URL: https://issues.apache.org/jira/browse/HDFS-8809
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.0
> Environment: Hadoop 2.7.1 and HBase 1.1.1, on SUSE11sp3 (other 
> Linuxes not tested, probably not platform-dependent).  This did NOT happen 
> with Hadoop 2.4 and HBase 0.98.
>Reporter: Sudhir Prakash
>Assignee: Jing Zhao
> Attachments: HDFS-8809.000.patch
>
>
> Whenever HBase is running, the "hdfs fsck /"  reports four hbase-related 
> files in the path "hbase/data/WALs/" as CORRUPT. Even after letting the 
> cluster sit idle for a couple hours, it is still in the corrupt state.  If 
> HBase is shut down, the problem goes away.  If HBase is then restarted, the 
> problem recurs.  This was observed with Hadoop 2.7.1 and HBase 1.1.1, and did 
> NOT happen with Hadoop 2.4 and HBase 0.98.
> {code}
> hades1:/var/opt/teradata/packages # su hdfs
> hdfs@hades1:/var/opt/teradata/packages> hdfs fsck /
> Connecting to namenode via 
> http://hades1.labs.teradata.com:50070/fsck?ugi=hdfs&path=%2F
> FSCK started by hdfs (auth:SIMPLE) from /39.0.8.2 for path / at Wed Jun 24 
> 20:40:17 GMT 2015
> ...
> /apps/hbase/data/WALs/hades4.labs.teradata.com,16020,1435168292684/hades4.labs.teradata.com%2C16020%2C1435168292684.default.1435175500556:
>  MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466..meta.1435175562144.meta:
>  MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466.default.1435175498500:
>  MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades6.labs.teradata.com,16020,1435168292373/hades6.labs.teradata.com%2C16020%2C1435168292373.default.1435175500301:
>  MISSING 1 blocks of total size 83 
> B..
> 
> 
> Status:
>  CORRUPT
>  Total size:723977553 B (Total open files size: 332 B)
>  Total dirs:79
>  Total files:   388
>  Total symlinks:0 (Files currently being written: 5)
>  Total blocks (validated):  387 (avg. block size 1870743 B) (Total open 
> file blocks (not validated): 4)
>   
>   UNDER MIN REPL'D BLOCKS:  4 (1.0335917 %)
>   dfs.namenode.replication.min: 1
>   CORRUPT FILES:4
>   MISSING BLOCKS:   4
>   MISSING SIZE: 332 B
>   
>  Minimally replicated blocks:   387 (100.0 %)
>  Over-replicated blocks:0 (0.0 %)
>  Under-replicated blocks:   0 (0.0 %)
>  Mis-replicated blocks: 0 (0.0 %)
>  Default replication factor:3
>  Average block replication: 3.0
>  Corrupt blocks:0
>  Missing replicas:  0 (0.0 %)
>  Number of data-nodes:  3
>  Number of racks:   1
> FSCK ended at Wed Jun 24 20:40:17 GMT 2015 in 7 milliseconds
> The filesystem under path '/' is CORRUPT
> hdfs@hades1:/var/opt/teradata/packages>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8803) Move DfsClientConf to hdfs-client

2015-08-19 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8803:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~liuml07] for the 
contribution.

> Move DfsClientConf to hdfs-client
> -
>
> Key: HDFS-8803
> URL: https://issues.apache.org/jira/browse/HDFS-8803
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch, 
> HDFS-8803.002.patch, HDFS-8803.003.patch
>
>
> This jira tracks the effort of moving the {{DfsClientConf}} class into the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8803) Move DfsClientConf to hdfs-client

2015-08-19 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8803:
-
Assignee: Mingliang Liu  (was: Haohui Mai)

> Move DfsClientConf to hdfs-client
> -
>
> Key: HDFS-8803
> URL: https://issues.apache.org/jira/browse/HDFS-8803
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch, 
> HDFS-8803.002.patch, HDFS-8803.003.patch
>
>
> This jira tracks the effort of moving the {{DfsClientConf}} class into the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8890) Allow admin to specify which blockpools the balancer should run on

2015-08-19 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HDFS-8890:
---
Attachment: HDFS-8890-trunk-v2.patch

V2 attached.

> Allow admin to specify which blockpools the balancer should run on
> --
>
> Key: HDFS-8890
> URL: https://issues.apache.org/jira/browse/HDFS-8890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: HDFS-8890-trunk-v1.patch, HDFS-8890-trunk-v2.patch
>
>
> Currently the balancer runs on all blockpools. Allow an admin to run the 
> balancer on a set of blockpools. This will enable the balancer to skip 
> blockpools that should not be balanced. For example, a tmp blockpool that has 
> a large amount of churn.
> An example of the command line interface would be an additional flag that 
> specifies the blockpools by id:
> -blockpools 
> BP-6299761-10.55.116.188-1415904647555,BP-47348528-10.51.120.139-1415904199257



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-08-19 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703505#comment-14703505
 ] 

Bob Hansen commented on HDFS-8855:
--

Agreed that the RPC cache not working is a bug that should be fixed 
independently.  It can be argued that caching the whole client object is an 
additional optimization that has some value here.

But yes, we should track down why the RPC cache is failing us.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
> Environment: HDP 2.2
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.1.patch, HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8923) Add -source flag to balancer usage message

2015-08-19 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HDFS-8923:
---
Status: Patch Available  (was: In Progress)

> Add -source flag to balancer usage message
> --
>
> Key: HDFS-8923
> URL: https://issues.apache.org/jira/browse/HDFS-8923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
> Attachments: HDFS-8923-trunk-v1.patch
>
>
> HDFS-8826 added a -source flag to the balancer, but the usage message still 
> needs to be updated. See current usage message in trunk:
> {code}
>private static final String USAGE = "Usage: hdfs balancer"
>+ "\n\t[-policy ]\tthe balancing policy: "
>+ BalancingPolicy.Node.INSTANCE.getName() + " or "
>+ BalancingPolicy.Pool.INSTANCE.getName()
>+ "\n\t[-threshold ]\tPercentage of disk capacity"
>+ "\n\t[-exclude [-f  | ]]"
>+ "\tExcludes the specified datanodes."
>+ "\n\t[-include [-f  | ]]"
>+ "\tIncludes only the specified datanodes."
>+ "\n\t[-idleiterations ]"
>+ "\tNumber of consecutive idle iterations (-1 for Infinite) before "
>+ "exit."
>+ "\n\t[-runDuringUpgrade]"
>+ "\tWhether to run the balancer during an ongoing HDFS upgrade."
>+ "This is usually not desired since it will not affect used space "
>+ "on over-utilized machines.";
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-08-19 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703480#comment-14703480
 ] 

Haohui Mai commented on HDFS-8855:
--

This basically shows that the RPC connection cache is not working, but again 
this is the wrong place to fix.

We should dig into in this case why RPC connection cache is not working instead 
of putting a band aid in WebHDFS.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
> Environment: HDP 2.2
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.1.patch, HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8923) Add -source flag to balancer usage message

2015-08-19 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HDFS-8923:
---
Attachment: HDFS-8923-trunk-v1.patch

[~szetszwo] [~arpitagarwal] V1 patch attached.

> Add -source flag to balancer usage message
> --
>
> Key: HDFS-8923
> URL: https://issues.apache.org/jira/browse/HDFS-8923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
> Attachments: HDFS-8923-trunk-v1.patch
>
>
> HDFS-8826 added a -source flag to the balancer, but the usage message still 
> needs to be updated. See current usage message in trunk:
> {code}
>private static final String USAGE = "Usage: hdfs balancer"
>+ "\n\t[-policy ]\tthe balancing policy: "
>+ BalancingPolicy.Node.INSTANCE.getName() + " or "
>+ BalancingPolicy.Pool.INSTANCE.getName()
>+ "\n\t[-threshold ]\tPercentage of disk capacity"
>+ "\n\t[-exclude [-f  | ]]"
>+ "\tExcludes the specified datanodes."
>+ "\n\t[-include [-f  | ]]"
>+ "\tIncludes only the specified datanodes."
>+ "\n\t[-idleiterations ]"
>+ "\tNumber of consecutive idle iterations (-1 for Infinite) before "
>+ "exit."
>+ "\n\t[-runDuringUpgrade]"
>+ "\tWhether to run the balancer during an ongoing HDFS upgrade."
>+ "This is usually not desired since it will not affect used space "
>+ "on over-utilized machines.";
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases

2015-08-19 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703461#comment-14703461
 ] 

Chris Trezzo commented on HDFS-8826:


Woops. Meant HDFS-8923.

> Balancer may not move blocks efficiently in some cases
> --
>
> Key: HDFS-8826
> URL: https://issues.apache.org/jira/browse/HDFS-8826
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.8.0
>
> Attachments: h8826_20150811.patch, h8826_20150816.patch, 
> h8826_20150818.patch
>
>
> Balancer is inefficient in the following case:
> || Datanode || Utilization || Rack ||
> | D1 | 95% | A |
> | D2 | 30% | B |
> | D3, D4, D5 | 0% | B |
> The average utilization is 25% so that D2 is within 10% threshold.  However, 
> Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
> are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases

2015-08-19 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703456#comment-14703456
 ] 

Chris Trezzo commented on HDFS-8826:


HDFS-8826 filled. Posting patch there.

> Balancer may not move blocks efficiently in some cases
> --
>
> Key: HDFS-8826
> URL: https://issues.apache.org/jira/browse/HDFS-8826
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.8.0
>
> Attachments: h8826_20150811.patch, h8826_20150816.patch, 
> h8826_20150818.patch
>
>
> Balancer is inefficient in the following case:
> || Datanode || Utilization || Rack ||
> | D1 | 95% | A |
> | D2 | 30% | B |
> | D3, D4, D5 | 0% | B |
> The average utilization is 25% so that D2 is within 10% threshold.  However, 
> Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
> are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-8923) Add -source flag to balancer usage message

2015-08-19 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-8923 started by Chris Trezzo.
--
> Add -source flag to balancer usage message
> --
>
> Key: HDFS-8923
> URL: https://issues.apache.org/jira/browse/HDFS-8923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
>
> HDFS-8826 added a -source flag to the balancer, but the usage message still 
> needs to be updated. See current usage message in trunk:
> {code}
>private static final String USAGE = "Usage: hdfs balancer"
>+ "\n\t[-policy ]\tthe balancing policy: "
>+ BalancingPolicy.Node.INSTANCE.getName() + " or "
>+ BalancingPolicy.Pool.INSTANCE.getName()
>+ "\n\t[-threshold ]\tPercentage of disk capacity"
>+ "\n\t[-exclude [-f  | ]]"
>+ "\tExcludes the specified datanodes."
>+ "\n\t[-include [-f  | ]]"
>+ "\tIncludes only the specified datanodes."
>+ "\n\t[-idleiterations ]"
>+ "\tNumber of consecutive idle iterations (-1 for Infinite) before "
>+ "exit."
>+ "\n\t[-runDuringUpgrade]"
>+ "\tWhether to run the balancer during an ongoing HDFS upgrade."
>+ "This is usually not desired since it will not affect used space "
>+ "on over-utilized machines.";
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8923) Add -source flag to balancer usage message

2015-08-19 Thread Chris Trezzo (JIRA)
Chris Trezzo created HDFS-8923:
--

 Summary: Add -source flag to balancer usage message
 Key: HDFS-8923
 URL: https://issues.apache.org/jira/browse/HDFS-8923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Trivial


HDFS-8826 added a -source flag to the balancer, but the usage message still 
needs to be updated. See current usage message in trunk:
{code}
   private static final String USAGE = "Usage: hdfs balancer"
   + "\n\t[-policy ]\tthe balancing policy: "
   + BalancingPolicy.Node.INSTANCE.getName() + " or "
   + BalancingPolicy.Pool.INSTANCE.getName()
   + "\n\t[-threshold ]\tPercentage of disk capacity"
   + "\n\t[-exclude [-f  | ]]"
   + "\tExcludes the specified datanodes."
   + "\n\t[-include [-f  | ]]"
   + "\tIncludes only the specified datanodes."
   + "\n\t[-idleiterations ]"
   + "\tNumber of consecutive idle iterations (-1 for Infinite) before "
   + "exit."
   + "\n\t[-runDuringUpgrade]"
   + "\tWhether to run the balancer during an ongoing HDFS upgrade."
   + "This is usually not desired since it will not affect used space "
   + "on over-utilized machines.";
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8803) Move DfsClientConf to hdfs-client

2015-08-19 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703438#comment-14703438
 ] 

Jing Zhao commented on HDFS-8803:
-

The test failure should be unrelated. +1.

> Move DfsClientConf to hdfs-client
> -
>
> Key: HDFS-8803
> URL: https://issues.apache.org/jira/browse/HDFS-8803
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch, 
> HDFS-8803.002.patch, HDFS-8803.003.patch
>
>
> This jira tracks the effort of moving the {{DfsClientConf}} class into the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8867) Enable optimized block reports

2015-08-19 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8867:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.2
   Status: Resolved  (was: Patch Available)

Thanks for the fix, Daryn. I've committed this to trunk, branch-2 and 
branch-2.7.

> Enable optimized block reports
> --
>
> Key: HDFS-8867
> URL: https://issues.apache.org/jira/browse/HDFS-8867
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
> Fix For: 2.7.2
>
> Attachments: HDFS-8867.patch
>
>
> Opening this ticket on behalf of [~daryn]
> HDFS-7435 introduced a more efficiently encoded block report format designed 
> to improve performance and reduce GC load on the NN and DNs. The NN is not 
> advertising this capability to the DNs so old-style reports are still being 
> used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703369#comment-14703369
 ] 

Hudson commented on HDFS-8826:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #289 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/289/])
HDFS-8826. In Balancer, add an option to specify the source node list so that 
balancer only selects blocks to move from those nodes. (szetszwo: rev 
7ecbfd44aa57f5f54c214b7fdedda2500be76f51)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/StringUtils.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/HostsFileReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java


> Balancer may not move blocks efficiently in some cases
> --
>
> Key: HDFS-8826
> URL: https://issues.apache.org/jira/browse/HDFS-8826
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.8.0
>
> Attachments: h8826_20150811.patch, h8826_20150816.patch, 
> h8826_20150818.patch
>
>
> Balancer is inefficient in the following case:
> || Datanode || Utilization || Rack ||
> | D1 | 95% | A |
> | D2 | 30% | B |
> | D3, D4, D5 | 0% | B |
> The average utilization is 25% so that D2 is within 10% threshold.  However, 
> Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
> are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8435) Support CreateFlag in WebHdfs

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703371#comment-14703371
 ] 

Hudson commented on HDFS-8435:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #289 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/289/])
HDFS-8435. Support CreateFlag in WebHDFS. Contributed by Jakob Homan (cdouglas: 
rev 30e342a5d32be5efffeb472cce76d4ed43642608)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/ParameterParser.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/CreateFlagParam.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/CreateParentParam.java


> Support CreateFlag in WebHdfs
> -
>
> Key: HDFS-8435
> URL: https://issues.apache.org/jira/browse/HDFS-8435
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.6.0
>Reporter: Vinoth Sathappan
>Assignee: Jakob Homan
> Fix For: 2.8.0
>
> Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, 
> HDFS-8435.002.patch, HDFS-8435.003.patch, HDFS-8435.004.patch, 
> HDFS-8435.005.patch
>
>
> The WebHdfsFileSystem implementation doesn't support createNonRecursive. 
> HBase extensively depends on that for proper functioning. Currently, when the 
> region servers are started over web hdfs, they crash due with -
> createNonRecursive unsupported for this filesystem class 
> org.apache.hadoop.hdfs.web.SWebHdfsFileSystem
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703370#comment-14703370
 ] 

Hudson commented on HDFS-8852:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #289 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/289/])
HDFS-8852. HDFS architecture documentation of version 2.x is outdated about 
append write support. Contributed by Ajith S. (aajisaka: rev 
fc509f66d814e7a5ed81d5d73b23c400625d573b)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HDFS architecture documentation of version 2.x is outdated about append write 
> support
> -
>
> Key: HDFS-8852
> URL: https://issues.apache.org/jira/browse/HDFS-8852
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Hong Dai Thanh
>Assignee: Ajith S
>  Labels: newbie
> Fix For: 2.7.2
>
> Attachments: HDFS-8852.2.patch, HDFS-8852.patch
>
>
> In the [latest version of the 
> documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model],
>  and also documentation for all releases with version 2, it’s mentioned that 
> “A file once created, written, and closed need not be changed. “ and “There 
> is a plan to support appending-writes to files in the future.” 
>  
> However, as far as I know, HDFS has supported append write since 0.21, based 
> on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old 
> version of the documentation in 
> 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs]
> Various posts on the Internet also suggests that append write has been 
> available in HDFS, and will always be available in Hadoop version 2 branch.
>  
> Can we update the documentation to reflect the current status?
> (Please also review whether the documentation should also be updated for 
> version 0.21 and above, and the version 1.x branch)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703372#comment-14703372
 ] 

Hudson commented on HDFS-8908:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #289 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/289/])
HDFS-8908. TestAppendSnapshotTruncate may fail with IOException: Failed to 
replace a bad datanode. (Tsz Wo Nicholas Sze via yliu) (yliu: rev 
2da5aaab334d0d6a7dee244cac603aa35c9b0134)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestAppendSnapshotTruncate.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad 
> datanode
> --
>
> Key: HDFS-8908
> URL: https://issues.apache.org/jira/browse/HDFS-8908
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h8908_20150817.patch
>
>
> See 
> https://builds.apache.org/job/PreCommit-HDFS-Build/12005/testReport/org.apache.hadoop.hdfs/TestAppendSnapshotTruncate/testAST/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8911) NameNode Metric : Add Editlog counters as a JMX metric

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703363#comment-14703363
 ] 

Hudson commented on HDFS-8911:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8321 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8321/])
HDFS-8911. NameNode Metric : Add Editlog counters as a JMX metric. (Contributed 
by Anu Engineer) (arp: rev 9c3571ea607f0953487464844ed0d46fdb3e9f90)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/metrics/FSNamesystemMBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystemMBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md


> NameNode Metric : Add Editlog counters as a JMX metric
> --
>
> Key: HDFS-8911
> URL: https://issues.apache.org/jira/browse/HDFS-8911
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-8911.001.patch
>
>
> Today we log editlog metrics in the log. This JIRA proposes to expose those 
> metrics via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode

2015-08-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703362#comment-14703362
 ] 

Hudson commented on HDFS-8908:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8321 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8321/])
HDFS-8908. TestAppendSnapshotTruncate may fail with IOException: Failed to 
replace a bad datanode. (Tsz Wo Nicholas Sze via yliu) (yliu: rev 
2da5aaab334d0d6a7dee244cac603aa35c9b0134)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestAppendSnapshotTruncate.java


> TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad 
> datanode
> --
>
> Key: HDFS-8908
> URL: https://issues.apache.org/jira/browse/HDFS-8908
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h8908_20150817.patch
>
>
> See 
> https://builds.apache.org/job/PreCommit-HDFS-Build/12005/testReport/org.apache.hadoop.hdfs/TestAppendSnapshotTruncate/testAST/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases

2015-08-19 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703350#comment-14703350
 ] 

Chris Trezzo commented on HDFS-8826:


[~arpitagarwal] [~szetszwo]

The applied patch is currently missing the -source flag in the usage message in 
the balancer. See current usage message in trunk:
{code}
   private static final String USAGE = "Usage: hdfs balancer"
   + "\n\t[-policy ]\tthe balancing policy: "
   + BalancingPolicy.Node.INSTANCE.getName() + " or "
   + BalancingPolicy.Pool.INSTANCE.getName()
   + "\n\t[-threshold ]\tPercentage of disk capacity"
   + "\n\t[-exclude [-f  | ]]"
   + "\tExcludes the specified datanodes."
   + "\n\t[-include [-f  | ]]"
   + "\tIncludes only the specified datanodes."
   + "\n\t[-idleiterations ]"
   + "\tNumber of consecutive idle iterations (-1 for Infinite) before "
   + "exit."
   + "\n\t[-runDuringUpgrade]"
   + "\tWhether to run the balancer during an ongoing HDFS upgrade."
   + "This is usually not desired since it will not affect used space "
   + "on over-utilized machines.";
{code}

Should I file a jira or do you guys just want to post an amendment patch? 
Thanks!

> Balancer may not move blocks efficiently in some cases
> --
>
> Key: HDFS-8826
> URL: https://issues.apache.org/jira/browse/HDFS-8826
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.8.0
>
> Attachments: h8826_20150811.patch, h8826_20150816.patch, 
> h8826_20150818.patch
>
>
> Balancer is inefficient in the following case:
> || Datanode || Utilization || Rack ||
> | D1 | 95% | A |
> | D2 | 30% | B |
> | D3, D4, D5 | 0% | B |
> The average utilization is 25% so that D2 is within 10% threshold.  However, 
> Balancer currently will first move blocks from D2 to D3, D4 and D5 since they 
> are under the same rack.  Then, it will move blocks from D1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >