[jira] [Commented] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-03-24 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412902#comment-16412902
 ] 

genericqa commented on HDFS-13056:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m  
1s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 25m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 23m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 23m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
27s{color} | {color:green} root: The patch generated 0 new + 680 unchanged - 14 
fixed = 680 total (was 694) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
7m 59s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
20s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
26s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 34s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
15s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}226m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | 

[jira] [Updated] (HDFS-13204) RBF: Optimize name service safe mode icon

2018-03-24 Thread liuhongtong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuhongtong updated HDFS-13204:
---
Attachment: HDFS-13204.007.patch

> RBF: Optimize name service safe mode icon
> -
>
> Key: HDFS-13204
> URL: https://issues.apache.org/jira/browse/HDFS-13204
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: liuhongtong
>Assignee: liuhongtong
>Priority: Minor
> Attachments: HDFS-13204.001.patch, HDFS-13204.002.patch, 
> HDFS-13204.003.patch, HDFS-13204.004.patch, HDFS-13204.005.patch, 
> HDFS-13204.006.patch, HDFS-13204.007.patch, Routers.png, Subclusters.png, 
> image-2018-02-28-18-33-09-972.png, image-2018-02-28-18-33-47-661.png, 
> image-2018-02-28-18-35-35-708.png, image-2018-03-23-18-06-54-354.png
>
>
> In federation health webpage, the safe mode icons of Subclusters and Routers 
> are inconsistent.
> The safe mode icon of Subclusters may induce users the name service is 
> maintaining.
> !image-2018-02-28-18-33-09-972.png!
> The safe mode icon of Routers:
> !image-2018-02-28-18-33-47-661.png!
> In fact, if the name service is in safe mode, users can't do writing related 
> operations. So I think the safe mode icon in Subclusters should be modified, 
> which may be more reasonable.
> !image-2018-02-28-18-35-35-708.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13204) RBF: Optimize name service safe mode icon

2018-03-24 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412894#comment-16412894
 ] 

genericqa commented on HDFS-13204:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
35m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | HDFS-13204 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916084/HDFS-13204.006.patch |
| Optional Tests |  asflicense  shadedclient  |
| uname | Linux 0e84677c471f 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 28790b8 |
| maven | version: Apache Maven 3.3.9 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23661/artifact/out/whitespace-eol.txt
 |
| Max. process+thread count | 315 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23661/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RBF: Optimize name service safe mode icon
> -
>
> Key: HDFS-13204
> URL: https://issues.apache.org/jira/browse/HDFS-13204
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: liuhongtong
>Assignee: liuhongtong
>Priority: Minor
> Attachments: HDFS-13204.001.patch, HDFS-13204.002.patch, 
> HDFS-13204.003.patch, HDFS-13204.004.patch, HDFS-13204.005.patch, 
> HDFS-13204.006.patch, Routers.png, Subclusters.png, 
> image-2018-02-28-18-33-09-972.png, image-2018-02-28-18-33-47-661.png, 
> image-2018-02-28-18-35-35-708.png, image-2018-03-23-18-06-54-354.png
>
>
> In federation health webpage, the safe mode icons of Subclusters and Routers 
> are inconsistent.
> The safe mode icon of Subclusters may induce users the name service is 
> maintaining.
> !image-2018-02-28-18-33-09-972.png!
> The safe mode icon of Routers:
> !image-2018-02-28-18-33-47-661.png!
> In fact, if the name service is in safe mode, users can't do writing related 
> operations. So I think the safe mode icon in Subclusters should be modified, 
> which may be more reasonable.
> !image-2018-02-28-18-35-35-708.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13291) RBF: Implement available space based OrderResolver

2018-03-24 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412883#comment-16412883
 ] 

genericqa commented on HDFS-13291:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 
44s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | HDFS-13291 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916083/HDFS-13291.009.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 1b246834e9d6 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 28790b8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23660/testReport/ |
| Max. process+thread count | 935 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23660/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RBF: Implement available space based 

[jira] [Commented] (HDFS-13204) RBF: Optimize name service safe mode icon

2018-03-24 Thread liuhongtong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412878#comment-16412878
 ] 

liuhongtong commented on HDFS-13204:


[~elgoiri] Thx. I have modified rbf.css.

> RBF: Optimize name service safe mode icon
> -
>
> Key: HDFS-13204
> URL: https://issues.apache.org/jira/browse/HDFS-13204
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: liuhongtong
>Assignee: liuhongtong
>Priority: Minor
> Attachments: HDFS-13204.001.patch, HDFS-13204.002.patch, 
> HDFS-13204.003.patch, HDFS-13204.004.patch, HDFS-13204.005.patch, 
> HDFS-13204.006.patch, Routers.png, Subclusters.png, 
> image-2018-02-28-18-33-09-972.png, image-2018-02-28-18-33-47-661.png, 
> image-2018-02-28-18-35-35-708.png, image-2018-03-23-18-06-54-354.png
>
>
> In federation health webpage, the safe mode icons of Subclusters and Routers 
> are inconsistent.
> The safe mode icon of Subclusters may induce users the name service is 
> maintaining.
> !image-2018-02-28-18-33-09-972.png!
> The safe mode icon of Routers:
> !image-2018-02-28-18-33-47-661.png!
> In fact, if the name service is in safe mode, users can't do writing related 
> operations. So I think the safe mode icon in Subclusters should be modified, 
> which may be more reasonable.
> !image-2018-02-28-18-35-35-708.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13204) RBF: Optimize name service safe mode icon

2018-03-24 Thread liuhongtong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuhongtong updated HDFS-13204:
---
Attachment: HDFS-13204.006.patch

> RBF: Optimize name service safe mode icon
> -
>
> Key: HDFS-13204
> URL: https://issues.apache.org/jira/browse/HDFS-13204
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: liuhongtong
>Assignee: liuhongtong
>Priority: Minor
> Attachments: HDFS-13204.001.patch, HDFS-13204.002.patch, 
> HDFS-13204.003.patch, HDFS-13204.004.patch, HDFS-13204.005.patch, 
> HDFS-13204.006.patch, Routers.png, Subclusters.png, 
> image-2018-02-28-18-33-09-972.png, image-2018-02-28-18-33-47-661.png, 
> image-2018-02-28-18-35-35-708.png, image-2018-03-23-18-06-54-354.png
>
>
> In federation health webpage, the safe mode icons of Subclusters and Routers 
> are inconsistent.
> The safe mode icon of Subclusters may induce users the name service is 
> maintaining.
> !image-2018-02-28-18-33-09-972.png!
> The safe mode icon of Routers:
> !image-2018-02-28-18-33-47-661.png!
> In fact, if the name service is in safe mode, users can't do writing related 
> operations. So I think the safe mode icon in Subclusters should be modified, 
> which may be more reasonable.
> !image-2018-02-28-18-35-35-708.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13291) RBF: Implement available space based OrderResolver

2018-03-24 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412869#comment-16412869
 ] 

Yiqun Lin commented on HDFS-13291:
--

Fix checktyle warning.

> RBF: Implement available space based OrderResolver
> --
>
> Key: HDFS-13291
> URL: https://issues.apache.org/jira/browse/HDFS-13291
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-13291.001.patch, HDFS-13291.002.patch, 
> HDFS-13291.003.patch, HDFS-13291.004.patch, HDFS-13291.005.patch, 
> HDFS-13291.006.patch, HDFS-13291.007.patch, HDFS-13291.008.patch, 
> HDFS-13291.009.patch
>
>
> Implement available space based OrderResolver, this type resolver will 
> benefit for balancing the data across subclusters. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13291) RBF: Implement available space based OrderResolver

2018-03-24 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13291:
-
Attachment: HDFS-13291.009.patch

> RBF: Implement available space based OrderResolver
> --
>
> Key: HDFS-13291
> URL: https://issues.apache.org/jira/browse/HDFS-13291
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-13291.001.patch, HDFS-13291.002.patch, 
> HDFS-13291.003.patch, HDFS-13291.004.patch, HDFS-13291.005.patch, 
> HDFS-13291.006.patch, HDFS-13291.007.patch, HDFS-13291.008.patch, 
> HDFS-13291.009.patch
>
>
> Implement available space based OrderResolver, this type resolver will 
> benefit for balancing the data across subclusters. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-03-24 Thread Dennis Huo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Huo updated HDFS-13056:
--
Attachment: HDFS-13056.012.patch

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Assignee: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.002.patch, HDFS-13056-branch-2.8.003.patch, 
> HDFS-13056-branch-2.8.004.patch, HDFS-13056-branch-2.8.005.patch, 
> HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, HDFS-13056.002.patch, 
> HDFS-13056.003.patch, HDFS-13056.003.patch, HDFS-13056.004.patch, 
> HDFS-13056.005.patch, HDFS-13056.006.patch, HDFS-13056.007.patch, 
> HDFS-13056.008.patch, HDFS-13056.009.patch, HDFS-13056.010.patch, 
> HDFS-13056.011.patch, HDFS-13056.012.patch, 
> Reference_only_zhen_PPOC_hadoop2.6.X.diff, hdfs-file-composite-crc32-v1.pdf, 
> hdfs-file-composite-crc32-v2.pdf, hdfs-file-composite-crc32-v3.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13300) Ozone: Remove DatanodeID dependency from HDSL and Ozone

2018-03-24 Thread Nanda kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412641#comment-16412641
 ] 

Nanda kumar commented on HDFS-13300:


Thanks for the explanation [~elek], I get your point. Good catch.
I'm working on the patch to fix the issues you mentioned, will upload it soon.

> Ozone:  Remove DatanodeID dependency from HDSL and Ozone
> 
>
> Key: HDFS-13300
> URL: https://issues.apache.org/jira/browse/HDFS-13300
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDFS-13300-HDFS-7240.000.patch, 
> HDFS-13300-HDFS-7240.001.patch, HDFS-13300-HDFS-7240.002.patch, 
> HDFS-13300-HDFS-7240.003.patch
>
>
> DatanodeID has been modified to add HDSL/Ozone related information 
> previously. This jira is to remove DatanodeID dependency from HDSL/Ozone to 
> make it truly pluggable without having the need to modify DatanodeID.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13195) DataNode conf page cannot display the current value after reconfig

2018-03-24 Thread maobaolong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412595#comment-16412595
 ] 

maobaolong commented on HDFS-13195:
---

[~kihwal] [~elgoiri] [~liuhongtong] [~bharatviswa] Thank you for the reviewing, 
[~kihwal] Thank you for help me to commit this patch.

> DataNode conf page  cannot display the current value after reconfig
> ---
>
> Key: HDFS-13195
> URL: https://issues.apache.org/jira/browse/HDFS-13195
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Minor
> Fix For: 2.10.0, 2.9.1, 2.8.4, 2.7.6, 3.0.2, 3.2.0, 3.1.1
>
> Attachments: HDFS-13195-branch-2.7.001.patch, 
> HDFS-13195-branch-2.7.002.patch, HDFS-13195.001.patch, HDFS-13195.002.patch
>
>
> Now the branch-2.7 support dfs.datanode.data.dir reconfig, but after i 
> reconfig this key, the conf page's value is still the old config value.
> The reason is that:
> {code:java}
> public DatanodeHttpServer(final Configuration conf,
>   final DataNode datanode,
>   final ServerSocketChannel externalHttpChannel)
> throws IOException {
> this.conf = conf;
> Configuration confForInfoServer = new Configuration(conf);
> confForInfoServer.setInt(HttpServer2.HTTP_MAX_THREADS, 10);
> HttpServer2.Builder builder = new HttpServer2.Builder()
> .setName("datanode")
> .setConf(confForInfoServer)
> .setACL(new AccessControlList(conf.get(DFS_ADMIN, " ")))
> .hostName(getHostnameForSpnegoPrincipal(confForInfoServer))
> .addEndpoint(URI.create("http://localhost:0;))
> .setFindPort(true);
> this.infoServer = builder.build();
> {code}
> The confForInfoServer is a new configuration instance, while the dfsadmin 
> reconfig the datanode's config, the config result cannot reflect to 
> confForInfoServer, so we should use the datanode's conf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-03-24 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412536#comment-16412536
 ] 

genericqa commented on HDFS-13056:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 49s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 26m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m  
8s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 46s{color} | {color:orange} root: The patch generated 6 new + 689 unchanged 
- 4 fixed = 695 total (was 693) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
8m 43s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m  3s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
33s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}137m 48s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
32s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}290m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.shell.TestCopyFromLocal |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|  

[jira] [Updated] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-03-24 Thread Dennis Huo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Huo updated HDFS-13056:
--
Attachment: HDFS-13056.011.patch

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Assignee: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.002.patch, HDFS-13056-branch-2.8.003.patch, 
> HDFS-13056-branch-2.8.004.patch, HDFS-13056-branch-2.8.005.patch, 
> HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, HDFS-13056.002.patch, 
> HDFS-13056.003.patch, HDFS-13056.003.patch, HDFS-13056.004.patch, 
> HDFS-13056.005.patch, HDFS-13056.006.patch, HDFS-13056.007.patch, 
> HDFS-13056.008.patch, HDFS-13056.009.patch, HDFS-13056.010.patch, 
> HDFS-13056.011.patch, Reference_only_zhen_PPOC_hadoop2.6.X.diff, 
> hdfs-file-composite-crc32-v1.pdf, hdfs-file-composite-crc32-v2.pdf, 
> hdfs-file-composite-crc32-v3.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-03-24 Thread Dennis Huo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412452#comment-16412452
 ] 

Dennis Huo commented on HDFS-13056:
---

Merged trunk to produce [^HDFS-13056.011.patch]

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Assignee: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.002.patch, HDFS-13056-branch-2.8.003.patch, 
> HDFS-13056-branch-2.8.004.patch, HDFS-13056-branch-2.8.005.patch, 
> HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, HDFS-13056.002.patch, 
> HDFS-13056.003.patch, HDFS-13056.003.patch, HDFS-13056.004.patch, 
> HDFS-13056.005.patch, HDFS-13056.006.patch, HDFS-13056.007.patch, 
> HDFS-13056.008.patch, HDFS-13056.009.patch, HDFS-13056.010.patch, 
> HDFS-13056.011.patch, Reference_only_zhen_PPOC_hadoop2.6.X.diff, 
> hdfs-file-composite-crc32-v1.pdf, hdfs-file-composite-crc32-v2.pdf, 
> hdfs-file-composite-crc32-v3.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-03-24 Thread Dennis Huo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412449#comment-16412449
 ] 

Dennis Huo commented on HDFS-13056:
---

Thanks for taking a look [~ste...@apache.org]! Applied your suggestions in 
[^HDFS-13056.010.patch]:

-Mark getFileChecksumWithCombineMode as LimitedPrivate
 -Add TestCopyMapperCompositeCrc extending TestCopyMapper with differentiation 
of behaviors between the checksum options in terms of what kinds of file 
layouts are supported.
 -Remove String.format from some LOG.debug statements
 -Make ReplicatedFileChecksumComputer raise PathIOExceptions
 -Switch TestCrcUtil and TestCrcComposer to use LambdaTestUtils.intercept 
instead of junit ExpectedException

There are a few places that will require followup work in LambdaTestUtils 
before switching over, namely: 1. Supporting checking more messages in the 
causal chain and/or suppressed exceptions, 2. Making it easy to check for 
multiple different string fragments in the exception text. I have some 
rudimentary parts of that in this followup: 
https://issues.apache.org/jira/browse/HDFS-13256

Re: [~xiaochen] - Removing or marking deprecated sounds good to me; I'll do 
that part in the followup Jira that also tracks adding WebHDFS support. Filed 
https://issues.apache.org/jira/browse/HDFS-13345 to track. Re: distcp, I agree 
there are some significant shortcomings in the existing behaviors; it's worse 
than always trying to overwrite when clusters have different configs, right now 
the copy does all the work and then fails on commit due to the checksum check 
after the copy has been performed, so it's wasted work. We could maybe file 
some followup work in DistCp to improve its behavior; as a user I initially 
expected its natural behavior to take into consideration whether 
FileChecksum#getAlgorithmName returns the same value for both sides before 
deciding if it's okay to compare them. I'd have expected mismatch algorithm 
names to be ignored the same way null FileChecksums are, so that syncing falls 
back to just file sizes. Instead, if the algorithm names differ right now, 
distcp tries to copy and then fails on commit. I guess we can discuss how to 
improve distcp semantics in a followup Jira.

> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Assignee: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.002.patch, HDFS-13056-branch-2.8.003.patch, 
> HDFS-13056-branch-2.8.004.patch, HDFS-13056-branch-2.8.005.patch, 
> HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, HDFS-13056.002.patch, 
> HDFS-13056.003.patch, HDFS-13056.003.patch, HDFS-13056.004.patch, 
> HDFS-13056.005.patch, HDFS-13056.006.patch, HDFS-13056.007.patch, 
> HDFS-13056.008.patch, HDFS-13056.009.patch, HDFS-13056.010.patch, 
> Reference_only_zhen_PPOC_hadoop2.6.X.diff, hdfs-file-composite-crc32-v1.pdf, 
> hdfs-file-composite-crc32-v2.pdf, hdfs-file-composite-crc32-v3.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to 

[jira] [Created] (HDFS-13345) Add support for new COMPOSITE_CRC in WebHDFS

2018-03-24 Thread Dennis Huo (JIRA)
Dennis Huo created HDFS-13345:
-

 Summary: Add support for new COMPOSITE_CRC in WebHDFS
 Key: HDFS-13345
 URL: https://issues.apache.org/jira/browse/HDFS-13345
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Dennis Huo


As a followup to https://issues.apache.org/jira/browse/HDFS-13056, plumbing 
through support for new FileChecksum types (in particular, COMPOSITE_CRC) in 
WebHDFS will help remove dependencies on the old DFSClient.getFileChecksum 
method that hard-codes MD5MD5CRC32FileChecksum as its return type. Once this is 
done, this work should also include marking the old getFileChecksum method as 
deprecated or removing it entirely from DFSClient.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts

2018-03-24 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412441#comment-16412441
 ] 

genericqa commented on HDFS-13056:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-13056 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13056 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916033/HDFS-13056.010.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23657/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Expose file-level composite CRCs in HDFS which are comparable across 
> different instances/layouts
> 
>
> Key: HDFS-13056
> URL: https://issues.apache.org/jira/browse/HDFS-13056
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, distcp, erasure-coding, federation, hdfs
>Affects Versions: 3.0.0
>Reporter: Dennis Huo
>Assignee: Dennis Huo
>Priority: Major
> Attachments: HDFS-13056-branch-2.8.001.patch, 
> HDFS-13056-branch-2.8.002.patch, HDFS-13056-branch-2.8.003.patch, 
> HDFS-13056-branch-2.8.004.patch, HDFS-13056-branch-2.8.005.patch, 
> HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, HDFS-13056.002.patch, 
> HDFS-13056.003.patch, HDFS-13056.003.patch, HDFS-13056.004.patch, 
> HDFS-13056.005.patch, HDFS-13056.006.patch, HDFS-13056.007.patch, 
> HDFS-13056.008.patch, HDFS-13056.009.patch, HDFS-13056.010.patch, 
> Reference_only_zhen_PPOC_hadoop2.6.X.diff, hdfs-file-composite-crc32-v1.pdf, 
> hdfs-file-composite-crc32-v2.pdf, hdfs-file-composite-crc32-v3.pdf
>
>
> FileChecksum was first introduced in 
> [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then 
> has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are 
> already stored as part of datanode metadata, and the MD5 approach is used to 
> compute an aggregate value in a distributed manner, with individual datanodes 
> computing the MD5-of-CRCs per-block in parallel, and the HDFS client 
> computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that 
> this FileChecksum is sensitive to the internal block-size and chunk-size 
> configuration, and thus different HDFS files with different block/chunk 
> settings cannot be compared. More commonly, one might have different HDFS 
> clusters which use different block sizes, in which case any data migration 
> won't be able to use the FileChecksum for distcp's rsync functionality or for 
> verifying end-to-end data integrity (on top of low-level data integrity 
> checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 
> during the addition of checksum support for striped erasure-coded files; 
> while there was some discussion of using CRC composability, it still 
> ultimately settled on hierarchical MD5 approach, which also adds the problem 
> that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses 
> CRC composition to remain completely chunk/block agnostic, and allows 
> comparison between striped vs replicated files, between different HDFS 
> instances, and possible even between HDFS and other external storage systems. 
> This feature can also be added in-place to be compatible with existing block 
> metadata, and doesn't need to change the normal path of chunk verification, 
> so is minimally invasive. This also means even large preexisting HDFS 
> deployments could adopt this feature to retroactively sync data. A detailed 
> design document can be found here: 
> https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org