[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111547#comment-16111547 ] Chris Douglas commented on HDFS-6984: - Thanks, [~andrew.wang]! I'll commit this shortly. I filed HDFS-12250 to clean up some of the deprecation warnings. bq. Is the release note on this JIRA up to date? There are also misc improvements like removing the acl/enc/ec bits from the WebHDFS AclStatus that we could call out. Sure, I'll update it. bq. Based on code inspection it looks like the wire-format changes are compatible, but it'd be good if you could comment on manual testing As we discussed earlier, including these bits in the AclStatus seemed accidental. I can try a few operations with an old client to make sure there are no errors. I don't have access to environments where these APIs would be stressed. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, > HDFS-6984.009.patch, HDFS-6984.010.patch, HDFS-6984.011.patch, > HDFS-6984.012.patch, HDFS-6984.013.patch, HDFS-6984.014.patch, > HDFS-6984.015.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110033#comment-16110033 ] Andrew Wang commented on HDFS-6984: --- +1 LGTM, thanks for working on this Chris. I applied this to 3.0.0-alpha4 and ran it through our internal integration test suite, and it passed. Great work! Some final checks: * Is the release note on this JIRA up to date? There are also misc improvements like removing the acl/enc/ec bits from the WebHDFS AclStatus that we could call out. * Based on code inspection it looks like the wire-format changes are compatible, but it'd be good if you could comment on manual testing. WebHDFS and HttpFS in particular need to be both forward compatible as well as backwards compatible to help with upgrade, and it'd be real nice to support this for the PB protocol too since this is such a fundamental primitive. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, > HDFS-6984.009.patch, HDFS-6984.010.patch, HDFS-6984.011.patch, > HDFS-6984.012.patch, HDFS-6984.013.patch, HDFS-6984.014.patch, > HDFS-6984.015.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097205#comment-16097205 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 10 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 44s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 15m 6s{color} | {color:red} root in trunk failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 24s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 28s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 47s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 20s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 11m 20s{color} | {color:red} root generated 80 new + 1313 unchanged - 0 fixed = 1393 total (was 1313) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 4s{color} | {color:orange} root: The patch generated 12 new + 824 unchanged - 22 fixed = 836 total (was 846) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs generated 0 new + 9 unchanged - 1 fixed = 9 total (was 10) {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 34s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 30s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 1s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 37s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:g
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090486#comment-16090486 ] Chris Douglas commented on HDFS-6984: - [~andrew.wang], would you mind reviewing this, now that 3.0.0-alpha4 is released? This is blocking HDFS-7878, which the last blocker for HDFS-9806. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, > HDFS-6984.009.patch, HDFS-6984.010.patch, HDFS-6984.011.patch, > HDFS-6984.012.patch, HDFS-6984.013.patch, HDFS-6984.014.patch, > HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064056#comment-16064056 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 43s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 9 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 41s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 10m 38s{color} | {color:red} root generated 79 new + 823 unchanged - 0 fixed = 902 total (was 823) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 3s{color} | {color:orange} root: The patch generated 12 new + 783 unchanged - 22 fixed = 795 total (was 805) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 41s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 21s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 21s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 36s{color} | {color:red} hadoop-hdfs-httpfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 4s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-6984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12874572/HDFS-6984.014.patch | | Optional Tests | asflicense xml compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 508b980bd7f1 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a9d3412 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | javac | https://builds.apache.org/jo
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030394#comment-16030394 ] Chris Douglas commented on HDFS-6984: - [~andrew.wang] do you have cycles to review this? > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, > HDFS-6984.009.patch, HDFS-6984.010.patch, HDFS-6984.011.patch, > HDFS-6984.012.patch, HDFS-6984.013.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027065#comment-16027065 ] Chris Douglas commented on HDFS-6984: - This is ready for review. None of the unit test failures are related, the checkstyle warnings are for the existing fields/cstr, and the javac warnings come from classes/methods deprecated in this patch. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, > HDFS-6984.009.patch, HDFS-6984.010.patch, HDFS-6984.011.patch, > HDFS-6984.012.patch, HDFS-6984.013.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026855#comment-16026855 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 9 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 33s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 38s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 13m 40s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 13m 40s{color} | {color:red} root generated 79 new + 787 unchanged - 0 fixed = 866 total (was 787) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 4s{color} | {color:orange} root: The patch generated 12 new + 792 unchanged - 22 fixed = 804 total (was 814) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 55s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 25s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 47s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 39s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}161m 12s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestMaintenanceState | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-6984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870098/HDFS-6984.013.patch | | Optional Tests | asflicense xml compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025822#comment-16025822 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 9 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 22s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 15s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 12m 15s{color} | {color:red} root generated 80 new + 787 unchanged - 0 fixed = 867 total (was 787) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 41s{color} | {color:orange} root: The patch generated 21 new + 789 unchanged - 22 fixed = 810 total (was 811) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 42s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 19s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 35s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 28s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 31s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}151m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetu
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015189#comment-16015189 ] Chris Douglas commented on HDFS-6984: - The {{TestINodeAttributeProvider}} failure is reproducible, but the other test failures don't appear to be related. The javac warnings are deprecation warnings introduced by the patch. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, > HDFS-6984.009.patch, HDFS-6984.010.patch, HDFS-6984.011.patch, > HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015144#comment-16015144 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 17s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 56s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 11m 56s{color} | {color:red} root generated 80 new + 787 unchanged - 0 fixed = 867 total (was 787) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 0s{color} | {color:orange} root: The patch generated 19 new + 796 unchanged - 22 fixed = 815 total (was 818) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 12s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 24s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 19s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 35s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}149m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestRaceWhenRelogin | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 | | | hadoop.hdfs.server.namenode.TestINodeAttributeProvider | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002385#comment-16002385 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 31s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 26s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 29s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 46s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 18s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 15m 18s{color} | {color:red} root generated 80 new + 787 unchanged - 0 fixed = 867 total (was 787) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 5s{color} | {color:orange} root: The patch generated 23 new + 795 unchanged - 25 fixed = 818 total (was 820) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 59s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 25s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 46s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 23s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}165m 27s{color} | {color:black} {color} | \\ \\ || Reaso
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002129#comment-16002129 ] Andrew Wang commented on HDFS-6984: --- Thanks again for the continued work here Chris, I want to get this in as badly as you. I spent a little more time looking at the patch, and I'm comfortable with {{HdfsFileStatus extends FileStatus}} once we make sure symlinks are treated opaquely. Symlinks: * I saw three calls to HdfsFileStatus#getSymlink: FSNamesystem#logAuditEvent, NNRpcServer#getLinkTarget, JsonUtil#toJsonMap. I think the latter two should be calling {{getSymlinkBytes}} for opacity. * We should add a warning to HdfsFileStatus#getSymlink about handling symlink targets opaquely, for future coders. The rest: {quote} Here is the relevant line. It's consistent with the documentation- that is the FsPermission[Extension] for that path- but including the flag bits in the AclStatus permission field looks more like an accident of the implementation, not part of its intended API. {quote} Good catch. I see you removed this already in the 009 patch, agree that it looks like an accident. I'm convinced by the rest. Documenting the other pending issues here for completeness: * Any additional checking for symlink target opacity and conversion to a Path. I only checked usages when it's a {{HdfsFileStatus}}, not usages of the base {{FileStatus}} method. * Snapshots potentially not working with EC or encryption? If so, this might be more broadly an xattrs+snapshots bug. * Backwards compatibility by still setting the permission bits for old clients. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, > HDFS-6984.009.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997143#comment-15997143 ] Chris Douglas commented on HDFS-6984: - Thanks for taking a look, Andrew. bq. Looks like this latest patch has grown some new functionality. That's not intended. It includes [~ste...@apache.org]'s request that the user-facing types be {{Serializable}}, and your request to move the flags out of {{FsPermissionExtension}}, deprecating Writable APIs in/around FileStatus. Versions before v009 tried to avoid a set of flags for features in HdfsFileStatus and FileStatus, but ACLs/encryption rely on the FsPermissionExtension bits, not the PB record. I tried variants that used a placeholder, "remote" stub in the PB field that's sufficient to fill out the FileStatus API, but there are too many required fields in the PB messages to make that efficient or clean. ACLs, encryption, and EC are all handled slightly differently. Instead of adding to client impl complexity, v009 gives up and adds a set of flags. By new functionality, are you referring to anything beyond making HdfsFileStatus a subtype of FileStatus? bq. I'd like to keep setting the bits server-side. There's no guarantee whether 2.9.0 or 3.0.0 will GA first, and if we have the option of avoiding incompatibility, I'd like to take it. Agreed. I hope we eventually remove this, or at least stop adding new flags to FsPermission. bq. Could you give a pointer to this? These bits are new to the JSON code, so it'd be good to catch any bugs quickly. [Here|https://github.com/apache/hadoop/blob/81092b1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java#L373] is the relevant line. It's consistent with the documentation- that _is_ the FsPermission\[Extension\] for that path- but including the flag bits in the AclStatus permission field looks more like an accident of the implementation, not part of its intended API. bq. I'm worried about making HdfsFileStatus extend FileStatus, particularly the link target. HdfsFileStatus is used on the NameNode, and there is potentially some mangling from turning the link target String into a Path. The target is supposed to be an opaque value within the filesystem, left to be interpreted by the client. For this particular case, shouldn't the caller be using {{HdfsFileStatus::getSymlinkInBytes}} in contexts where it's opaque? I share your concern, but extending FileStatus also required touching code that didn't expect (spurious) IOExceptions. In those and other existing contexts, I didn't find any instances where the String wasn't converted to a Path. Converting the byte[] to a String may also mangle it somewhat. To treat it as opaque for the patch, I'll change the serialization to use the {{getSymlinkInBytes}} conversion, rather than {{Path::toString}}. I'll also go through its uses, and try to find all the cases that don't immediately convert the String to a Path. bq. How bad is it to split this change out? I've split this twice already, to fix concerns over DistCp and for {{Serializable}}. We can change the scope of the JIRA, but (IMHO) these changes are related, and implied by previous review requests. bq. Wasn't clear what we gain by having HdfsFileStatus extend FileStatus The case [~sershe] raised for HDFS-7878 involved passing a handle across process boundaries. If a PathHandle is required to store sufficient state to open the file, then it needs to copy (and serialize) data from HdfsFileStatus that, in most cases, will not be used. We can construct this on-demand, if the HdfsFileStatus is kept intact. There's also the minor inefficiency of creating a new FileStatus instance and copying fields, but 1) that's often incurred anyway because the Located\* types are irreconcilable and 2) in the noise. FileStatus is exposing attributes that (AFAIK) only apply to HDFS (e.g., EC, encryption); the types are 1:1 mappings right now. If HDFS wants to pass additional information, or encode it differently, then it can return a different data type (perhaps with this as a field). As it's used inside the NN, it lazily binds some fields, but I don't know of a use that's inconsistent with the POD type. I wish FileStatus were an interface and not a concrete class, but that ship has sailed. If you're really uncomfortable with the change, then I can try to roll it back. bq. HttpsFileSystem: Please do not use a wildcard import Recently switched to Intellij; not intended. Will fix my settings. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCab
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995986#comment-15995986 ] Andrew Wang commented on HDFS-6984: --- Thanks for the rev Chris. Looks like this latest patch has grown some new functionality. bq. For backwards compatibility with 2.x clients, we can either keep setting the permission bits server side or backport the flags to branch-2 for the "2.x version compatible with 3.x clusters" release. Any opinions on this? It's simplest to keep setting the bits server-side, and deleting that code when we remove FsPermissionExtension. I'd like to keep setting the bits server-side. There's no guarantee whether 2.9.0 or 3.0.0 will GA first, and if we have the option of avoiding incompatibility, I'd like to take it. bq. I also had a question about snapshots. I'm not certain whether EC and encryption zones work with them. The current patch doesn't handle ACLs, but I wanted to get a quick check on that before digging through the implementation. Do you have a unit test for this behavior? Bit surprised that this doesn't work. If it's not a regression, we can handle this in a different JIRA. bq. The JSON code tried to preserve the acl/crypt/ec bits of the FsPermission on AclStatus. This seems incorrect, but I can try to find a solution if it is meaningful. Could you give a pointer to this? These bits are new to the JSON code, so it'd be good to catch any bugs quickly. A high-level comment: I'm worried about making HdfsFileStatus extend FileStatus, particularly the link target. HdfsFileStatus is used on the NameNode, and there is potentially some mangling from turning the link target String into a Path. The target is supposed to be an opaque value within the filesystem, left to be interpreted by the client. How bad is it to split this change out? Wasn't clear what we gain by having HdfsFileStatus extend FileStatus. Some nits: * HttpsFileSystem: Please do not use a wildcard import * Double ";" in PBHelperClient: {code} import org.apache.hadoop.hdfs.protocol.FsPermissionExtension;; {code} * Extra import in SnapshotInfo > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.006.patch, HDFS-6984.007.patch, HDFS-6984.008.patch, > HDFS-6984.009.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993979#comment-15993979 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 23s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 28s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 44s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 17m 9s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 17m 9s{color} | {color:red} root generated 79 new + 788 unchanged - 0 fixed = 867 total (was 788) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 4m 0s{color} | {color:orange} root: The patch generated 21 new + 770 unchanged - 25 fixed = 791 total (was 795) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 37s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 19s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 33s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 34s{color} | {color:red} hadoop-hdfs-httpfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}173m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tes
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992111#comment-15992111 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 27s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 28s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 47s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 2s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 15m 2s{color} | {color:red} root generated 71 new + 788 unchanged - 0 fixed = 859 total (was 788) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 12s{color} | {color:orange} root: The patch generated 21 new + 762 unchanged - 24 fixed = 783 total (was 786) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 17s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 39s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 13s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 25s{color} | {color:red} hadoop-hdfs-httpfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 36s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}185m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | modul
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990141#comment-15990141 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 34s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 38s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 39s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 57s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 30s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 8s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 15m 8s{color} | {color:red} root generated 71 new + 788 unchanged - 0 fixed = 859 total (was 788) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 6s{color} | {color:orange} root: The patch generated 14 new + 764 unchanged - 24 fixed = 778 total (was 788) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 37s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 3 new + 2 unchanged - 0 fixed = 5 total (was 2) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 19s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 20s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 46s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 25s{color} | {color:red} hadoop-hdfs-httpfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}161m 11s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | F
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989778#comment-15989778 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 24s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 29s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 46s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 14m 7s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m 7s{color} | {color:red} root generated 68 new + 788 unchanged - 0 fixed = 856 total (was 788) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 4s{color} | {color:orange} root: The patch generated 19 new + 772 unchanged - 24 fixed = 791 total (was 796) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 37s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 3 new + 2 unchanged - 0 fixed = 5 total (was 2) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 54s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 19s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 12s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 24s{color} | {color:red} hadoop-hdfs-httpfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}160m 9s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hado
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989654#comment-15989654 ] Chris Douglas commented on HDFS-6984: - v006 of the patch does the following: * Moves all the protobuf serialization to library code * Deprecates Writable APIs for both {{FileStatus}} and {{FsPermission}} * Where possible, removes {{FsPermissionExtension}}. ** Too many unit tests rely on this for HDFS, and trying to change those caused even more bloat to the patch. v006 moves its instantiation to a private {{convert}} method. If this is OK, I'll file a followup JIRA to clear up the unit tests. ** I don't know how many downstream applications rely on the {{hasAcl}}, {{isEncrypted}}, or {{isErasureCoded}} methods on {{FsPermission}}, but these are deprecated (rather than removed) in the patch. ** Introduced an intermediate, private {{FlaggedFileStatus}} class to preserve the attributes formerly mixed in with the permission bits. This could have been in {{LocatedFileStatus}}, but downstream clients may check {{instanceof LocatedFileStatus}} and assume the null locations are correct. ** Still need to deprecate {{FsPermission#toExtendedShort}}, will post a followup patch with any other checkstyle/findbugs fixes to v006 * Make {{HdfsFileStatus}} extend {{FileStatus}} ** After HADOOP-13895, the {{Serializable}} API bled further into the API ** {{getSymlink}} annoyingly throws an {{IOException}} if {{!isSymlink()}}. Overriding it in {{HdfsFileStatus}} required changing the return type, and to comply with the contract tests there are some superfluous try/catch statements in e.g., the JSON utils ** The JSON code tried to preserve the acl/crypt/ec bits of the {{FsPermission}} on {{AclStatus}}. This seems incorrect, but I can try to find a solution if it is meaningful. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.006.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929203#comment-15929203 ] Andrew Wang commented on HDFS-6984: --- Hi Chris, replies inline, lots of agreement with your direction: bq. Should we also try to remove Writable from FsPermission? We could deprecate the Writable API instead of removing it from these classes, in case projects/users depend on it downstream... the serialization/conversion can still live in a library, but be called from the deprecated methods. SGTM bq. Since both FsPermission#getAclBit and FsPermission#getEncryptedBit/FileStatus#isEncrypted are user-facing, should these also be part of FSProtos? ...While we're at it, should we also change HdfsFileStatusProto to stop packing the acl/encryption bits among the permission bits? Also sounds great, I'd love a bitfield rather than stuffing these in FsPermission. bq. is there a reason encryption info is included in HdfsFileStatus, but ACLs are not? Would it be inappropriate to add a FileSystem#getAclStatus(FileStatus), in case an implementation returns this information in its response (potentially avoiding the 2-RPC overhead)? We need the FEInfo to read or write a file, and the ACLs are rarely needed by the client since they're enforced server-side. I don't think there are any plans to include ACLs in HdfsFileStatus because of the bloat, but your suggestion makes sense if there is such an FS implementation. Good follow-on. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928873#comment-15928873 ] Chris Douglas commented on HDFS-6984: - Thanks for taking a look, [~andrew.wang]. Sorry, I had forgotten your earlier comment. Should we also try to remove {{Writable}} from {{FsPermission}}? We could deprecate the {{Writable}} API instead of removing it from these classes, in case projects/users depend on it downstream... the serialization/conversion can still live in a library, but be called from the deprecated methods. Since both {{FsPermission#getAclBit}} and {{FsPermission#getEncryptedBit}}/{{FileStatus#isEncrypted}} are user-facing, should these also be part of FSProtos? The payload for {{FileEncryptionInfoProto}} is likely HDFS-specific, or I'd suggest populating these using the presence of the fields. While we're at it, should we also change {{HdfsFileStatusProto}} to stop packing the acl/encryption bits among the permission bits? Kind of a sidebar: is there a reason encryption info is included in HdfsFileStatus, but ACLs are not? Would it be inappropriate to add a {{FileSystem#getAclStatus(FileStatus)}}, in case an implementation returns this information in its response (potentially avoiding the 2-RPC overhead)? I'll add some additional tests. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927021#comment-15927021 ] Andrew Wang commented on HDFS-6984: --- Hi Chris, thanks for revving, a few review comments: I was wondering if you saw my comment way back about the ACL bit / encrypted bit / etc: bq. The takeaways for me are that we should make a separate bitfield for these flags. If we want to preserve cross-serialization, we'd need to also add this field to HdfsFileStatus, and we'd always have to be careful with field numbers. Right now, it looks like if you pass in an HdfsFileStatus with these bits set, they're dropped. It'd be good to unit test these getters. If you can think up a unit test to detect the addition of new bits (e.g. isErasureCoded), that'd also be great. Since a lot of fields are optional in the PB, should we also test with these optional fields unset? I'm wondering if the resulting FileStatus is filled in with reasonable defaults. Generally beefing up test coverage would be good too, since it seems like we lost some of the basic "try writing and reading some different statuses" test from TestFileStatus. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15925714#comment-15925714 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 18s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 7m 27s{color} | {color:red} root in trunk failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 7m 14s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 7m 14s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 14s{color} | {color:red} root in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 8s{color} | {color:orange} root: The patch generated 6 new + 72 unchanged - 29 fixed = 78 total (was 101) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 30s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 21s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 46s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}132m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-6984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12858834/HDFS-6984.005.patch | | Optional Tests | asflicense xml compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 1513df8ca62c 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cc1292e | | Default Java | 1.8.0_121 | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/18726/artifact/patchprocess/branch-compile-root.txt | | findbugs | v3.0.0
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15925574#comment-15925574 ] Chris Douglas commented on HDFS-6984: - Fixed some checkstyle warnings. Test failures are due to YARN-6336 > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.004.patch, HDFS-6984.005.patch, > HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15925334#comment-15925334 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 35s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 8m 34s{color} | {color:red} root in trunk failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 7m 49s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 7m 49s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 49s{color} | {color:red} root in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 18s{color} | {color:orange} root: The patch generated 48 new + 72 unchanged - 29 fixed = 120 total (was 101) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 41s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 53s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}145m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-6984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12857384/HDFS-6984.004.patch | | Optional Tests | asflicense xml compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux e7102ca39923 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fa67a96 | | Default Java | 1.8.0_121 | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/18
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15849075#comment-15849075 ] Andrew Wang commented on HDFS-6984: --- I think this is a "user-level file format" and covered by: https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#File_formats__Metadata {quote} Non-forward-compatible user-file format changes are restricted to major releases. When user-file formats change, new releases are expected to read existing formats, but may write data in formats incompatible with prior releases. Also, the community shall prefer to create a new format that programs must opt in to instead of making incompatible changes to existing formats. {quote} I think that's basically what [~ste...@apache.org] said above. [~chris.douglas] thanks for bearing with, let's move the serde into a library and get this committed. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848285#comment-15848285 ] Steve Loughran commented on HDFS-6984: -- What is there in the hadoop compat guidelines about serialize/writable compat. Wire compatibility is covered to the extent of protobuf, but not much else. to be ruthless, I'd propose something like "protobuf-serialized data: compatible across minor versions; may break over major", java serializable: less/none. where java serialization is going to cause problems if versions change is in things like spark checkpointing, which uses java or kryo ser/deser. They'll have to expect failures on point updates. I don't see spark itself stating anything there either > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847736#comment-15847736 ] Chris Douglas commented on HDFS-6984: - bq. serialization that omits fields for efficiency isn't supportable, and is why I'd prefer a PathHandle. With this constraint, we either expand the API surface- adding functions that accept PathHandle instances, rather than FileStatus- or all fields are required in FileStatus objects. I see your point w.r.t. confused semantics for partial FileStatus objects. In completeness, we could change the return type to {{Optional}} for FileStatus fields, but that's a nonstarter from a compatibility perspective. We can pick this up in HDFS-7878. bq. I don't feel that strongly about removing Writable, but the nowritable patch is simple and to the point Well... sure. Deleting code is direct. :) I don't mind moving the protobuf serialization to a utility library. Would that move this forward? bq. Cross-serialization doesn't have an immediate usecase. HDFS-7878 IMO needs a serializable PathHandle, not a full FileStatus. The goal was not to make cross-serialization a requirement, but to make future efficiency possible. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765957#comment-15765957 ] Andrew Wang commented on HDFS-6984: --- I think "the serialization can omit fields" is an abstraction breakage. If we're going to have an open API that takes a FileStatus, the implementation should be allowed to use all the fields expected to be present for a FileStatus of that FileSystem. This means serialization that omits fields for efficiency isn't supportable, and is why I'd prefer a PathHandle. Regarding TOCTOU, this would be addressed by including the PathHandle in the returned FileStatus, right? This discussion is an aside though to the matter at hand, and we should continue on HDFS-7878. I think we already agreed above that as long as we can add new fields to FileStatus, we can satisfy the basic requirements of HDFS-7878. bq. I didn't mean to suggest that HdfsFileStatus should be a public API (with all the restrictions on evolving it) If we don't intend to make HdfsFileStatus public, what's the point of cross-serialization? We also need to qualify the path for an HdfsFileStatus to become a FileStatus, so I don't know how zero-copy it can be anyway. I don't feel *that* strongly about removing Writable, but the nowritable patch is simple and to the point, and I still haven't grasped the benefit of keeping FileStatus Writable, even via PB. We don't think there are many (any?) apps out there using the Writable interface. Cross-serialization doesn't have an immediate usecase. HDFS-7878 IMO needs a serializable PathHandle, not a full FileStatus. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15755358#comment-15755358 ] Chris Douglas commented on HDFS-6984: - bq. So for the Hive usecase, they would have to pass around a full serialized FileStatus even though open() only needs the PathHandle field? This API seems fine for same-process usage (HDFS-9806?) but inefficient for cross-process. I think UNIX users are also used to the idea of an inode id separate from a file status. Sure, but this wasn't what I was trying to get at. On its efficiency: # The overhead at the NN is significant, since it's often the cluster bottleneck. If we're returning information that's immediately discarded, the client-side inefficiency of not-immediately-discarding is not irrelevant, but it's not significant. For a few hundred {{FileStatus}} records, this overhead is less than 1MB, and it's often available for GC (most {{FileStatus}} objects are short-lived). # For cases where we want to manage thousands or millions of FileStatus instances, this overhead may become significant. But we can compress repeated data, and most collections of {{FileStatus}} objects are mostly repeated data. Further, most of these fields are optional, and if an application wants to transfer only necessary fields (e.g., the Path and PathHandle, omitting everything else), that's fine. # I share your caution w.r.t. the API surface, which is why HDFS-7878 avoids adding lots of new calls accepting {{PathHandle}} objects. Users of {{FileSystem}} almost always intend to refer to the entity returned by the last query, not whatever happens to exist at that path now. In exchange for non-optimal record size, the API gains some convenient symmetry (i.e., get a FileStatus, use a FileStatus) that at least makes it possible to avoid the TOCTOU races that are uncommon, but annoying. The serialization can omit fields. We can use generic container formats that understand PB to compress collections of {{FileStatus}} objects. We can even use a more generic type ({{FileStatus}}) if the specific type follows some PB conventions. Considering the cost of these, the distance from "optimal" seems well within the ambient noise. bq. I still lean toward removing Writable altogether, since it reduces our API surface. Similarly, I'd rather not open up HdfsFileStatus as a public API (without a concrete usecase) since it expands our API surface. Again, I'm mostly ambivalent about {{Writable}}. [~ste...@apache.org]? Preserving the automatic serialization that some user programs may rely on... we can try removing it in 3.x-alpha/beta, and see if anyone complains. If I moved this to a library, would that move this forward? I didn't mean to suggest that {{HdfsFileStatus}} should be a public API (with all the restrictions on evolving it). bq. The cross-serialization is also fragile since we need to be careful not to reuse field numbers across two structures, and I've seen numbering mistakes made before even for normal PB changes. Granted, but a hole in the PB toolchain doesn't mean we shouldn't use the feature. We can add more comments to the .proto. Reading through HDFS-6326, perhaps making {{FsPermission}} a protobuf record would help. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@ha
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752895#comment-15752895 ] Andrew Wang commented on HDFS-6984: --- bq. The current proposal in HDFS-7878 offers open(FileStatus) instead of open(PathHandle) to avoid adding new functions to FileSystem for listing, deleting, renaming, etc. against a PathHandle... So for the Hive usecase, they would have to pass around a full serialized FileStatus even though open() only needs the PathHandle field? This API seems fine for same-process usage (HDFS-9806?) but inefficient for cross-process. I think UNIX users are also used to the idea of an inode id separate from a file status. I still lean toward removing Writable altogether, since it reduces our API surface. Similarly, I'd rather not open up {{HdfsFileStatus}} as a public API (without a concrete usecase) since it expands our API surface. The cross-serialization is also fragile since we need to be careful not to reuse field numbers across two structures, and I've seen numbering mistakes made before even for normal PB changes. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15746989#comment-15746989 ] Chris Douglas commented on HDFS-6984: - bq. This means we need to be able to add new fields to FileStatus, and that PathHandle should be an opaque PB-encoded blob. It doesn't require cross-serialization between FileStatus and HdfsFileStatus, and we don't necessarily need FileStatus to be serializable either. Yes, I agree. To add new fields to {{FileStatus}}, we were changing its serialization in 3.x to PB (since updating {{Writable}} is problematic). If we're going to use PB, then arranging its fields so it _could_ be compatible with {{HdfsFileStatus}} was just anticipating future changes; you're right, cross-serialization is not a requirement. By arranging the type this way, the {{bytes}} field is accessible deserialized into a vanilla {{FileStatus}}, but if one were to deserialize into {{HdfsFileStatus}} then the {{PathHandle}} could have typed, versioned fields without extra steps. It's free to make this easy, so the patch doesn't make it hard. The current proposal in HDFS-7878 offers {{open(FileStatus)}} instead of {{open(PathHandle)}} to avoid adding new functions to {{FileSystem}} for listing, deleting, renaming, etc. against a {{PathHandle}}. It'd be convenient if, as is fairly common, processes could just throw the result of a {{FileSystem}} query to a service/framework using some of the automatic serialization (e.g., {{Serializable}} in Spark and {{Writable}} in MapReduce), but we could move it to a library. Again, I'm pretty ambivalent about removing {{Writable}} from {{FileStatus}}, since we still have it on other data returned by {{FileSystem}} (like {{Path}}), but it is largely vestigial in 3.x. bq, Is HDFS-9806 using the FileSystem interface to access the backing storage system? It sounds like we need to be able to add a new field to FileStatus, but not necessarily cross-serialization (or serialization) of a FileStatus. Yes, that's correct. We just need a consistent place to add "metadata that uniquely identifies a file" so we can handle it consistently across {{FileSystem}} implementations. bq. Regarding cross-serialization, yes we can save some copies by making HdfsFileStatus implement FileStatus and returning an HdfsFileStatus directly in FileSystem, but HdfsFileStatus has even more fields that a user is unlikely to care about. It also seems weird from an API perspective to include extra, inaccessible data in the serialized version of a class. If the record doesn't leave the process or the deserializer knew to use a {{HdfsFileStatus}} then it'd be accessible. Unlike today, a client could use the data we return from the NN if we didn't throw it out. I hear you on the inefficiency of carrying excess data per record, but the NN is a more critical bottleneck, so perhaps we should focus on not returning the unused data at all. Among clients, unless they're caching thousands of {{FileStatus}} objects in memory, the overhead is negligible, and the solution is to build efficient composites (i.e., instead of shaving redundant bytes per record, we can compress repeated data across millions of records). > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15746761#comment-15746761 ] Andrew Wang commented on HDFS-6984: --- IIUC the usecases mentioned are the following: bq. Sergey Shelukhin cited a case in Hive/LLAP where this object needs to be passed across process boundaries (as splits, etc.) If this Hive usecase is the one from [this comment on HDFS-7878|https://issues.apache.org/jira/browse/HDFS-7878?focusedCommentId=15453828&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15453828] then it sounds like they want to be able to get a {{PathHandle}} from a {{FileStatus}}, serialize the {{PathHandle}}, pass it to a worker, and then deserialize the {{PathHandle}} and open a file with it. This means we need to be able to add new fields to FileStatus, and that {{PathHandle}} should be an opaque PB-encoded blob. It doesn't require cross-serialization between FileStatus and HdfsFileStatus, and we don't necessarily need FileStatus to be serializable either. bq. and HDFS-9806 needs a nonce from the FS to detect changes. Neither knows the FileStatus type before it's read, but by using PB the PathHandle data are addressable and the basic fields (path, type, owner, etc.) are available, and that's enough. Is HDFS-9806 using the FileSystem interface to access the backing storage system? It sounds like we need to be able to add a new field to FileStatus, but not necessarily cross-serialization (or serialization) of a FileStatus. Basically, I'm wondering if we should push all FileStatus-related serialization duties into application code. The distcp conversion is an example of the benefits: space efficiency was improved by not including unused fields, and it made it clear that the serialization only needs to be compatible within a single job execution. Regarding cross-serialization, yes we can save some copies by making HdfsFileStatus implement FileStatus and returning an HdfsFileStatus directly in FileSystem, but HdfsFileStatus has even more fields that a user is unlikely to care about. It also seems weird from an API perspective to include extra, inaccessible data in the serialized version of a class. Thoughts? > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15743045#comment-15743045 ] Chris Douglas commented on HDFS-6984: - bq. Can we split the Serializable stuff into a separate change? Moved to HADOOP-13895 > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15738848#comment-15738848 ] Chris Douglas commented on HDFS-6984: - Thanks for reviewing this, Andrew. I tried to explain the cross-serialization use case below. I think we can modify the FsPermission to keep the encryption and ACL bits in the protobuf record, rather than the special encoding it uses now. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15738840#comment-15738840 ] Chris Douglas commented on HDFS-6984: - bq. Can we split the Serializable stuff into a separate change? Changing booleans into Boolean objects is additional overhead, and combining these two changes makes it harder to review. Reluctantly... this will create multiple, conflicting patches, but OK. The two are somewhat related, in that {{InodeType}} is required in the PB impl, so {{Serializable}} should consider that missing field from the stream (i.e., null) as an error. That said, nothing enforces or relies on consistent semantics between the two serializations. bq. Also, I am not a Java serialization expert, but IIUC maintaining compatibility is difficult, which means maybe people should just use PB anyway. [~ste...@apache.org] requested it for Spark, so if there's a way to configure Spark to use PB then we can avoid it. Even distcp doesn't really have a cross-version compatibility requirement, so this seemed mostly harmless. bq. how about we simply stop implementing Writable altogether in FileStatus? Agreed, I don't think this functionality is often used. Though your point about the ACL bit and the encryption bit highlights why I'd like some PB representation for HDFS-7878, and why the 2.x PB semantics are useful. Consider {{DistributedFileSystem#listStatusInternal}}. Each HdfsFileStatus instance copies its fields to a new FileStatus object by {{HdfsFileStatus#makeQualified}}, those HdfsFileStatus fields having been populated by copying fields from the PB record returned from RPC. Setting aside the inefficiency of creating copies, this also drops information returned by the NN to construct the generic {{FileStatus}}, which (per your summary of HDFS-6326) discards even more information to preserve {{Writable}} compatibility and performance. In HDFS-7878, a {{PathHandle}} (or whatever) needs some FS-specific metadata to be opaquely serialized as {{bytes}} in FileStatus, but resolved by a FileSystem instance. If a receiver deserializes a FileStatus object without knowing the source FS (i.e., as a vanilla FileStatusProto object), a serialized HdfsFileStatus object will populate those fields it understands, and (in PB 2.x) the record will retain unknown fields. The unit test in v003 verifies this. [~sershe] cited a case in Hive/LLAP where this object needs to be passed across process boundaries (as splits, etc.), and HDFS-9806 needs a nonce from the FS to detect changes. Neither knows the FileStatus type before it's read, but by using PB the {{PathHandle}} data are addressable and the basic fields (path, type, owner, etc.) are available, and that's enough. If HDFS were to return HdfsFileStatus wrapping the HdfsFileStatusProto from which it's derived, that would not only be more efficient (avoiding copies), but serializing it and reading it back as a vanilla FileStatus would also work (again, per v003). There's no particular reason that requires implementing the {{Writable}} contract, though. I'm ambivalent about removing it, since HdfsFileStatus could be modified to write HdfsFileStatusProto objects compatible with FileStatus equally well through some utility class. bq. I'm guessing FileStatus was originally made Writable as a shortcut for DistCp Hadoop RPC used to require {{Writable}} types, so it was probably to implement {{listStatus}} and {{getFileStatus}}. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736555#comment-15736555 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 4s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 40s{color} | {color:orange} root: The patch generated 21 new + 67 unchanged - 27 fixed = 88 total (was 94) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 11s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 27s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}131m 57s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestHASafeMode | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-6984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12842609/HDFS-6984.nowritable.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 5351181484ef 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5bd7dec | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/17813/artifact/patchprocess/diff-checkstyle-root.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/17813/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17813/testReport/ | | modules | C: hadoop-common-project/hadoop-common hadoop-hdfs
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736258#comment-15736258 ] Andrew Wang commented on HDFS-6984: --- Here's also a more radical idea: how about we simply stop implementing Writable altogether in FileStatus? I did an empirical test by successfully building all of CDH with a patch that does this. This includes apps like HBase, Hive, Spark, Solr, Impala, Oozie, Avro, Parquet, etc. I'm guessing FileStatus was originally made Writable as a shortcut for DistCp. Maybe there are custom apps that use Writable, but the vast majority of Cloudera users interact with a Hadoop cluster via a higher-level framework. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch, HDFS-6984.nowritable.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15726802#comment-15726802 ] Andrew Wang commented on HDFS-6984: --- Because I was curious, I re-read the history of the ACL bit over on HDFS-6326. My paraphrased notes: * There was a big effort to avoid exposing the ACL bit to callers of FsPermission#toShort and #write and so on for compatibility reasons. * Adding the full ACL to FileStatus was discarded for performance reasons. * Adding a new bitfield is okay performance wise, but was discarded because it breaks Writable compatibility. The takeaways for me are that we should make a separate bitfield for these flags. If we want to preserve cross-serialization, we'd need to also add this field to HdfsFileStatus, and we'd always have to be careful with field numbers. This entire saga has also made me eager to purge any remaining uses of Writable from our API. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15724789#comment-15724789 ] Andrew Wang commented on HDFS-6984: --- Thanks for revving this Chris. I don't have a ton of background here, but my review: * Can we split the Serializable stuff into a separate change? Changing booleans into Boolean objects is additional overhead, and combining these two changes makes it harder to review. Also, I am not a Java serialization expert, but IIUC maintaining compatibility is difficult, which means maybe people should just use PB anyway. * The serialization loses the extra {{isEncrypted}} and {{FsPermission#getAclBit}} bits since it calls {{toShort}} rather than {{toExtendedShort}}. Seems like we should save these, though the fact we pack these bits into FsPermission is an internal implementation detail. Adding new booleans to FileStatus might break cross-serialization with HdfsFileStatus though. What is the usecase for cross-serialization? * Could use some more extensive unit tests that test the default values and error conditions with the shorts. * Test nit: please also avoid the wildcard import. Regarding Steve's comment on bounds checking, PB by default has a 64MB max message size. We could use that as a reasonable upper-bound on the size of a FileStatus. > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15591402#comment-15591402 ] Steve Loughran commented on HDFS-6984: -- wow, thanks for that extra work! appreciated. I didn't go into the details enough to give it a full review...not familar enough with the code. # I see in FileStatus.java line 300+ the input size is checked for being negative. What if the size came back as 2^31? I think you need an upper bounds check if you really want to defend against malicious endpoints > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, > HDFS-6984.003.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15590249#comment-15590249 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 6s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 58s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 15s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}139m 50s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-6984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834247/HDFS-6984.003.patch | | Optional Tests | asflicense xml compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux a100249d6bf2 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e9c4616 | | Default Java | 1.8.0_101 | | findbugs | v3.0
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507650#comment-15507650 ] Steve Loughran commented on HDFS-6984: -- FWIW, I'd like FileStatus in Hadoop 2 to implement Serializable. Why? makes it trivial to pass across processes in Spark and similar remote-closure APIs > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504474#comment-15504474 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 26s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 23s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch generated 27 new + 19 unchanged - 0 fixed = 46 total (was 19) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 29s{color} | {color:red} hadoop-common-project/hadoop-common generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 13s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-common-project/hadoop-common | | | Class org.apache.hadoop.fs.FileStatusProtos$FileStatusProto defines non-transient non-serializable instance field unknownFields In FileStatusProtos.java:instance field unknownFields In FileStatusProtos.java | | | Useless control flow in org.apache.hadoop.fs.FileStatusProtos$FileStatusProto$Builder.maybeForceBuilderInitialization() At FileStatusProtos.java: At FileStatusProtos.java:[line 1057] | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-6984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782647/HDFS-6984.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit xml findbugs checkstyle cc | | uname | Linux 4cf3d029293f 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk /
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504330#comment-15504330 ] Chris Douglas commented on HDFS-6984: - bq. I actually planned on replacing DistCp's usage of FileStatus serialization with its own custom data type Filed HADOOP-13626 > In Hadoop 3, make FileStatus serialize itself via protobuf > -- > > Key: HDFS-6984 > URL: https://issues.apache.org/jira/browse/HDFS-6984 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha1 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Labels: BB2015-05-TBR > Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch > > > FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this > to serialize it and send it over the wire. But in Hadoop 2 and later, we > have the protobuf {{HdfsFileStatusProto}} which serves to serialize this > information. The protobuf form is preferable, since it allows us to add new > fields in a backwards-compatible way. Another issue is that already a lot of > subclasses of FileStatus don't override the Writable methods of the > superclass, breaking the interface contract that read(status.write) should be > equal to the original status. > In Hadoop 3, we should just make FileStatus serialize itself via protobuf so > that we don't have to deal with these issues. It's probably too late to do > this in Hadoop 2, since user code may be relying on the existing FileStatus > serialization there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus serialize itself via protobuf
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102888#comment-15102888 ] Hadoop QA commented on HDFS-6984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 14s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 28s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 12s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 13m 49s {color} | {color:red} root-jdk1.8.0_66 with JDK v1.8.0_66 generated 1 new issues (was 10, now 10). {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 8m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 24s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 24s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s {color} | {color:red} Patch generated 27 new checkstyle issues in hadoop-common-project/hadoop-common (total was 22, now 49). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 3s {color} | {color:red} hadoop-common-project/hadoop-common introduced 2 new FindBugs issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 28s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 14s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch