[ https://issues.apache.org/jira/browse/HDFS-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999977#comment-13999977 ]
Chris Nauroth commented on HDFS-6326: ------------------------------------- [~szetszwo], I still see a compatibility problem with adding the ACL directly to {{FileStatus}}. This is a quote from the ACLs design doc on HDFS-4685: {quote} An alternative implementation would have been to add Acl as a member field of the existing FileStatus, FsPermission or PermissionStatus classes. This is rejected for compatibility reasons. If a new version of the client (postACLs) deserializes an old version of the FileStatus (preACLs), then this could cause bugs. If we assume a null Acl if we encounter EOF during deserialization, then there is a risk of null pointer dereferences in downstream code. {quote} I can't think of a workaround to the problem of brittle {{Writable}} serialization. I explored a few ideas, and they all introduce other bugs as a side effect: * Keep reading all the way through to EOF during deserialization and see if you find a serialized ACL after the existing fields. ** This doesn't work, because {{readFields}} has no way to know if the underlying stream really contains multiple serialized objects. * Attempt deserialization with ACL, but fall back to the old deserialization without ACL if it fails. ** This is likely to fail for similar reasons as above, because subsequent data in the stream could contain data misinterpreted as an ACL. Deserialization wouldn't fail, but it would result in wrong data. This also doesn't address the problem of an old pre-ACLs client attempting to deserialize a new {{FileStatus}} containing an ACL. * Just skip the ACL in {{write}} and {{readFields}}. ** This has the problem that round-tripping a {{FileStatus}} would lose an ACL, and clients instead would expect to maintain the full state of the object. Also, there is the consideration of bandwidth consumption for inclusion of the ACL data in RPC responses. Each {{AclEntry}} is 3 enums + a string, and the number of entries ranges from 2 - 29. The ACL could dominate over the size of all existing data in the {{FileStatus}}, even though most use cases don't consume the ACL. This part could be addressed by something like a "fetchAcl=true" optional parameter on the RPC, passed only for use cases that need it like getfacl, but there are still the compatibility problems. I just thought of one more possibility. We could add whole new RPCs, i.e. {{getListingExtended}} and {{getFileInfoExtended}}, that return a whole new object, {{FileStatusExtended}}, and we could put the ACL in the new object. Existing clients would stick to the old RPCs and old {{FileStatus}}, so there would be no compatibility concerns. Uses cases that care about the ACL, like getfacl and distcp -pa, can change their code to call the new RPCs. Obviously, the downside of this is cloning a bunch of code and expanding our API footprint on {{FileSystem}}. It's the only way I can see doing it in a backwards-compatible way though. If you think this is worth exploring, then we could file a new jira for it. Let me know what you think. > WebHdfs ACL compatibility is broken > ----------------------------------- > > Key: HDFS-6326 > URL: https://issues.apache.org/jira/browse/HDFS-6326 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Affects Versions: 3.0.0, 2.4.0 > Reporter: Daryn Sharp > Assignee: Chris Nauroth > Priority: Blocker > Fix For: 3.0.0, 2.4.1 > > Attachments: HDFS-6326-branch-2.4.patch, HDFS-6326.1.patch, > HDFS-6326.2.patch, HDFS-6326.3.patch, HDFS-6326.4.patch, HDFS-6326.5.patch, > HDFS-6326.6.patch, aclfsperm.example > > > 2.4 ACL support is completely incompatible with <2.4 webhdfs servers. The NN > throws an {{IllegalArgumentException}} exception. > {code} > hadoop fs -ls webhdfs://nn/ > Found 21 items > ls: Invalid value for webhdfs parameter "op": No enum constant > org.apache.hadoop.hdfs.web.resources.GetOpParam.Op.GETACLSTATUS > [... 20 more times...] > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)