[ 
https://issues.apache.org/jira/browse/HDFS-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999977#comment-13999977
 ] 

Chris Nauroth commented on HDFS-6326:
-------------------------------------

[~szetszwo], I still see a compatibility problem with adding the ACL directly 
to {{FileStatus}}.  This is a quote from the ACLs design doc on HDFS-4685:

{quote}
An alternative implementation would have been to add Acl as a member field of 
the
existing FileStatus, FsPermission or PermissionStatus classes.  This is 
rejected for
compatibility reasons.  If a new version of the client (post­ACLs) deserializes 
an old version of
the FileStatus (pre­ACLs), then this could cause bugs.  If we assume a null Acl 
if we encounter
EOF during deserialization, then there is a risk of null pointer dereferences 
in downstream code.
{quote}

I can't think of a workaround to the problem of brittle {{Writable}} 
serialization.  I explored a few ideas, and they all introduce other bugs as a 
side effect:
* Keep reading all the way through to EOF during deserialization and see if you 
find a serialized ACL after the existing fields.
** This doesn't work, because {{readFields}} has no way to know if the 
underlying stream really contains multiple serialized objects.
* Attempt deserialization with ACL, but fall back to the old deserialization 
without ACL if it fails.
** This is likely to fail for similar reasons as above, because subsequent data 
in the stream could contain data misinterpreted as an ACL.  Deserialization 
wouldn't fail, but it would result in wrong data.  This also doesn't address 
the problem of an old pre-ACLs client attempting to deserialize a new 
{{FileStatus}} containing an ACL.
* Just skip the ACL in {{write}} and {{readFields}}.
** This has the problem that round-tripping a {{FileStatus}} would lose an ACL, 
and clients instead would expect to maintain the full state of the object.

Also, there is the consideration of bandwidth consumption for inclusion of the 
ACL data in RPC responses.  Each {{AclEntry}} is 3 enums + a string, and the 
number of entries ranges from 2 - 29.  The ACL could dominate over the size of 
all existing data in the {{FileStatus}}, even though most use cases don't 
consume the ACL.  This part could be addressed by something like a 
"fetchAcl=true" optional parameter on the RPC, passed only for use cases that 
need it like getfacl, but there are still the compatibility problems.

I just thought of one more possibility.  We could add whole new RPCs, i.e. 
{{getListingExtended}} and {{getFileInfoExtended}}, that return a whole new 
object, {{FileStatusExtended}}, and we could put the ACL in the new object.  
Existing clients would stick to the old RPCs and old {{FileStatus}}, so there 
would be no compatibility concerns.  Uses cases that care about the ACL, like 
getfacl and distcp -pa, can change their code to call the new RPCs.  Obviously, 
the downside of this is cloning a bunch of code and expanding our API footprint 
on {{FileSystem}}.  It's the only way I can see doing it in a 
backwards-compatible way though.  If you think this is worth exploring, then we 
could file a new jira for it.  Let me know what you think.

> WebHdfs ACL compatibility is broken
> -----------------------------------
>
>                 Key: HDFS-6326
>                 URL: https://issues.apache.org/jira/browse/HDFS-6326
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Daryn Sharp
>            Assignee: Chris Nauroth
>            Priority: Blocker
>             Fix For: 3.0.0, 2.4.1
>
>         Attachments: HDFS-6326-branch-2.4.patch, HDFS-6326.1.patch, 
> HDFS-6326.2.patch, HDFS-6326.3.patch, HDFS-6326.4.patch, HDFS-6326.5.patch, 
> HDFS-6326.6.patch, aclfsperm.example
>
>
> 2.4 ACL support is completely incompatible with <2.4 webhdfs servers.  The NN 
> throws an {{IllegalArgumentException}} exception.
> {code}
> hadoop fs -ls webhdfs://nn/
> Found 21 items
> ls: Invalid value for webhdfs parameter "op": No enum constant 
> org.apache.hadoop.hdfs.web.resources.GetOpParam.Op.GETACLSTATUS
> [... 20 more times...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to