[ 
https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-6984:
--------------------------------
    Attachment: HDFS-6984.003.patch

Rebased [~cmccabe]'s patch.
* Changed {{FileStatus}} to be {{Serializable}}, per [~ste...@apache.org]'s 
suggestion. This cascaded to a few other classes, which I halted at 
{{HdfsBlockLocation}} (changing the final ref to transient). Looking through 
its usage this is probably correct, since the fields not redundant with 
{{BlockLocation}} are things like tokens, which are internal(?) to DFSClient.
* Changed required fields to {{<path, filetype>}} from {{<path, filetype, 
permission mask, owner, group, mtime>}}. mtime in particular isn't always cheap 
on some systems, and the owner/group/perms may be -lies- placeholders if the FS 
is required to populate them. The {{filetype}} is arguable. YARN overwhelmingly 
favors {{optional}} fields for everything, FWIW.
* Changed field IDs to match {{HdfsFileStatusProto}}. In proto2 at least, this 
cross-serialization works (added a test). In HDFS-7878, {{FileStatus}} can 
leave its {{PathHandle}} as {{bytes}}, provided they occupy the same field ID.

> In Hadoop 3, make FileStatus serialize itself via protobuf
> ----------------------------------------------------------
>
>                 Key: HDFS-6984
>                 URL: https://issues.apache.org/jira/browse/HDFS-6984
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Colin P. McCabe
>            Assignee: Colin P. McCabe
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-6984.001.patch, HDFS-6984.002.patch, 
> HDFS-6984.003.patch
>
>
> FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this 
> to serialize it and send it over the wire.  But in Hadoop 2 and later, we 
> have the protobuf {{HdfsFileStatusProto}} which serves to serialize this 
> information.  The protobuf form is preferable, since it allows us to add new 
> fields in a backwards-compatible way.  Another issue is that already a lot of 
> subclasses of FileStatus don't override the Writable methods of the 
> superclass, breaking the interface contract that read(status.write) should be 
> equal to the original status.
> In Hadoop 3, we should just make FileStatus serialize itself via protobuf so 
> that we don't have to deal with these issues.  It's probably too late to do 
> this in Hadoop 2, since user code may be relying on the existing FileStatus 
> serialization there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to