[ 
https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123412#comment-14123412
 ] 

Chris Nauroth commented on HDFS-6984:
-------------------------------------

bq. So, it looks like DistCp depends on FileStatus being writable...

Last time I looked at this, I actually planned on replacing DistCp's usage of 
{{FileStatus}} serialization with its own custom data type.  I believe it 
doesn't need all of the fields of {{FileStatus}}, so there is potential for a 
marginal space/performance improvement by omitting the unnecessaries.

> In Hadoop 3, make FileStatus no longer a Writable
> -------------------------------------------------
>
>                 Key: HDFS-6984
>                 URL: https://issues.apache.org/jira/browse/HDFS-6984
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-6984.001.patch
>
>
> FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this 
> to serialize it and send it over the wire.  But in Hadoop 2 and later, we 
> have the protobuf {{HdfsFileStatusProto}} which serves to serialize this 
> information.  The protobuf form is preferable, since it allows us to add new 
> fields in a backwards-compatible way.  Another issue is that already a lot of 
> subclasses of FileStatus don't override the Writable methods of the 
> superclass, breaking the interface contract that read(status.write) should be 
> equal to the original status.
> In Hadoop 3, we should just make FileStatus no longer a writable so that we 
> don't have to deal with these issues.  It's probably too late to do this in 
> Hadoop 2, since user code may be relying on the ability to use the Writable 
> methods on FileStatus objects there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to