[
https://issues.apache.org/jira/browse/HADOOP-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478116
]
Doug Cutting commented on HADOOP-1053:
--------------------------------------
Milind, you're right: if we implement some things in the io package as records
then we'll have a circular package structure: these packages would no longer be
well layered. I don't see how that would cost us much, but it is true.
If we want to (1) define some things that are currently in the io package as
records, (2) not duplicate code, and (3) keep things well layered, then we'd
need to restructure things. The io runtime (e.g., readVInt, compareBytes,
Writable, etc.), would need to be split into a separate package from classes
that we might define using the record package, like IntWritable and
BytesWritable, so that the package layering might be ioruntime > record >
iostructs.
But what would be the point? We could probably decompose nearly every package
into well-layered sub-packages, but that is disruptive, since it is not
back-compatible. Occasionally it is warranted, when packages get too big and
poorly defined, and we have other reasons to change public APIs. For example,
I would like to someday re-organize mapred into several sub-packages (e.g.,
client, protocol, tasktracker, jobtracker), to rename org.apache.hadoop.dfs to
be org.apache.hadoop.fs.hdfs, to make the util package smaller, etc., but we
don't want to rush into such changes lightly.
In summary, I still fail to see an overwhelming argument for making
org.apache.hadoop.record independent of org.apache.hadoop.io. What am I
missing?
> Make Record I/O functionally modular from the rest of Hadoop
> ------------------------------------------------------------
>
> Key: HADOOP-1053
> URL: https://issues.apache.org/jira/browse/HADOOP-1053
> Project: Hadoop
> Issue Type: Improvement
> Components: record
> Affects Versions: 0.11.2
> Environment: All
> Reporter: Milind Bhandarkar
> Assigned To: Milind Bhandarkar
> Fix For: 0.13.0
>
> Attachments: jute-patch.txt
>
>
> This issue has been created to separate one proposal originally included in
> HADOOP-941, for which no consensus could be reached. For earlier discussion
> about the issue, please see HADOOP-941.
> I will summarize the proposal here. We need to provide a way for some users
> who want to use record I/O framework outside of Hadoop.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.