[
https://issues.apache.org/jira/browse/HADOOP-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478546
]
Milind Bhandarkar commented on HADOOP-1031:
-------------------------------------------
I would like to add another proposed change to record I/O here. Currently
hadoop.record.RecordReader and RecordWriter act as factories for various
InputArcchive and OutputArchive recently. In the original design, this was done
in order to have tight control over various serialization formats. This has
proven to be counterproductive. For wider usage of record I/O one should be
able to use their own serialization formats. The proposed changes make it
possible. They are as follows:
1. Eliminate current record.RecordReader and record.RecordWriter.
2. rename InputArchive as RecordReader, and OutputArchive as RecordWriter.
3. rename various archives accordingly. e..g. BinaryInputArchive ->
BinaryRecordReader etc.
> Enhancements to Hadoop record I/O - Part 2
> ------------------------------------------
>
> Key: HADOOP-1031
> URL: https://issues.apache.org/jira/browse/HADOOP-1031
> Project: Hadoop
> Issue Type: Improvement
> Components: record
> Affects Versions: 0.11.2
> Environment: All
> Reporter: Milind Bhandarkar
> Assigned To: Milind Bhandarkar
>
> Remaining planned enhancements to Hadoop record I/O:
> 5. Provide a 'swiggable' C binding, so that processing the generated C code
> with swig allows it to be used in scripting languages such as Python and
> Perl.
> 7. Optimize generated write() and readFields() methods, so that they do not
> have to create BinaryOutputArchive or BinaryInputArchive every time these
> methods are called on a record.
> 8. Implement ByteInStream and ByteOutStream for C++ runtime, as they will be
> needed for using Hadoop Record I/O with forthcoming C++ MapReduce framework
> (currently, only FileStreams are provided.)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.