MultipleOutputs should use newer Hadoop serialization interface since 0.19
--------------------------------------------------------------------------
Key: HADOOP-5167
URL: https://issues.apache.org/jira/browse/HADOOP-5167
Project: Hadoop Core
Issue Type: Bug
Components: mapred
Affects Versions: 0.19.0
Environment: Environment-independent issue
Reporter: Mikhail Yakshin
We have a system based on Hadoop 0.18 / Cascading 0.8.1 and now I'm trying to
port it to Hadoop 0.19 / Cascading 1.0. The first serious problem I've got into
that we're extensively using MultipleOutputs in our jobs dealing with sequence
files that store Cascading's Tuples.
Since Cascading 0.9, Tuples stopped being WritableComparable and implemented
generic Hadoop serialization interface and framework. However, in Hadoop 0.19,
MultipleOutputs require use of older WritableComparable interface. Thus, trying
to do something like:
{noformat}
MultipleOutputs.addNamedOutput(conf, "output-name",
MySpecialMultiSplitOutputFormat.class, Tuple.class, Tuple.class);
mos = new MultipleOutputs(conf);
...
mos.getCollector("output-name", reporter).collect(tuple1, tuple2);
{noformat}
yields an error:
{noformat}
java.lang.RuntimeException: java.lang.RuntimeException: class
cascading.tuple.Tuple not org.apache.hadoop.io.WritableComparable
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:752)
at
org.apache.hadoop.mapred.lib.MultipleOutputs.getNamedOutputKeyClass(MultipleOutputs.java:252)
at
org.apache.hadoop.mapred.lib.MultipleOutputs$InternalFileOutputFormat.getRecordWriter(MultipleOutputs.java:556)
at
org.apache.hadoop.mapred.lib.MultipleOutputs.getRecordWriter(MultipleOutputs.java:425)
at
org.apache.hadoop.mapred.lib.MultipleOutputs.getCollector(MultipleOutputs.java:511)
at
org.apache.hadoop.mapred.lib.MultipleOutputs.getCollector(MultipleOutputs.java:476)
at my.namespace.MyReducer.reduce(MyReducer.java:xxx)
{noformat}
MultipleOutputs should eventually be ported to use more generic Hadoop
serialization, as I understand.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.