Hi, We have a system based on Hadoop 0.18 / Cascading 0.8.1 and now I'm trying to port it to Hadoop 0.19 / Cascading 1.0. The first serious problem I've got into that we're extensively using MultipleOutputs in our jobs dealing with sequence files that store Cascading's Tuples.
Since Cascading 0.9, Tuples stopped being WritableComparable and implemented generic Hadoop serialization interface and framework. However, in Hadoop 0.19, MultipleOutputs require use of older WritableComparable interface. Thus, trying to do something like: MultipleOutputs.addNamedOutput(conf, "output-name", MySpecialMultiSplitOutputFormat.class, Tuple.class, Tuple.class); mos = new MultipleOutputs(conf); ... mos.getCollector("output-name", reporter).collect(tuple1, tuple2); yields an error: java.lang.RuntimeException: java.lang.RuntimeException: class cascading.tuple.Tuple not org.apache.hadoop.io.WritableComparable at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:752) at org.apache.hadoop.mapred.lib.MultipleOutputs.getNamedOutputKeyClass(MultipleOutputs.java:252) at org.apache.hadoop.mapred.lib.MultipleOutputs$InternalFileOutputFormat.getRecordWriter(MultipleOutputs.java:556) at org.apache.hadoop.mapred.lib.MultipleOutputs.getRecordWriter(MultipleOutputs.java:425) at org.apache.hadoop.mapred.lib.MultipleOutputs.getCollector(MultipleOutputs.java:511) at org.apache.hadoop.mapred.lib.MultipleOutputs.getCollector(MultipleOutputs.java:476) at my.namespace.MyReducer.reduce(MyReducer.java:xxx) Is there any known workaround for that? Any progress going on to make MultipleOutputs use generic Hadoop serialization? -- WBR, Mikhail Yakshin