[
https://issues.apache.org/jira/browse/CRUNCH-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabriel Reid resolved CRUNCH-539.
---------------------------------
Resolution: Fixed
Assignee: Gabriel Reid
Fix Version/s: 0.13.0
Pushed to master
> Use of TupleWritable.setConf fails in mapper/reducer
> ----------------------------------------------------
>
> Key: CRUNCH-539
> URL: https://issues.apache.org/jira/browse/CRUNCH-539
> Project: Crunch
> Issue Type: Bug
> Affects Versions: 0.12.0
> Reporter: Gabriel Reid
> Assignee: Gabriel Reid
> Fix For: 0.13.0
>
> Attachments: CRUNCH-539.patch
>
>
> In (at least) more recent versions of Hadoop 2, the implicit call to
> TupleWritable.setConf that happens when using TupleWritables fails with a
> ClassNotFoundException for (ironically) the TupleWritable class.
> This appears to be due to the way that ObjectInputStream resolves classes in
> its [resolveClass
> method|https://docs.oracle.com/javase/7/docs/api/java/io/ObjectInputStream.html#resolveClass(java.io.ObjectStreamClass)],
> together with the way that the context classloader is set within a hadoop
> mapper or reducer.
> This is similar to PIG-2532.
> This can be reproduced in the local job tracker (at least) in Hadoop 2.7.0,
> but it can't be reproduced in Crunch integration tests (due to classloading
> setup). It appears that this issue is only present in Crunch 0.12.
> The following code within a simple pipeline will cause this issue to occur:
> {code}
> PTable<String, Integer> yearTemperatures = ... /* Writable-based PTable */
> PTable<String, Integer> maxTemps = yearTemperatures
> .groupByKey()
> .combineValues(Aggregators.MAX_INTS())
> .top(1); //LINE THAT CAUSES THE ERROR
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)