[
https://issues.apache.org/jira/browse/CRUNCH-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Wills updated CRUNCH-155:
------------------------------
Attachment: CRUNCH-155.patch
Ended up getting it done sooner. The key idea is to not trigger a MapReduce
when the code is asked to materialize an InputCollection or InputTable, but
rather to just return a reference to the underlying ReadableSource for that
input. [~gabriel.reid] would you take a pass at this one? All the integration
tests pass, but it touches lots of bits of code.
> Combining union and groupByKey causes ClassCastException in DoNode during
> planning
> ----------------------------------------------------------------------------------
>
> Key: CRUNCH-155
> URL: https://issues.apache.org/jira/browse/CRUNCH-155
> Project: Crunch
> Issue Type: Bug
> Affects Versions: 0.4.0
> Reporter: Tim van Heugten
> Assignee: Josh Wills
> Priority: Critical
> Attachments: ConcatGroupFn.java, CRUNCH-155.patch,
> FirstLetterKeyFn.java, UnionGbkTest.java, words.txt
>
>
> Create two pCollections.
> Create an outputTarget for one of them.
> Create a table from the pCollections
> Union the tables
> Group the union.
> Process the grouped data into a PCollection.
> Materialize and run the pipeline.
> Causes: java.lang.ClassCastException:
> org.apache.crunch.types.writable.WritableType cannot be cast to
> org.apache.crunch.types.PGroupedTableType
> Note: unioning first and creating a table later has the same CCE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira