[
https://issues.apache.org/jira/browse/CRUNCH-670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936067#comment-16936067
]
David Ortiz edited comment on CRUNCH-670 at 9/23/19 5:38 PM:
-------------------------------------------------------------
[~jwills] here is a patch file combining your update, and the updates to I had
to make to AvroParquetPathPerKeyOutputFormat and AvroPathPerKeyOutputFormat to
get our stuff running on spark.[^CRUNCH-670-pt2.patch]
was (Author: dortiz):
[~jwills] Finally got approval to post this from my employer. Here is a patch
file combining your update, and the updates to I had to make to
AvroParquetPathPerKeyOutputFormat and AvroPathPerKeyOutputFormat to get our
stuff running on spark.[^CRUNCH-670-pt2.patch]
> Make the AvroPathPerKeyTarget work with the SparkRuntime
> --------------------------------------------------------
>
> Key: CRUNCH-670
> URL: https://issues.apache.org/jira/browse/CRUNCH-670
> Project: Crunch
> Issue Type: Improvement
> Reporter: Josh Wills
> Assignee: Josh Wills
> Priority: Major
> Attachments: CRUNCH-670-pt2.patch, CRUNCH-670.patch
>
>
> There is an issue where the AvroPathPerKeyTarget won't properly copy the
> output of a Spark pipeline from the temp directory to the target directory
> because it assumes it will always get a valid Crunch output index (0, 1, 2,
> ...) and the SparkRuntime passes -1 to the Target's output handler method (to
> signal that it's the only output for the job.) I _think_ the right move is to
> have AvroPathPerKeyTarget rewrite a -1 index to 0 so as not to break any
> other implementations that depend on the SparkRuntime's behavior.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)