[
https://issues.apache.org/jira/browse/CRUNCH-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Whiting updated CRUNCH-412:
---------------------------------
Attachment: AvroReadSimulator.java
I didn't manage to get very far in hacking into Crunch itself, but here's the
(stupid) helper we've been using so far just with individual DoFns, which could
work in an integrated way by combining with the SingleUseIterable when they are
combined in the HFunction. It's kinda specific to Avro SpecificRecords though,
so I'm guessing there could be something a lot more intelligent delegating to
the PType of the collection, but I can't really figure out how that would work.
> MemPipeline mode for simulating MapReduce quirks
> ------------------------------------------------
>
> Key: CRUNCH-412
> URL: https://issues.apache.org/jira/browse/CRUNCH-412
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Reporter: Josh Wills
> Assignee: Josh Wills
> Attachments: AvroReadSimulator.java
>
>
> From a discussion on the mailing list, we'd like to have a MemPipeline mode
> that simulates a couple of the quirks of MapReduce/MRPipeline for more
> reliable testing, namely:
> 1) Shuffle code that re-uses reduce-side objects so we can detect bugs caused
> by object modification, and
> 2) Serializes/deserializes DoFns before running them in order to test for any
> non-serializable code that sneaks into a pipeline.
--
This message was sent by Atlassian JIRA
(v6.2#6252)