Thanks Josh. I'll open a JIRA On 5 Feb 2013, at 17:13, Josh Wills <[email protected]> wrote:
> Sounds useful, no way to do it now, I think. > > On Feb 5, 2013 12:00 PM, "Dave Beech" <[email protected]> wrote: >> Hi all, >> >> Something I find myself doing reasonably often in mapreduce is to use >> the reduce step as nothing more than a means to merge data into larger >> files. Unless I've missed something in the API, there doesn't appear >> to be a neat way to do this with Crunch. Here's what I have now: >> >> PGroupedTable<MyAvroRecord, Void> grouped = >> collection.parallelDo(new MapFn<MyAvroRecord, Pair<MyAvroRecord, Void>>() { >> @Override >> public Pair<MyAvroRecord, Void> map(MyAvroRecord input) { >> return Pair.of(input, null); >> } >> }, Avros.tableOf(Avros.specifics(MyAvroRecord.class), >> Avros.nulls())).groupByKey(4); >> >> pipeline.write(grouped,At.avroFile(MyAvroRecord.class)); >> >> Is there a better way? Or if not, maybe we could have a utility >> function to do this or similar? >> >> Thanks, >> Dave
