Hi Tom, I will have to use named-output. About your example DatasetTarget, is it safe to setOutputFormat() explicitly here? I guess this may conflict with other targets that only use the same trick. Is it possible for us to have a general approach to get OutputCommitter work? Hi Chao,
Crunch doesn't call the output committer explicitly itself, it's called by the MR framework as a normal part of running a job. However, in Crunch's MapReduceTarget#configureForMapReduce the output format is not typically set for the named-output case (which is the only case that is executed now, as I discovered in the thread mentioned below), so it defaults to FileOutputFormat, with its semantics. (This is why HBaseTarget calls FileOutputFormat.setOutputPath, which it wouldn't have to if it set the output format explicitly to HBase's TableOutputFormat.) Are you setting the HCatOutputFormat in the named-output case? In the Crunch Target I'm writing I've set the OutputFormat explicitly: https://github.com/tomwhite/kite/blob/CDK-308-dataset-output-format/kite-data/kite-data-crunch/src/main/java/org/kitesdk/data/crunch/DatasetTarget.java#L106 Cheers, Tom On Thu, Feb 27, 2014 at 7:54 AM, Gabriel Reid <[email protected]> wrote: > For reference, here's the link to the previous thread on this: > http://mail-archives.apache.org/mod_mbox/crunch-dev/201401.mbox/%3cCAF-WD4Sig2n7yMxiZSji8trQy-8wfUy5_7dnKC=dksxmrfs...@mail.gmail.com%3e > > On Thu, Feb 27, 2014 at 7:56 AM, Josh Wills <[email protected]> wrote: >> +tom >> >> Didn't Tom have a thing like this a little while ago? >> >> >> On Wed, Feb 26, 2014 at 8:04 PM, Chao Shi <[email protected]> wrote: >> >>> Hi crunch devs, >>> >>> I'm developing target wrapper for HCatOutputFormat, which uses a custom >>> OutputCommiter to get results committed to hive. It seems its >>> OutputCommitter is not called at all. Looking into the code, I can't find >>> where crunch calls it. Is it really supported? >>> >>> Thanks, >>> Chao >>> >> >> >> >> -- >> Director of Data Science >> Cloudera <http://www.cloudera.com> >> Twitter: @josh_wills <http://twitter.com/josh_wills>
