Hi. I want to invoke a map reduce job from Pig - great, there's an operation for that.
But I'd also like to get some information out the Pig context and pass it to the MR job -through the Hadoop context, the string args, etc, by any means necessary. Is there a way to do this that doesn't amount to "force some task to use a single reducer, dump the info you want to hdfs in the reducer... then read from hdfs in your MR job" ? I guess, conceivably you don't need to limit the number of reducers (or even do it in a mapper) if you don't mind have replications in the temp file and tell the MR job to just use the first record it finds. Still, things like this smell like wasteful hacks to me. So... any better ideas? Cheers, Nate Segerilnd