Hi.

I want to invoke a map reduce job from Pig - great, there's an operation for 
that.

But I'd also like to get some information out the Pig context and pass it to 
the MR job -through the Hadoop context, the string args, etc, by any means 
necessary.

Is there a way to do this that doesn't amount to "force some task to use a 
single reducer, dump the info you want to hdfs in the reducer... then read from 
hdfs in your MR job" ? I guess, conceivably you don't need to limit the number 
of reducers (or even do it in a mapper) if you don't mind have replications in 
the temp file and tell the MR job to just use the first record it finds. Still, 
things like this smell like wasteful hacks to me. So... any better ideas?

Cheers,
Nate Segerilnd

Reply via email to