Couldn't you write a simple wrapper around your binary, include the binary
using the -file option and use Streaming?

Or use the distributed cache to copy your binaries to all the compute
nodes.

On Tue, Jan 10, 2012 at 5:01 PM, Daren Hasenkamp <dhasenk...@berkeley.edu>wrote:

> Hi,
>
> I would like to bundle a binary with a hadoop job and call it from inside
> the mappers/reducers.
>
> The binary is a C++ program that I do not want to re-implement in Java. I
> want to fork it as a subprocess from inside mappers/reducers and capture
> the output (on stdout).
>
> So, I need to get the binary onto the compute nodes and figure out how to
> call it. Ideally, the binary would be copied to the compute nodes
> alongside the job jar. (I'm not interested in solutions that involve
> copying the binary to the compute nodes by hand).
>
> Note that Streaming is not a solution here--the binary itself is not the
> mapper or reducer; the binary needs to be *called* from the
> mapper/reducer.
>
> Does anyone have experience with this? Any suggestions are much
> appreciated!
>
> -daren
>
>

Reply via email to