Couldn't you write a simple wrapper around your binary, include the binary using the -file option and use Streaming?
Or use the distributed cache to copy your binaries to all the compute nodes. On Tue, Jan 10, 2012 at 5:01 PM, Daren Hasenkamp <dhasenk...@berkeley.edu>wrote: > Hi, > > I would like to bundle a binary with a hadoop job and call it from inside > the mappers/reducers. > > The binary is a C++ program that I do not want to re-implement in Java. I > want to fork it as a subprocess from inside mappers/reducers and capture > the output (on stdout). > > So, I need to get the binary onto the compute nodes and figure out how to > call it. Ideally, the binary would be copied to the compute nodes > alongside the job jar. (I'm not interested in solutions that involve > copying the binary to the compute nodes by hand). > > Note that Streaming is not a solution here--the binary itself is not the > mapper or reducer; the binary needs to be *called* from the > mapper/reducer. > > Does anyone have experience with this? Any suggestions are much > appreciated! > > -daren > >