Jarod, On Jul 10, 2011, at 3:13 PM, Donghan (Jarod) Wang wrote:
> Hey Arun, > > Thank you for the reply. The way you mentioned requires setting up > native libraries somewhere on the hdfs before starting the job, which > is what I am trying to avoid. What I want is bundling the libraries > within the job JAR, in other words the libraries are shipped with the > JAR and need not be pre-installed on the system. And once the job gets > running, it extracts the lib from the job JAR and System.load it. I > wonder if it is possible. > It's possible, but very tedious. Currently (0.20.xxx) unjars the job.jar for you, but that is going away in 0.23 (even 0.22 I guess). Even then, you'll have to manually figure the path, load it etc. OTOH, using the DC is supported. Even better, the native .so will be shared across jobs - so it's downloaded into the DC only once and re-used. I'd highly recommend that. hth, Arun > Thanks, > Jarod > > On Sat, Jul 9, 2011 at 3:20 PM, Arun C Murthy <acmur...@apache.org> wrote: >> Jarod, >> >> On Jul 9, 2011, at 12:08 PM, Donghan (Jarod) Wang wrote: >> >>> Hey all, >>> >>> I'm working on a project that uses a native c library. Although I can >>> use DistributedCache as a way to distribute the c library, I'd like to >>> use the jar to do the job. What I mean is packing the c library into >>> the job jar, and writing code in a way that the job can find the >>> library once it gets submitted. I wonder if this is possible. If so >>> how can I obtain the path in the code. >> >> >> Just add it as a cache-file in the distributed cache, enable the = >> symlink and just System.load the filename (of the symlink). >> >> More details: >> http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#DistributedCache >> >> hth, >> Arun