Arun, > It's possible, but very tedious. > > Currently (0.20.xxx) unjars the job.jar for you, but that is going away in > 0.23 (even 0.22 I guess). Even then, you'll have to manually figure the path, > load it etc. > > OTOH, using the DC is supported. Even better, the native .so will be shared > across jobs - so it's downloaded into the DC only once and re-used. I'd > highly recommend that. Thanks for the insights. And you are right, DC is more efficient and effective than the self-extracted jar. I'll push the solution to the system, to which the code will be delivered.
Jarod On Mon, Jul 11, 2011 at 12:41 AM, Arun C Murthy <a...@hortonworks.com> wrote: > Jarod, > > On Jul 10, 2011, at 3:13 PM, Donghan (Jarod) Wang wrote: > >> Hey Arun, >> >> Thank you for the reply. The way you mentioned requires setting up >> native libraries somewhere on the hdfs before starting the job, which >> is what I am trying to avoid. What I want is bundling the libraries >> within the job JAR, in other words the libraries are shipped with the >> JAR and need not be pre-installed on the system. And once the job gets >> running, it extracts the lib from the job JAR and System.load it. I >> wonder if it is possible. >> > > It's possible, but very tedious. > > Currently (0.20.xxx) unjars the job.jar for you, but that is going away in > 0.23 (even 0.22 I guess). Even then, you'll have to manually figure the path, > load it etc. > > OTOH, using the DC is supported. Even better, the native .so will be shared > across jobs - so it's downloaded into the DC only once and re-used. I'd > highly recommend that. > > hth, > Arun > >> Thanks, >> Jarod >> >> On Sat, Jul 9, 2011 at 3:20 PM, Arun C Murthy <acmur...@apache.org> wrote: >>> Jarod, >>> >>> On Jul 9, 2011, at 12:08 PM, Donghan (Jarod) Wang wrote: >>> >>>> Hey all, >>>> >>>> I'm working on a project that uses a native c library. Although I can >>>> use DistributedCache as a way to distribute the c library, I'd like to >>>> use the jar to do the job. What I mean is packing the c library into >>>> the job jar, and writing code in a way that the job can find the >>>> library once it gets submitted. I wonder if this is possible. If so >>>> how can I obtain the path in the code. >>> >>> >>> Just add it as a cache-file in the distributed cache, enable the = >>> symlink and just System.load the filename (of the symlink). >>> >>> More details: >>> http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#DistributedCache >>> >>> hth, >>> Arun > >