Arun,

> It's possible, but very tedious.
>
> Currently (0.20.xxx) unjars the job.jar for you, but that is going away in 
> 0.23 (even 0.22 I guess). Even then, you'll have to manually figure the path, 
> load it etc.
>
> OTOH, using the DC is supported. Even better, the native .so will be shared 
> across jobs - so it's downloaded into the DC only once and re-used. I'd 
> highly recommend that.
Thanks for the insights. And you are right, DC is more efficient and
effective than the self-extracted jar. I'll push the solution to the
system, to which the code will be delivered.

Jarod


On Mon, Jul 11, 2011 at 12:41 AM, Arun C Murthy <a...@hortonworks.com> wrote:
> Jarod,
>
> On Jul 10, 2011, at 3:13 PM, Donghan (Jarod) Wang wrote:
>
>> Hey Arun,
>>
>> Thank you for the reply. The way you mentioned requires setting up
>> native libraries somewhere on the hdfs before starting the job, which
>> is what I am trying to avoid. What I want is bundling the libraries
>> within the job JAR, in other words the libraries are shipped with the
>> JAR and need not be pre-installed on the system. And once the job gets
>> running, it extracts the lib from the job JAR and System.load it. I
>> wonder if it is possible.
>>
>
> It's possible, but very tedious.
>
> Currently (0.20.xxx) unjars the job.jar for you, but that is going away in 
> 0.23 (even 0.22 I guess). Even then, you'll have to manually figure the path, 
> load it etc.
>
> OTOH, using the DC is supported. Even better, the native .so will be shared 
> across jobs - so it's downloaded into the DC only once and re-used. I'd 
> highly recommend that.
>
> hth,
> Arun
>
>> Thanks,
>> Jarod
>>
>> On Sat, Jul 9, 2011 at 3:20 PM, Arun C Murthy <acmur...@apache.org> wrote:
>>> Jarod,
>>>
>>> On Jul 9, 2011, at 12:08 PM, Donghan (Jarod) Wang wrote:
>>>
>>>> Hey all,
>>>>
>>>> I'm working on a project that uses a native c library. Although I can
>>>> use DistributedCache as a way to distribute the c library, I'd like to
>>>> use the jar to do the job. What I mean is packing the c library into
>>>> the job jar, and writing code in a way that the job can find the
>>>> library once it gets submitted. I wonder if this is possible. If so
>>>> how can I obtain the path in the code.
>>>
>>>
>>> Just add it as a cache-file in the distributed cache, enable the =
>>> symlink and just System.load the filename (of the symlink).
>>>
>>> More details: 
>>> http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#DistributedCache
>>>
>>> hth,
>>> Arun
>
>

Reply via email to