Eyal
      Hope you are looking for this one
http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/

Regards
Bejoy.K.S


On Sat, Jan 7, 2012 at 12:25 PM, Eyal Golan <egola...@gmail.com> wrote:

> hi,
> can you please point out link to Cloudera's article?
>
> thanks,
>
> Eyal
>
>
> Eyal Golan
> egola...@gmail.com
>
> Visit: http://jvdrums.sourceforge.net/
> LinkedIn: http://www.linkedin.com/in/egolan74
> Skype: egolan74
>
> P  Save a tree. Please don't print this e-mail unless it's really necessary
>
>
>
> On Tue, Jan 3, 2012 at 5:28 PM, Samir Eljazovic <samir.eljazo...@gmail.com
> > wrote:
>
>> Hi,
>> yes, I'm trying to get option 1 from Cloudera's article (using
>> distributed cache) work. If I specify all libraries when running the job it
>> works, but I'm trying to make it work using only one archive file
>> containing all native libraries I need. And that seems to be a problem.
>>
>> when I use tar file libraries are extracted but they are not added to
>> classpath.
>>
>> Here's TT log:
>>
>> 2012-01-03 15:04:43,611 INFO
>> org.apache.hadoop.filecache.TrackerDistributedCacheManager (Thread-447):
>> Creating openCV.tar in
>> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/distcache/8087259939901130551_1003999143_605667452/
>> 10.190.207.247/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar-work--7133799918421346652with
>>  rwxr-xr-x
>> 2012-01-03 15:04:44,209 INFO
>> org.apache.hadoop.filecache.TrackerDistributedCacheManager (Thread-447):
>> Extracting
>> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/distcache/8087259939901130551_1003999143_605667452/
>> 10.190.207.247/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar-work--7133799918421346652/openCV.tarto
>> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/distcache/8087259939901130551_1003999143_605667452/
>> 10.190.207.247/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar-work--7133799918421346652
>> 2012-01-03 15:04:44,363 INFO
>> org.apache.hadoop.filecache.TrackerDistributedCacheManager (Thread-447):
>> Cached hdfs://
>> 10.190.207.247:9000/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar#openCV.taras
>> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/distcache/8087259939901130551_1003999143_605667452/
>> 10.190.207.247/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar
>>
>> What should I do to get these libs available to my job?
>>
>> Thanks
>>
>>
>> On 3 January 2012 15:57, Praveen Sripati <praveensrip...@gmail.com>wrote:
>>
>>> Check this article from Cloudera for different options.
>>>
>>>
>>> http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
>>>
>>> Praveen
>>>
>>> On Tue, Jan 3, 2012 at 7:41 AM, Harsh J <ha...@cloudera.com> wrote:
>>>
>>>> Samir,
>>>>
>>>> I believe HARs won't work there. But you can use a regular tar instead,
>>>> and that should be unpacked properly.
>>>>
>>>> On 03-Jan-2012, at 5:38 AM, Samir Eljazovic wrote:
>>>>
>>>> > Hi,
>>>> > I need to provide a lot of 3th party libraries (both java and native)
>>>> and doing that using generic option parser (-libjars and -files arguments)
>>>> is a little bit messy. I was wandering if it is possible to wrap all
>>>> libraries into single har archive and use that when submitting the job?
>>>> >
>>>> > Just to mention that I want to avoid putting all libraries into job
>>>> jar for two reasons:
>>>> > 1. does not work for  native libs
>>>> > 2. takes time to upload jar
>>>> >
>>>> > Thanks,
>>>> > Samir
>>>>
>>>>
>>>
>>
>

Reply via email to