[ 
https://issues.apache.org/jira/browse/HADOOP-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reassigned HADOOP-1660:
-------------------------------------

    Assignee: Arun C Murthy

> add support for native library toDistributedCache 
> --------------------------------------------------
>
>                 Key: HADOOP-1660
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1660
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>         Environment: unix (different handling would be required for windows)
>            Reporter: Alejandro Abdelnur
>            Assignee: Arun C Murthy
>
> Currently if a M/R job depends on JNI based component the dynamic library 
> must be available in all the task nodes. This is not possible specially when 
> you have not control on the cluster machines, just using it as a service.
> It should be possible to specify using the DistributedCache what are the 
> native libraries a job needs.
> For example via a new method 'public void addLibrary(Path libraryPath, 
> JobConf conf)'.
> The added libraries would make it to the local FS of the task nodes (same way 
> as cached resources) but instead been part of the classpath they would be 
> copied to a lib directory and that lib directory would be added t the 
> LD_LIBRARY_PATH of the task JVM.
> An alternative would be to set the '-Djava.library.path=' task JVM parameter 
> to the lib directory above. However, this would break for libraries that 
> depend on other libraries as the dependent one would not be in the 
> LD_LIBRARY_PATH and the OS would fail to find it as it is not the JVM the one 
> doing the load of the dependent one.
> For uncached usage of native libraries, a special directory in the JAR could 
> be used for native libraries. But I'd argue that the DistributedCache 
> enhancement would be enough, and if somebody wants to use a native library 
> s/he should use the DistributedCached. Or a JobConf addLibrary method that 
> uses the DistributedCached under the hood at submission time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to