On Feb 10, 2009, at 11:06 AM, Mimi Sun wrote:
Hi,
I'm new to Hadoop and I'm wondering what the recommended method is
for using native libraries in mapred jobs.
I've tried the following separately:
1. set LD_LIBRARY_PATH in .bashrc
2. set LD_LIBRARY_PATH and JAVA_LIBRARY_PATH in hadoop-env.sh
3. set -Djava.library.path=... for mapred.child.java.opts
For what you are trying (i.e. given that the JNI libs are present on
all machines at a constant path) setting -Djava.library.path for the
child task via mapred.child.java.opts should work. What are you seeing?
Arun
4. change bin/hadoop to include $LD_LIBRARY_PATH in addition to the
path it generates: HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=
$LD_LIBRARY_PATH:$JAVA_LIBRARY_PATH"
5. drop the .so files I need into hadoop/lib/native/...
1~3 didn't work, 4 and 5 did but seem to be hacks. I also read that
I can do this using DistributedCache, but that seems to be extra
work for loading libraries that are already present on each machine.
(I'm using the JNI libs for berkeley db).
It seems that there should be a way to configure java.library.path
for the mapred jobs. Perhaps bin/hadoop should make use of
LD_LIBRARY_PATH?
Thanks,
- Mimi