Hi,

We're using Mllib (1.0.0 release version) on a k-means clustering problem.
We want to reduce the matrix column size before send the points to k-means
solver.

It works on my mac with the local mode: spark-test-run-assembly-1.0.jar
contains my application code, com.github.fommil, netlib code and
netlib-native*.so files (include jnilib and dll files) 

spark-submit --class test.TestMllibPCA --master local[4] --executor-memory
3g --driver-memory 3g --driver-class-path
/data/user/dump/spark-test-run-assembly-1.0.jar
/data/user/dump/spark-test-run-assembly-1.0.jar
/data/user/dump/user_fav_2014_04_09.csv.head1w 

But if  --driver-class-path removed, the warn message appears:
14/06/05 16:36:20 WARN LAPACK: Failed to load implementation from:
com.github.fommil.netlib.NativeSystemLAPACK
14/06/05 16:36:20 WARN LAPACK: Failed to load implementation from:
com.github.fommil.netlib.NativeRefLAPACK

or set SPARK_CLASSPATH=/data/user/dump/spark-test-run-assembly-1.0.jar can
also solve the problem.

The matrix contain sparse data with rows: 6778, columns: 2487 and the time
consume of calculating PCA is 10s and 47s respectively which infers the
native library works well.

Then I want to test it on a spark standalone cluster(on CentOS), but it
failed again.
After change JDK logging level to FINEST, got the message:
14/06/05 16:19:15 INFO JniLoader: JNI LIB =
netlib-native_system-linux-x86_64.so
14/06/05 16:19:15 INFO JniLoader: extracting
jar:file:/data/user/dump/spark-test-run-assembly-1.0.jar!/netlib-native_system-linux-x86_64.so
to /tmp/jniloader6648403281987654682netlib-native_system-linux-x86_64.so
14/06/05 16:19:15 WARN LAPACK: Failed to load implementation from:
com.github.fommil.netlib.NativeSystemLAPACK
14/06/05 16:19:15 INFO JniLoader: JNI LIB =
netlib-native_ref-linux-x86_64.so
14/06/05 16:19:15 INFO JniLoader: extracting
jar:file:/data/user/dump/spark-test-run-assembly-1.0.jar!/netlib-native_ref-linux-x86_64.so
to /tmp/jniloader2298588627398263902netlib-native_ref-linux-x86_64.so
14/06/05 16:19:16 WARN LAPACK: Failed to load implementation from:
com.github.fommil.netlib.NativeRefLAPACK
14/06/05 16:19:16 INFO LAPACK: Implementation provided by class
com.github.fommil.netlib.F2jLAPACK

The libgfortran ,atlas, blas, lapack and arpack are all installed and all of
the .so files are located under /usr/lib64, spark.executor.extraLibraryPath
is set to /usr/lib64 in conf/spark-defaults.conf but none of them works. I
tried add --jars /data/user/dump/spark-test-run-assembly-1.0.jar but no good
news.

What should I try next?

Is the native library need to be visible for driver and executor both? In
local mode the problem seems to be a classpath problem, but for standalone
and yarn mode it get more complex. A detail document is really helpful.

Thanks.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Native-library-can-not-be-loaded-when-using-Mllib-PCA-tp7042.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to