Hi, We're using Mllib (1.0.0 release version) on a k-means clustering problem. We want to reduce the matrix column size before send the points to k-means solver.
It works on my mac with the local mode: spark-test-run-assembly-1.0.jar contains my application code, com.github.fommil, netlib code and netlib-native*.so files (include jnilib and dll files) spark-submit --class test.TestMllibPCA --master local[4] --executor-memory 3g --driver-memory 3g --driver-class-path /data/user/dump/spark-test-run-assembly-1.0.jar /data/user/dump/spark-test-run-assembly-1.0.jar /data/user/dump/user_fav_2014_04_09.csv.head1w But if --driver-class-path removed, the warn message appears: 14/06/05 16:36:20 WARN LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK 14/06/05 16:36:20 WARN LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK or set SPARK_CLASSPATH=/data/user/dump/spark-test-run-assembly-1.0.jar can also solve the problem. The matrix contain sparse data with rows: 6778, columns: 2487 and the time consume of calculating PCA is 10s and 47s respectively which infers the native library works well. Then I want to test it on a spark standalone cluster(on CentOS), but it failed again. After change JDK logging level to FINEST, got the message: 14/06/05 16:19:15 INFO JniLoader: JNI LIB = netlib-native_system-linux-x86_64.so 14/06/05 16:19:15 INFO JniLoader: extracting jar:file:/data/user/dump/spark-test-run-assembly-1.0.jar!/netlib-native_system-linux-x86_64.so to /tmp/jniloader6648403281987654682netlib-native_system-linux-x86_64.so 14/06/05 16:19:15 WARN LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK 14/06/05 16:19:15 INFO JniLoader: JNI LIB = netlib-native_ref-linux-x86_64.so 14/06/05 16:19:15 INFO JniLoader: extracting jar:file:/data/user/dump/spark-test-run-assembly-1.0.jar!/netlib-native_ref-linux-x86_64.so to /tmp/jniloader2298588627398263902netlib-native_ref-linux-x86_64.so 14/06/05 16:19:16 WARN LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK 14/06/05 16:19:16 INFO LAPACK: Implementation provided by class com.github.fommil.netlib.F2jLAPACK The libgfortran ,atlas, blas, lapack and arpack are all installed and all of the .so files are located under /usr/lib64, spark.executor.extraLibraryPath is set to /usr/lib64 in conf/spark-defaults.conf but none of them works. I tried add --jars /data/user/dump/spark-test-run-assembly-1.0.jar but no good news. What should I try next? Is the native library need to be visible for driver and executor both? In local mode the problem seems to be a classpath problem, but for standalone and yarn mode it get more complex. A detail document is really helpful. Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Native-library-can-not-be-loaded-when-using-Mllib-PCA-tp7042.html Sent from the Apache Spark User List mailing list archive at Nabble.com.