Hi, Spark 1.0 has been installed as Standalone - But it can't read any compressed (CMX/Snappy) and Sequence file residing on HDFS. The key notable message is: "Unable to load native-hadoop library.....". Other related messages are -
Caused by: java.lang.IllegalStateException: Cannot load com.ibm.biginsights.compress.CmxDecompressor without native library! at com.ibm.biginsights.compress.CmxDecompressor.<clinit>(CmxDecompressor.java:65) Here is the core-site.xml's key part: <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec,com.ibm.biginsights.compress.CmxCodec</value> </property> Here is the spark.env.sh: export SPARK_WORKER_CORES=4 export SPARK_WORKER_MEMORY=10g export SCALA_HOME=/opt/spark/scala-2.11.1 export JAVA_HOME=/opt/spark/jdk1.7.0_55 export SPARK_HOME=/opt/spark/spark-0.9.1-bin-hadoop2 export ADD_JARS=/opt/IHC/lib/compression.jar export SPARK_CLASSPATH=/opt/IHC/lib/compression.jar export SPARK_LIBRARY_PATH=/opt/IHC/lib/native/Linux-amd64-64/ export SPARK_MASTER_WEBUI_PORT=1080 export HADOOP_CONF_DIR=/opt/IHC/hadoop-conf Note: CMX is an IBM branded splittable LZO based compression codec. Any help is appreciated. Thanks, Nasir DTCC DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify us immediately and delete the email and any attachments from your system. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.