Hi, we are having problems with using a custom hadoop lib in a spark image
when running it on a kubernetes cluster while following the steps of the documentation. Details in the description below. Does anyone else had similar problems? Is there something missing in the setup below? Or is this a bug? Hadoop free spark on kubernetes Using custom hadoop libraries in spark image does not work with following the steps of the documentation (*) for running spark pi on kubernetes cluster. *Usage of hadoop free build: https://spark.apache.org/docs/2.4.0/hadoop-provided.html Steps: 1. Download hadoop free spark spark-2.4.0-bin-without-hadoop.tgz<https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-without-hadoop.tgz> 2. Build spark image without hadoop from this with docker-image-tool.sh 3. Create Dockerfile to add an image layer to the spark image without hadoop that adds a custom hadoop (see: Dockerfile and conf/spark-enf.sh below) 4. Use custom hadoop spark image to run spark examples (see: k8s submit below) 5. Produces JNI Error (see message below), expected instead is computation of pi. Dockerfile # some spark base image built via: # $SPARK_HOME/bin/docker-image-tool.sh -r <some repo> -t sometag build # $SPARK_HOME/bin/docker-image-tool.sh -r <some repo> -t sometag push # # docker build this to: >> docker build -t reg../...:1.0.0 . # # use spark 2.4.0 without hadoop as base image # FROM registry/spark-without-hadoop:2.4.0 ENV SPARK_HOME /opt/spark # setup custom hadoop COPY libs/hadoop-2.9.2 /opt/hadoop ENV HADOOP_HOME /opt/hadoop COPY conf/spark-env.sh ${SPARK_HOME}/conf/spark-env.sh WORKDIR /opt/spark/work-dir conf/spark-enf.sh #!/usr/bin/env bash # echo commands to the terminal output set -ex # With explicit path to 'hadoop' binary export SPARK_DIST_CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath) echo $SPARK_DIST_CLASSPATH submit command to kubernetes $SPARK_HOME/bin/spark-submit \ --master k8s://... \ --name sparkpiexample-custom-hadoop-original \ --deploy-mode cluster \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.instances=2 \ --conf spark.executor.extraJavaOptions="-XX:+UseG1GC -XX:-ResizePLAB" \ --conf spark.kubernetes.memoryOverheadFactor=0.2 \ --conf spark.kubernetes.container.image=registry/spark-custom-hadoop-original:1.0.0 \ --conf spark.kubernetes.container.image.pullSecrets=... \ --conf spark.kubernetes.container.image.pullPolicy=Always \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ local:///opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar Retrieved error message: ++ id -u + myuid=0 ++ id -g + mygid=0 + set +e ++ getent passwd 0 + uidentry=root:x:0:0:root:/root:/bin/ash + set -e + '[' -z root:x:0:0:root:/root:/bin/ash ']' + SPARK_K8S_CMD=driver + case "$SPARK_K8S_CMD" in + shift 1 + SPARK_CLASSPATH=':/opt/spark/jars/*' + env + grep SPARK_JAVA_OPT_ + sort -t_ -k4 -n + sed 's/[^=]*=\(.*\)/\1/g' + readarray -t SPARK_EXECUTOR_JAVA_OPTS + '[' -n '' ']' + '[' -n '' ']' + PYSPARK_ARGS= + '[' -n '' ']' + R_ARGS= + '[' -n '' ']' + '[' '' == 2 ']' + '[' '' == 3 ']' + case "$SPARK_K8S_CMD" in + CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@") + exec /sbin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=100.96.6.123 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi spark-internal Error: A JNI error has occurred, please check your installation and try again Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.privateGetMethodRecursive(Class.java:3048) at java.lang.Class.getMethod0(Class.java:3018) at java.lang.Class.getMethod(Class.java:1784) at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 7 more Regards, Tobias Sommer M.Sc. (Uni) Team eso-IN-Swarm Software Engineer [Beschreibung: Description: Description: Description: Description: Description: Description: e-solutions-logo-text-142] e.solutions GmbH Despag-Str. 4a, 85055 Ingolstadt, Germany Phone +49-8458-3332-1219 Fax +49-8458-3332-2219 tobias.som...@esolutions.de<mailto:tobias.som...@esolutions.de> Registered Office: Despag-Str. 4a, 85055 Ingolstadt, Germany e.solutions GmbH Managing Directors Uwe Reder, Dr. Riclef Schmidt-Clausen Register Court Ingolstadt HRB 5221