Hi all: My spark runs on mesos.I write a spark streaming app using python, code on GitHub <https://github.com/adolphlwq/linkerProcessorSample>.
The app has dependency "*org.apache.spark:spark-streaming-kafka_2.10:1.6.1* ". Spark on mesos has two important concepts: Spark Framework and Spark exector. I set my exector run in docker image.The docker image Dockerfile <https://github.com/adolphlwq/linkerProcessorSample/blob/master/docker/Dockerfile> is below: # refer ' > http://spark.apache.org/docs/latest/running-on-mesos.html#spark-properties' > on 'spark.mesos.executor.docker.image' section FROM ubuntu:14.04 > WORKDIR /linker > RUN ln -f -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime > #download mesos > RUN echo "deb http://repos.mesosphere.io/ubuntu/ trusty main" > > /etc/apt/sources.list.d/mesosphere.list && \ > apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF && \ > apt-get update && \ > apt-get -y install mesos=0.28.1-2.0.20.ubuntu1404 openjdk-7-jre > python-pip git vim curl > RUN git clone https://github.com/adolphlwq/linkerProcessorSample.git && \ > pip install -r linkerProcessorSample/docker/requirements.txt > RUN curl -fL > http://archive.apache.org/dist/spark/spark-1.6.0/spark-1.6.0-bin-hadoop2.6.tgz > | tar xzf - -C /usr/local && \ > apt-get clean > ENV MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos.so \ > JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 \ > SPARK_HOME=/usr/local/spark-1.6.0-bin-hadoop2.6 > ENV PATH=$JAVA_HOME/bin:$PATH > WORKDIR $SPARK_HOME When I use below command to submit my app program: dcos spark run --submit-args='--packages > org.apache.spark:spark-streaming-kafka_2.10:1.6.1 \ > spark2cassandra.py zk topic' \ > -docker-image=adolphlwq/mesos-for-spark-exector-image:1.6.0.beta The exector docker container run successfully, but it has no package for *org.apache.spark:spark-streaming-kafka_2.10:1.6.1*. The *stderr* om mesos is: I0713 09:34:52.715551 18124 logging.cpp:188] INFO level logging started! > I0713 09:34:52.717797 18124 fetcher.cpp:424] Fetcher Info: > {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/6097419e-c2d0-4e5f-9a91-e5815de640c4-S4","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/home\/ubuntu\/spark2cassandra.py"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/org.apache.spark_spark-streaming-kafka_2.10-1.6.0.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/com.101tec_zkclient-0.3.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/org.apache.kafka_kafka_2.10-0.8.2.1.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/org.slf4j_slf4j-api-1.7.10.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/org.spark-project.spark_unused-1.0.0.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/net.jpountz.lz4_lz4-1.3.0.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/log4j_log4j-1.2.17.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/com.yammer.metrics_metrics-core-2.2.0.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/org.apache.kafka_kafka-clients-0.8.2.1.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/root\/.ivy2\/jars\/org.xerial.snappy_snappy-java-1.1.2.jar"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/6097419e-c2d0-4e5f-9a91-e5815de640c4-S4\/frameworks\/7399b6f7-5dcd-4a9b-9846-e7948d5ffd11-0024\/executors\/driver-20160713093451-0015\/runs\/84419372-9482-4c58-8f87-4ba528b6885c"} > I0713 09:34:52.719846 18124 fetcher.cpp:379] Fetching URI > '/home/ubuntu/spark2cassandra.py' > I0713 09:34:52.719866 18124 fetcher.cpp:250] Fetching directly into the > sandbox directory > I0713 09:34:52.719925 18124 fetcher.cpp:187] Fetching URI > '/home/ubuntu/spark2cassandra.py' > I0713 09:34:52.719945 18124 fetcher.cpp:167] Copying resource with > command:cp '/home/ubuntu/spark2cassandra.py' > '/tmp/mesos/slaves/6097419e-c2d0-4e5f-9a91-e5815de640c4-S4/frameworks/7399b6f7-5dcd-4a9b-9846-e7948d5ffd11-0024/executors/driver-20160713093451-0015/runs/84419372-9482-4c58-8f87-4ba528b6885c/spark2cassandra.py' > W0713 09:34:52.722587 18124 fetcher.cpp:272] Copying instead of extracting > resource from URI with 'extract' flag, because it does not seem to be an > archive: /home/ubuntu/spark2cassandra.py > I0713 09:34:52.724138 18124 fetcher.cpp:456] Fetched > '/home/ubuntu/spark2cassandra.py' to > '/tmp/mesos/slaves/6097419e-c2d0-4e5f-9a91-e5815de640c4-S4/frameworks/7399b6f7-5dcd-4a9b-9846-e7948d5ffd11-0024/executors/driver-20160713093451-0015/runs/84419372-9482-4c58-8f87-4ba528b6885c/spark2cassandra.py' > I0713 09:34:52.724148 18124 fetcher.cpp:379] Fetching URI > '/root/.ivy2/jars/org.apache.spark_spark-streaming-kafka_2.10-1.6.0.jar' > I0713 09:34:52.724153 18124 fetcher.cpp:250] Fetching directly into the > sandbox directory > I0713 09:34:52.724162 18124 fetcher.cpp:187] Fetching URI > '/root/.ivy2/jars/org.apache.spark_spark-streaming-kafka_2.10-1.6.0.jar' > I0713 09:34:52.724171 18124 fetcher.cpp:167] Copying resource with > command:cp > '/root/.ivy2/jars/org.apache.spark_spark-streaming-kafka_2.10-1.6.0.jar' > '/tmp/mesos/slaves/6097419e-c2d0-4e5f-9a91-e5815de640c4-S4/frameworks/7399b6f7-5dcd-4a9b-9846-e7948d5ffd11-0024/executors/driver-20160713093451-0015/runs/84419372-9482-4c58-8f87-4ba528b6885c/org.apache.spark_spark-streaming-kafka_2.10-1.6.0.jar' > cp: cannot stat > '/root/.ivy2/jars/org.apache.spark_spark-streaming-kafka_2.10-1.6.0.jar': > No such file or directory > Failed to fetch > '/root/.ivy2/jars/org.apache.spark_spark-streaming-kafka_2.10-1.6.0.jar': > Failed to copy with command 'cp > '/root/.ivy2/jars/org.apache.spark_spark-streaming-kafka_2.10-1.6.0.jar' May somebody help me to solve the dependencies problem! Thanks. -- Thanks & Best Regards 卢文泉 | Adolph Lu TEL:+86 15651006559 Linker Networks(http://www.linkernetworks.com/)