Hello,
I'm building a different version of Spark Streaming (based on a different
branch than master) in my application for testing purposes, but it seems
like spark-submit is ignoring my newly built Spark Streaming .jar, and
using an older version.
Here's some context:
I'm on a different branch:
$ git branch
* SPARK-3276
master
Then I build the Spark Streaming that I've changed:
✔ ~/code/spark [SPARK-3276 L|✚ 1]
$ mvn --projects streaming/ -DskipTests install
it builds without problems, and then when I check my local Maven
repository, I see that I have newly generated Spark Streaming jars:
$ ls -lh
~/.m2/repository/org/apache/spark/spark-streaming_2.10/1.4.0-SNAPSHOT/
total 3.3M
-rw-rw-r-- 1 emre emre 1.6K Apr 20 10:43 maven-metadata-local.xml
-rw-rw-r-- 1 emre emre 421 Apr 20 10:43 _remote.repositories
-rw-rw-r-- 1 emre emre 1.3M Apr 20 10:42
spark-streaming_2.10-1.4.0-SNAPSHOT.jar
-rw-rw-r-- 1 emre emre 622K Apr 20 10:43
spark-streaming_2.10-1.4.0-SNAPSHOT-javadoc.jar
-rw-rw-r-- 1 emre emre 6.7K Apr 20 10:42
spark-streaming_2.10-1.4.0-SNAPSHOT.pom
-rw-rw-r-- 1 emre emre 181K Apr 20 10:42
spark-streaming_2.10-1.4.0-SNAPSHOT-sources.jar
-rw-rw-r-- 1 emre emre 1.2M Apr 20 10:42
spark-streaming_2.10-1.4.0-SNAPSHOT-tests.jar
-rw-rw-r-- 1 emre emre 82K Apr 20 10:42
spark-streaming_2.10-1.4.0-SNAPSHOT-test-sources.jar
Then I build and run an application (in Java) that uses Spark Streaming. In
that test project's pom.xml I have
...
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<hadoop.version>2.4.0</hadoop.version>
<spark.version>1.4.0-SNAPSHOT</spark.version>
</properties>
...
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
And then I use
~/code/spark/bin/spark-submit
to submit my application. It starts fine, and continues to run on my local
filesystem but when I check the log messages on the console, I don't see
the changes I have made, and I *did* make changes, e.g. changed some
logging messages. It is like when I submit my application, it is not using
the Spark Streaming from *branch SPARK-3276* but from the master branch.
Any ideas what might be causing this? Is there some form of caching? Or is
spark-submit using a different .jar for streaming? (Where?)
How can I see the effects of my changes that I did to Spark Streaming in my
SPARK-3276 branch?
--
Emre Sevinç