Adding to advices given by others ... Spark 2.1.0 works with Scala 2.11, so set:
scalaVersion := "2.11.8"
When you see something like:
"org.apache.spark" % "spark-core_2.10" % "1.5.2"
that means that library `spark-core` is compiled against Scala 2.10,
so you would have to change that to
Getting to the Spark web UI when Spark is running on Dataproc is not
that straightforward. Connecting to that web interface is a two step
process:
1. create an SSH tunnel
2. configure the browser to use a SOCKS proxy to connect
The above steps are described here:
On 13 January 2017 at 13:55, Anahita Talebi wrote:
> Hi,
>
> Thanks for your answer.
>
> I have chose "Spark" in the "job type". There is not any option where we can
> choose the version. How I can choose different version?
There's "Preemptible workers, bucket,
Not knowing how the code that handles those arguments look like, I
would, in the "Arguments" field for submitting a dataproc job, put:
--trainFile=gs://Anahita/small_train.dat
--testFile=gs://Anahita/small_test.dat
--numFeatures=9947
--numRounds=100
... providing you still keep those files in
You can run Spark app on Dataproc, which is Google's managed Spark and
Hadoop service:
https://cloud.google.com/dataproc/docs/
basically, you:
* assemble a jar
* create a cluster
* submit a job to that cluster (with the jar)
* delete a cluster when the job is done
Before all that, one has to