Hi All,

I'm trying to build a Java/Scala client application that will submit spark
code snippets to a remote spark cluster for execution. I am using Apache
Toree to achieve this. I downloaded the Apache Toree binary tar from
http://www.apache.org/dyn/closer.lua/incubator/toree/0.1.0-incubating/toree-bin/apache-toree-0.1.0-incubating-binary-release.tar.gz
<http://www.apache.org/dyn/closer.lua/incubator/toree/0.1.0-incubating/toree-bin/apache-toree-0.1.0-incubating-binary-release.tar.gz>
on
the edge node.

After untarring it on the edge node, I could successfully connect and start
the Apache Toree Service with Spark on *YARN *through the edge node - *It
is now running as a Apache Toree job on my cluster.*

*PS: I did not install Toree using pip + I did not install
Jupyter anywhere.*

Now after this initial setup - I *need to write a Java/Scala client to
connect to this remote running Apache Toree service* so as to submit and
execute spark code snippets on the Spark cluster through the Apache
Toree service.


To write a client I'm referring to the following examples in the project
but things aren't falling in place for me

https://github.com/apache/incubator-toree/blob/master/client/src/test/scala/examples/DocumentationExamples.scala

My question is - if the Apache Toree service is running remotely on one of
the edge nodes having an ip address 10.22.34.10:8042 or
http://example.com:8042 *where do I specify/configure this address in the
client code* so that the client code makes a connection with the remote
Apache Toree service and submits the spark code to Spark for execution?

In a nutshell - How do we establish a connection between a client and the
Apache Toree service i.e. I mean how does the client know where to submit
the spark code? - I'm unable to find the configuration in the sample
example i.e. *DocumentationExamples.scala* (link above).

PS: I'm not using Jupyter at any place in my use case - simple running
Toree service on Spark-Yarn and writing a client in Java/scala to
submit/execute Spark code.

*Please correct with my steps mentioned above if I'm doing things the wrong
way and let me know if I'm missing on something important as a
configuration for my use-case.*

A link/sample to a working Java/Scala sample code to connect to the running
Toree service which can submit and execute the spark code will be highly
appreciated.


Thank you!

Anchit

Reply via email to