If you are using Maven to manage your jar dependencies, the jar files are located in the maven repository on your home directory. It is usually in the .m2 directory.

Hope this helps.

-ND

On 6/23/20 3:21 PM, Anwar AliKhan wrote:
Hi,

I prefer to do most of my projects in Python and for that I use Jupyter.
I have been downloading the compiled version of spark.

I do not normally like the source code version because the build process makes me nervous.
You know with lines of stuff   scrolling up the screen.
What am I am going to do if a build fails. I am a user!

I decided to risk it and it was only one  mvn command to build. (45 minutes later)
Everything is great. Success.

I removed all jvms except jdk8 for compilation.

I used jdk8 so I know which libraries where linked in the build process.
I also used my local version of maven. Not the apt install version .

I used jdk8 because if you go this scala site.

http://scala-ide.org/download/sdk.html. they say requirement jdk8 for IDE
 even for scala12.
They don't say JDK 8 or higher ,  just jdk8.

So anyway  once in a while I  do spark projects in scala with eclipse.

For that I don't use maven or anything. I prefer to make use of build path
And external jars. This way I know exactly which libraries I am linking to.

creating a jar in eclipse is straight forward for spark_submit.


Anyway as you can see (below) I am pointing jupyter to find spark.init('opt/spark').
That's OK everything is fine.

With the compiled version of spark there is a jar directory which I have been using in eclipse.



With my own compiled from source version there is no jar directory.


Where are all the jars gone ?.



I am not sure how findspark.init('/opt/spark') is locating the libraries unless it is finding them from
Anaconda.


import findspark
findspark.init('/opt/spark')
from pyspark.sql import SparkSession
spark = SparkSession \
    .builder \
    .appName('Titanic Data') \
    .getOrCreate()

Reply via email to