Re: Interpreter dependency not loading?

2016-03-08 Thread vincent gromakowski
Hi Clearly the dependencies management should be clarified because its not clear which method override which one specially when you have conflict the order of libs is important in the classpath... Le 9 mars 2016 03:04, "mina lee" a écrit : > Hi Chris, > > there are several

Re: Interpreter dependency not loading?

2016-03-08 Thread mina lee
Hi Chris, there are several ways to load dependencies to Zeppelin 0.5.5. Using %dep is one of them. If you want do it by setting spark.jars.packages property, proper way of doing it is editing your SPARK_HOME/conf/spark-default.conf and adding below line.(I assume that you set SPARK_HOME in

Re: Best practices of maintaining a long running SparkContext

2016-03-08 Thread Zhong Wang
+spark-users We are using Zeppelin (http://zeppelin.incubator.apache.org) as our UI to run spark jobs. Zeppelin maintains a long running SparkContext, and we run into a couple of issues: -- 1. Dynamic resource allocation keeps removing and registering executors, even though no jobs are running 2.

Re: graphframes: errors adding dependencies with pyspark

2016-03-08 Thread Felix Cheung
Hi - this seems to be an issue with the way the python code is imported from a jar or from a Spark package. I ran into the same. I tried bt couldn't find any guideline on how a Spark package should make its Python binding available. If you would open an issue at graph frame, I could chime in

Interpreter dependency not loading?

2016-03-08 Thread Chris Miller
Hi, I have a strange situation going on. I'm running Zeppelin 0.5.5 and Spark 1.6.0 (on Amazon EMR). I added this property to the interpreter settings (and restarted it): spark.jars.packages: org.apache.avro:avro:1.8.0,org.joda:joda-convert:1.8.1 The avro dependency loads fine and I'm able to

Re: Best practices of maintaining a long running SparkContext

2016-03-08 Thread Deenar Toraskar
1) You should turn dynamic allocation on see http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation to maximise utilisation of your cluster resources. This might be a reason you are seeing cached data disappearing. 2) If other processes cache data and the amount of data cached