I have looked at the SparkInterpreter.java code and this is indeed the issue. Whenever one uses an instruction with z.input("...") something then no spark transformation can work as z will be shipped to the slaves where Zeppelin is not installed as showed by the example I sent. A workaround could be to interpret separately the variables (by defining a map of variables before interpreting).
Best, David On Mon, Aug 24, 2015 at 6:45 PM, David Salinas <david.salinas....@gmail.com> wrote: > Hi Moon, > > I found another way to reproduce the problem: > > //cell 1 does not work > val file = "hdfs://someclusterfile.json" > val s = z.input("Foo").toString > val textFile = sc.textFile(file) > textFile.filter(_.contains(s)).count > //org.apache.spark.SparkException: Job aborted due to stage failure: Task > 41 in stage 5.0 failed 4 times, most recent failure: Lost task 41.3 in > stage 5.0 (TID 2735,XXX.com ): java.lang.NoClassDefFoundError: > Lorg/apache/zeppelin/spark/ZeppelinContext; > > // cell 2 works > val file = "hdfs://someclusterfile.json" > val s = "Y" > val textFile = sc.textFile(file) > textFile.filter(_.contains(s)).count > //res19: Long = 109 > > This kind of issue happens often also when using variables from other > cells and also when taking closure for transformation. Maybe you are > reading variables inside the transformation with something like > "z.get("s")" which causes z to be send to the slaves as one of its member > is used (although I also sometimes have this issue without using anything > from other cells). > > Best, > > David > > > On Mon, Aug 24, 2015 at 10:34 AM, David Salinas < > david.salinas....@gmail.com> wrote: > >> Sorry I forgot to mention my environment: >> mesos 0.17, spark 1.4.1, scala 2.10.4, java 1.8 >> >> On Mon, Aug 24, 2015 at 10:32 AM, David Salinas < >> david.salinas....@gmail.com> wrote: >> >>> Hi Moon, >>> >>> Today I cannot reproduce the bug with elementary example either but it >>> is still impacting all my notebooks. The weird thing is that when calling a >>> transformation with map, it takes Zeppelin Context in the closure which >>> gives these java.lang.NoClassDefFoundError: >>> Lorg/apache/zeppelin/spark/ZeppelinContext errors (spark shell run this >>> command without any problem). I will try to find another example that is >>> more persistent (it is weird this example was failing yesterday). Do you >>> have any idea of what could cause Zeppelin Context to be included in the >>> closure? >>> >>> Best, >>> >>> David >>> >>> >>> On Fri, Aug 21, 2015 at 6:29 PM, moon soo Lee <m...@apache.org> wrote: >>> >>>> I have tested your code and can not reproduce the problem. >>>> >>>> Could you share your environment? how did you configure Zeppelin with >>>> Spark? >>>> >>>> Thanks, >>>> moon >>>> >>>> On Fri, Aug 21, 2015 at 2:25 AM David Salinas < >>>> david.salinas....@gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a problem when using spark closure. This error was not >>>>> appearing with spark 1.2.1. >>>>> >>>>> I have included a reproducible example that happens when taking the >>>>> closure (Zeppelin has been built with head of master with this command mvn >>>>> install -DskipTests -Pspark-1.4 -Dspark.version=1.4.1 >>>>> -Dhadoop.version=2.2.0 -Dprotobuf.version=2.5.0). Does anyone ever >>>>> encountered this problem? All my previous notebooks are broken by this :( >>>>> >>>>> ------------------------------ >>>>> val textFile = sc.textFile("hdfs://somefile.txt") >>>>> >>>>> val f = (s: String) => s+s >>>>> textFile.map(f).count >>>>> //works fine >>>>> //res145: Long = 407 >>>>> >>>>> >>>>> def f(s:String) = { >>>>> s+s >>>>> } >>>>> textFile.map(f).count >>>>> >>>>> //fails -> >>>>> >>>>> org.apache.spark.SparkException: Job aborted due to stage failure: >>>>> Task 566 in stage 87.0 failed 4 times, most recent failure: Lost task >>>>> 566.3 >>>>> in stage 87.0 (TID 43396, XXX.com): java.lang.NoClassDefFoundError: >>>>> Lorg/apache/zeppelin/spark/ZeppelinContext; at >>>>> java.lang.Class.getDeclaredFields0(Native Method) at >>>>> java.lang.Class.privateGetDeclaredFields(Class.java:2583) at >>>>> java.lang.Class.getDeclaredField(Class.java:2068) ... >>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) >>>>> at >>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) >>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at >>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) >>>>> at >>>>> >>>>> Best, >>>>> >>>>> David >>>>> >>>> >>> >> >