Hi Zeppelin developers,

This issue sounds very serious. Is this specific to David's use case here?

Best Regards,

Jerry

On Mon, Aug 24, 2015 at 1:28 PM, David Salinas <david.salinas....@gmail.com>
wrote:

> I have looked at the SparkInterpreter.java code and this is indeed the
> issue. Whenever one uses an instruction with z.input("...") something then
> no spark transformation can work as z will be shipped to the slaves where
> Zeppelin is not installed as showed by the example I sent.
> A workaround could be to interpret separately the variables (by defining a
> map of variables before interpreting).
>
> Best,
>
> David
>
>
> On Mon, Aug 24, 2015 at 6:45 PM, David Salinas <
> david.salinas....@gmail.com> wrote:
>
>> Hi Moon,
>>
>> I found another way to reproduce the problem:
>>
>> //cell 1 does not work
>> val file = "hdfs://someclusterfile.json"
>> val s = z.input("Foo").toString
>> val textFile = sc.textFile(file)
>> textFile.filter(_.contains(s)).count
>> //org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> 41 in stage 5.0 failed 4 times, most recent failure: Lost task 41.3 in
>> stage 5.0 (TID 2735,XXX.com ): java.lang.NoClassDefFoundError:
>> Lorg/apache/zeppelin/spark/ZeppelinContext;
>>
>> // cell 2 works
>> val file = "hdfs://someclusterfile.json"
>> val s = "Y"
>> val textFile = sc.textFile(file)
>> textFile.filter(_.contains(s)).count
>> //res19: Long = 109
>>
>> This kind of issue happens often also when using variables from other
>> cells and also when taking closure for transformation. Maybe you are
>> reading variables inside the transformation with something like
>> "z.get("s")" which causes z to be send to the slaves as one of its member
>> is used (although I also sometimes have this issue without using anything
>> from other cells).
>>
>> Best,
>>
>> David
>>
>>
>> On Mon, Aug 24, 2015 at 10:34 AM, David Salinas <
>> david.salinas....@gmail.com> wrote:
>>
>>> Sorry I forgot to mention my environment:
>>> mesos 0.17, spark 1.4.1, scala 2.10.4, java 1.8
>>>
>>> On Mon, Aug 24, 2015 at 10:32 AM, David Salinas <
>>> david.salinas....@gmail.com> wrote:
>>>
>>>> Hi Moon,
>>>>
>>>> Today I cannot reproduce the bug with elementary example either but it
>>>> is still impacting all my notebooks. The weird thing is that when calling a
>>>> transformation with map, it takes Zeppelin Context in the closure which
>>>> gives these java.lang.NoClassDefFoundError:
>>>> Lorg/apache/zeppelin/spark/ZeppelinContext errors (spark shell run this
>>>> command without any problem). I will try to find another example that is
>>>> more persistent (it is weird this example was failing yesterday). Do you
>>>> have any idea of what could cause Zeppelin Context to be included in the
>>>> closure?
>>>>
>>>> Best,
>>>>
>>>> David
>>>>
>>>>
>>>> On Fri, Aug 21, 2015 at 6:29 PM, moon soo Lee <m...@apache.org> wrote:
>>>>
>>>>> I have tested your code and can not reproduce the problem.
>>>>>
>>>>> Could you share your environment? how did you configure Zeppelin with
>>>>> Spark?
>>>>>
>>>>> Thanks,
>>>>> moon
>>>>>
>>>>> On Fri, Aug 21, 2015 at 2:25 AM David Salinas <
>>>>> david.salinas....@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have a problem when using spark closure. This error was not
>>>>>> appearing with spark 1.2.1.
>>>>>>
>>>>>> I have included a reproducible example that happens when taking the
>>>>>> closure (Zeppelin has been built with head of master with this command 
>>>>>> mvn
>>>>>> install -DskipTests -Pspark-1.4 -Dspark.version=1.4.1
>>>>>> -Dhadoop.version=2.2.0 -Dprotobuf.version=2.5.0). Does anyone ever
>>>>>> encountered this problem? All my previous notebooks are broken by this :(
>>>>>>
>>>>>> ------------------------------
>>>>>> val textFile = sc.textFile("hdfs://somefile.txt")
>>>>>>
>>>>>> val f = (s: String) => s+s
>>>>>> textFile.map(f).count
>>>>>> //works fine
>>>>>> //res145: Long = 407
>>>>>>
>>>>>>
>>>>>> def f(s:String) = {
>>>>>>     s+s
>>>>>> }
>>>>>> textFile.map(f).count
>>>>>>
>>>>>> //fails ->
>>>>>>
>>>>>> org.apache.spark.SparkException: Job aborted due to stage failure:
>>>>>> Task 566 in stage 87.0 failed 4 times, most recent failure: Lost task 
>>>>>> 566.3
>>>>>> in stage 87.0 (TID 43396, XXX.com): java.lang.NoClassDefFoundError:
>>>>>> Lorg/apache/zeppelin/spark/ZeppelinContext; at
>>>>>> java.lang.Class.getDeclaredFields0(Native Method) at
>>>>>> java.lang.Class.privateGetDeclaredFields(Class.java:2583) at
>>>>>> java.lang.Class.getDeclaredField(Class.java:2068) ...
>>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
>>>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) 
>>>>>> at
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> David
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to