Re: Order of paragraphs vs. different interpreters (spark vs. pyspark)

Hyung Sung Shim Wed, 13 Jul 2016 04:31:26 -0700

hi.
Maybe https://github.com/apache/zeppelin/pull/1176 is related what you want.
Please check this pr.


2016년 7월 13일 수요일, xiufeng liu<[email protected]>님이 작성한 메시지:

> You have to change the source codes to add the dependencies of running
> paragraphs. I think it is a really interesting feature, for example, it can
> be use as an ETL tool. But, unfortunately, there is no configuration option
> right now.
>
> /afancy
>
> On Wed, Jul 13, 2016 at 12:27 PM, Ahmed Sobhi <[email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>
>> Hello,
>>
>> I have been working on a large Spark Scala notebook. I recently had the
>> requirement to produce graphs/plots out of these data. Python and PySpark
>> seemed like a natural fit but since I've already invested a lot of time and
>> effort into the Scala version, I want to restrict my usage of python to
>> just plotting.
>>
>> I found a good workflow for where in the scala paragraphs I can use 
>> *registerTempTable
>> *and in python I can just use *sqlContext.table *to retrieve that table.
>>
>> The problem now is that if I try to run all paragraphs to get the
>> notebook updated, the python paragraphs fail because they are running
>> before the scala ones eventhough they are placed after them.
>>
>> It seems like the behavior in Zeppelin is that it attempts to run the
>> paragraphs concurrently if they were running on different interpreters
>> which might seem fine on the surface. But now that I want to introduce some
>> dependency between spark/pyspark paragraphs, is there any way to do that?
>>
>> --
>> Cheers,
>> Ahmed
>>
>
>

Re: Order of paragraphs vs. different interpreters (spark vs. pyspark)

Reply via email to