Re: Order of paragraphs vs. different interpreters (spark vs. pyspark)

CloverHearts Wed, 13 Jul 2016 17:57:06 -0700

nice to meet you.

I have created a <https://github.com/apache/zeppelin/pull/1176>.


Do you need the feature to run for all paragraph in note?

I think that the function is needed.

I will implement it.

 

Thank you.

 

출발: xiufeng liu <[email protected]>
회신 대상: <[email protected]>
날짜: 2016년 7월 14일 목요일 오전 3:18
받는 사람: "[email protected]" <[email protected]>
주제: Re: Order of paragraphs vs. different interpreters (spark vs. pyspark)

 

It is easy to change the code. I did myself and use it as an ETL tool. It is 
very powerful 

 

Afancy 

On Wednesday, July 13, 2016, Ahmed Sobhi <[email protected]> wrote:

I think this pr addresses what I need. Case 2 seem to describe the issue I'm 
having if I'm reading it correctly.

 

The proposed solution, however, is not that clear to me.

 

Is it that you define workflows where a work flow is a sequence of (notebook, 
paragraph) pairs that are to be run in a specific order?

If that's the case, then this definitely solves my problem, but it's really 
cumbersome from a usability point of view. I think a better solution for my use 
case is to just have an option to run all paragraphs in the order they appear 
in on the notebook, regardless of which interpreter they use.

 

On Wed, Jul 13, 2016 at 12:31 PM, Hyung Sung Shim <[email protected]> wrote:

hi.

Maybe https://github.com/apache/zeppelin/pull/1176 is related what you want.

Please check this pr.


2016년 7월 13일 수요일, xiufeng liu<[email protected]>님이 작성한 메시지:

 

You have to change the source codes to add the dependencies of running 
paragraphs. I think it is a really interesting feature, for example, it can be 
use as an ETL tool. But, unfortunately, there is no configuration option right 
now.

 

/afancy

 

On Wed, Jul 13, 2016 at 12:27 PM, Ahmed Sobhi <[email protected]> wrote:

Hello,

 

I have been working on a large Spark Scala notebook. I recently had the 
requirement to produce graphs/plots out of these data. Python and PySpark 
seemed like a natural fit but since I've already invested a lot of time and 
effort into the Scala version, I want to restrict my usage of python to just 
plotting.

 

I found a good workflow for where in the scala paragraphs I can use 
registerTempTable and in python I can just use sqlContext.table to retrieve 
that table.

 

The problem now is that if I try to run all paragraphs to get the notebook 
updated, the python paragraphs fail because they are running before the scala 
ones eventhough they are placed after them.

 

It seems like the behavior in Zeppelin is that it attempts to run the 
paragraphs concurrently if they were running on different interpreters which 
might seem fine on the surface. But now that I want to introduce some 
dependency between spark/pyspark paragraphs, is there any way to do that?
 

-- 

Cheers,
Ahmed

 



 

-- 

Cheers,
Ahmed

http://bit.ly/ahmed_abtme

Re: Order of paragraphs vs. different interpreters (spark vs. pyspark)

Reply via email to