It is easy to change the code. I did myself and use it as an ETL tool. It is very powerful
Afancy On Wednesday, July 13, 2016, Ahmed Sobhi <ahmed.so...@gmail.com> wrote: > I think this pr addresses what I need. Case 2 seem to describe the issue > I'm having if I'm reading it correctly. > > The proposed solution, however, is not that clear to me. > > Is it that you define workflows where a work flow is a sequence of > (notebook, paragraph) pairs that are to be run in a specific order? > If that's the case, then this definitely solves my problem, but it's > really cumbersome from a usability point of view. I think a better solution > for my use case is to just have an option to run all paragraphs in the > order they appear in on the notebook, regardless of which interpreter they > use. > > On Wed, Jul 13, 2016 at 12:31 PM, Hyung Sung Shim <hss...@nflabs.com > <javascript:_e(%7B%7D,'cvml','hss...@nflabs.com');>> wrote: > >> hi. >> Maybe https://github.com/apache/zeppelin/pull/1176 is related what you >> want. >> Please check this pr. >> >> 2016년 7월 13일 수요일, xiufeng liu<toxiuf...@gmail.com >> <javascript:_e(%7B%7D,'cvml','toxiuf...@gmail.com');>>님이 작성한 메시지: >> >> You have to change the source codes to add the dependencies of running >>> paragraphs. I think it is a really interesting feature, for example, it can >>> be use as an ETL tool. But, unfortunately, there is no configuration option >>> right now. >>> >>> /afancy >>> >>> On Wed, Jul 13, 2016 at 12:27 PM, Ahmed Sobhi <ahmed.so...@gmail.com> >>> wrote: >>> >>>> Hello, >>>> >>>> I have been working on a large Spark Scala notebook. I recently had the >>>> requirement to produce graphs/plots out of these data. Python and PySpark >>>> seemed like a natural fit but since I've already invested a lot of time and >>>> effort into the Scala version, I want to restrict my usage of python to >>>> just plotting. >>>> >>>> I found a good workflow for where in the scala paragraphs I can use >>>> *registerTempTable >>>> *and in python I can just use *sqlContext.table *to retrieve that >>>> table. >>>> >>>> The problem now is that if I try to run all paragraphs to get the >>>> notebook updated, the python paragraphs fail because they are running >>>> before the scala ones eventhough they are placed after them. >>>> >>>> It seems like the behavior in Zeppelin is that it attempts to run the >>>> paragraphs concurrently if they were running on different interpreters >>>> which might seem fine on the surface. But now that I want to introduce some >>>> dependency between spark/pyspark paragraphs, is there any way to do that? >>>> >>>> -- >>>> Cheers, >>>> Ahmed >>>> >>> >>> > > > -- > Cheers, > Ahmed > http://bit.ly/ahmed_abtme <http://about.me/humanzz> >