Re: Order of paragraphs vs. different interpreters (spark vs. pyspark)

xiufeng liu Wed, 13 Jul 2016 03:54:02 -0700

You have to change the source codes to add the dependencies of running
paragraphs. I think it is a really interesting feature, for example, it can
be use as an ETL tool. But, unfortunately, there is no configuration option
right now.


/afancy

On Wed, Jul 13, 2016 at 12:27 PM, Ahmed Sobhi <ahmed.so...@gmail.com> wrote:

> Hello,
>
> I have been working on a large Spark Scala notebook. I recently had the
> requirement to produce graphs/plots out of these data. Python and PySpark
> seemed like a natural fit but since I've already invested a lot of time and
> effort into the Scala version, I want to restrict my usage of python to
> just plotting.
>
> I found a good workflow for where in the scala paragraphs I can use 
> *registerTempTable
> *and in python I can just use *sqlContext.table *to retrieve that table.
>
> The problem now is that if I try to run all paragraphs to get the notebook
> updated, the python paragraphs fail because they are running before the
> scala ones eventhough they are placed after them.
>
> It seems like the behavior in Zeppelin is that it attempts to run the
> paragraphs concurrently if they were running on different interpreters
> which might seem fine on the surface. But now that I want to introduce some
> dependency between spark/pyspark paragraphs, is there any way to do that?
>
> --
> Cheers,
> Ahmed
>

Re: Order of paragraphs vs. different interpreters (spark vs. pyspark)

Reply via email to