Zeppelin works great. The other thing that we have done in notebooks (like Zeppelin or Databricks) which support multiple types of spark session is register Spark SQL temp tables in our scala code then escape hatch to python for plotting with seaborn/matplotlib when the built in plots are insufficient.
— Pedro Rodriguez PhD Student in Large-Scale Machine Learning | CU Boulder Systems Oriented Data Scientist UC Berkeley AMPLab Alumni pedrorodriguez.io | 909-353-4423 github.com/EntilZha | LinkedIn On July 22, 2016 at 3:04:48 AM, Marco Colombo (ing.marco.colo...@gmail.com) wrote: Take a look at zeppelin http://zeppelin.apache.org Il giovedì 21 luglio 2016, Andy Davidson <a...@santacruzintegration.com> ha scritto: Hi Pseudo Plotting, graphing, data visualization, report generation are common needs in scientific and enterprise computing. Can you tell me more about your use case? What is it about the current process / workflow do you think could be improved by pushing plotting (I assume you mean plotting and graphing) into spark. In my personal work all the graphing is done in the driver on summary stats calculated using spark. So for me using standard python libs has not been a problem. Andy From: pseudo oduesp <pseudo20...@gmail.com> Date: Thursday, July 21, 2016 at 8:30 AM To: "user @spark" <user@spark.apache.org> Subject: spark and plot data Hi , i know spark it s engine to compute large data set but for me i work with pyspark and it s very wonderful machine my question we don't have tools for ploting data each time we have to switch and go back to python for using plot. but when you have large result scatter plot or roc curve you cant use collect to take data . somone have propostion for plot . thanks -- Ing. Marco Colombo