Hi, I use Zeppelin as well and in the notebook mode you can do analytics much like what you do in Spark-shell.
You can store your intermediate data in Parquet if you wish and then analyse data the way you like. What is your use case here? Zeppelin as I use it is a web UI to your spark-shell, accessible from anywhere. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 12 March 2016 at 07:13, trung kien <kient...@gmail.com> wrote: > Hi all, > > I've just viewed some Zeppenlin's videos. The intergration between > Zeppenlin and Spark is really amazing and i want to use it for my > application. > > In my app, i will have a Spark streaming app to do some basic realtime > aggregation ( intermediate data). Then i want to use Zeppenlin to do some > realtime analytics on the intermediate data. > > My question is what's the most efficient storage engine to store realtime > intermediate data? Is parquet file somewhere is suitable? >