Re: Spark 2.0 Structured Streaming: sc.parallelize in foreach sink cause Task not serializable error

2016-09-26 Thread Michael Armbrust
The code in ForeachWriter runs on the executors, which means that you are not allowed to use the SparkContext. This is probably why you are seeing that exception. On Sun, Sep 25, 2016 at 3:20 PM, Jianshi wrote: > Dear all: > > I am trying out the new released feature of

Spark 2.0 Structured Streaming: sc.parallelize in foreach sink cause Task not serializable error

2016-09-25 Thread Jianshi
Dear all: I am trying out the new released feature of structured streaming in Spark 2.0. I use the Structured Streaming to perform windowing by event time. I can print out the result in the console. I would like to write the result to Cassandra database through the foreach sink option. I am

Re: Spark 2.0+ Structured Streaming

2016-04-28 Thread Tathagata Das
Hello Benjamin, Have you take a look at the slides of my talk in Strata San Jose - http://www.slideshare.net/databricks/taking-spark-streaming-to-the-next-level-with-datasets-and-dataframes Unfortunately there is not video, as Strata does not upload videos for everyone. I presented the same talk

Spark 2.0+ Structured Streaming

2016-04-28 Thread Benjamin Kim
Can someone explain to me how the new Structured Streaming works in the upcoming Spark 2.0+? I’m a little hazy how data will be stored and referenced if it can be queried and/or batch processed directly from streams and if the data will be append only to or will there be some sort of upsert