Re: spark 2.0 readStream from a REST API

2016-08-11 Thread Sela, Amit
el Armbrust <mich...@databricks.com<mailto:mich...@databricks.com>> Subject: Re: spark 2.0 readStream from a REST API Why writeStream is needed to consume the data ? When I tried it I got this exception: INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint

Re: spark 2.0 readStream from a REST API

2016-08-02 Thread Jacek Laskowski
On Tue, Aug 2, 2016 at 10:59 AM, Ayoub Benali wrote: > Why writeStream is needed to consume the data ? You need to start your structured streaming (query) and there's no way to access it without DataStreamWriter => writeStream's a must.

Re: spark 2.0 readStream from a REST API

2016-08-02 Thread Ayoub Benali
Why writeStream is needed to consume the data ? When I tried it I got this exception: INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint > org.apache.spark.sql.AnalysisException: Complete output mode not supported > when there are no streaming aggregations on streaming

Re: spark 2.0 readStream from a REST API

2016-08-02 Thread Ayoub Benali
Hello, here is the code I am trying to run: https://gist.github.com/ayoub-benali/a96163c711b4fce1bdddf16b911475f2 Thanks, Ayoub. 2016-08-01 13:44 GMT+02:00 Jacek Laskowski : > On Mon, Aug 1, 2016 at 11:01 AM, Ayoub Benali > wrote: > > > the

Re: spark 2.0 readStream from a REST API

2016-08-01 Thread Amit Sela
I think you're missing: val query = wordCounts.writeStream .outputMode("complete") .format("console") .start() Dis it help ? On Mon, Aug 1, 2016 at 2:44 PM Jacek Laskowski wrote: > On Mon, Aug 1, 2016 at 11:01 AM, Ayoub Benali > wrote: > >

Re: spark 2.0 readStream from a REST API

2016-08-01 Thread Jacek Laskowski
On Mon, Aug 1, 2016 at 11:01 AM, Ayoub Benali wrote: > the problem now is that when I consume the dataframe for example with count > I get the stack trace below. Mind sharing the entire pipeline? > I followed the implementation of TextSocketSourceProvider to

Re: spark 2.0 readStream from a REST API

2016-08-01 Thread Ayoub Benali
Hello, using the full class name worked, thanks. the problem now is that when I consume the dataframe for example with count I get the stack trace below. I followed the implementation of TextSocketSourceProvider

Re: spark 2.0 readStream from a REST API

2016-07-31 Thread Jacek Laskowski
Hi, See https://github.com/jaceklaskowski/spark-workshop/tree/master/solutions/spark-mf-format. There's a custom format that you can use to get started. Basically, you need to develop the code behind "mysource" format and register it using --packages or --jars or similar when you spark-submit

Re: spark 2.0 readStream from a REST API

2016-07-31 Thread Michael Armbrust
You have to add a file in resource too (example ). Either that or give a full class name. On Sun, Jul 31, 2016 at 9:45 AM, Ayoub Benali

Re: spark 2.0 readStream from a REST API

2016-07-31 Thread Ayoub Benali
Looks like the way to go in spark 2.0 is to implement StreamSourceProvider with DataSourceRegister

spark 2.0 readStream from a REST API

2016-07-31 Thread Ayoub Benali
Hello, I started playing with the Structured Streaming API in spark 2.0 and I am looking for a way to create streaming Dataset/Dataframe from a rest HTTP endpoint but I am bit stuck. "readStream" in SparkSession has a json method but this one is expecting a path (s3, hdfs, etc) and I want to