Re: spark 2.0 readStream from a REST API

2016-08-02 Thread Ayoub Benali
8:44 GMT+02:00 Amit Sela <amitsel...@gmail.com>: > I think you're missing: > > val query = wordCounts.writeStream > > .outputMode("complete") > .format("console") > .start() > > Dis it help ? > > On Mon, Aug 1, 2016 at 2:44 PM

Re: spark 2.0 readStream from a REST API

2016-08-02 Thread Ayoub Benali
Hello, here is the code I am trying to run: https://gist.github.com/ayoub-benali/a96163c711b4fce1bdddf16b911475f2 Thanks, Ayoub. 2016-08-01 13:44 GMT+02:00 Jacek Laskowski <ja...@japila.pl>: > On Mon, Aug 1, 2016 at 11:01 AM, Ayoub Benali > <benali.ayoub.i...@g

Re: spark 2.0 readStream from a REST API

2016-08-01 Thread Ayoub Benali
chael Armbrust <mich...@databricks.com>: > You have to add a file in resource too (example > <https://github.com/apache/spark/blob/master/sql/core/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister>). > Either that or give a full class name. &

Re: spark 2.0 readStream from a REST API

2016-07-31 Thread Ayoub Benali
urce: mysource. Please find packages at http://spark-packages.org Is there something I need to do in order to "load" the Stream source provider ? Thanks, Ayoub 2016-07-31 17:19 GMT+02:00 Jacek Laskowski <ja...@japila.pl>: > On Sun, Jul 31, 2016 at 12:53 PM, Ayoub Benali >

spark 2.0 readStream from a REST API

2016-07-31 Thread Ayoub Benali
Hello, I started playing with the Structured Streaming API in spark 2.0 and I am looking for a way to create streaming Dataset/Dataframe from a rest HTTP endpoint but I am bit stuck. "readStream" in SparkSession has a json method but this one is expecting a path (s3, hdfs, etc) and I want to

Re: RDD[Future[T]] = Future[RDD[T]]

2015-07-26 Thread Ayoub Benali
It doesn't work because mapPartitions expects a function f:(Iterator[T]) ⇒ Iterator[U] while .sequence wraps the iterator in a Future 2015-07-26 22:25 GMT+02:00 Ignacio Blasco elnopin...@gmail.com: Maybe using mapPartitions and .sequence inside it? El 26/7/2015 10:22 p. m., Ayoub

Re: SQL JSON array operations

2015-01-15 Thread Ayoub Benali
You could try yo use hive context which bring HiveQL, it would allow you to query nested structures using LATERAL VIEW explode... On Jan 15, 2015 4:03 PM, jvuillermet jeremy.vuiller...@gmail.com wrote: let's say my json file lines looks like this {user: baz, tags : [foo, bar] }

Re: Parquet compression codecs not applied

2015-01-10 Thread Ayoub Benali
it worked thanks. this doc page https://spark.apache.org/docs/1.2.0/sql-programming-guide.htmlrecommends to use spark.sql.parquet.compression.codec to set the compression coded and I thought this setting would be forwarded to the hive context given that HiveContext extends SQLContext, but it was

Parquet compression codecs not applied

2015-01-08 Thread Ayoub Benali
Hello, I tried to save a table created via the hive context as a parquet file but whatever compression codec (uncompressed, snappy, gzip or lzo) I set via setConf like: setConf(spark.sql.parquet.compression.codec, gzip) the size of the generated files is the always the same, so it seems like