Re: How save streaming aggregations on 'Structured Streams' in parquet format ?

2017-06-19 Thread kaniska Mandal
rk("timestamp", "1 hour") > .groupBy(window("timestamp", "30 seconds")) > .agg(...) > > Read more here - https://databricks.com/blog/2017/05/08/event-time- > aggregation-watermarking-apache-sparks-structured-streaming.html > > >

Re: How save streaming aggregations on 'Structured Streams' in parquet format ?

2017-06-19 Thread kaniska Mandal
ta = spark.table("deviceBasicAggSummary" ); deviceBasicAggSummaryData.toDF().write().parquet("/data/summary/devices/"+ new Date().getTime()+"/"); } = So whats the best practice for 'low latency query on distributed data' u

How save streaming aggregations on 'Structured Streams' in parquet format ?

2017-06-19 Thread kaniska Mandal
s/") .option("path", "/data/summary/tags/") .start(); But, parquet doesn't support 'complete' outputMode. So is parquet supported only for batch queries , NOT for streaming queries ? - note that console outputmode working fine ! Any help will be much appreciated. Thanks Kaniska

Re: unable to stream kafka messages

2017-03-27 Thread kaniska Mandal
cribeType, topics) .load() .withColumn("message", from_json(col("value").cast("string"), tweetSchema)) // cast the binary value to a string and parse it as json .select("message.*") // unnest the json .as(Encoders.bean(Tweet.class)) Thanks Kaniska

Re: unable to stream kafka messages

2017-03-27 Thread kaniska Mandal
sults into similar compilation error. Thanks Kaniska On Mon, Mar 27, 2017 at 12:09 PM, Michael Armbrust <mich...@databricks.com> wrote: > Sorry, I don't think that I understand the question. Value is just a > binary blob that we get from kafka and pass to you. If its stored in JSO

Re: unable to stream kafka messages

2017-03-24 Thread kaniska Mandal
*") // unnest the json > .as(Encoders.bean(Tweet.class)) // only required if you want to use > lambda functions on the data using this class > > Here is some more info on working with JSON and other semi-structured > formats > <https://databricks.com/blog/2017/02/23/workin

unable to stream kafka messages

2017-03-24 Thread kaniska
Hi, Currently , encountering the following exception while working with below-mentioned code snippet : > Please suggest the correct approach for reading the stream into a sql > schema. > If I add 'tweetSchema' while reading stream, it errors out with message - > we can not change static schema