Structured Streaming: emitted record count

2017-11-28 Thread aravias
In structured streaming, the QueryProgressEvent does not seem to have the final emitted record count to the destination, I see only the number of input rows. I was trying to use the count (additional action after persisting the dataset), but I face the below exception when calling persist or

Structured Streaming: multiple sinks

2017-08-11 Thread aravias
ionMs" : { "addBatch" : 2263426, "getBatch" : 12, "getOffset" : 273, "queryPlanning" : 13, "triggerExecution" : 2264288, "walCommit" : 552 }, regards aravias -- View this message in context: http://apache-spark-user-list.

StructuredStreaming: java.util.concurrent.TimeoutException: Cannot fetch record for offset

2017-08-08 Thread aravias
Hi, we have a structured streaming app consuming data from kafka and writing to s3. I keep getting this timeout exception whenever the executor is specified and running with more than one core per executor. If someone can share any info related to this if you know it would be great. 17/08/08

Structured Streaming: consumerGroupId

2017-07-06 Thread aravias
Hi, Is there a way to get the *consumerGroupId* assigned to a structured streaming application when its consuming from kafka? regards Aravind -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Structured-Streaming-consumerGroupId-tp28828.html Sent from the

Stream Processing: how to refresh a loaded dataset periodically

2017-06-19 Thread aravias
Hi, we are using structured streaming for stream processing and for each message to do some data enrichment i have to lookup data from cassandra and that data in cassandra gets periodically (like once in a day) updated. I want to look at the option of loading it as a dataset and then register it

Re: StructuredStreaming : StreamingQueryException

2017-06-09 Thread aravias
the bug is related to where long checkpoints are truncated when dealing with topics have large number of partitions, in my case 120. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/StructuredStreaming-StreamingQueryException-tp28749p28754.html Sent from the

Re: StructuredStreaming : StreamingQueryException

2017-06-09 Thread aravias
this is a bug in spark version 2.1.0, seems to be fixed in spark 2.1.1 when ran with that version. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/StructuredStreaming-StreamingQueryException-tp28749p28753.html Sent from the Apache Spark User List mailing

StructuredStreaming : org.apache.spark.sql.streaming.StreamingQueryException

2017-06-06 Thread aravias
hi, I have one read stream to consume data from a Kafka topic , and based on an attribute value in each of the incoming messages, I have to write data to either of the 2 different locations in S3 (if value1 write to location1, otherwise to location2). On a high level below is what I have for