Is there a way through which we can use* groupByKey() Function in spark structured streaming without aggregates ?*
I have a scenario like below, where we would like to group the items based on a key without applying any aggregates. *Sample incoming data:* I would like to apply groupByKey on field - "device_id", so that i will be getting an output like below. I have also tried using collect_list() in the aggregate expression of groupByKey, but that is taking more time to process the datasets. Also, since we are aggregating - we could only use either 'Complete' or 'Update' in output modes, but 'Append' mode looks more suitable for our use case. I have also looked at the groupByKey(Num_Partitions) and reduceByKey() functions available in Direct Dstream which gives results like in the form of -> (String, Itreable[String,Int]) without doing any aggregates. Is there something available similar to that in structured streaming ? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org