Hi,
Does anyone know how I could control the number of reducer when we do operation
such as groupie For data frame?
I could set spark.sql.shuffle.partitions in sql but not sure how to do in
df.groupBy("XX") api.
Thanks,
Mike
Hi Das,
Thanks for your reply. Somehow I missed it..
I am using Spark 1.3. The data source is from kafka.
Yeah, not sure why the delay is 0. I'll run against 1.4 and give a screenshot.
Thanks,
Mike
From: Akhil Das mailto:ak...@sigmoidanalytics.com>>
Date: Thursday, June 18, 2015 at 6:05 PM
To: M
Hi,
I have a spark streaming program running for ~ 25hrs. When I check the
Streaming UI tab. I found the "Waiting batches" is 144. But the "scheduling
delay" is 0. I am a bit confused.
If the "waiting batches" is 144, that means many batches are waiting in the
queue to be processed? If this is