uary 2018 at 00:04
To: Christiaan Ras <christiaan@semmelwise.nl>
Cc: user <user@spark.apache.org>
Subject: Re: [Spark structured streaming] Use of (flat)mapgroupswithstate takes
long time
For computing mapGroupsWithState, can you check the following.
- How many tasks are being l
For computing mapGroupsWithState, can you check the following.
- How many tasks are being launched in the reduce stage (that is, the stage
after the shuffle, that is computing mapGroupsWithState)
- How long each task is taking?
- How many cores does the cluster have?
On Thu, Jan 18, 2018 at
Hi,
I recently did some experiments with stateful structured streaming by using
flatmapgroupswithstate. The streaming application is quit simple: It
receives data from Kafka, feed it to the stateful operator
(flatmapgroupswithstate) and sinks the output to console.
During a test with small