Hi all, We have a Spark Structured streaming stream which is using mapGroupWithState. After some time of processing in a stable manner suddenly each mini batch starts taking 40 seconds. Suspiciously it looks like exactly 40 seconds each time. Before this the batches were taking less than a second.
Looking at the details for a particular task most partitions are processed really quickly but a few take exactly 40 seconds: The GC was looking ok as the data was being processed quickly but suddenly the full GCs etc stop (at the same time as the 40 second issue): I have taken a thread dump from one of the executors as this issue is happening but I cannot see any resource they are blocked on: Are we hitting a GC problem and why is it manifesting in this way? Is there another resource that is blocking and what is it? Thanks, Patrick