Could you show a sample of the file names? There are multiple things that
are using UUIDs so would be good to see what are 100s of directories that
being generated every second.
If you are checkpointing every 400s then there shouldnt be checkpoint
directories written every second. They should be hu
Hi,
we are running a streaming job that processes about 500 events per 20s batches
and uses updateStateByKey to accumulate Web sessions (with a 30 Minute live
time).
The checkpoint intervall is set to 20xBatchInterval, that is 400s.
Cluster size is 8 nodes.
We are having trouble with the amou