Just an update on this, Looking into Spark logs seems that some partitions are not found and recomputed. Gives the impression that those are related with the delayed updatestatebykey calls.
I'm seeing something like: log line 1 - Partition rdd_132_1 not found, computing it .... log line N - Found block rdd_132_1 locally Log line N+1 - Goes into the updatestatebykey X times has many objects with delayed update Log line M - Done Checkpointing RDD 126 to hdfs://.... This happens for Y amount of partitions as many seconds the updatestatebykey call is delayed. Tnks, Rod -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/UpdatestateByKey-assumptions-tp10858p10859.html Sent from the Apache Spark User List mailing list archive at Nabble.com.