hello all,
i am observing a strange result. i have a computation that i run on a
cached RDD in spark-standalone. it typically takes about 4 seconds.

but when other RDDs that are not relevant to the computation at hand are
cached in memory (in same spark context), the computation takes 40 seconds
or more.

the problem seems to be GC time, which goes from milliseconds to tens of
seconds.

note that my issue is not that memory is full. i have cached about 14G in
RDDs with 66G available across workers for the application. also my
computation did not push any cached RDD out of memory.

any ideas?

Reply via email to