i have graphx queries running inside a service where i collect the results to the driver and do not hold any references to the rdds involved in the queries. my assumption was that with the references gone spark would go and remove the cached rdds from memory (note, i did not cache them, graphx did).
yet they hang around... is my understanding of how the ContextCleaner works incorrect? or could it be that grapx holds some references internally to rdds, preventing garbage collection? maybe even circular references?