Hi, Recently I gave a talk on RDD data structure which gives in depth understanding of spark internals. You can watch it on youtube <https://www.youtube.com/watch?v=WVdyuVwWcBc>. Also slides are on slideshare <http://www.slideshare.net/datamantra/anatomy-of-rdd> and code is on github <https://github.com/phatak-dev/anatomy-of-rdd>.
Regards, Madhukara Phatak http://datamantra.io/