Thanks Eric. Yes..I am Chinese, :-). I will read through the articles, thank you!
bit1...@163.com From: eric wong Date: 2015-01-07 10:46 To: bit1...@163.com CC: user Subject: Re: Re: I think I am almost lost in the internals of Spark A good beginning if you are chinese. https://github.com/JerryLead/SparkInternals/tree/master/markdown 2015-01-07 10:13 GMT+08:00 bit1...@163.com <bit1...@163.com>: Thank you, Tobias. I will look into the Spark paper. But it looks that the paper has been moved, http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf. A web page is returned (Resource not found)when I access it. bit1...@163.com From: Tobias Pfeiffer Date: 2015-01-07 09:24 To: Todd CC: user Subject: Re: I think I am almost lost in the internals of Spark Hi, On Tue, Jan 6, 2015 at 11:24 PM, Todd <bit1...@163.com> wrote: I am a bit new to Spark, except that I tried simple things like word count, and the examples given in the spark sql programming guide. Now, I am investigating the internals of Spark, but I think I am almost lost, because I could not grasp a whole picture what spark does when it executes the word count. I recommend understanding what an RDD is and how it is processed, using http://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds and probably also http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf (once the server is back). Understanding how an RDD is processed is probably most helpful to understand the whole of Spark. Tobias -- 王海华