Hi, Thanks for the links, is there any english translation for the same? Sachin
On Thu, Jul 21, 2016 at 8:34 AM, Taotao.Li <charles.up...@gmail.com> wrote: > Hi, Sachin, here are two posts about the basic concepts about spark: > > > - spark-questions-concepts > <http://litaotao.github.io/spark-questions-concepts?s=gmail> > - deep-into-spark-exection-model > <http://litaotao.github.io/deep-into-spark-exection-model?s=gmail> > > > And, I fully recommend databrick's post: > https://databricks.com/blog/2016/06/22/apache-spark-key-terms-explained.html > > > On Thu, Jul 21, 2016 at 1:36 AM, Jean Georges Perrin <j...@jgp.net> wrote: > >> Hey, >> >> I love when questions are numbered, it's easier :) >> >> 1) Yes (but I am not an expert) >> 2) You don't control... One of my process is going to 8k tasks, so... >> 3) Yes, if you have HT, it double. My servers have 12 cores, but HT, so >> it makes 24. >> 4) From my understanding: Slave is the logical computational unit and >> Worker is really the one doing the job. >> 5) Dunnoh >> 6) Dunnoh >> >> On Jul 20, 2016, at 1:30 PM, Sachin Mittal <sjmit...@gmail.com> wrote: >> >> Hi, >> I was able to build and run my spark application via spark submit. >> >> I have understood some of the concepts by going through the resources at >> https://spark.apache.org but few doubts still remain. I have few >> specific questions and would be glad if someone could share some light on >> it. >> >> So I submitted the application using spark.master local[*] and I have >> a 8 core PC. >> >> - What I understand is that application is called as job. Since mine had >> two stages it gets divided into 2 stages and each stage had number of tasks >> which ran in parallel. >> Is this understanding correct. >> >> - What I notice is that each stage is further divided into 262 tasks From >> where did this number 262 came from. Is this configurable. Would increasing >> this number improve performance. >> >> - Also I see that the tasks are run in parallel in set of 8. Is this >> because I have a 8 core PC. >> >> - What is the difference or relation between slave and worker. When I did >> spark-submit did it start 8 slaves or worker threads? >> >> - I see all worker threads running in one single JVM. Is this because I >> did not start slaves separately and connect it to a single master cluster >> manager. If I had done that then each worker would have run in its own JVM. >> >> - What is the relationship between worker and executor. Can a worker have >> more than one executors? If yes then how do we configure that. Does all >> executor run in the worker JVM and are independent threads. >> >> I suppose that is all for now. Would appreciate any response.Will add >> followup questions if any. >> >> Thanks >> Sachin >> >> >> >> > > > -- > *___________________* > Quant | Engineer | Boy > *___________________* > *blog*: http://litaotao.github.io > <http://litaotao.github.io?utm_source=spark_mail> > *github*: www.github.com/litaotao >