Hi Xiao, Thank you very much for the pointers. I looked into the part of the code. I now understand how the main method is invoked. Still not clear how is the code distributed to the executors. Is it the whole jar or some serialized object. I was expecting to see the part of the code where the closures are serialized and shipped. Maybe I am missing something. Thanks again, Arijit Date: Thu, 8 Oct 2015 10:26:55 -0700 Subject: Re: Understanding code/closure shipment to Spark workers From: gatorsm...@gmail.com To: arij...@live.com CC: dev@spark.apache.org
Hi, Arijit, The code flow of spark-submit is simple. Enter the main function of SparkSubmit.scala --> case SparkSubmitAction.SUBMIT => submit(appArgs) --> doRunMain() in function submit() in the same file --> runMain(childArgs,...) in the same file --> mainMethod.invoke(null, childArgs.toArray) in the same file Function Invoke() is provided by JAVA Reflection for invoking the main function of your JAR. Hopefully, it can help you understand the problem. Thanks, Xiao Li 2015-10-07 16:47 GMT-07:00 Arijit <arij...@live.com>: Hi, I want to understand the code flow starting from the Spark jar that I submit through spark-submit, how does Spark identify and extract the closures, clean and serialize them and ship them to workers to execute as tasks. Can someone point me to any documentation or a pointer to the source code path to help me understand this. Thanks, Arijit