One clarification: there *are* Python interpreters running on executors so
that Python UDFs and RDD API code can be executed. Some slightly-outdated
but mostly-correct reference material for this can be found at
https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals.
See also: search
Hi, I'm interested in figuring out how the Python API for Spark works,
I've came to the following conclusion and want to share this with the
community; could be of use in the PySpark docs here, specifically the
"Execution and pipelining part".
Any sanity checking would be much appreciated,