How pySpark works?

Egor Pahomov Fri, 11 Jul 2014 05:51:08 -0700

Hi, I want to use pySpark, but can't understand how it works. Documentation
doesn't provide enough information.


1) How python shipped to cluster? Should machines in cluster already have
python?
2) What happens when I write some python code in "map" function - is it
shipped to cluster and just executed on it? How it understand all
dependencies, which my code need and ship it there? If I use Math in my
code in "map" does it mean, that I would ship Math class or some python
Math on cluster would be used?
3) I have c++ compiled code. Can I ship this executable with "addPyFile"
and just use "exec" function from python? Would it work?

-- 



*Sincerely yoursEgor PakhomovScala Developer, Yandex*

How pySpark works?

Reply via email to