This is done by cloudpickle. They pickle global variables referred within
the func together, and register it to the global imported modules.

On Wed, 20 Jul 2022 at 00:55, Li Jin <ice.xell...@gmail.com> wrote:

> Hi,
>
> I have a question about how does "imports" get send to the python worker.
>
> For example, I have
>
> def foo(x):
>     return np.abs(x)
>
> If I run this code directly, it obviously failed (because np is undefined
> on the driver process):
>
> sc.paralleilize([1, 2, 3]).map(foo).collect()
>
> However, if I add the import statement "import numpy as np" on the driver,
> it works. So somehow driver is sending that "imports" to the worker when
> executing foo on the worker but I cannot seem t o find the code that does
> this - Can someone please send me a pointer?
>
> Thanks,
> Li
>

Reply via email to