Aha I see. Thanks Hyukjin!

On Tue, Jul 19, 2022 at 9:09 PM Hyukjin Kwon <gurwls...@gmail.com> wrote:

> This is done by cloudpickle. They pickle global variables referred within
> the func together, and register it to the global imported modules.
>
> On Wed, 20 Jul 2022 at 00:55, Li Jin <ice.xell...@gmail.com> wrote:
>
>> Hi,
>>
>> I have a question about how does "imports" get send to the python worker.
>>
>> For example, I have
>>
>> def foo(x):
>>     return np.abs(x)
>>
>> If I run this code directly, it obviously failed (because np is undefined
>> on the driver process):
>>
>> sc.paralleilize([1, 2, 3]).map(foo).collect()
>>
>> However, if I add the import statement "import numpy as np" on the
>> driver, it works. So somehow driver is sending that "imports" to the worker
>> when executing foo on the worker but I cannot seem t o find the code that
>> does this - Can someone please send me a pointer?
>>
>> Thanks,
>> Li
>>
>

Reply via email to