Is the mylist present on every executor? If not, then you have to pass it
on. And broadcasts are the best way to pass them on. But note that once
broadcasted it will immutable at the executors, and if you update the list
at the driver, you will have to broadcast it again.

TD

On Wed, Apr 22, 2015 at 9:28 AM, Vadim Bichutskiy <
vadim.bichuts...@gmail.com> wrote:

> I am using Spark Streaming with Python. For each RDD, I call a map, i.e.,
> myrdd.map(myfunc), myfunc is in a separate Python module. In yet another
> separate Python module I have a global list, i.e. mylist, that's populated
> with metadata. I can't get myfunc to see mylist...it's always empty.
> Alternatively, I guess I could pass mylist to map.
>
> Any suggestions?
>
> Thanks,
> Vadim
> ᐧ
>

Reply via email to