Is the mylist present on every executor? If not, then you have to pass it on. And broadcasts are the best way to pass them on. But note that once broadcasted it will immutable at the executors, and if you update the list at the driver, you will have to broadcast it again.
TD On Wed, Apr 22, 2015 at 9:28 AM, Vadim Bichutskiy < vadim.bichuts...@gmail.com> wrote: > I am using Spark Streaming with Python. For each RDD, I call a map, i.e., > myrdd.map(myfunc), myfunc is in a separate Python module. In yet another > separate Python module I have a global list, i.e. mylist, that's populated > with metadata. I can't get myfunc to see mylist...it's always empty. > Alternatively, I guess I could pass mylist to map. > > Any suggestions? > > Thanks, > Vadim > ᐧ >