On Thu, Feb 19, 2015 at 7:57 AM, jamborta <jambo...@gmail.com> wrote: > Hi all, > > I think I have run into an issue on the lazy evaluation of variables in > pyspark, I have to following > > functions = [func1, func2, func3] > > for counter in range(len(functions)): > data = data.map(lambda value: [functions[counter](value)])
You need to create a wrapper for counter: def mapper(f): return lambda v: [f(v)] for f in functions: data = data.map(mapper(f)) > it looks like that the counter is evaluated when the RDD is computed, so it > fills in all the three mappers with the last value of it. Is there any way > to get it forced to be evaluated at the time? (I am aware that I could run > persist it after each step, which sounds a bit of a waste) > > thanks, > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/stack-map-functions-in-a-loop-pyspark-tp21722.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org