Hi all,

I have a process where I do some calculations on each one of the columns of a 
dataframe.
Intrinsecally, I run across each column with a for loop. On the other hand, 
each process itself is non-entirely-distributable.

To speed up the process, I would like to submit a spark program for each 
column, any suggestions? I was thinking on primitive threads sharing a spark 
context.

Thank you,
Saif

Reply via email to