RE: Is it possible to call a transform + action inside an action?

2014-10-28 Thread Ganelin, Ilya
You cannot have nested RDD transformations in Scala Spark. The issue is that when the outer operation is distributed to the cluster and kicks off a new job (the inner query) the inner job no longer has the context for the outer job. The way around this is to either do a join on two RDDs or to

RE: Is it possible to call a transform + action inside an action?

2014-10-28 Thread kpeng1
Ok cool. So in that case the only way I could think of doing this would be calling the toArray method on those RDDs which would return Array[String] and store them as broadcast variables. I read about the broadcast variables, but it still fuzzy. I am assume that since broadcast variables are