Your list is defined on the driver, whereas function specified in forEach will be evaluated on each executor. You might want to add an accumulator or handle a Sequence of list from each partition.
On Wed, Dec 9, 2015 at 11:19 AM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi, > > I have a below query. Please help me to solve this > > I have a 20000 ids. I want to join these ids to table. This table contains > some blob data. So i can not join these 2000 ids to this table in one step. > > I'm planning to join this table in a chunks. For example, each step I will > join 5000 ids. > > Below code is not working. I'm not able to add result to ListBuffer. > Result s giving always ZERO > > *Code Block :-* > > var listOfIds is a ListBuffer with 20000 records > > listOfIds.grouped(5000).foreach { x => > { > var v1 = new ListBuffer[String]() > val r = sc.parallelize(x).toDF() > r.registerTempTable("r") > var result = sqlContext.sql("SELECT r.id, t.data from r, t where r.id = > t.id") > result.foreach{ y => > { > v1 += y > } > } > println(" SIZE OF V1 === "+ v1.size) ==> > > *THIS VALUE PRINTING AS ZERO* > > *// Save v1 values to other db* > } > > Please help me on this. > > Regards, > Rajesh > -- Regards, Rishitesh Mishra, SnappyData . (http://www.snappydata.io/) https://in.linkedin.com/in/rishiteshmishra