Re: dataframe.foreach VS dataframe.collect().foreach

2016-07-26 Thread Pedro Rodriguez
:) Just realized you didn't get your original question answered though: scala> import sqlContext.implicits._ import sqlContext.implicits._ scala> case class Person(age: Long, name: String) defined class Person scala> val df = Seq(Person(24, "pedro"), Person(22, "fritz")).toDF() df:

Re: dataframe.foreach VS dataframe.collect().foreach

2016-07-26 Thread Gourav Sengupta
And Pedro has made sense of a world running amok, scared, and drunken stupor. Regards, Gourav On Tue, Jul 26, 2016 at 2:01 PM, Pedro Rodriguez wrote: > I am not 100% as I haven't tried this out, but there is a huge difference > between the two. Both foreach and collect

Re: dataframe.foreach VS dataframe.collect().foreach

2016-07-26 Thread Pedro Rodriguez
I am not 100% as I haven't tried this out, but there is a huge difference between the two. Both foreach and collect are actions irregardless of whether or not the data frame is empty. Doing a collect will bring all the results back to the driver, possibly forcing it to run out of memory. Foreach

Re: dataframe.foreach VS dataframe.collect().foreach

2016-07-26 Thread kevin
thank you Chanh 2016-07-26 15:34 GMT+08:00 Chanh Le : > Hi Ken, > > *blacklistDF -> just DataFrame * > Spark is lazy until you call something like* collect, take, write* it > will execute the hold process *like you do map or filter before you > collect*. > That mean until

Re: dataframe.foreach VS dataframe.collect().foreach

2016-07-26 Thread Chanh Le
Hi Ken, blacklistDF -> just DataFrame Spark is lazy until you call something like collect, take, write it will execute the hold process like you do map or filter before you collect. That mean until you call collect spark do nothing so you df would not have any data -> can’t call foreach. Call

dataframe.foreach VS dataframe.collect().foreach

2016-07-26 Thread kevin
HI ALL: I don't quite understand the different between : dataframe.foreach and dataframe.collect().foreach . When to use dataframe.foreach? I use spark2.0 ,I want to iterate a dataframe to get one colum's value : this can work out blacklistDF.collect().foreach { x =>