Fwd: map vs foreach for sending data to external system

2015-07-02 Thread Alexandre Rodrigues
Hi Spark devs, I'm coding a spark job and at a certain point in execution I need to send some data present in an RDD to an external system. val myRdd = myRdd.foreach { record = sendToWhtv(record) } The thing is that foreach forces materialization of the RDD and it seems to be executed

Re: map vs foreach for sending data to external system

2015-07-02 Thread Alexandre Rodrigues
/latest/programming-guide.html#actions Thanks guys! -- Alexandre Rodrigues On Thu, Jul 2, 2015 at 5:37 PM, Eugen Cepoi cepoi.eu...@gmail.com wrote: *The thing is that foreach forces materialization of the RDD and it seems to be executed on the driver program* What makes you think

Re: map vs foreach for sending data to external system

2015-07-02 Thread Alexandre Rodrigues
:) 2015-07-02 18:59 GMT+02:00 Alexandre Rodrigues alex.jose.rodrig...@gmail.com: Foreach is listed as an action[1]. I guess an *action* just means that it forces materialization of the RDD. I just noticed much faster executions with map although I don't like the map approach. I'll look