Re: How to Map and Reduce in sparkR

2015-06-25 Thread Eskilson,Aleksander
The simple answer is that SparkR does support map/reduce operations over RDD’s through the RDD API, but since Spark v 1.4.0, those functions were made private in SparkR. They can still be accessed by prepending the function with the namespace, like SparkR:::lapply(rdd, func). It was thought

Re: How to Map and Reduce in sparkR

2015-06-25 Thread Shivaram Venkataraman
In addition to Aleksander's point please let us know what use case would use RDD-like API in https://issues.apache.org/jira/browse/SPARK-7264 -- We are hoping to have a version of this API in upcoming releases. Thanks Shivaram On Thu, Jun 25, 2015 at 6:02 AM, Eskilson,Aleksander

Re: How to Map and Reduce in sparkR

2015-06-25 Thread Wei Zhou
Hi Shivaram/Alek, I understand that a better way to import data is to DataFrame rather than RDD. If one wants to do a map-like transformation for such row in sparkR, one could use sparkR:::lapply(), but is there a counterpart row operation on DataFrame? The use case I am working on requires

Re: How to Map and Reduce in sparkR

2015-06-25 Thread Wei Zhou
Thanks Shivaram. For those who prefer to watch the video version for the talk, like me, you can actually register for spark summit live stream 2015 free of cost. I personally find the talk extremely helpful. 2015-06-25 15:20 GMT-07:00 Shivaram Venkataraman shiva...@eecs.berkeley.edu : We don't

How to Map and Reduce in sparkR

2015-06-24 Thread Wei Zhou
Anyone knows whether sparkR supports map and reduce operations as the RDD transformations? Thanks in advance. Best, Wei