Re: foreachPartition in Java
Great, I will use mapPartitions instead. Thanks for the advice, Yadid On 11/17/13 8:13 PM, Aaron Davidson wrote: Also, in general, you can workaround shortcomings in the Java API by converting to a Scala RDD (using JavaRDD's rdd() method). The API tends to be much clunkier since you have to jump through some hoops to talk to a Scala API in Java, though. In this case, JavaRDD's mapPartition() method will likely be the cleanest solution as Patrick said. On Sun, Nov 17, 2013 at 5:03 PM, Patrick Wendell pwend...@gmail.com mailto:pwend...@gmail.com wrote: Can you just call mapPartitions and ignore the result? - Patrick On Sun, Nov 17, 2013 at 4:45 PM, Yadid Ayzenberg ya...@media.mit.edu mailto:ya...@media.mit.edu wrote: Hi, According to the API, foreachPartition() is not yet implemented in Java. Are there any workarounds to get the same functionality ? I have a non serializable DB connection and instantiating it is pretty expensive, so I prefer to do it on a per partition basis. thanks, Yadid
foreachPartition in Java
Hi, According to the API, foreachPartition() is not yet implemented in Java. Are there any workarounds to get the same functionality ? I have a non serializable DB connection and instantiating it is pretty expensive, so I prefer to do it on a per partition basis. thanks, Yadid
Re: foreachPartition in Java
Also, in general, you can workaround shortcomings in the Java API by converting to a Scala RDD (using JavaRDD's rdd() method). The API tends to be much clunkier since you have to jump through some hoops to talk to a Scala API in Java, though. In this case, JavaRDD's mapPartition() method will likely be the cleanest solution as Patrick said. On Sun, Nov 17, 2013 at 5:03 PM, Patrick Wendell pwend...@gmail.com wrote: Can you just call mapPartitions and ignore the result? - Patrick On Sun, Nov 17, 2013 at 4:45 PM, Yadid Ayzenberg ya...@media.mit.edu wrote: Hi, According to the API, foreachPartition() is not yet implemented in Java. Are there any workarounds to get the same functionality ? I have a non serializable DB connection and instantiating it is pretty expensive, so I prefer to do it on a per partition basis. thanks, Yadid