Re: Spark Function setup and cleanup

2014-07-26 Thread Yosi Botzer
Thank you, but that doesn't answer my general question. I might need to enrich my records using different datasources (or DB's) So the general use case I need to support is to have some kind of Function that has init() logic for creating connection to DB, query the DB for each records and enrich

Re: Spark Function setup and cleanup

2014-07-26 Thread Sean Owen
Look at mapPartitions. Where as map turns one value V1 into one value V2, mapPartitions lets you turn one entire Iterator[V1] to one whole Iterator [V2]. The function that does so can perform some initialization at its start, and then process all of the values, and clean up at its end. This is how

Spark Function setup and cleanup

2014-07-24 Thread Yosi Botzer
Hi, I am using the Java api of Spark. I wanted to know if there is a way to run some code in a manner that is like the setup() and cleanup() methods of Hadoop Map/Reduce The reason I need it is because I want to read something from the DB according to each record I scan in my Function, and I

Re: Spark Function setup and cleanup

2014-07-24 Thread Yanbo Liang
If you want to connect to DB in program, you can use JdbcRDD ( https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala ) 2014-07-24 18:32 GMT+08:00 Yosi Botzer yosi.bot...@gmail.com: Hi, I am using the Java api of Spark. I wanted to know if there

Re: Spark Function setup and cleanup

2014-07-24 Thread Yosi Botzer
In my case I want to reach HBase. For every record with userId I want to get some extra information about the user and add it to result record for further prcessing On Thu, Jul 24, 2014 at 9:11 AM, Yanbo Liang yanboha...@gmail.com wrote: If you want to connect to DB in program, you can use

Re: Spark Function setup and cleanup

2014-07-24 Thread Yanbo Liang
You can refer this topic http://www.mapr.com/developercentral/code/loading-hbase-tables-spark 2014-07-24 22:32 GMT+08:00 Yosi Botzer yosi.bot...@gmail.com: In my case I want to reach HBase. For every record with userId I want to get some extra information about the user and add it to result