Hi Jerry, it's all clear to me now, I will try with something like Apache
DBCP for the connection pool
Thanks a lot for your help!
2014-07-09 3:08 GMT+02:00 Shao, Saisai saisai.s...@intel.com:
Yes, that would be the Java equivalence to use static class member, but
you should carefully
Hi Tobias, thanks for your help. I understand that with that code we obtain
a database connection per partition, but I also suspect that with that code
a new database connection is created per each execution of the function
used as argument for mapPartitions(). That would be very inefficient
I think you can maintain a connection pool or keep the connection as a
long-lived object in executor side (like lazily creating a singleton object in
object { } in Scala), so your task can get this connection each time executing
a task, not creating a new one, that would be good for your
Hi Jerry, thanks for your answer. I'm using Spark Streaming for Java, and I
only have rudimentary knowledge about Scala, how could I recreate in Java
the lazy creation of a singleton object that you propose for Scala? Maybe a
static class member in Java for the connection would be the solution?
Yes, that would be the Java equivalence to use static class member, but you
should carefully program to prevent resource leakage. A good choice is to use
third-party DB connection library which supports connection pool, that will
alleviate your programming efforts.
Thanks
Jerry
From: Juan
Juan,
I am doing something similar, just not insert into SQL database, but
issue some RPC call. I think mapPartitions() may be helpful to you. You
could do something like
dstream.mapPartitions(iter = {
val db = new DbConnection()
// maybe only do the above if !iter.isEmpty
iter.map(item =