I understand that using foreachPartition I can create one DB connection per
partition level. Is there a way to create a DB connection per executor level
and share that for all partitions/tasks run within that executor? One
approach I am thinking is to have a singleton with say a getConnection
method. The connection object is not created in the driver rather it passes
to the the singleton object the DB connection detail (host, port, user,
password etc). In the foreachPartition this singleton object is passed too.
The getConnection method of the singleton creates the actual connection
object only the first time it's called and returns the same connection
instance for all later invocations. I believe that way each executor JVM
will have one instance of the singleton/connection and thus all
partitions/tasks running within that executor would share the same
connection. I'd like to validate this approach with the spark experts. Does
it have any inherent flaw or is there a better way to create one instance of
an object per executor?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Create-one-DB-connection-per-executor-tp26588.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to