I understand that using foreachPartition I can create one DB connection per partition level. Is there a way to create a DB connection per executor level and share that for all partitions/tasks run within that executor? One approach I am thinking is to have a singleton with say a getConnection method. The connection object is not created in the driver rather it passes to the the singleton object the DB connection detail (host, port, user, password etc). In the foreachPartition this singleton object is passed too. The getConnection method of the singleton creates the actual connection object only the first time it's called and returns the same connection instance for all later invocations. I believe that way each executor JVM will have one instance of the singleton/connection and thus all partitions/tasks running within that executor would share the same connection. I'd like to validate this approach with the spark experts. Does it have any inherent flaw or is there a better way to create one instance of an object per executor?
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Create-one-DB-connection-per-executor-tp26588.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org