Out of curiosity I wanted to see what JBoss supported in terms of
clustering and database connection pooling since its implementation should
suffice for your use case. I found:
*Note:* JBoss does not recommend using this feature on a production
environment. It requires accessing a connection pool
http://docs.oracle.com/cd/B10500_01/java.920/a96654/connpoca.htm
The question doesn't seem to be Spark specific, btw
On Apr 2, 2015, at 4:45 AM, Sateesh Kavuri sateesh.kav...@gmail.com wrote:
Hi,
We have a case that we will have to run concurrent jobs (for the same
algorithm) on
Right, I am aware on how to use connection pooling with oracle, but the
specific question is how to use it in the context of spark job execution
On 2 Apr 2015 17:41, Ted Yu yuzhih...@gmail.com wrote:
http://docs.oracle.com/cd/B10500_01/java.920/a96654/connpoca.htm
The question doesn't seem to
Hi,
We have a case that we will have to run concurrent jobs (for the same
algorithm) on different data sets. And these jobs can run in parallel and
each one of them would be fetching the data from the database.
We would like to optimize the database connections by making use of
connection
How long does each executor keep the connection open for? How many
connections does each executor open?
Are you certain that connection pooling is a performant and suitable
solution? Are you running out of resources on the database server and
cannot tolerate each executor having a single
But this basically means that the pool is confined to the job (of a single
app) in question, but is not sharable across multiple apps?
The setup we have is a job server (the spark-jobserver) that creates jobs.
Currently, we have each job opening and closing a connection to the
database. What we
Each executor runs for about 5 secs until which time the db connection can
potentially be open. Each executor will have 1 connection open.
Connection pooling surely has its advantages of performance and not hitting
the dbserver for every open/close. The database in question is not just
used by the
Connection pools aren't serializable, so you generally need to set them up
inside of a closure. Doing that for every item is wasteful, so you
typically want to use mapPartitions or foreachPartition
rdd.mapPartition { part =
setupPool
part.map { ...
See Design Patterns for using foreachRDD in