[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840666#comment-16840666
 ] 

Brachi Packter commented on BEAM-7230:
--------------------------------------

Hi again.

Checking with the new snapshot.

When I configure my own data source. it works great! thanks.

I tested also with this way, (the default implementation), then I get very 
quick "too many connection" error...

 
{code:java}
pipeline.apply(JdbcIO.<KV<Integer, String>>read() 
.withDataSourceProviderFn(JdbcIO.PoolableDataSourceProvider.of( 
JdbcIO.DataSourceConfiguration.create( "com.mysql.jdbc.Driver", 
"jdbc:mysql://hostname:3306/mydb", "username", "password")))
{code}
 

With custom data source I process 50k queries with 1000 connection, and with 
the default data source, I process 20k queries with 4000 connection (the limit).

Do you think it can be related to another configuration I set, like connection 
timeout, max pool size and more?

Didn't check the code, But with the default implementation (above code), do we 
still create data source pool for each DoFn?

If yes, then we should change this too, to be statically initialized per JVM.

 

> Using JdbcIO creates huge amount of connections
> -----------------------------------------------
>
>                 Key: BEAM-7230
>                 URL: https://issues.apache.org/jira/browse/BEAM-7230
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>    Affects Versions: 2.11.0
>            Reporter: Brachi Packter
>            Assignee: Ismaël Mejía
>            Priority: Major
>
> I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, 
> and still I see huge amount of connections in GCP SQL (4k while I set 
> connection pool to 300), and most of them in sleep.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to