My mistake - using @Teardown in this way is a good approach. It may not be
executed sometimes, but like Reuven says it means the process died.

Kenn

On Thu, Jan 17, 2019 at 9:31 AM Jean-Baptiste Onofré <[email protected]>
wrote:

> Hi,
>
> I don't think we have connection leak in normal behavior.
>
> The actual SQL statement is executed in @FinishBundle, where the
> connection is closed.
>
> The processElement adds record to process.
>
> Does it mean that an Exception occurs in the batch addition ?
>
> Regards
> JB
>
> On 17/01/2019 12:41, Alexey Romanenko wrote:
> > Kenn,
> >
> > I’m not sure that we have a connection leak in JdbcIO since new
> > connection is being obtained from an instance of /javax.sql.DataSource/
> > (created in @Setup) and which
> > is /org.apache.commons.dbcp2.BasicDataSource/ by default.
> > /BasicDataSource/ uses connection pool and closes all idle connections
> > in "close()”.
> >
> > In its turn, JdbcIO calls/DataSource.close()/ in @Teardown, so all idle
> > connections should be closed and released there in case of fails.
> > Though, potentially some connections, that has been delegated to client
> > before and were not not properly returned to pool, could be leaked…
> > Anyway, I think it could be a good idea to call "/connection.close()/”
> > (return to connection pool) explicitly in case of any exception happened
> > during bundle processing.
> >
> > Probably JB may provide more details as original author of JdbcIO.
> >
> >> On 14 Jan 2019, at 21:37, Kenneth Knowles <[email protected]
> >> <mailto:[email protected]>> wrote:
> >>
> >> Hi Jonathan,
> >>
> >> JdbcIO.write() just invokes this
> >> DoFn:
> https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L765
> >>
> >> It establishes a connection in @StartBundle and then in @FinishBundle
> >> it commits a batch and closes the connection. If an error happens
> >> in @StartBundle or @ProcessElement there will be a retry with a fresh
> >> instance of the DoFn, which will establish a new connection. It looks
> >> like neither @StartBundle nor @ProcessElement closes the connection,
> >> so I'm guessing that the old connection sticks around because the
> >> worker process was not terminated. So the Beam runner and Dataflow
> >> service are working as intended and this is an issue with JdbcIO,
> >> unless I've made a mistake in my reading or analysis.
> >>
> >> Would you mind reporting these details
> >> to https://issues.apache.org/jira/projects/BEAM/ ?
> >>
> >> Kenn
> >>
> >> On Mon, Jan 14, 2019 at 12:51 AM Jonathan Perron
> >> <[email protected] <mailto:[email protected]>>
> wrote:
> >>
> >>     Hello !
> >>
> >>     My question is maybe mainly GCP-oriented, so I apologize if it is
> >>     not fully related to the Beam community.
> >>
> >>     We have a streaming pipeline running on Dataflow which writes data
> >>     to a PostgreSQL instance hosted on Cloud SQL. This database is
> >>     suffering from connection leak spikes on a regular basis:
> >>
> >>     <ofbkcnmdfbgcoooc.png>
> >>
> >>     The connections are kept alive until the pipeline is
> canceled/drained:
> >>
> >>     <gngklddbhnckgpni.png>
> >>
> >>     We are writing to the database with:
> >>
> >>     - individual DoFn where we open/close the connection using the
> >>     standard JDBC try/catch (SQLException ex)/finally statements;
> >>
> >>     - a Pipeline.apply(JdbcIO.<SessionData>write()) operations.
> >>
> >>     I observed that these spikes happens most of the time after I/O
> >>     errors with the database. Has anyone observed the same situation ?
> >>
> >>     I have several questions/observations, please correct me if I am
> >>     wrong (I am not from the java environment, so some can seem pretty
> >>     trivial) :
> >>
> >>     - Does the apply method handles SQLException or I/O errors ?
> >>
> >>     - Would the use of a connection pool prevents such behaviours ? If
> >>     so, how would one implement it to allow all workers to use it ?
> >>     Could it be implemented with JDBC Connection pooling ?
> >>
> >>     I am worrying about the serialization if one would pass a
> >>     Connection item as an argument of a DoFn.
> >>
> >>     Thank you in advance for your comments and reactions.
> >>
> >>     Best regards,
> >>
> >>     Jonathan
> >>
> >
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to