+1 looks good

Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>


Le jeu. 4 oct. 2018 à 11:59, Jonathan Gallimore <
[email protected]> a écrit :

> I have pushed some commits onto the PR here:
> https://github.com/apache/tomee/pull/174, and would appreciate some
> review.
> One the things I found was that if I hold onto a reference to the proxy for
> the transaction, it doesn't get cleaned up, but then losing the reference
> from the scope at the end of the transaction also didn't clean things up.
> What was actually going on was:
>
> ----
> connection = cf.createConnection();
>
> // forget to close it
>
> connection2 = cf.createConnection(); // different proxy, but same
> ManagedConnectionInfo as the TransactionCachingInterceptor provides the
> same connection
> connection2.close(); // this removed the ManagedConnectionInfo from the
> references field on AutoConnectionTracker, so it was no longer tracked for
> cleanup.
> ----
>
> The transaction then completed, and connection dropped from scope, but
> because the ManagedConnectionInfo associated with it wasn't in the
> references list anymore, it never got cleaned up.
>
> To make matters worse, TransactionCachingInterceptor sees that the
> ManagedConnectionInfo has open handles and refuses to send it back to the
> pool. The effect is the connections leaked and never went back to the pool.
>
> So, my latest changes:
>
> * Keep a proxy for as long as the transaction runs
> * Checks the reference list and cleans up at the end
> * Checks the ManagedConnectionInfo object, and if there are open handles,
> destroys the connection.
> * Additionally, if OpenEJB.connector is set to log at FINEST, it should
> reveal where any unclosed connections are originally opened, to provide the
> developer with a hint on how to fix them.
>
> I'll add some tests this afternoon, but any thoughts would be much
> appreciated.
>
> Jon
>
>
> On Fri, Sep 28, 2018 at 3:18 PM Jonathan Gallimore <
> [email protected]> wrote:
>
> > I was thinking the same - do it through the registry. Let me have a go at
> > hacking on that and I'll post back here. Many thanks for the feedback.
> >
> > Jon
> >
> > On Fri, Sep 28, 2018 at 3:15 PM Romain Manni-Bucau <
> [email protected]>
> > wrote:
> >
> >> that is what I had in mind, throught it was already the case to be
> honest
> >> through the transaction registry - excess of enthusiasm probably ;)
> >>
> >> Side note: dropped G for now, if we find something impacting g-txmgr
> we'll
> >> add it back
> >>
> >> Romain Manni-Bucau
> >> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> >> <https://rmannibucau.metawerx.net/> | Old Blog
> >> <http://rmannibucau.wordpress.com> | Github <
> >> https://github.com/rmannibucau> |
> >> LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
> >> <
> >>
> https://www.packtpub.com/application-development/java-ee-8-high-performance
> >> >
> >>
> >>
> >> Le ven. 28 sept. 2018 à 16:11, Jonathan Gallimore <
> >> [email protected]> a écrit :
> >>
> >> > I apologize, I CC'd Geronimo in case there was anything relevant on
> the
> >> > Geronimo connector side. Happy to drop the CC if its not relevant.
> >> >
> >> > Thanks for the quick response!
> >> >
> >> > Off the top of my head, I wonder if we could keep a reference to the
> >> > connection proxy in the transaction (if there is one), and remove the
> >> > reference when the transaction is complete?
> >> >
> >> > Jon
> >> >
> >> > On Fri, Sep 28, 2018 at 2:57 PM Romain Manni-Bucau <
> >> [email protected]>
> >> > wrote:
> >> >
> >> > > Hi Jon,
> >> > >
> >> > > I experienced that kind of "surprise" in recent jvm since gc is way
> >> more
> >> > > aggressive - cause lambda are way more memory consuming ;). This
> lead
> >> to
> >> > > this kind of issue where the GC happens before the object should
> >> actually
> >> > > be finalizable.
> >> > > I suspect we need to rework the AutoConnectionTracker to take into
> >> > account
> >> > > these new behaviors and actually take into consideration the
> >> lifecycle of
> >> > > the underlying connection.
> >> > >
> >> > > Maybe I got it wrong but I see the ability to disable the proxying
> as
> >> a
> >> > > quickfix/workaround - which is ok - but it means we need to fix the
> >> > source
> >> > > anyway as a long term solution. Am I understanding it right? If so
> we
> >> > need
> >> > > to ensure to keep the reference until the connection is released at
> >> least
> >> > > and ensure it was not closed in the pool (kind of testOnXXX).
> >> > > Also wonder if there is anything related to geronimo since you cc-ed
> >> it.
> >> > > IIRC this logic is only in TomEE, no?
> >> > >
> >> > > side note: reference queue is supposed thread safe yes.
> >> > >
> >> > > Romain Manni-Bucau
> >> > > @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> >> > > <https://rmannibucau.metawerx.net/> | Old Blog
> >> > > <http://rmannibucau.wordpress.com> | Github <
> >> > > https://github.com/rmannibucau> |
> >> > > LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
> >> > > <
> >> > >
> >> >
> >>
> https://www.packtpub.com/application-development/java-ee-8-high-performance
> >> > > >
> >> > >
> >> > >
> >> > > Le ven. 28 sept. 2018 à 15:49, Jonathan Gallimore <
> >> > > [email protected]> a écrit :
> >> > >
> >> > > > Hi Folks,
> >> > > >
> >> > > > I have been looking into some issues with TomEE using Websphere
> MQ,
> >> > > > specifically issues where XA resources are returning a -7 error
> >> during
> >> > a
> >> > > > transaction commit when the system is under load. The -7 error
> >> > indicates
> >> > > > that the resource is closed.
> >> > > >
> >> > > > The result of this error seems to be resources staying in the
> system
> >> > > > somehow associated with the connection, and subsequently, these
> >> can't
> >> > be
> >> > > > enlisted in transactions (because they are closed).
> >> > > >
> >> > > > The stacktrace is like this, and happens over and over again as
> the
> >> > > server
> >> > > > attempts to process more messages from the queue.
> >> > > >
> >> > > > WARN  Transaction- Unable to enlist XAResource
> >> > > >
> >> org.apache.geronimo.transaction.manager.WrapperNamedXAResource@3dd56e90
> >> > ,
> >> > > > errorCode: -7
> >> > > > javax.transaction.xa.XAException: The method 'xa_start' has failed
> >> with
> >> > > > errorCode '-7' due to the resource being closed.
> >> > > >         at
> >> > com.ibm.mq.jmqi.JmqiXAResource.start(JmqiXAResource.java:946)
> >> > > > ~[com.ibm.mq.jmqi.jar:7.5.0.5 - p750-005-150424]
> >> > > >         at
> >> > com.ibm.mq.connector.xa.XARWrapper.start(XARWrapper.java:581)
> >> > > > ~[com.ibm.mq.connector.jar:7.5.0.5-p750-005-150424]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.transaction.manager.WrapperNamedXAResource.start(WrapperNamedXAResource.java:111)
> >> > > > ~[geronimo-transaction-3.1.4.jar:3.1.4]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.transaction.manager.TransactionImpl.enlistResource(TransactionImpl.java:209)
> >> > > > [geronimo-transaction-3.1.4.jar:3.1.4]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.TransactionEnlistingInterceptor.getConnection(TransactionEnlistingInterceptor.java:60)
> >> > > > [geronimo-connector-3.1.4.jar:3.1.4]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.TransactionCachingInterceptor.getConnection(TransactionCachingInterceptor.java:101)
> >> > > > [geronimo-connector-3.1.4.jar:3.1.4]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.ConnectionHandleInterceptor.getConnection(ConnectionHandleInterceptor.java:43)
> >> > > > [geronimo-connector-3.1.4.jar:3.1.4]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.TCCLInterceptor.getConnection(TCCLInterceptor.java:39)
> >> > > > [geronimo-connector-3.1.4.jar:3.1.4]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.ConnectionTrackingInterceptor.getConnection(ConnectionTrackingInterceptor.java:66)
> >> > > > [geronimo-connector-3.1.4.jar:3.1.4]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:81)
> >> > > > [geronimo-connector-3.1.4.jar:3.1.4]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.mq.connector.outbound.ConnectionFactoryImpl.createManagedJMSConnection(ConnectionFactoryImpl.java:194)
> >> > > > [com.ibm.mq.connector.jar:7.5.0.5-p750-005-150424]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.mq.connector.outbound.ConnectionFactoryImpl.createConnectionInternal(ConnectionFactoryImpl.java:153)
> >> > > > [com.ibm.mq.connector.jar:7.5.0.5-p750-005-150424]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.mq.connector.outbound.ConnectionFactoryImpl.createConnection(ConnectionFactoryImpl.java:138)
> >> > > > [com.ibm.mq.connector.jar:7.5.0.5-p750-005-150424]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.mq.connector.outbound.ConnectionFactoryImpl.createConnection(ConnectionFactoryImpl.java:123)
> >> > > > [com.ibm.mq.connector.jar:7.5.0.5-p750-005-150424]
> >> > > >
> >> > > > The issue itself bears some resemblance to these posts:
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> http://tomee-openejb.979440.n4.nabble.com/Unable-to-enlist-XAResource-error-td4666552.html
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> http://tomee-openejb.979440.n4.nabble.com/Does-tomEE-issue-double-rollbacks-td4666090.html
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> http://tomee-openejb.979440.n4.nabble.com/errorcode-100-when-using-websphere-MQ-with-tomee-td4666519.html
> >> > > >
> >> > > > com.ibm.mq.jmqi.JmqiXAResource has a isClosed field, and this is
> set
> >> > when
> >> > > > close_internal() is called. I tracked this call using AspectJ, and
> >> it
> >> > > > appeared to happen here:
> >> > > >
> >> > > >
> >> com.ibm.mq.jmqi.JmqiXAResource.close_internal(JmqiXAResource.java:296),
> >> > > > com.ibm.mq.jmqi.JmqiXAResource.close(JmqiXAResource.java:423),
> >> > > > com.ibm.mq.jmqi.JmqiXAResource.close(JmqiXAResource.java:410),
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.msg.client.wmq.internal.WMQXASession.close(WMQXASession.java:163),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.msg.client.jms.internal.JmsSessionImpl.close(JmsSessionImpl.java:375),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.msg.client.jms.internal.JmsSessionImpl.close(JmsSessionImpl.java:303),
> >> > > > com.ibm.mq.jms.MQSession.close(MQSession.java:298),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.mq.connector.outbound.ManagedConnectionImpl.destroy(ManagedConnectionImpl.java:298),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.MCFConnectionInterceptor.returnConnection(MCFConnectionInterceptor.java:67),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.XAResourceInsertionInterceptor.returnConnection(XAResourceInsertionInterceptor.java:47),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.AbstractSinglePoolConnectionInterceptor.internalReturn(AbstractSinglePoolConnectionInterceptor.java:148),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.AbstractSinglePoolConnectionInterceptor.returnConnection(AbstractSinglePoolConnectionInterceptor.java:129),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.TransactionEnlistingInterceptor.returnConnection(TransactionEnlistingInterceptor.java:118),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.TransactionCachingInterceptor.returnConnection(TransactionCachingInterceptor.java:119),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.ConnectionHandleInterceptor.returnConnection(ConnectionHandleInterceptor.java:71),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.TCCLInterceptor.returnConnection(TCCLInterceptor.java:50),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.ConnectionTrackingInterceptor.returnConnection(ConnectionTrackingInterceptor.java:91),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.openejb.resource.AutoConnectionTracker.setEnvironment(AutoConnectionTracker.java:62),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.ConnectionTrackingInterceptor.getConnection(ConnectionTrackingInterceptor.java:65),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.geronimo.connector.outbound.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:81),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.mq.connector.outbound.ConnectionFactoryImpl.createManagedJMSConnection(ConnectionFactoryImpl.java:194),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.mq.connector.outbound.ConnectionFactoryImpl.createConnectionInternal(ConnectionFactoryImpl.java:153),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.mq.connector.outbound.ConnectionFactoryImpl.createConnection(ConnectionFactoryImpl.java:138),
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> com.ibm.mq.connector.outbound.ConnectionFactoryImpl.createConnection(ConnectionFactoryImpl.java:123),
> >> > > >
> >> > > > The interesting part here is the
> >> > > > org.apache.openejb.resource.AutoConnectionTracker.setEnvironment()
> >> > call,
> >> > > > which is destroying connections where the proxy for the connection
> >> no
> >> > > > longer has any references. I.e. if you had some code that did
> this:
> >> > > >
> >> > > > ----
> >> > > > Connection conn = connectionFactory.createConnection(); // conn is
> >> > > > actually a proxy to the connection
> >> > > > // do work
> >> > > >
> >> > > > conn = null; // or conn is out of scope
> >> > > > ----
> >> > > >
> >> > > > note that conn.close() is not called. The reference count to the
> >> proxy
> >> > is
> >> > > > now zero, so it counts as abandoned, and the next call to
> >> > > > createConnection() destroys the associated connection. However, if
> >> all
> >> > > this
> >> > > > happens in a transacted method, the JmqiXAResource is still
> enlisted
> >> > with
> >> > > > the transaction. So when the transaction goes to commit, it fails,
> >> > > because
> >> > > > the JmqiXAResource is closed. The resource is still associated
> with
> >> a
> >> > > > ManagedConnectionInfo object through ManagedConnectionInfo.xares
> >> > > > (WrapperNamedXAResource). xaResource (XARWrapper - part of the
> >> > > RAR).theXAR
> >> > > > (JmqiXAResource - part of the RAR). The ManagedConnectionInfo
> object
> >> > > > somehow comes back to the pool after the rollback, and gets used
> >> again,
> >> > > > with the TransactionImpl.enlistResource() call failing as above.
> >> > > >
> >> > > > I suspect that we're seeing this under load as GC will be taking
> >> place
> >> > > > more often and calls to createConnection() will be happening more
> >> > > > frequently.
> >> > > >
> >> > > > The connection proxies themselves don't do much - only calls to
> >> > > finalize()
> >> > > > and clone() are caught. I propose this patch to allow the proxies
> >> to be
> >> > > > disabled (the current behaviour is still the default):
> >> > > > https://github.com/apache/tomee/pull/174 and welcome feedback.
> >> > > >
> >> > > > Running this specific case with proxies disabled has yielded
> >> positive
> >> > > > results. I'll try and extend this and reproduce it in a test case
> as
> >> > > well.
> >> > > > I'm wondering if AutoConnectionTracker's use of ReferenceQueue is
> >> > thread
> >> > > > safe. Do anyone have any thoughts on this?
> >> > > >
> >> > > > Cheers
> >> > > >
> >> > > > Jon
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to