On Thursday, 11 July 2019 11:00:33 BST Jarosław Fedewicz wrote: > So far, I tried to minimize a test case, but it seems like it's really > picky about what environment it's running in. One of those cases where "it > works on my machine", I suppose. The versions are as follows: > > cryptography==2.7 > pyOpenSSL==19.0.0 > asn1crypto==0.24.0 > pyasn1==0.4.5 > pyasn1-modules==0.2.5 > Twisted==19.2.1 > > The target machine is running Xenial, so openssl 1.0.0g.
That's old... Can you go to 1.0.2s? I recall that pyOpenSSL may need newer openssl - might be wrong on this. > My local machine runs Fedora 30, thus openssl 1.1.1c. > > Is there a neat way to list all pyOpenSSL objects in a running Twisted > program? Or maybe TCPConnection objects, since those might hook to the > zope.interface machinery? You can use the gc to help with this sort of debugging. gc.collect() for obj in gc.get_objects(): do something interesting with obj You could count the number of each type of obj and look for which ones increase over time. Barry > > On Thu, Jul 11, 2019 at 9:20 AM Glyph <gl...@twistedmatrix.com> wrote: > > Hi Jarosław! > > > > On Jul 1, 2019, at 4:48 PM, Jarosław Fedewicz > > <jaroslaw.fedew...@gmail.com> > > wrote: > > > > I have written a simple service which takes data from network, massages it > > until it's useful enough, and sends the results out periodically via HTTP > > to an API. > > > > > > A reasonable start :-). > > > > It all works for a while, then I get an error like this approximately 40 > > minutes into the service's uptime: > > > > ResponseNeverReceived: [<twisted.python.failure.Failure > > OpenSSL.SSL.ZeroReturnError: >] > > > > > > Then a couple more like this: > > > > ResponseNeverReceived: [<twisted.python.failure.Failure > > twisted.internet.error.ConnectionLost: Connection to the other side was > > lost in a non-clean fashion: Connection lost.>] > > > > > > Then it ends with > > > > TimeoutError: User timeout caused connection failure. > > > > > > Then every request results in the same TimeoutError. I don't know if using > > HTTPS important in this case. > > > > > > I'm pretty sure the presence of an OpenSSL.SSL error indeed means that > > HTTPS is important. > > > > Restarting the whole service, of course, makes the problem go for a while. > > The other side is the Slack API, so I rather assume it's not very much to > > blame, it can be demonstrated to work rather reliably, all its criticisms > > notwithstanding. > > > > > > It does seem likely that the clustering of errors you're seeing are a > > local problem with Twisted. > > > > I cannot yet tell if this bug is a function of uptime, or the number of > > requests made. > > > > > > My personal guess is that it has something to do with the number of the > > TCP connections; or, specifically, the number of pyOpenSSL 'Connection' > > objects. > > > > I have tried to work around the problem by discarding the agent object, > > and using an HTTPConnectionPool with persistent=False, but it didn't help > > at all. I think it made the problem worse because the framework seems to > > refer to some objects the Agent creates, and the process becomes a CPU > > hogs > > in a couple hours (with the TimeoutErrors still happening all the time). > > > > > > I have a slight suspicion that the thing that is leaking between > > connections here is the pyOpenSSL "Context" object. We recently > > implemented an optimization which shares the Context object among multiple > > Connection objects that reference the same host. What version of Twisted > > area you using, and what version of OpenSSL, pyOpenSSL, and Cryptography? > > > > I'm curious if you reverse that optimization, if it would make any > > different to your use-case. > > > > The closest I've got on the internets which describes a similar problem, > > apart from people complaining on StackOverflow about precisely this to > > happen when they are using Scrapy, is this blog post from almost a decade > > ago: > > http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connec > > tion-failure/ . > > > > > > This definitely seems like a bug, if it's occurring in multiple places. > > > > There could be a small chance I'm holding it wrong(tm), but maybe there > > exists a ticket, just worded differently, which could help me get to the > > bottom of it. > > > > > > I don't think that any open tickets describe your precise issue. So > > please do open one. And if possible, can you minimize a proof of concept? > > Some example code would go a long way to helping to isolate this. > > > > -glyph > > _______________________________________________ > > Twisted-Python mailing list > > Twisted-Python@twistedmatrix.com > > https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python