Hi Jarosław!

> On Jul 1, 2019, at 4:48 PM, Jarosław Fedewicz <jaroslaw.fedew...@gmail.com> 
> wrote:
> 
> I have written a simple service which takes data from network, massages it 
> until it's useful enough, and sends the results out periodically via HTTP to 
> an API.

A reasonable start :-).

> It all works for a while, then I get an error like this approximately 40 
> minutes into the service's uptime:
> 
> ResponseNeverReceived: [<twisted.python.failure.Failure 
> OpenSSL.SSL.ZeroReturnError: >]
> 
> Then a couple more like this:
> 
> ResponseNeverReceived: [<twisted.python.failure.Failure 
> twisted.internet.error.ConnectionLost: Connection to the other side was lost 
> in a non-clean fashion: Connection lost.>]
> 
> Then it ends with
> 
> TimeoutError: User timeout caused connection failure.
> 
> Then every request results in the same TimeoutError. I don't know if using 
> HTTPS important in this case.

I'm pretty sure the presence of an OpenSSL.SSL error indeed means that HTTPS is 
important.

> Restarting the whole service, of course, makes the problem go for a while. 
> The other side is the Slack API, so I rather assume it's not very much to 
> blame, it can be demonstrated to work rather reliably, all its criticisms 
> notwithstanding.

It does seem likely that the clustering of errors you're seeing are a local 
problem with Twisted.

> I cannot yet tell if this bug is a function of uptime, or the number of 
> requests made.

My personal guess is that it has something to do with the number of the TCP 
connections; or, specifically, the number of pyOpenSSL 'Connection' objects.

> I have tried to work around the problem by discarding the agent object, and 
> using an HTTPConnectionPool with persistent=False, but it didn't help at all. 
> I think it made the problem worse because the framework seems to refer to 
> some objects the Agent creates, and the process becomes a CPU hogs in a 
> couple hours (with the TimeoutErrors still happening all the time).

I have a slight suspicion that the thing that is leaking between connections 
here is the pyOpenSSL "Context" object.  We recently implemented an 
optimization which shares the Context object among multiple Connection objects 
that reference the same host.  What version of Twisted area you using, and what 
version of OpenSSL, pyOpenSSL, and Cryptography?

I'm curious if you reverse that optimization, if it would make any different to 
your use-case.

> The closest I've got on the internets which describes a similar problem, 
> apart from people complaining on StackOverflow about precisely this to happen 
> when they are using Scrapy, is this blog post from almost a decade ago: 
> http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connection-failure/
>  
> <http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connection-failure/>.
>  

This definitely seems like a bug, if it's occurring in multiple places.

> There could be a small chance I'm holding it wrong(tm), but maybe there 
> exists a ticket, just worded differently, which could help me get to the 
> bottom of it.

I don't think that any open tickets describe your precise issue.  So please do 
open one.  And if possible, can you minimize a proof of concept?  Some example 
code would go a long way to helping to isolate this.

-glyph
_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to