Re: [Twisted-Python] Need some enlightenment on using web client properly, or maybe nudge a bug to get fixed

2019-07-11 Thread Scott, Barry
On Thursday, 11 July 2019 11:00:33 BST Jarosław Fedewicz wrote:
> So far, I tried to minimize a test case, but it seems like it's really
> picky about what environment it's running in. One of those cases where "it
> works on my machine", I suppose. The versions are as follows:
> 
> cryptography==2.7
> pyOpenSSL==19.0.0
> asn1crypto==0.24.0
> pyasn1==0.4.5
> pyasn1-modules==0.2.5
> Twisted==19.2.1
> 
> The target machine is running Xenial, so openssl 1.0.0g.

That's old... Can you go to 1.0.2s?
I recall that pyOpenSSL may need newer openssl - might be wrong on this.

> My local machine runs Fedora 30, thus openssl 1.1.1c.
> 
> Is there a neat way to list all pyOpenSSL objects in a running Twisted
> program? Or maybe TCPConnection objects, since those might hook to the
> zope.interface machinery?

You can use the gc to help with this sort of debugging.

gc.collect()
for obj in gc.get_objects():
 do something interesting with obj

You could count the number of each type of obj and look for which ones 
increase over time.

Barry



> 
> On Thu, Jul 11, 2019 at 9:20 AM Glyph  wrote:
> > Hi Jarosław!
> > 
> > On Jul 1, 2019, at 4:48 PM, Jarosław Fedewicz
> > 
> > wrote:
> > 
> > I have written a simple service which takes data from network, massages it
> > until it's useful enough, and sends the results out periodically via HTTP
> > to an API.
> > 
> > 
> > A reasonable start :-).
> > 
> > It all works for a while, then I get an error like this approximately 40
> > minutes into the service's uptime:
> > 
> > ResponseNeverReceived: [ > OpenSSL.SSL.ZeroReturnError: >]
> > 
> > 
> > Then a couple more like this:
> > 
> > ResponseNeverReceived: [ > twisted.internet.error.ConnectionLost: Connection to the other side was
> > lost in a non-clean fashion: Connection lost.>]
> > 
> > 
> > Then it ends with
> > 
> > TimeoutError: User timeout caused connection failure.
> > 
> > 
> > Then every request results in the same TimeoutError. I don't know if using
> > HTTPS important in this case.
> > 
> > 
> > I'm pretty sure the presence of an OpenSSL.SSL error indeed means that
> > HTTPS is important.
> > 
> > Restarting the whole service, of course, makes the problem go for a while.
> > The other side is the Slack API, so I rather assume it's not very much to
> > blame, it can be demonstrated to work rather reliably, all its criticisms
> > notwithstanding.
> > 
> > 
> > It does seem likely that the clustering of errors you're seeing are a
> > local problem with Twisted.
> > 
> > I cannot yet tell if this bug is a function of uptime, or the number of
> > requests made.
> > 
> > 
> > My personal guess is that it has something to do with the number of the
> > TCP connections; or, specifically, the number of pyOpenSSL 'Connection'
> > objects.
> > 
> > I have tried to work around the problem by discarding the agent object,
> > and using an HTTPConnectionPool with persistent=False, but it didn't help
> > at all. I think it made the problem worse because the framework seems to
> > refer to some objects the Agent creates, and the process becomes a CPU
> > hogs
> > in a couple hours (with the TimeoutErrors still happening all the time).
> > 
> > 
> > I have a slight suspicion that the thing that is leaking between
> > connections here is the pyOpenSSL "Context" object.  We recently
> > implemented an optimization which shares the Context object among multiple
> > Connection objects that reference the same host.  What version of Twisted
> > area you using, and what version of OpenSSL, pyOpenSSL, and Cryptography?
> > 
> > I'm curious if you reverse that optimization, if it would make any
> > different to your use-case.
> > 
> > The closest I've got on the internets which describes a similar problem,
> > apart from people complaining on StackOverflow about precisely this to
> > happen when they are using Scrapy, is this blog post from almost a decade
> > ago:
> > http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connec
> > tion-failure/ .
> > 
> > 
> > This definitely seems like a bug, if it's occurring in multiple places.
> > 
> > There could be a small chance I'm holding it wrong(tm), but maybe there
> > exists a ticket, just worded differently, which could help me get to the
> > bottom of it.
> > 
> > 
> > I don't think that any open tickets describe your precise issue.  So
> > please do open one.  And if possible, can you minimize a proof of concept?
> > Some example code would go a long way to helping to isolate this.
> > 
> > -glyph
> > ___
> > Twisted-Python mailing list
> > Twisted-Python@twistedmatrix.com
> > https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python




___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Need some enlightenment on using web client properly, or maybe nudge a bug to get fixed

2019-07-11 Thread Maarten ter Huurne
On Thursday, 11 July 2019 12:00:33 CEST Jarosław Fedewicz wrote:

> Is there a neat way to list all pyOpenSSL objects in a running Twisted
> program? Or maybe TCPConnection objects, since those might hook to
> the zope.interface machinery?

Not specific to Twisted, but you can get a list of all objects tracked 
by the garbage collector using "gc.get_objects()" and then filter that 
by class.

Bye,
Maarten



___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Need some enlightenment on using web client properly, or maybe nudge a bug to get fixed

2019-07-11 Thread Jarosław Fedewicz
So far, I tried to minimize a test case, but it seems like it's really
picky about what environment it's running in. One of those cases where "it
works on my machine", I suppose. The versions are as follows:

cryptography==2.7
pyOpenSSL==19.0.0
asn1crypto==0.24.0
pyasn1==0.4.5
pyasn1-modules==0.2.5
Twisted==19.2.1

The target machine is running Xenial, so openssl 1.0.0g.

My local machine runs Fedora 30, thus openssl 1.1.1c.

Is there a neat way to list all pyOpenSSL objects in a running Twisted
program? Or maybe TCPConnection objects, since those might hook to the
zope.interface machinery?

On Thu, Jul 11, 2019 at 9:20 AM Glyph  wrote:

> Hi Jarosław!
>
> On Jul 1, 2019, at 4:48 PM, Jarosław Fedewicz 
> wrote:
>
> I have written a simple service which takes data from network, massages it
> until it's useful enough, and sends the results out periodically via HTTP
> to an API.
>
>
> A reasonable start :-).
>
> It all works for a while, then I get an error like this approximately 40
> minutes into the service's uptime:
>
> ResponseNeverReceived: [ OpenSSL.SSL.ZeroReturnError: >]
>
>
> Then a couple more like this:
>
> ResponseNeverReceived: [ twisted.internet.error.ConnectionLost: Connection to the other side was
> lost in a non-clean fashion: Connection lost.>]
>
>
> Then it ends with
>
> TimeoutError: User timeout caused connection failure.
>
>
> Then every request results in the same TimeoutError. I don't know if using
> HTTPS important in this case.
>
>
> I'm pretty sure the presence of an OpenSSL.SSL error indeed means that
> HTTPS is important.
>
> Restarting the whole service, of course, makes the problem go for a while.
> The other side is the Slack API, so I rather assume it's not very much to
> blame, it can be demonstrated to work rather reliably, all its criticisms
> notwithstanding.
>
>
> It does seem likely that the clustering of errors you're seeing are a
> local problem with Twisted.
>
> I cannot yet tell if this bug is a function of uptime, or the number of
> requests made.
>
>
> My personal guess is that it has something to do with the number of the
> TCP connections; or, specifically, the number of pyOpenSSL 'Connection'
> objects.
>
> I have tried to work around the problem by discarding the agent object,
> and using an HTTPConnectionPool with persistent=False, but it didn't help
> at all. I think it made the problem worse because the framework seems to
> refer to some objects the Agent creates, and the process becomes a CPU hogs
> in a couple hours (with the TimeoutErrors still happening all the time).
>
>
> I have a slight suspicion that the thing that is leaking between
> connections here is the pyOpenSSL "Context" object.  We recently
> implemented an optimization which shares the Context object among multiple
> Connection objects that reference the same host.  What version of Twisted
> area you using, and what version of OpenSSL, pyOpenSSL, and Cryptography?
>
> I'm curious if you reverse that optimization, if it would make any
> different to your use-case.
>
> The closest I've got on the internets which describes a similar problem,
> apart from people complaining on StackOverflow about precisely this to
> happen when they are using Scrapy, is this blog post from almost a decade
> ago:
> http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connection-failure/
> .
>
>
> This definitely seems like a bug, if it's occurring in multiple places.
>
> There could be a small chance I'm holding it wrong(tm), but maybe there
> exists a ticket, just worded differently, which could help me get to the
> bottom of it.
>
>
> I don't think that any open tickets describe your precise issue.  So
> please do open one.  And if possible, can you minimize a proof of concept?
> Some example code would go a long way to helping to isolate this.
>
> -glyph
> ___
> Twisted-Python mailing list
> Twisted-Python@twistedmatrix.com
> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>


-- 
Yaroslav Fedevych
IT Philosopher
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Need some enlightenment on using web client properly, or maybe nudge a bug to get fixed

2019-07-11 Thread Glyph
Hi Jarosław!

> On Jul 1, 2019, at 4:48 PM, Jarosław Fedewicz  
> wrote:
> 
> I have written a simple service which takes data from network, massages it 
> until it's useful enough, and sends the results out periodically via HTTP to 
> an API.

A reasonable start :-).

> It all works for a while, then I get an error like this approximately 40 
> minutes into the service's uptime:
> 
> ResponseNeverReceived: [ OpenSSL.SSL.ZeroReturnError: >]
> 
> Then a couple more like this:
> 
> ResponseNeverReceived: [ twisted.internet.error.ConnectionLost: Connection to the other side was lost 
> in a non-clean fashion: Connection lost.>]
> 
> Then it ends with
> 
> TimeoutError: User timeout caused connection failure.
> 
> Then every request results in the same TimeoutError. I don't know if using 
> HTTPS important in this case.

I'm pretty sure the presence of an OpenSSL.SSL error indeed means that HTTPS is 
important.

> Restarting the whole service, of course, makes the problem go for a while. 
> The other side is the Slack API, so I rather assume it's not very much to 
> blame, it can be demonstrated to work rather reliably, all its criticisms 
> notwithstanding.

It does seem likely that the clustering of errors you're seeing are a local 
problem with Twisted.

> I cannot yet tell if this bug is a function of uptime, or the number of 
> requests made.

My personal guess is that it has something to do with the number of the TCP 
connections; or, specifically, the number of pyOpenSSL 'Connection' objects.

> I have tried to work around the problem by discarding the agent object, and 
> using an HTTPConnectionPool with persistent=False, but it didn't help at all. 
> I think it made the problem worse because the framework seems to refer to 
> some objects the Agent creates, and the process becomes a CPU hogs in a 
> couple hours (with the TimeoutErrors still happening all the time).

I have a slight suspicion that the thing that is leaking between connections 
here is the pyOpenSSL "Context" object.  We recently implemented an 
optimization which shares the Context object among multiple Connection objects 
that reference the same host.  What version of Twisted area you using, and what 
version of OpenSSL, pyOpenSSL, and Cryptography?

I'm curious if you reverse that optimization, if it would make any different to 
your use-case.

> The closest I've got on the internets which describes a similar problem, 
> apart from people complaining on StackOverflow about precisely this to happen 
> when they are using Scrapy, is this blog post from almost a decade ago: 
> http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connection-failure/
>  
> .
>  

This definitely seems like a bug, if it's occurring in multiple places.

> There could be a small chance I'm holding it wrong(tm), but maybe there 
> exists a ticket, just worded differently, which could help me get to the 
> bottom of it.

I don't think that any open tickets describe your precise issue.  So please do 
open one.  And if possible, can you minimize a proof of concept?  Some example 
code would go a long way to helping to isolate this.

-glyph___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


[Twisted-Python] Need some enlightenment on using web client properly, or maybe nudge a bug to get fixed

2019-07-01 Thread Jarosław Fedewicz
I have written a simple service which takes data from network, massages it
until it's useful enough, and sends the results out periodically via HTTP
to an API.

It all works for a while, then I get an error like this approximately 40
minutes into the service's uptime:

ResponseNeverReceived: []


Then a couple more like this:

ResponseNeverReceived: []


Then it ends with

TimeoutError: User timeout caused connection failure.


Then every request results in the same TimeoutError. I don't know if using
HTTPS important in this case.

Restarting the whole service, of course, makes the problem go for a while.
The other side is the Slack API, so I rather assume it's not very much to
blame, it can be demonstrated to work rather reliably, all its criticisms
notwithstanding.

I cannot yet tell if this bug is a function of uptime, or the number of
requests made.

I have tried to work around the problem by discarding the agent object, and
using an HTTPConnectionPool with persistent=False, but it didn't help at
all. I think it made the problem worse because the framework seems to refer
to some objects the Agent creates, and the process becomes a CPU hogs in a
couple hours (with the TimeoutErrors still happening all the time).

The closest I've got on the internets which describes a similar problem,
apart from people complaining on StackOverflow about precisely this to
happen when they are using Scrapy, is this blog post from almost a decade
ago:
http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connection-failure/
.

There could be a small chance I'm holding it wrong(tm), but maybe there
exists a ticket, just worded differently, which could help me get to the
bottom of it.

-- 
Yaroslav Fedevych
IT Philosopher
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python