Re: [Twisted-Python] is it possible to change the isolation level of a psycopg2 connection under enterprise.adbapi ?

2014-11-13 Thread exarkun

On 04:56 pm, twisted-pyt...@2xlp.com wrote:
there doesn't seem to be a way to access the connection objects within 
the pool ( psycopg2 manages this via 
`connection.set_isolation_level(X)`


Basically.  There is a trick to work around this, invent your own DB-API 
2.0 wrapper around psycopg2 that is a pass-through except that it makes 
this `connection` object call before it gives back a new connection.


You might want to look at twextpy's adbapi2. It provides an interface 
slightly more amenable to customizations like this one. Off the top of 
my head, I don't know if it supports psycopg2 (but I know it supports 
postgresql somehow).


Jean-Paul
the only workaround I can think of seems to be emitting raw sql when I 
first start the transaction - but this doesn't seem right.


am i missing anything?
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Sending longer messages in AMP

2014-11-13 Thread exarkun

On 02:57 pm, ga...@gromper.net wrote:

Hi,

We're using AMP and are starting to hit TooLong errors when scaling
our application. In one respect it's a sign that we should do
something like paging large requests and responses, but that's a lot
more work, and comes with its own problems. We also don't need
particularly large payloads: right now, a limit of ~500kiB would allow
us to scale as far as we need and beyond.

I've put together a fork of Twisted's AMP implementation that uses
32-bit length prefixes everywhere, though it limits the maximum
message size to 2MiB. Every other aspect of it is the same so it's a
drop-in replacement, as long as both ends of a connection use it.
However, there's no negotiation phase so it's completely incompatible
on the wire. The overhead of a few extra bytes is negligible for our
use cases, where the networks are all assumed to be low-latency
high-bandwidth LANs.

Are there any reasons that we shouldn't be doing this? Was there a
good reason for 16-bit length prefixes that still holds? Should we be
doing something else?


The short length limit is in place to encourage two things:

 * messages that can be processed in a cooperative-multitasking-friendly 
way


 * the AMP channel can reliably used to multiplex multiple operations

The limit encourages the former by limiting the total amount of data 
it's possible to receive in a single command.  Of course, you can still 
do ridiculously complicated work based on a small bit of data so this 
doesn't guarantee that no matter what you do you'll be safe.  But doing 
even something simple on a ridiculously large amount of data is probably 
guaranteed to take a while.


The limit encourages the latter by putting a limit on the data that 
needs to be transferred to complete any one command (or answer).  Again, 
this isn't a guarantee of safety (you could always have a `for i in 
range(1e10): callRemote(...)` loop and clog up the channel for ages) but 
it pushes things a bit more in that direction.


At ClusterHQ we *also* maintained a fork of AMP with this limit raised. 
Basically, it worked.  It did let us get into the kind of trouble that 
the limit was supposed to try to avoid (in particular it let us send 
around messages that would take longer and longer to be processed - in a 
system where keeping latency down was actually sort of important; 
fortunately we had *worse* problems introducing latency so this in 
particular never bit us too hard ;).

If I assume that the answers are all no, would someone find this
protocol useful if we submitted it for inclusion in Twisted itself?


There are better solutions to the problem.  The trouble is that they're 
also more work to implement. ;)  I think Twisted should hold out for the 
better solutions though, not adopt a like-AMP-but-with-different-hard- 
coded-limits solution.


What are the better solutions?  Library support for paging, basically. 
Or, to consider things more generally, library support for streaming. 
The AMP implementation in Twisted (note, not the *protocol*) should be 
extended to make it easy to pass arbitrarily large streams of data 
around - suitably broken into smaller pieces at the box level.


As of right now, the way I'd do that is by introducing a new argument 
type (or two) supporting `IProducer` and `IConsumer`.  Pass in an 
`IProducer` and the library will take the necessary steps to read data 
out of it, chunk it up into <=16kB chunks, and re-assemble them on the 
receiving side (as another `IProducer`).


There are two reasons I'm not working on this right now (apart from the 
standard reasons of not having time to do so ;):


 1) IProducer / IConsumer aren't amenable to this kind of decoupling. 
You can register a producer with a consumer but you can't register a 
consumer with a producer.  By the time you give the IProducer to AMP, 
it's too late to tell it you want it to send its data into the AMP 
implementation for the necessary handling.  We worked around this in 
twisted.web.client.Agent by introducing a new IProducer-like interface. 
It solves the basic problem but it doesn't go any further to improve the 
usability of the interfaces.


 2) Tubes.  Glyph is working on a replacement for IProducer/IConsumer 
that does go a lot further to improve usability.  With this promise of a 
bright, prosperous future looming, it's hard to get excited about 
implementing for AMP a just-barely-good-enough solution like the one 
used by Agent (in particular, with the knowledge that the tubes solution 
will be API incompatible and we'll most likely want to deprecate the 
IProducer/IConsumer thing).


Jean-Paul

The code right now is a straight copy of amp.py and test_amp.py with
changes to 32-bit length prefixes everywhere, but for upstreaming we'd
probably propose instead to modify the original to have an optional
negotiation phase, and to make the maximum message size a parameter.

Thanks!

Gavin.

___
Twisted-Pyth

Re: [Twisted-Python] How do I debug this network problem?

2014-11-13 Thread Peter Westlake


>> I've put in the dataReceived, and the answer box does*not*make it
>> into the Protocol. There are two entries in
>> protocol._outstandingRequests: {'2189': , '2188':
>> } and the log output shows 2186, 2187, 218a, 218b, ...
>
> So, wait a second.
>
> You said you're looking at tcpdump, and it's showing you that your
> outstanding requests - in this case, 2188 and 2189 - are in fact being
> answered. But then you say you're looking at the dump from
> dataReceived, and seeing that not only are 2188 and 2189 not being
> answered from that layer, but 218a and 218b *are* being answered?
>
> Simply put: that should be impossible. TCP is an ordered stream. If
> you received answers to 2188 and 2189 on the wire in tcpdump, you
> should see those come back to dataReceived first. What kind of
> transport is this hooked up to? A regular socket? Is there TLS
> involved? Did you run tcpdump for that same session?

No TLS, just TCP, created with
twisted.application.internet.TCPClient(host, port, protocolfactory).

I didn't record this session with tcpdump, but from a previous one I can
say that yes, some Deferreds are left hanging around waiting for a
response while subsequent ones have already received one. There was no
interruption or irregularity in the TCP stream.

tcpdump: 2186, 2187, 2188, 2189, 218a, 218b

dataReceived: 2186, 2187, 218a, 218b

_outstandingRequests: {2188, 2189}

So as you say, TCP must have delivered the data to someone, or at least
believe it has. How much code is there between there and dataReceived? I
imagine most of it is in the kernel?

Peter.
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


[Twisted-Python] is it possible to change the isolation level of a psycopg2 connection under enterprise.adbapi ?

2014-11-13 Thread Jonathan Vanasco
there doesn't seem to be a way to access the connection objects within the pool 
( psycopg2 manages this via `connection.set_isolation_level(X)`

the only workaround I can think of seems to be emitting raw sql when I first 
start the transaction - but this doesn't seem right.

am i missing anything?
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] How do I debug this network problem?

2014-11-13 Thread Glyph
> On Nov 13, 2014, at 3:40 PM, Peter Westlake  wrote:
> 
> I'm certainly not averse to understanding the code, and if you had time to 
> describe it, that would be very kind, thank you!

Well if it's not even hitting dataReceived then a more subtle exploration is 
not necessary! ;-)

> I've put in the dataReceived, and the answer box does not make it into the 
> Protocol. There are two entries in protocol._outstandingRequests: {'2189': 
> , '2188': } and the log output shows 2186, 2187, 
> 218a, 218b, ...

So, wait a second.

You said you're looking at tcpdump, and it's showing you that your outstanding 
requests - in this case, 2188 and 2189 - are in fact being answered. But then 
you say you're looking at the dump from dataReceived, and seeing that not only 
are 2188 and 2189 not being answered from that layer, but 218a and 218b are 
being answered?

Simply put: that should be impossible.  TCP is an ordered stream.  If you 
received answers to 2188 and 2189 on the wire in tcpdump, you should see those 
come back to dataReceived first.  What kind of transport is this hooked up to?  
A regular socket?  Is there TLS involved?  Did you run tcpdump for that same 
session?

-glyph

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


[Twisted-Python] Sending longer messages in AMP

2014-11-13 Thread Gavin Panella
Hi,

We're using AMP and are starting to hit TooLong errors when scaling
our application. In one respect it's a sign that we should do
something like paging large requests and responses, but that's a lot
more work, and comes with its own problems. We also don't need
particularly large payloads: right now, a limit of ~500kiB would allow
us to scale as far as we need and beyond.

I've put together a fork of Twisted's AMP implementation that uses
32-bit length prefixes everywhere, though it limits the maximum
message size to 2MiB. Every other aspect of it is the same so it's a
drop-in replacement, as long as both ends of a connection use it.
However, there's no negotiation phase so it's completely incompatible
on the wire. The overhead of a few extra bytes is negligible for our
use cases, where the networks are all assumed to be low-latency
high-bandwidth LANs.

Are there any reasons that we shouldn't be doing this? Was there a
good reason for 16-bit length prefixes that still holds? Should we be
doing something else?

If I assume that the answers are all no, would someone find this
protocol useful if we submitted it for inclusion in Twisted itself?
The code right now is a straight copy of amp.py and test_amp.py with
changes to 32-bit length prefixes everywhere, but for upstreaming we'd
probably propose instead to modify the original to have an optional
negotiation phase, and to make the maximum message size a parameter.

Thanks!

Gavin.

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] How do I debug this network problem?

2014-11-13 Thread Peter Westlake


On Thu, 13 Nov 2014, at 13:17, Glyph wrote:
>
>> On Nov 13, 2014, at 1:15 PM, Peter Westlake
>>  wrote:
>>
>> I've had a look at the code, and got rather lost amongst the
>> interfaces and inheritance and protocols and transports. If someone
>> can help me narrow down the relevant bits of code, I can put in some
>> Python tracing.
>>
>
> I could describe all the interfaces and inheritance and protocols and
> transports, but since you don't want to puzzle out all that code,
> presumably such a description would be overly complex :).
>
> A good place to start would be to figure out if the data is getting to
> Twisted at all, which means instrumenting your Protocol.
>
> If you've done the default thing, and just done class Something(AMP):,
> this means you should override dataReceived, like so:
>
>> *from* *__future__* *import* print_function
>> *from**twisted.protocols.amp**import* AMP *class**MyAMP*(AMP,
>> object): *def* dataReceived(self, data): *print*("Got some data",
>> repr(data)) *return* super(MyAMP, self).dataReceived(data)
>
> If you're not seeing anything, that will give you an idea of whether
> your kernel is not actually delivering that data to Twisted.
>
> There are, of course, a plethora of other things that could be going
> wrong - maybe your Twisted program is stuck in some blocking function
> elsewhere and the reactor loop isn't running at all, maybe you're
> using some policy wrapper which is buffering incorrectly... but that's
> a good sanity check to start with.
>

I'm certainly not averse to understanding the code, and if you had time
to describe it, that would be very kind, thank you!

I've put in the dataReceived, and the answer box does *not* make it into
the Protocol. There are two entries in protocol._outstandingRequests:
{'2189': , '2188': } and the log output shows
2186, 2187, 218a, 218b, ...

Peter.
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] How do I debug this network problem?

2014-11-13 Thread Glyph

> On Nov 13, 2014, at 1:15 PM, Peter Westlake  wrote:
> 
> I've had a look at the code, and got rather lost amongst the interfaces
> and inheritance and protocols and transports. If someone can help me
> narrow down the relevant bits of code, I can put in some Python tracing.
> 

I could describe all the interfaces and inheritance and protocols and 
transports, but since you don't want to puzzle out all that code, presumably 
such a description would be overly complex :).

A good place to start would be to figure out if the data is getting to Twisted 
at all, which means instrumenting your Protocol.

If you've done the default thing, and just done class Something(AMP):, this 
means you should override dataReceived, like so:

from __future__ import print_function
from twisted.protocols.amp import AMP
class MyAMP(AMP, object):
def dataReceived(self, data):
print("Got some data", repr(data))
return super(MyAMP, self).dataReceived(data)

If you're not seeing anything, that will give you an idea of whether your 
kernel is not actually delivering that data to Twisted.

There are, of course, a plethora of other things that could be going wrong - 
maybe your Twisted program is stuck in some blocking function elsewhere and the 
reactor loop isn't running at all, maybe you're using some policy wrapper which 
is buffering incorrectly... but that's a good sanity check to start with.

-glyph___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


[Twisted-Python] How do I debug this network problem?

2014-11-13 Thread Peter Westlake
TL;DR - how do I debug the sequence of events between an AMP answer box
arriving at a NIC, and AMP firing the callRemote Deferred?

I have an application with two processes, on separate machines,
communicating using AMP. One process does a callRemote, which returns a
Deferred, which is never fired. I know from tcpdump that the AMP answer
box arrives safely at the network interface card.

This isn't something which can easily be reproduced. Instead, I want to
ask the specific question: how do I debug the data path from the NIC to
AMP firing its Deferred?

I've had a look at the code, and got rather lost amongst the interfaces
and inheritance and protocols and transports. If someone can help me
narrow down the relevant bits of code, I can put in some Python tracing.

FWIW, this is happening on Debian Squeeze and Wheezy, on VMs hosted on
Xen 6.5. It only happens on some specific machines, and only sometimes.
The same code has run flawlessly for many years elsewhere, though this
same bug did happen there too some years ago. That time, it went away
after most of the software in the system was upgraded. I tried that this
time - Debian Squeeze to Wheezy, with associated kernel, Python and
Twisted versions - but the problem persists. Anyway, I don't want to
make the problem go away without understanding it, for fear that it will
come back a third time.

Peter.

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python