I bisected the bug to the frontal: the attached script capture the live TCP connections from the frontal to the application server upon status 500 errors and I can confirm that the request body payload is missing from these sessions so, the frontal did not send the bytes to the application server.
Now, it could be that I do not receive the missing bytes on my frontal or the frontal is dropping them. I can't reuse the same script because my frontal's traffic is encrypted. Is there a way to make uwsgi dump somewhere the decryption keys for an ssl session so I could decrypt the associated traffic on the fly and dump the traffic for later inspection when I see a 500 ? Mathieu Le mar. 12 mars 2019 à 11:14, mathieu lacage <mathieu.lac...@alcmeon.com> a écrit : > Hi, > > My production servers are deployed as 4 uwsgi instances: > > front1 -> admin3+admin4 > front2 -> admin3+admin4 > > front1/2 are load balanced at the DNS level and perform ssl termination > with uwsgi and forward requests to admin3/4 by using an http subscription > server. admin3/4 run my python application within a simple uwsgi master+5 > slaves. > > Before you ask: on all 4 servers, > $ uwsgi --version > 2.0.15-debian > > Recently, I became aware of a low-probability bug triggered by multiple > customers across a wide range of access networks that happens when reading > the body of my incoming POST HTTP requests at the application level. > > Here is a typical backtrace: > > Mar 09 08:32:10 admin3 uwsgi[8246]: admin-5 ERROR root timeout > during read(2) on wsgi.input > Traceback (most recent call last): > File > "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1535, in __call__ > rv = > self.handle_exception(request, response, e) > File > "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1529, in __call__ > rv = self.router.dispatch(request, > response) > File > "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1278, in > default_dispatcher > return > route.handler_adapter(request, response) > File > "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1102, in __call__ > return handler.dispatch() > File "admin2/BaseHandler.py", line > 99, in dispatch > self._do_dispatch(user) > File "admin2/BaseHandler.py", line > 327, in _do_dispatch > > self._begin_transaction(company_id, user) > File "admin2/BaseRequestHandler.py", > line 210, in _begin_transaction > self._data = self.request.body > File > "/usr/lib/python2.7/dist-packages/webob/request.py", line 702, in body > self.make_body_seekable() # we > need this to have content_length > File > "/usr/lib/python2.7/dist-packages/webob/request.py", line 930, in > make_body_seekable > self.copy_body() > File > "/usr/lib/python2.7/dist-packages/webob/request.py", line 954, in copy_body > data = input.read(min(todo, 65535)) > File > "/usr/lib/python2.7/dist-packages/webob/request.py", line 1540, in readinto > data = self.file.read(sz0) > IOError: timeout during read(2) on > wsgi.input > Mar 09 08:32:10 admin3 admin[3875]: [uwsgi-body-read] Timeout reading 2 > bytes. Content-Length: 2 consumed: 0 left: 2 > > Ok, I get this: my application is trying to read these damn 2 bytes but > they are not in the wsgi input file so, uwsgi triggers the 4s (default > value) socket timeout. > > Now, the issue here is that I'd like to figure out if these damn bytes > were actually received on the application server and if not, where they got > lost so I can blame someone else. > > Has anyone any idea on what I could do to debug/trace this further on my > production infrastructure ? (clearly, logging all incoming traffic with > tcpdump is not an option) > > Mathieu > -- > Mathieu Lacage <mathieu.lac...@alcmeon.com> > -- Mathieu Lacage <mathieu.lac...@alcmeon.com>
import argparse import os import pcapy import pypacker.ppcap import pypacker.layer12.ethernet import pypacker.layer3.ip import pypacker.layer4.tcp class Session: def __init__(self): self._packets = [] self._keep = False def append(self, p): self._packets.append(p) @property def length(self): return len(self._packets) def keep(self): self._keep = True def close(self, directory, k): if self._keep: filename = os.path.join(directory, '%s.pcap' % k) print('keep %s' % filename) with pypacker.ppcap.Writer(filename=filename, linktype=pypacker.ppcap.DLT_EN10MB) as pwriter: for p in self._packets: pwriter.write(p) def main(): parser = argparse.ArgumentParser() parser.add_argument('-i', '--interface', required=True) parser.add_argument('-f', '--filter', default=None) parser.add_argument('-o', '--output-directory', default=os.getcwd()) args = parser.parse_args() sessions = {} reader = pcapy.open_live(args.interface, 65536, 0, 1000) if args.filter is not None: reader.setfilter(args.filter) def receive(header, packet): eth = pypacker.layer12.ethernet.Ethernet(packet) ip = eth[pypacker.layer3.ip.IP] tcp = eth[pypacker.layer4.tcp.TCP] a = (ip.src_s, tcp.sport) b = (ip.dst_s, tcp.dport) k = '-'.join(sorted([ip.src_s, ip.dst_s, str(tcp.sport), str(tcp.dport)])) session = sessions.get(k) if session is None: session = Session() sessions[k] = session session.append(packet) if tcp.body_bytes.startswith(b'HTTP/1.1 500'): session.keep() if tcp.flags & 0x0001: # FIN #print('close %s' % k) #print(session.length) session.close(args.output_directory, k) del sessions[k] reader.loop(0, receive) main()
_______________________________________________ uWSGI mailing list uWSGI@lists.unbit.it http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi