On Tue, Sep 25, 2012 at 6:38 AM, Jimb Esser <[email protected]> wrote:
> Yes, but, as I understand, this is logging during processing a single stream
> of data, in a single call to http_parser_execute(), specifically inside of a
> for loop looping over exactly the data I logged.  That should be internally
> consistent, at least, right?  If the stream says it has an 8192 byte chunk
> coming, and there's more than that many bytes of date, 8192 bytes later
> there should be another chunk size in the stream (which, there is not, which
> of course is why it's throwing the HPE_INVALID_CHUNK_SIZE error as it reads
> some non-hex characters at the offset its expecting the next chunk size).
>
> Anyway, I managed to capture it as it went bad with tcpdump (I added  -U -B
> 1000000, not sure if that helped or the timing was just different).  Short
> result is: it looks like the data was bad as it came in over TCP!  I'll go
> into detail and post the dumps below, but this quickly leads me to ask
> (after apologizing about thinking this was a node bug): why does this
> generally fail on 0.8 and never on 0.6?
>
> My guess is that on 0.6 an HTTP parse error simply disconnected the socket,
> and all of our AWS APIs simply retry any requests which get a socket
> disconnect (as those happen rather frequently with AWS).  We also
> automatically retry on a bunch of 5xx http response codes, so if any of
> those were being generated, we'd retry on those as well.  I'll add some
> additional logging and run this on node 0.6 again to see if I can verify
> this.
>
> Back to the details on the symptom at hand (just in case I misinterpreted
> some of this data, it might be good to get another set of eyes on it).  In
> this case, it didn't actually get a http parser error, it simply handed back
> what looks like bad data (includes the HTTP chunk size lines inserted in the
> middle of it), perhaps because the socket got closed or something (although
> it seems if it got closed prematurely that perhaps that should have
> generated a parsing error since it was expecting more bytes? I'm not sure on
> this, exactly).  I've uploaded the parsed/printed TCP dump of the end of the
> relevant stream (starting from before the last good chunk size line) [1], as
> well as the tail of the data that was returned to my app, starting from the
> same point [2].
>
> As far as I can tell, the stream indicated there was an 8kb chunk coming,
> 8kb later it indicates another 8kb chunk is coming, but then 1296 bytes
> later there's another chunk size line ("\r\n2000\r\n").  At the packet
> level, the final packet (starting at seq 230681) appears to be simply
> repeated data from 1304 bytes back in the stream, but then terminated
> unexpectedly (never gets to the 8kb of data the earlier chunk size indicated
> was coming).  No other packets show up in this stream in my dump.
>
> I can not think of any possible thing on our side (either my code or node's)
> that could cause this, so my conclusion is that something is going wrong
> within Amazon's Dynamo DB (or routing between there and our servers, though
> I've tried both inside and outside of their datacenters with the same
> results).  In retrospect, every bad packet I've gotten has been on DynamoDB
> queries, despite us slamming a couple other Amazon APIs with similar levels
> of traffic during this test, so that also points the finger in that
> direction.  I think I will try and bring this up on the DynamoDB forums, and
> just make our DDB client more robust if it gets low-level parsing errors
> (although I'm a little afraid of what could happen if it happens to
> accidentally parse correctly because of some of our data happening to match
> up!).
>
> Thank you very much for your help with this.
>
> [1]  http://thesilentb.com/files/tcpdump.txt
> [2]  http://thesilentb.com/files/packet.txt

Glad you got it solved - or the next best thing, know what the culprit is. :-)

The reason that you're seeing it with 0.8 and not 0.6 probably is that
http_parser.c became considerably more strict. Error handling on the
JS level hasn't changed significantly.

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to