Re: [nodejs] Coping with invalid http headers

2013-01-09 Thread Marcel Laverdet
"First packet" may not always be the first packet, given Connection: keep-alive. Basically what I'm saying is that there are no guarantees that this will continue working, or work consistently. You seem to have a good understanding of the risks involved, but for me I'd sooner use child_process.spaw

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Nathan White
Varnish is great at normalizing http headers. You could make this telco site a "backend" on varnish and continue on parsing in node. sub vcl_recv { req.http.Content-Length = regsub(req.http.Content-Length, '^([0-9]+)', '\1'); } On Jan 8, 2013, at 6:34 PM, Matt wrote: > On Tue, Jan 8, 2013 at

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Matt
On Tue, Jan 8, 2013 at 5:24 PM, Marcel Laverdet wrote: > By heavy load I'm talking about network traffic, either on your end, their > end, or any hop in between. "In the first packet" is certainly *not* > something I'd recommend anyone to depend on, as that depends on a whole lot > of things. >

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Marcel Laverdet
By heavy load I'm talking about network traffic, either on your end, their end, or any hop in between. "In the first packet" is certainly *not* something I'd recommend anyone to depend on, as that depends on a whole lot of things. The monkey patching is gross, but hey it works. The only thing here

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Matt
Exactly, it's designed for this one service which always sends the Content-Length capitalized like this, and screwed up with the comma, and in the first packet. If there's other screwy things in the future we can deal with them then. Believe me I know all about parsing headers (I had to write the p

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Marcel Laverdet
omg I can't believe you've done this. Obviously this won't work if the server doesn't send "Content-Length" capitalized like you have here, but if you're only designing against one service that's not a huge issue. You should be aware though that this may fail in certain rare circumstances, or unde

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Matt
Rather than go into patching anything, I managed to get this to work: r.on('request', function (req) { req.on('socket', function () { var oldOnData = req.socket.ondata; var first_packet = true; req.socket.ondata = function (d, start, end) {

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Marcel Laverdet
Apply this patch: https://gist.github.com/4487528 Node shouldn't be barfing on anything a browser can display and should really be more tolerant of these failures. I should submit a PR.. but not sure if this will cause other issues down the road. On Tue, Jan 8, 2013 at 12:42 PM, Matt wrote: > W

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Tim Caswell
I mean, use the manual client as a fallback. It's only as good/hard as you need it to be. You could simply look for the first instance of "\r\n\r\n" and assume everything after that is the body. If you needed the headers, just split on "\r\n" and then split on ":" and you'll get most of it. Dep

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Matt
On Tue, Jan 8, 2013 at 2:26 PM, Tim Caswell wrote: > You can use the TCP client directly and hand-roll the http request. Your > response won't be parsed as http (nor would you want to in the error case), > but you can write a crude parser in js to get the bulk of it. > Yeah that occurred to me

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Matt
On Tue, Jan 8, 2013 at 2:22 PM, Ryan Schmidt wrote: > I'll bet you already have, but sending a bug report to whoever's serving > that invalid content so that they can fix it seems like the best and > simplest solution > Yeah - have contacted them but they're a big telco - I very much doubt they'l

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Tim Caswell
You can use the TCP client directly and hand-roll the http request. Your response won't be parsed as http (nor would you want to in the error case), but you can write a crude parser in js to get the bulk of it. On Tue, Jan 8, 2013 at 12:42 PM, Matt wrote: > We're doing web scraping using node

Re: [nodejs] Coping with invalid http headers

2013-01-08 Thread Ryan Schmidt
On Jan 8, 2013, at 12:42, Matt wrote: > We're doing web scraping using node and coming across an issue that we cannot > fetch a particular URL on a particular web site, because it sends back: > "Content-Length: 1234,1234" I'll bet you already have, but sending a bug report to whoever's servin

[nodejs] Coping with invalid http headers

2013-01-08 Thread Matt
We're doing web scraping using node and coming across an issue that we cannot fetch a particular URL on a particular web site, because it sends back: "Content-Length: 1234,1234" I totally understand that node's http parser doesn't deal with this, and throws an error, but is there any way we can in