Hello All, I ran across another interesting problem today in handling status codes inside of:
src/main/java/org/apache/mina/filter/codec/http/HttpResponseLineDecodingStat e.java This is mina-2.0 snapshot. I'm building an RSS client that needs to poll thousands of RSS channels. I'm using conditional gets as not to waste bandwidth. I noticed a particular site was raising a "Bad Status Code" exception whenever the server responded with a 304 (Not Modified). It seems that the server (Apache version ??) was sending back a status code without the Reason Phrase. I found the BNF notation that describes the Reason Phrase in RFC2616: http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/issues/#i94 Which states: TEXT = <any OCTET except CTLs, but including LWS> LWS = [CRLF] 1*( SP | HT ) CRLF = CR LF Reason-Phrase = *<TEXT, excluding CR, LF> This means that a Reason Phrase could be empty and still be considered valid (e.g., 0 or more Octets). The state machine expects to see a Reason Phrase and when it doesn't, it consumes part of the next header (Date) and then throws an exception trying to convert this value to an Integer. What I did was override the isTerminator() method for the ConsumeToLinearWhitespaceDecodingState adding a check for a CR. This stops the scanner from pulling in excess bytes. I wasn't sure how the remaining states would handle this but AFTER_READ_STATUS_CODE returns immediately as does READ_REASON_PHRASE (since we left a remaining LF byte on the input buffer) and we cleanly move to a final acceptance state. I've run about 500 feeds through this and nothing seems to have broke. Here's a patch: --- HttpResponseLineDecodingState.orig.java 2008-01-04 14:29:25.000000000 -0500 +++ HttpResponseLineDecodingState.java 2008-01-04 14:28:40.000000000 -0500 @@ -80,6 +80,10 @@ } return AFTER_READ_STATUS_CODE; } + @Override + protected boolean isTerminator(byte b) { + return b == 32 || b == 9 || b == 13; + } }; private final DecodingState AFTER_READ_STATUS_CODE = new LinearWhitespaceSkippingState() { This _should_ be safe since the response line has to be terminated with a CR/LF pair. As long as we leave the LF byte, it's enough to satisfy the state requirements for the trailing states. You can use this site for testing. If you set the eTag you should get a 304 with no Reason Phrase. http://www.mattweber.org/feed/ Thanks, -Eric P.S. Did I mention how much fun I've been having with mina? Love it!
