2016-08-02 10:17 GMT+02:00 Luca Toscano <toscano.l...@gmail.com>: > > 2016-08-01 21:13 GMT+02:00 Jacob Champion <champio...@gmail.com> >> >> >> As stated above, this is not my first choice -- but I wouldn't oppose it >> if that's what the consensus comes to. >> >> else if (!ap_cstr_casecmp(w, "Last-Modified")) { >>> - apr_time_t parsed_date = apr_date_parse_rfc(l); >>> + apr_time_t parsed_date = apr_date_parse_http(l); >>> >> >> apr_date_parse_http() is not good enough; IIUC, it completely ignores >> timezones, which further corrupts non-GMT Last-Modified stamps. We either >> want strict parsing or actual correction, not something in the middle. > > > You are right, but this is the current behavior, this is why I used > apr_date_parse_http. My idea was to keep what we have at the moment, adding > a clear log that states the inconsistency found by httpd and why it has > been corrected. In this case it would simply mean that a value considered > "in the future" from GMT would be logged and corrected accordingly. This > would still leave out other wrong use cases like a datetime string in the > US/West timezone "in the future" if converted in GMT, but to solve it we'd > need to interpret the Last-Modified header value with the > apr_date_parse_rfc function and afaiu we still need to reach an agreement :) > > So I am really in favor of using apr_date_parse_rfc but I wanted to come > up with a possible to solution to satisfy everybody and help our users with > some clue. > > Example about the wrong use case mentioned by me for everybody (it helped > me a lot so it might also be useful to other people reading): > > =================== > HTTP/1.1 200 OK > Date: Tue, 02 Aug 2016 07:54:39 GMT > Server: Apache/2.5.0-dev (Unix) > Last-Modified: Tue, 02 Aug 2016 00:54:39 GMT > Content-Type: text/html; charset=UTF-8 > > Last-Modified sent: Tue, 02 Aug 2016 00:54:39 -0700 > Now in GMT: Tue, 02 Aug 2016 07:54:39 +0000 > =================== > > As you can see, the FCGI script sets a Last-Modified header value now() in > the "America/Los_Angeles" timezone, but it gets interpreted by httpd as a > GMT datestring "in the past". So for example "now() + 2 hours" in the > "America/Los_Angeles" timezone would still get interpreted as "in the past" > by httpd, avoiding any datetime correction to GMT now(). > > Same example with the code change that we have originally proposed: > > =================== > HTTP/1.1 200 OK > Date: Tue, 02 Aug 2016 08:08:14 GMT > Server: Apache/2.5.0-dev (Unix) > Last-Modified: Tue, 02 Aug 2016 08:08:14 GMT > Content-Type: text/html; charset=UTF-8 > > Last-Modified sent: Tue, 02 Aug 2016 01:08:14 -0700 > Now in GMT: Tue, 02 Aug 2016 08:08:14 +0000 > =================== > > The debate is all around, as far as I have understood, to if httpd should > be able to do the above "interpretation" or not. > > > 2016-08-01 23:13 GMT+02:00 Jacob Champion <champio...@gmail.com>: > >> On 08/01/2016 01:35 PM, William A Rowe Jr wrote: >> >>> So this alternative is not my first choice. Invalid headers should >>> really either be corrected (if the correction is obvious, safe, and >>> helpful), or dropped entirely. Or the entire response should be >>> 500'd, but we run into major compatibility breaks if we choose that >>> route. >>> >>> No, I agree that a 500 is not an option. Drop it, or fix it loudly in >>> the logs, >>> but we can't break existing deployments (which don't accept non-GMT >>> dates today, FWIW). >>> >> >> Based on conversations with Luca, existing deployments do "accept" >> non-GMT dates, silently corrupting the value, due to the use of the >> *_parse_http() function. >> >> I'm happy to react to bad input following a parsable date string, e.g. any >>> non-"GMT" signifier, as entirely bad input worth emitting to the error >>> log. >>> Let's help CGI authors find their bugs instead of generating unexpected >>> results, such as 5-8 hour wrong "Last Modified" dates in the US and >>> now() throughout Europe, owing to the time zone deltas. >>> >> >> Cool. I'm not opposed to loudly dropping non-GMT timestamps (though I >> still prefer fixing them up). > > > Wouldn't this introduce a breaking change? I know that everybody is > supposed to look into the changelog before upgrading httpd but as we often > see this is not the case and we could trigger nasty issues in various > installations imho. Compared to this solution I'd much prefer to go for the > proposed one, invasive but less disruptive :) > > As Jacob said, more opinions are very welcome! > > One general comment: I really like discussions but in this case I feel > that we would have solved the issue in 20 minutes talking in person :) > So maybe a conversation on IRC whenever everybody has a bit of time could > help reaching a final consensus? >
Trying with a new patch: http://apaste.info/xHz I added two separate logs, one for non-GMT header values and the other one for datetime strings considered in the future, restoring apr_date_parse_http instead of apr_date_parse_rfc. The example in my previous email would lead to the first error log, meanwhile a Last-Modified header in the Europe/Paris timezone would lead to both. Hope that this could be a good compromise (modulo changes to wording and/or log format that are very welcome), otherwise I am starting to run out of ideas :) Regards, Luca