2016-08-02 10:17 GMT+02:00 Luca Toscano <toscano.l...@gmail.com>:

>
> 2016-08-01 21:13 GMT+02:00 Jacob Champion <champio...@gmail.com>
>>
>>
>> As stated above, this is not my first choice -- but I wouldn't oppose it
>> if that's what the consensus comes to.
>>
>>          else if (!ap_cstr_casecmp(w, "Last-Modified")) {
>>> -            apr_time_t parsed_date = apr_date_parse_rfc(l);
>>> +            apr_time_t parsed_date = apr_date_parse_http(l);
>>>
>>
>> apr_date_parse_http() is not good enough; IIUC, it completely ignores
>> timezones, which further corrupts non-GMT Last-Modified stamps. We either
>> want strict parsing or actual correction, not something in the middle.
>
>
> You are right, but this is the current behavior, this is why I used
> apr_date_parse_http. My idea was to keep what we have at the moment, adding
> a clear log that states the inconsistency found by httpd and why it has
> been corrected. In this case it would simply mean that a value considered
> "in the future" from GMT would be logged and corrected accordingly. This
> would still leave out other wrong use cases like a datetime string in the
> US/West timezone "in the future" if converted in GMT, but to solve it we'd
> need to interpret the Last-Modified header value with the
> apr_date_parse_rfc function and afaiu we still need to reach an agreement :)
>
> So I am really in favor of using apr_date_parse_rfc but I wanted to come
> up with a possible to solution to satisfy everybody and help our users with
> some clue.
>
> Example about the wrong use case mentioned by me for everybody (it helped
> me a lot so it might also be useful to other people reading):
>
> ===================
> HTTP/1.1 200 OK
> Date: Tue, 02 Aug 2016 07:54:39 GMT
> Server: Apache/2.5.0-dev (Unix)
> Last-Modified: Tue, 02 Aug 2016 00:54:39 GMT
> Content-Type: text/html; charset=UTF-8
>
> Last-Modified sent: Tue, 02 Aug 2016 00:54:39 -0700
> Now in GMT: Tue, 02 Aug 2016 07:54:39 +0000
> ===================
>
> As you can see, the FCGI script sets a Last-Modified header value now() in
> the "America/Los_Angeles" timezone, but it gets interpreted by httpd as a
> GMT datestring "in the past". So for example "now() + 2 hours" in the
> "America/Los_Angeles" timezone would still get interpreted as "in the past"
> by httpd, avoiding any datetime correction to GMT now().
>
> Same example with the code change that we have originally proposed:
>
> ===================
> HTTP/1.1 200 OK
> Date: Tue, 02 Aug 2016 08:08:14 GMT
> Server: Apache/2.5.0-dev (Unix)
> Last-Modified: Tue, 02 Aug 2016 08:08:14 GMT
> Content-Type: text/html; charset=UTF-8
>
> Last-Modified sent: Tue, 02 Aug 2016 01:08:14 -0700
> Now in GMT: Tue, 02 Aug 2016 08:08:14 +0000
> ===================
>
> The debate is all around, as far as I have understood, to if httpd should
> be able to do the above "interpretation" or not.
>
>
> 2016-08-01 23:13 GMT+02:00 Jacob Champion <champio...@gmail.com>:
>
>> On 08/01/2016 01:35 PM, William A Rowe Jr wrote:
>>
>>>     So this alternative is not my first choice. Invalid headers should
>>>     really either be corrected (if the correction is obvious, safe, and
>>>     helpful), or dropped entirely. Or the entire response should be
>>>     500'd, but we run into major compatibility breaks if we choose that
>>>     route.
>>>
>>> No, I agree that a 500 is not an option. Drop it, or fix it loudly in
>>> the logs,
>>> but we can't break existing deployments (which don't accept non-GMT
>>> dates today, FWIW).
>>>
>>
>> Based on conversations with Luca, existing deployments do "accept"
>> non-GMT dates, silently corrupting the value, due to the use of the
>> *_parse_http() function.
>>
>> I'm happy to react to bad input following a parsable date string, e.g. any
>>> non-"GMT" signifier, as entirely bad input worth emitting to the error
>>> log.
>>> Let's help CGI authors find their bugs instead of generating unexpected
>>> results, such as 5-8 hour wrong "Last Modified" dates in the US and
>>> now() throughout Europe, owing to the time zone deltas.
>>>
>>
>> Cool. I'm not opposed to loudly dropping non-GMT timestamps (though I
>> still prefer fixing them up).
>
>
> Wouldn't this introduce a breaking change? I know that everybody is
> supposed to look into the changelog before upgrading httpd but as we often
> see this is not the case and we could trigger nasty issues in various
> installations imho. Compared to this solution I'd much prefer to go for the
> proposed one, invasive but less disruptive :)
>
> As Jacob said, more opinions are very welcome!
>
> One general comment: I really like discussions but in this case I feel
> that we would have solved the issue in 20 minutes talking in person :)
> So maybe a conversation on IRC whenever everybody has a bit of time could
> help reaching a final consensus?
>


Trying with a new patch: http://apaste.info/xHz

I added two separate logs, one for non-GMT header values and the other one
for datetime strings considered in the future, restoring
apr_date_parse_http instead of apr_date_parse_rfc.

The example in my previous email would lead to the first error log,
meanwhile a Last-Modified header in the Europe/Paris timezone would lead to
both.

Hope that this could be a good compromise (modulo changes to wording and/or
log format that are very welcome), otherwise I am starting to run out of
ideas :)

Regards,

Luca

Reply via email to