Hi Yann, 2016-09-10 13:58 GMT+02:00 Yann Ylavic <[email protected]>:
> Hi Luca, > > On Sat, Sep 10, 2016 at 12:07 PM, Luca Toscano <[email protected]> > wrote: > > > > 2016-09-10 1:13 GMT+02:00 Yann Ylavic <[email protected]>: > >> > >> As read it, this proposal (http://apaste.info/FCs) does not enforce > >> GMT date and will treat any other timezone as if it were GMT (per > >> apr_date_parse_http(), which does not look for "GMT" anywhere), right? > > > > Yes, it leaves the current behavior unaltered (except for APR_DATE_BAD). > > So we still may change the Last-Modified sent by a CGI... > Yes, all the drawbacks of the current approach (basically assuming 'GMT' in the datestring returned by a CGI) are still there, but they are logged and users know what httpd is doing. > >> > >> Shouldn't we either ignore (i.e. not forward) non-GMT, or change it to > >> GMT if it's a valid timezone? > >> > >> The only way to do this would be to use apr_date_parse_rfc() in the > first > >> place. > >> If we'd want to ignore non-GMT, compare it to apr_date_parse_http(). > >> And only otherwise or if it matches, let it go through > ap_update_mtime(). > > > > My decision to not use apr_date_parse_rfc came after the long discussion > in > > dev@ about whether or not a Last-Modified header coming from a CGI > backend > > should be considered HTTP input (so assuming GMT IIUC) or not. Since we > > didn't reach a complete agreement on how to proceed I thought to propose > a > > light change as starter (to answer a question raised in users@ a while > back > > about warning admins on httpd modifications of the Last-Modified header > > value) and think about more invasive changes in the future, like > dropping a > > non GMT header or converting it in GMT if on a valid timezone. > > If we really didn't want to modify the (semantic of the )header, we'd > accept other timezones and convert it to GMT. > But dropping or interpreting it as GMT when it's not (i.e. s/<any > timezone>/GMT/ without changing the date) is a modification anyway. > > So if we want to stick to the RFC (and it's very strict > interpretation, which I'm not sure it's that strict), either we ignore > or we error on non-GMT (the latter would be an unacceptable change in > behaviour IMHO). > I wouldn't like both since they could be too invasive for existing web-apps/CGIs that are relying on this "feature" without knowing it. I am in favor of either interpreting a valid non-GMT datestring or leaving the current behavior and log a warning. > >> I don't the difference between this proposal and the current code (but > >> TRACEs), ap_update_mtime() is a noop with APR_DATE_BAD. > >> Did I miss something? > > > > > > This is a very good point, I didn't notice an important part. From what I > > can see ap_update_mtime is indeed not updating anything, but r->mtime is > > already 0 and the ap_set_last_modified picks it up and use it (ending up > > with a Unix epoch datetime string). My check on APR_DATE_BAD avoids the > call > > to ap_set_last_modified, this is why the header is dropped. > > Yes, I missed ap_set_last_modified() was using ~now when given zero. > > > Maybe we also > > need something like the following? > > > > AP_DECLARE(void) ap_set_last_modified(request_rec *r) > > { > > - if (!r->assbackwards) { > > + if (!r->assbackwards && r->mtime > 0) { > > apr_time_t mod_time = ap_rationalize_mtime(r, r->mtime); > > char *datestr = apr_palloc(r->pool, APR_RFC822_DATE_LEN); > > That'd change its behaviour, one may currently use it to set > Last-Modified to now, w/o calling ap_update_mtime(), and it used to > work... > My bad, didn't investigate the solution properly :) > Anyway, my personal preference would be to tolerate a different > timezone (than GMT), or at at worse ignore the header (if not GMT). > > So I think the patch (against 2.4.x) would be something like : > > Index: server/util_script.c > =================================================================== > --- server/util_script.c (revision 1760114) > +++ server/util_script.c (working copy) > @@ -670,11 +670,26 @@ AP_DECLARE(int) ap_scan_script_header_err_core_ex( > } > /* > * If the script gave us a Last-Modified header, we can't just > - * pass it on blindly because of restrictions on future values. > + * pass it on blindly (RFC mandates GMT). Future values are > + * handled by ap_set_last_modified() and changed to now. > */ > else if (!strcasecmp(w, "Last-Modified")) { > - ap_update_mtime(r, apr_date_parse_http(l)); > - ap_set_last_modified(r); > + apr_time_t parsed_date = apr_date_parse_rfc(l); > + if (parsed_date == APR_DATE_BAD) { > + ap_log_rerror(SCRIPT_LOG_MARK, APLOG_DEBUG, 0, r, > + "Ignored Last-Modified header value: '%s'", > + l); > + } > + else if(parsed_date != apr_date_parse_http(l)) { > + ap_log_rerror(SCRIPT_LOG_MARK, APLOG_DEBUG, 0, r, > + "Ignored Last-Modified header value (not " > + "within the GMT timezone, as required): > '%s'", > + l); > + } > The patch looks good but the above bit is what I'd be worried about, since I am afraid that it could affect existing web-app/CGIs needing extra tuning/config after the httpd upgrade. Imagine an old and untouchable CGI returning an 'incorrect' Last-Modified header, worked fine till now but broken after this change. Maybe I am too paranoid but the admins in the community have already a painful life while upgrading systems, I wouldn't cause more headaches to them if possible :) > + else { > + ap_update_mtime(r, parsed_date); > + ap_set_last_modified(r); > + } > } > else if (!strcasecmp(w, "Set-Cookie")) { > apr_table_add(cookie_table, w, l); > _ > > > Or the lenient mode : > > Index: server/util_script.c > =================================================================== > --- server/util_script.c (revision 1760114) > +++ server/util_script.c (working copy) > @@ -670,11 +670,21 @@ AP_DECLARE(int) ap_scan_script_header_err_core_ex( > } > /* > * If the script gave us a Last-Modified header, we can't just > - * pass it on blindly because of restrictions on future values. > + * pass it on blindly (RFC mandates GMT, but be lenient and > + * transform other timezones to GMT). Future values are handled > + * by ap_set_last_modified() and changed to now. > */ > else if (!strcasecmp(w, "Last-Modified")) { > - ap_update_mtime(r, apr_date_parse_http(l)); > - ap_set_last_modified(r); > + apr_time_t parsed_date = apr_date_parse_rfc(l); > + if (parsed_date == APR_DATE_BAD) { > + ap_log_rerror(SCRIPT_LOG_MARK, APLOG_DEBUG, 0, r, > + "Ignored Last-Modified header value: '%s'", > + l); > + } > + else { > + ap_update_mtime(r, parsed_date); > + ap_set_last_modified(r); > + } > } > else if (!strcasecmp(w, "Set-Cookie")) { > apr_table_add(cookie_table, w, l); > I really like this one, but I'd also add a trace log to warn the admin if apr_date_parse_http(l) != apr_date_parse_rfc(l). My last code change proposal was an attempt to satisfy a user request and a dev@ discussion, but probably I failed to find a good compromise. I am personally in favor of using apr_date_parse_rfc instead of the _http counterpart, but I'd also like to see an agreement in this list. My +1 goes to your second solution (plus some trace logging if everybody likes it). William/Jacob: if you want to add more thoughts please do :) Thanks a lot for the review! Luca
