On 07/19/2009 05:26 PM, Nick Kew wrote: > Ruediger Pluem wrote: >> >> On 07/19/2009 01:12 AM, n...@apache.org wrote: > >>> - if (re->nsub < idx || idx >= AP_MAX_REG_MATCH) { >>> - ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r, >>> - "regex capture $%" APR_SIZE_T_FMT >>> - " is out of range (last regex was: >>> '%s') in %s", >>> - idx, re->rexp, r->filename); >>> - return NULL; >>> - } >>> - >>> - if (re->match[idx].rm_so < 0 || re->match[idx].rm_eo < 0) { >>> - return NULL; >>> - } >>> + else if (re->match[idx]rm_so == re->match[idx].rm_eo) { >>> + return NULL; >>> + } >>> + else if (re->match[idx].rm_so < 0 || re->match[idx].rm_eo < >>> 0) { >>> + /* I don't think this can happen if have_match is true. >>> + * But let's not risk a regression by dropping this >>> + */ >>> + return NULL; >>> + } >>> + else if (re->nsub < idx || idx >= AP_MAX_REG_MATCH) { >> >> IMHO this check should be the first (as in the old code) as it ensures >> that >> idx is of correct range. So it should be moved before >> re->match[idx].rm_so == re->match[idx].rm_eo) > > The problem with that is that re->nsub isn't the actual number of > matches, it's the maximum number we gave to regexec. So that check > returns true even when idx indexes a non-match.
But if idx for whatever reason is above AP_MAX_REG_MATCH we can segfault again in re->match[idx].rm_so == re->match[idx].rm_eo. So I still suggest to reverse the order of both if statements. This doesn't cost us anything but make things more foolprove. > > I'd say it would be better to set it to the number of actual matches > when we run regexec. But I didn't want to think through the risk > of side-effects in a quick-fix scenario. > > OTOH, bearing in mind the history of that code, it's never been > part of a public API in a stable release, so if we do create > side-effects, we're not at risk of breaking anything we shouldn't. > I guess we could reasonably hack it in trunk? > >>> + ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r, >>> + "regex capture $%" APR_SIZE_T_FMT >>> + " is out of range (last regex was: '%s') >>> in %s", >>> + idx, re->rexp, r->filename); >>> + return NULL; >>> + } >>> >>> + else { >>> val = apr_pstrmemdup(ctx->dpool, re->source + >>> re->match[idx].rm_so, >>> re->match[idx].rm_eo - >>> re->match[idx].rm_so); >> >> >> As apr_pstrmemdup does return '\0' instead of NULL when >> re->match[idx].rm_so == re->match[idx].rm_eo >> we change the behaviour by doing the re->match[idx].rm_so == >> re->match[idx].rm_eo check above. > > Fair point. Again, side-effects. Let's reverse that change for 2.2 > (and in trunk if you're unhappy with it). As I see no benefit in returning NULL here I would say that we should reverse this in trunk as well as it keeps 2.2.x and trunk closer together. I have no problem if trunk is different from 2.2.x but it should be for a reason which I fail to see here. Regards Rüdiger