On 07/19/2009 05:26 PM, Nick Kew wrote:
> Ruediger Pluem wrote:
>>
>> On 07/19/2009 01:12 AM, n...@apache.org wrote:
>
>>> -            if (re->nsub < idx || idx >= AP_MAX_REG_MATCH) {
>>> -                ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r,
>>> -                              "regex capture $%" APR_SIZE_T_FMT
>>> -                              " is out of range (last regex was:
>>> '%s') in %s",
>>> -                              idx, re->rexp, r->filename);
>>> -                return NULL;
>>> -            }
>>> -
>>> -            if (re->match[idx].rm_so < 0 || re->match[idx].rm_eo < 0) {
>>> -                return NULL;
>>> -            }
>>> +        else if (re->match[idx]rm_so == re->match[idx].rm_eo) {
>>> +            return NULL;
>>> +        }
>>> +        else if (re->match[idx].rm_so < 0 || re->match[idx].rm_eo <
>>> 0) {
>>> +            /* I don't think this can happen if have_match is true.
>>> +             * But let's not risk a regression by dropping this
>>> +             */
>>> +            return NULL;
>>> +        }
>>> +        else if (re->nsub < idx || idx >= AP_MAX_REG_MATCH) {
>>
>> IMHO this check should be the first (as in the old code) as it ensures
>> that
>> idx is of correct range. So it should be moved before
>> re->match[idx].rm_so == re->match[idx].rm_eo)
> 
> The problem with that is that re->nsub isn't the actual number of
> matches, it's the maximum number we gave to regexec.  So that check
> returns true even when idx indexes a non-match.

But if idx for whatever reason is above AP_MAX_REG_MATCH we can segfault
again in re->match[idx].rm_so == re->match[idx].rm_eo. So I still suggest
to reverse the order of both if statements. This doesn't cost us anything
but make things more foolprove.

> 
> I'd say it would be better to set it to the number of actual matches
> when we run regexec.  But I didn't want to think through the risk
> of side-effects in a quick-fix scenario.
> 
> OTOH, bearing in mind the history of that code, it's never been
> part of a public API in a stable release, so if we do create
> side-effects, we're not at risk of breaking anything we shouldn't.
> I guess we could reasonably hack it in trunk?
> 
>>> +            ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r,
>>> +                          "regex capture $%" APR_SIZE_T_FMT
>>> +                          " is out of range (last regex was: '%s')
>>> in %s",
>>> +                          idx, re->rexp, r->filename);
>>> +            return NULL;
>>> +        }
>>>  
>>> +        else {
>>>              val = apr_pstrmemdup(ctx->dpool, re->source +
>>> re->match[idx].rm_so,
>>>                                   re->match[idx].rm_eo -
>>> re->match[idx].rm_so);
>>
>>
>> As apr_pstrmemdup does return '\0' instead of NULL when
>> re->match[idx].rm_so == re->match[idx].rm_eo
>> we change the behaviour by doing the re->match[idx].rm_so ==
>> re->match[idx].rm_eo check above.
> 
> Fair point.  Again, side-effects.  Let's reverse that change for 2.2
> (and in trunk if you're unhappy with it).

As I see no benefit in returning NULL here I would say that we should
reverse this in trunk as well as it keeps 2.2.x and trunk closer together.
I have no problem if trunk is different from 2.2.x but it should be for
a reason which I fail to see here.


Regards

Rüdiger

Reply via email to