On Fri, Oct 2, 2015 at 6:53 AM, Bishop Bettini <bis...@php.net> wrote:
> On Fri, Oct 2, 2015 at 4:18 AM, Peter Cowburn <petercowb...@gmail.com>
> wrote:
>
>> a) change all other "invalid" escape sequences to be a parse error [that
>> would mean "\m" would raise a parse error!]
>>
>> b) change \u{} to behave like any other escape sequence, by not raising a
>> parse error and instead keeping the literal characters
>>
>> or c) tell me to keep quiet and accept the oddball behaviour, having quirks
>> is The PHP Way after all.
>>
>
> Well, I think option (a) would break parsed strings containing regex:
>
Oh holy hell.  I was about to point towards A because I agree with
Andrea that our invalid escape handling makes no sense, then you throw
this wrench in the gears.

While I still think that ignoring invalid sequences is bad and a
recipe for disaster (for example, in a given regex string, you have
some "escapes" passed to the engine as-is, while others like
\t\v\f\r\n do get interpolated, which is so inconsistent and entirely
php it's practically its own meme), I have to be practical about the
fact that there is a TON of existing regex out there (and no small
amount of "\u1234" sequences in JSON blobs).  A ton of that existing
regex is also needlessly using double-quotes strings where
single-quotes would have worked, meaning we can't just bifurcate on
that (even though allowing invalid sequences through on single-quotes
makes some sense).

Ugh...   No, that's too big of a change to existing scripts.  Can't do
option A, much as I'd like.

> Option (b) sounds reasonable, but there's probably A Solid Reason it was
> implemented that way
>
AIUI, the "solid reason" was because it's dangerous to fail silently
where you have high confidence that something is wrong.  Again, I
believe in it, but the arguments against option A illustrate why it
might not be practical.  I hate to say this, but in the interest of
consistency (were 7.0 not in its final stage) I'd vote for B.

> which if so leaves (c.ii): accepting the odd-ball behavior....
>
Given that 7.0 is in its final stage, and changing this behaviour is
probably a non-starter at this point.  C seems the most sane^W
pragmatic.  It's not the first inconsistency PHP's picked up, it won't
be the last.

-Sara

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to