On 10/2/15 1:04 PM, Sara Golemon wrote:
On Fri, Oct 2, 2015 at 4:18 AM, Peter Cowburn <petercowb...@gmail.com>
wrote:

a) change all other "invalid" escape sequences to be a parse error [that
would mean "\m" would raise a parse error!]

b) change \u{} to behave like any other escape sequence, by not raising a
parse error and instead keeping the literal characters

or c) tell me to keep quiet and accept the oddball behaviour, having quirks
is The PHP Way after all.


Well, I think option (a) would break parsed strings containing regex:

Oh holy hell.  I was about to point towards A because I agree with
Andrea that our invalid escape handling makes no sense, then you throw
this wrench in the gears.

While I still think that ignoring invalid sequences is bad and a
recipe for disaster (for example, in a given regex string, you have
some "escapes" passed to the engine as-is, while others like
\t\v\f\r\n do get interpolated, which is so inconsistent and entirely
php it's practically its own meme), I have to be practical about the
fact that there is a TON of existing regex out there (and no small
amount of "\u1234" sequences in JSON blobs).  A ton of that existing
regex is also needlessly using double-quotes strings where
single-quotes would have worked, meaning we can't just bifurcate on
that (even though allowing invalid sequences through on single-quotes
makes some sense).

Ugh...   No, that's too big of a change to existing scripts.  Can't do
option A, much as I'd like.

Option (b) sounds reasonable, but there's probably A Solid Reason it was
implemented that way

AIUI, the "solid reason" was because it's dangerous to fail silently
where you have high confidence that something is wrong.  Again, I
believe in it, but the arguments against option A illustrate why it
might not be practical.  I hate to say this, but in the interest of
consistency (were 7.0 not in its final stage) I'd vote for B.

which if so leaves (c.ii): accepting the odd-ball behavior....

Given that 7.0 is in its final stage, and changing this behaviour is
probably a non-starter at this point.  C seems the most sane^W
pragmatic.  It's not the first inconsistency PHP's picked up, it won't
be the last.

I agree with Sara all the way except the opinion that it's too late to fix this bug with option B, which I think is the right one.

I simply don't know if it is too late or not so I suggest Peter enter a bug report and see what happens. If it's too late for 7.0.0 do it in .0.1, which is ok because people will expect instability with 7.0.0. \u{394}semver > 1 is sufficient warning, I think.

Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to