On 10/2/15 1:04 PM, Sara Golemon wrote:
On Fri, Oct 2, 2015 at 4:18 AM, Peter Cowburn <petercowb...@gmail.com>
wrote:
a) change all other "invalid" escape sequences to be a parse error [that
would mean "\m" would raise a parse error!]
b) change \u{} to behave like any other escape sequence, by not raising a
parse error and instead keeping the literal characters
or c) tell me to keep quiet and accept the oddball behaviour, having quirks
is The PHP Way after all.
Well, I think option (a) would break parsed strings containing regex:
Oh holy hell. I was about to point towards A because I agree with
Andrea that our invalid escape handling makes no sense, then you throw
this wrench in the gears.
While I still think that ignoring invalid sequences is bad and a
recipe for disaster (for example, in a given regex string, you have
some "escapes" passed to the engine as-is, while others like
\t\v\f\r\n do get interpolated, which is so inconsistent and entirely
php it's practically its own meme), I have to be practical about the
fact that there is a TON of existing regex out there (and no small
amount of "\u1234" sequences in JSON blobs). A ton of that existing
regex is also needlessly using double-quotes strings where
single-quotes would have worked, meaning we can't just bifurcate on
that (even though allowing invalid sequences through on single-quotes
makes some sense).
Ugh... No, that's too big of a change to existing scripts. Can't do
option A, much as I'd like.
Option (b) sounds reasonable, but there's probably A Solid Reason it was
implemented that way
AIUI, the "solid reason" was because it's dangerous to fail silently
where you have high confidence that something is wrong. Again, I
believe in it, but the arguments against option A illustrate why it
might not be practical. I hate to say this, but in the interest of
consistency (were 7.0 not in its final stage) I'd vote for B.
which if so leaves (c.ii): accepting the odd-ball behavior....
Given that 7.0 is in its final stage, and changing this behaviour is
probably a non-starter at this point. C seems the most sane^W
pragmatic. It's not the first inconsistency PHP's picked up, it won't
be the last.
I agree with Sara all the way except the opinion that it's too late to
fix this bug with option B, which I think is the right one.
I simply don't know if it is too late or not so I suggest Peter enter a
bug report and see what happens. If it's too late for 7.0.0 do it in
.0.1, which is ok because people will expect instability with 7.0.0.
\u{394}semver > 1 is sufficient warning, I think.
Tom
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php