On Fri, Oct 2, 2015 at 6:53 AM, Bishop Bettini <bis...@php.net> wrote: > On Fri, Oct 2, 2015 at 4:18 AM, Peter Cowburn <petercowb...@gmail.com> > wrote: > >> a) change all other "invalid" escape sequences to be a parse error [that >> would mean "\m" would raise a parse error!] >> >> b) change \u{} to behave like any other escape sequence, by not raising a >> parse error and instead keeping the literal characters >> >> or c) tell me to keep quiet and accept the oddball behaviour, having quirks >> is The PHP Way after all. >> > > Well, I think option (a) would break parsed strings containing regex: > Oh holy hell. I was about to point towards A because I agree with Andrea that our invalid escape handling makes no sense, then you throw this wrench in the gears.
While I still think that ignoring invalid sequences is bad and a recipe for disaster (for example, in a given regex string, you have some "escapes" passed to the engine as-is, while others like \t\v\f\r\n do get interpolated, which is so inconsistent and entirely php it's practically its own meme), I have to be practical about the fact that there is a TON of existing regex out there (and no small amount of "\u1234" sequences in JSON blobs). A ton of that existing regex is also needlessly using double-quotes strings where single-quotes would have worked, meaning we can't just bifurcate on that (even though allowing invalid sequences through on single-quotes makes some sense). Ugh... No, that's too big of a change to existing scripts. Can't do option A, much as I'd like. > Option (b) sounds reasonable, but there's probably A Solid Reason it was > implemented that way > AIUI, the "solid reason" was because it's dangerous to fail silently where you have high confidence that something is wrong. Again, I believe in it, but the arguments against option A illustrate why it might not be practical. I hate to say this, but in the interest of consistency (were 7.0 not in its final stage) I'd vote for B. > which if so leaves (c.ii): accepting the odd-ball behavior.... > Given that 7.0 is in its final stage, and changing this behaviour is probably a non-starter at this point. C seems the most sane^W pragmatic. It's not the first inconsistency PHP's picked up, it won't be the last. -Sara -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php