Title: [221160] trunk
Revision
221160
Author
msab...@apple.com
Date
2017-08-24 14:14:43 -0700 (Thu, 24 Aug 2017)

Log Message

Add support for RegExp "dotAll" flag
https://bugs.webkit.org/show_bug.cgi?id=175924

Reviewed by Keith Miller.

JSTests:

Updated tests for new dotAll ('s' flag) changes.

* es6/Proxy_internal_get_calls_RegExp.prototype.flags.js:
* stress/static-getter-in-names.js:

Source/_javascript_Core:

The dotAll RegExp flag, 's', changes . to match any character including line terminators.
Added a the "dotAll" identifier as well as RegExp.prototype.dotAll getter.
Added a new any character CharacterClass that is used to match . terms in a dotAll flags
RegExp.  In the YARR pattern and parsing code, changed the NewlineClassID, which was only
used for '.' processing, to DotClassID.  The selection of which builtin character class
that DotClassID resolves to when generating the pattern is conditional on the dotAll flag.
This NewlineClassID to DotClassID refactoring includes the atomBuiltInCharacterClass() in
the WebCore content extensions code in the PatternParser class.

As an optimization, the Yarr JIT actually doesn't perform match checks against the builtin
any character CharacterClass, it merely reads the character.  There is another optimization
in our DotStart enclosure processing where a non-capturing regular _expression_ in the form
of .*<_expression_.*, with options beginning ^ and/or trailing $, match the contained
_expression_ and then look for the extents of the surrounding .*'s.  When used with the
dotAll flag, that processing alwys results with the beinning of the string and the end
of the string.  Therefore we short circuit the finding the beginning and end of the line
or string with dotAll patterns.

* bytecode/BytecodeDumper.cpp:
(JSC::regexpToSourceString):
* runtime/CommonIdentifiers.h:
* runtime/RegExp.cpp:
(JSC::regExpFlags):
(JSC::RegExpFunctionalTestCollector::outputOneTest):
* runtime/RegExp.h:
* runtime/RegExpKey.h:
* runtime/RegExpPrototype.cpp:
(JSC::RegExpPrototype::finishCreation):
(JSC::flagsString):
(JSC::regExpProtoGetterDotAll):
* yarr/YarrInterpreter.cpp:
(JSC::Yarr::Interpreter::matchDotStarEnclosure):
* yarr/YarrInterpreter.h:
(JSC::Yarr::BytecodePattern::dotAll const):
* yarr/YarrJIT.cpp:
(JSC::Yarr::YarrGenerator::optimizeAlternative):
(JSC::Yarr::YarrGenerator::generateCharacterClassOnce):
(JSC::Yarr::YarrGenerator::generateCharacterClassFixed):
(JSC::Yarr::YarrGenerator::generateCharacterClassGreedy):
(JSC::Yarr::YarrGenerator::backtrackCharacterClassNonGreedy):
(JSC::Yarr::YarrGenerator::generateDotStarEnclosure):
* yarr/YarrParser.h:
(JSC::Yarr::Parser::parseTokens):
* yarr/YarrPattern.cpp:
(JSC::Yarr::YarrPatternConstructor::atomBuiltInCharacterClass):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassBuiltIn):
(JSC::Yarr::YarrPatternConstructor::optimizeDotStarWrappedExpressions):
(JSC::Yarr::YarrPattern::YarrPattern):
(JSC::Yarr::PatternTerm::dump):
(JSC::Yarr::anycharCreate):
* yarr/YarrPattern.h:
(JSC::Yarr::YarrPattern::reset):
(JSC::Yarr::YarrPattern::anyCharacterClass):
(JSC::Yarr::YarrPattern::dotAll const):

Source/WebCore:

Changed due to refactoring NewlineClassID to DotClassID.

No new tests. No change in behavior.

* contentextensions/URLFilterParser.cpp:
(WebCore::ContentExtensions::PatternParser::atomBuiltInCharacterClass):

LayoutTests:

* js/regexp-dotall-expected.txt: Added.
* js/regexp-dotall.html: Added.
* js/script-tests/Object-getOwnPropertyNames.js:
* js/script-tests/regexp-dotall.js: Added.
New tests.

* js/Object-getOwnPropertyNames-expected.txt:
Updated tests for new dotAll ('s' flag) changes.

Modified Paths

Added Paths

Diff

Modified: trunk/JSTests/ChangeLog (221159 => 221160)


--- trunk/JSTests/ChangeLog	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/JSTests/ChangeLog	2017-08-24 21:14:43 UTC (rev 221160)
@@ -1,3 +1,15 @@
+2017-08-24  Michael Saboff  <msab...@apple.com>
+
+        Add support for RegExp "dotAll" flag
+        https://bugs.webkit.org/show_bug.cgi?id=175924
+
+        Reviewed by Keith Miller.
+
+        Updated tests for new dotAll ('s' flag) changes.
+
+        * es6/Proxy_internal_get_calls_RegExp.prototype.flags.js:
+        * stress/static-getter-in-names.js:
+
 2017-08-24  Mark Lam  <mark....@apple.com>
 
         Land regression test for https://bugs.webkit.org/show_bug.cgi?id=164081.

Modified: trunk/JSTests/es6/Proxy_internal_get_calls_RegExp.prototype.flags.js (221159 => 221160)


--- trunk/JSTests/es6/Proxy_internal_get_calls_RegExp.prototype.flags.js	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/JSTests/es6/Proxy_internal_get_calls_RegExp.prototype.flags.js	2017-08-24 21:14:43 UTC (rev 221160)
@@ -4,7 +4,7 @@
 var get = [];
 var p = new Proxy({}, { get: function(o, k) { get.push(k); return o[k]; }});
 Object.getOwnPropertyDescriptor(RegExp.prototype, 'flags').get.call(p);
-return get + '' === "global,ignoreCase,multiline,unicode,sticky";
+return get + '' === "global,ignoreCase,multiline,dotAll,unicode,sticky";
       
 }
 

Modified: trunk/JSTests/stress/static-getter-in-names.js (221159 => 221160)


--- trunk/JSTests/stress/static-getter-in-names.js	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/JSTests/stress/static-getter-in-names.js	2017-08-24 21:14:43 UTC (rev 221160)
@@ -3,5 +3,5 @@
         throw new Error('bad value: ' + actual);
 }
 
-shouldBe(JSON.stringify(Object.getOwnPropertyNames(RegExp.prototype).sort()), '["compile","constructor","exec","flags","global","ignoreCase","multiline","source","sticky","test","toString","unicode"]');
+shouldBe(JSON.stringify(Object.getOwnPropertyNames(RegExp.prototype).sort()), '["compile","constructor","dotAll","exec","flags","global","ignoreCase","multiline","source","sticky","test","toString","unicode"]');
 shouldBe(JSON.stringify(Object.getOwnPropertyNames(/Cocoa/).sort()), '["lastIndex"]');

Modified: trunk/LayoutTests/ChangeLog (221159 => 221160)


--- trunk/LayoutTests/ChangeLog	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/LayoutTests/ChangeLog	2017-08-24 21:14:43 UTC (rev 221160)
@@ -1,3 +1,19 @@
+2017-08-24  Michael Saboff  <msab...@apple.com>
+
+        Add support for RegExp "dotAll" flag
+        https://bugs.webkit.org/show_bug.cgi?id=175924
+
+        Reviewed by Keith Miller.
+
+        * js/regexp-dotall-expected.txt: Added.
+        * js/regexp-dotall.html: Added.
+        * js/script-tests/Object-getOwnPropertyNames.js:
+        * js/script-tests/regexp-dotall.js: Added.
+        New tests.
+
+        * js/Object-getOwnPropertyNames-expected.txt:
+        Updated tests for new dotAll ('s' flag) changes.
+
 2017-08-24  Kirill Ovchinnikov  <kirill.ovch...@gmail.com>
 
         HTMLTrackElement behavior violates the standard

Modified: trunk/LayoutTests/js/Object-getOwnPropertyNames-expected.txt (221159 => 221160)


--- trunk/LayoutTests/js/Object-getOwnPropertyNames-expected.txt	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/LayoutTests/js/Object-getOwnPropertyNames-expected.txt	2017-08-24 21:14:43 UTC (rev 221160)
@@ -57,7 +57,7 @@
 PASS getSortedOwnPropertyNames(Date) is ['UTC', 'length', 'name', 'now', 'parse', 'prototype']
 PASS getSortedOwnPropertyNames(Date.prototype) is ['constructor', 'getDate', 'getDay', 'getFullYear', 'getHours', 'getMilliseconds', 'getMinutes', 'getMonth', 'getSeconds', 'getTime', 'getTimezoneOffset', 'getUTCDate', 'getUTCDay', 'getUTCFullYear', 'getUTCHours', 'getUTCMilliseconds', 'getUTCMinutes', 'getUTCMonth', 'getUTCSeconds', 'getYear', 'setDate', 'setFullYear', 'setHours', 'setMilliseconds', 'setMinutes', 'setMonth', 'setSeconds', 'setTime', 'setUTCDate', 'setUTCFullYear', 'setUTCHours', 'setUTCMilliseconds', 'setUTCMinutes', 'setUTCMonth', 'setUTCSeconds', 'setYear', 'toDateString', 'toGMTString', 'toISOString', 'toJSON', 'toLocaleDateString', 'toLocaleString', 'toLocaleTimeString', 'toString', 'toTimeString', 'toUTCString', 'valueOf']
 PASS getSortedOwnPropertyNames(RegExp) is ['$&', "$'", '$*', '$+', '$1', '$2', '$3', '$4', '$5', '$6', '$7', '$8', '$9', '$_', '$`', 'input', 'lastMatch', 'lastParen', 'leftContext', 'length', 'multiline', 'name', 'prototype', 'rightContext']
-PASS getSortedOwnPropertyNames(RegExp.prototype) is ['compile', 'constructor', 'exec', 'flags', 'global', 'ignoreCase', 'multiline', 'source', 'sticky', 'test', 'toString', 'unicode']
+PASS getSortedOwnPropertyNames(RegExp.prototype) is ['compile', 'constructor', 'dotAll', 'exec', 'flags', 'global', 'ignoreCase', 'multiline', 'source', 'sticky', 'test', 'toString', 'unicode']
 PASS getSortedOwnPropertyNames(Error) is ['length', 'name', 'prototype', 'stackTraceLimit']
 PASS getSortedOwnPropertyNames(Error.prototype) is ['constructor', 'message', 'name', 'toString']
 PASS getSortedOwnPropertyNames(Math) is ['E','LN10','LN2','LOG10E','LOG2E','PI','SQRT1_2','SQRT2','abs','acos','acosh','asin','asinh','atan','atan2','atanh','cbrt','ceil','clz32','cos','cosh','exp','expm1','floor','fround','hypot','imul','log','log10','log1p','log2','max','min','pow','random','round','sign','sin','sinh','sqrt','tan','tanh','trunc']

Added: trunk/LayoutTests/js/regexp-dotall-expected.txt (0 => 221160)


--- trunk/LayoutTests/js/regexp-dotall-expected.txt	                        (rev 0)
+++ trunk/LayoutTests/js/regexp-dotall-expected.txt	2017-08-24 21:14:43 UTC (rev 221160)
@@ -0,0 +1,79 @@
+Test for processing of RegExp dotAll flag
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS "aaXcc".match(/.X./)[0].length is 3
+PASS "aaXcc".match(/.X./s)[0].length is 3
+PASS "aa\nXcc".match(/.X./) is null
+PASS "aa\nXcc".match(/.X./m) is null
+PASS "aa\nX\ncc".match(/.X./s)[0] is "\nX\n"
+PASS "aa\nX\ncc".match(/.X./ms)[0] is "\nX\n"
+PASS "aa\nXcc".match(/.*X/)[0] is "X"
+PASS "aa\nXcc".match(/.*X/m)[0] is "X"
+PASS "aa\nXcc".match(/.*X/s)[0] is "aa\nX"
+PASS "aa\nXcc".match(/.*X/sm)[0] is "aa\nX"
+PASS "aaX\ncc".match(/X.*/)[0] is "X"
+PASS "aaX\ncc".match(/X.*/m)[0] is "X"
+PASS "aaX\ncc".match(/X.*/s)[0] is "X\ncc"
+PASS "aaX\ncc".match(/X.*/sm)[0] is "X\ncc"
+PASS "aa\nX\ncc".match(/.*X.*/)[0] is "X"
+PASS "aa\nX\ncc".match(/.*X.*/m)[0] is "X"
+PASS "aa\nX\ncc".match(/.*X.*/s)[0] is "aa\nX\ncc"
+PASS "aa\nX\ncc".match(/.*X.*/sm)[0] is "aa\nX\ncc"
+PASS "aa\nXcc".match(/.*^X/) is null
+PASS "aa\nXcc".match(/.*^X/m)[0] is "X"
+PASS "aa\nXcc".match(/.*^X/s) is null
+PASS "aa\nXcc".match(/.*^X/sm)[0] is "aa\nX"
+PASS "aaX\ncc".match(/X$.*/) is null
+PASS "aaX\ncc".match(/X$.*/m)[0] is "X"
+PASS "aaX\ncc".match(/X$.*/s) is null
+PASS "aaX\ncc".match(/X$.*/sm)[0] is "X\ncc"
+PASS "aa\nX\ncc".match(/.*^X$.*/) is null
+PASS "aa\nX\ncc".match(/.*^X$.*/m)[0] is "X"
+PASS "aa\nX\ncc".match(/.*^X$.*/s) is null
+PASS "aa\nX\ncc".match(/.*^X$.*/sm)[0] is "aa\nX\ncc"
+PASS "aa\nXcc".match(/^.*X/) is null
+PASS "aa\nXcc".match(/^.*X/m)[0] is "X"
+PASS "aa\nXcc".match(/^.*X/s)[0] is "aa\nX"
+PASS "aa\nXcc".match(/^.*X/sm)[0] is "aa\nX"
+PASS "aaX\ncc".match(/X.*$/) is null
+PASS "aaX\ncc".match(/X.*$/m)[0] is "X"
+PASS "aaX\ncc".match(/X.*$/s)[0] is "X\ncc"
+PASS "aaX\ncc".match(/X.*$/sm)[0] is "X\ncc"
+PASS "aa\nX\ncc".match(/^.*X.*$/) is null
+PASS "aa\nX\ncc".match(/^.*X.*$/m)[0] is "X"
+PASS "aa\nX\ncc".match(/^.*X.*$/s)[0] is "aa\nX\ncc"
+PASS "aa\nX\ncc".match(/^.*X.*$/sm)[0] is "aa\nX\ncc"
+PASS "a\na\nX\nc\nc\n".match(/^.*X.*$/) is null
+PASS "a\na\nX\nc\nc\n".match(/^.*X.*$/m)[0] is "X"
+PASS "a\na\nX\nc\nc\n".match(/^.*X.*$/s)[0] is "a\na\nX\nc\nc\n"
+PASS "a\na\nX\nc\nc\n".match(/^.*X.*$/sm)[0] is "a\na\nX\nc\nc\n"
+PASS "a\na\nX\nc\nc\n".match(/^.*X.*$/) is null
+PASS "a\na\nX\nc\nc\n".match(/^.*X.*$/m)[0] is "X"
+PASS "a\na\nX\nc\nc\n".match(/^.*X.*$/s)[0] is "a\na\nX\nc\nc\n"
+PASS "a\na\nX\nc\nc\n".match(/^.*X.*$/sm)[0] is "a\na\nX\nc\nc\n"
+PASS "\n\n\nX".match(/.{1}X/sm)[0] is "\nX"
+PASS "\n\n\nX".match(/.{1,2}X/sm)[0] is "\n\nX"
+PASS "\n\n\nX".match(/.{1,3}X/sm)[0] is "\n\n\nX"
+PASS "\n\n\nX".match(/.{1,4}X/sm)[0] is "\n\n\nX"
+PASS "\n\n\nX".match(/.{1,2}?X/sm)[0] is "\n\nX"
+PASS "\n\n\nX".match(/.{1,3}?X/sm)[0] is "\n\n\nX"
+PASS "\n\n\nX".match(/.{1,4}?X/sm)[0] is "\n\n\nX"
+PASS "X\n\n\nY".match(/X.{1}/sm)[0] is "X\n"
+PASS "X\n\n\nY".match(/X.{1,2}/sm)[0] is "X\n\n"
+PASS "X\n\n\nY".match(/X.{1,3}/sm)[0] is "X\n\n\n"
+PASS "X\n\n\nY".match(/X.{1,4}/sm)[0] is "X\n\n\nY"
+PASS "X\n\n\nY".match(/X.{1,2}?/sm)[0] is "X\n"
+PASS "X\n\n\nY".match(/X.{1,3}?/sm)[0] is "X\n"
+PASS "X\n\n\nY".match(/X.{1,4}?/sm)[0] is "X\n"
+PASS "The\nquick\nbrown\nfox\njumped.".match(/.*brown.*/)[0] is "brown"
+PASS "The\nquick\nbrown\nfox\njumped.".match(/.*brown.*/s)[0] is "The\nquick\nbrown\nfox\njumped."
+PASS "The\nquick\nbrown\nfox\njumped.".match(/The.quick.brown.fox.jumped./) is null
+PASS "The\nquick\nbrown\nfox\njumped.".match(/The.quick.brown.fox.jumped./s)[0] is "The\nquick\nbrown\nfox\njumped."
+PASS /a/.dotAll is false
+PASS /a/s.dotAll is true
+PASS successfullyParsed is true
+
+TEST COMPLETE
+

Added: trunk/LayoutTests/js/regexp-dotall.html (0 => 221160)


--- trunk/LayoutTests/js/regexp-dotall.html	                        (rev 0)
+++ trunk/LayoutTests/js/regexp-dotall.html	2017-08-24 21:14:43 UTC (rev 221160)
@@ -0,0 +1,10 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+<head>
+<script src=""
+</head>
+<body>
+<script src=""
+<script src=""
+</body>
+</html>

Modified: trunk/LayoutTests/js/script-tests/Object-getOwnPropertyNames.js (221159 => 221160)


--- trunk/LayoutTests/js/script-tests/Object-getOwnPropertyNames.js	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/LayoutTests/js/script-tests/Object-getOwnPropertyNames.js	2017-08-24 21:14:43 UTC (rev 221160)
@@ -66,7 +66,7 @@
     "Date": "['UTC', 'length', 'name', 'now', 'parse', 'prototype']",
     "Date.prototype": "['constructor', 'getDate', 'getDay', 'getFullYear', 'getHours', 'getMilliseconds', 'getMinutes', 'getMonth', 'getSeconds', 'getTime', 'getTimezoneOffset', 'getUTCDate', 'getUTCDay', 'getUTCFullYear', 'getUTCHours', 'getUTCMilliseconds', 'getUTCMinutes', 'getUTCMonth', 'getUTCSeconds', 'getYear', 'setDate', 'setFullYear', 'setHours', 'setMilliseconds', 'setMinutes', 'setMonth', 'setSeconds', 'setTime', 'setUTCDate', 'setUTCFullYear', 'setUTCHours', 'setUTCMilliseconds', 'setUTCMinutes', 'setUTCMonth', 'setUTCSeconds', 'setYear', 'toDateString', 'toGMTString', 'toISOString', 'toJSON', 'toLocaleDateString', 'toLocaleString', 'toLocaleTimeString', 'toString', 'toTimeString', 'toUTCString', 'valueOf']",
     "RegExp": "['$&', \"$'\", '$*', '$+', '$1', '$2', '$3', '$4', '$5', '$6', '$7', '$8', '$9', '$_', '$`', 'input', 'lastMatch', 'lastParen', 'leftContext', 'length', 'multiline', 'name', 'prototype', 'rightContext']",
-    "RegExp.prototype": "['compile', 'constructor', 'exec', 'flags', 'global', 'ignoreCase', 'multiline', 'source', 'sticky', 'test', 'toString', 'unicode']",
+    "RegExp.prototype": "['compile', 'constructor', 'dotAll', 'exec', 'flags', 'global', 'ignoreCase', 'multiline', 'source', 'sticky', 'test', 'toString', 'unicode']",
     "Error": "['length', 'name', 'prototype', 'stackTraceLimit']",
     "Error.prototype": "['constructor', 'message', 'name', 'toString']",
     "Math": "['E','LN10','LN2','LOG10E','LOG2E','PI','SQRT1_2','SQRT2','abs','acos','acosh','asin','asinh','atan','atan2','atanh','cbrt','ceil','clz32','cos','cosh','exp','expm1','floor','fround','hypot','imul','log','log10','log1p','log2','max','min','pow','random','round','sign','sin','sinh','sqrt','tan','tanh','trunc']",

Added: trunk/LayoutTests/js/script-tests/regexp-dotall.js (0 => 221160)


--- trunk/LayoutTests/js/script-tests/regexp-dotall.js	                        (rev 0)
+++ trunk/LayoutTests/js/script-tests/regexp-dotall.js	2017-08-24 21:14:43 UTC (rev 221160)
@@ -0,0 +1,77 @@
+description(
+'Test for processing of RegExp dotAll flag'
+);
+
+// Check dotAll matching operation
+shouldBe('"aaXcc".match(/.X./)[0].length', '3');
+shouldBe('"aaXcc".match(/.X./s)[0].length', '3');
+shouldBeNull('"aa\\nXcc".match(/.X./)');
+shouldBeNull('"aa\\nXcc".match(/.X./m)');
+shouldBe('"aa\\nX\\ncc".match(/.X./s)[0]', '"\\nX\\n"');
+shouldBe('"aa\\nX\\ncc".match(/.X./ms)[0]', '"\\nX\\n"');
+shouldBe('"aa\\nXcc".match(/.*X/)[0]', '"X"');
+shouldBe('"aa\\nXcc".match(/.*X/m)[0]', '"X"');
+shouldBe('"aa\\nXcc".match(/.*X/s)[0]', '"aa\\nX"');
+shouldBe('"aa\\nXcc".match(/.*X/sm)[0]', '"aa\\nX"');
+shouldBe('"aaX\\ncc".match(/X.*/)[0]', '"X"');
+shouldBe('"aaX\\ncc".match(/X.*/m)[0]', '"X"');
+shouldBe('"aaX\\ncc".match(/X.*/s)[0]', '"X\\ncc"');
+shouldBe('"aaX\\ncc".match(/X.*/sm)[0]', '"X\\ncc"');
+shouldBe('"aa\\nX\\ncc".match(/.*X.*/)[0]', '"X"');
+shouldBe('"aa\\nX\\ncc".match(/.*X.*/m)[0]', '"X"');
+shouldBe('"aa\\nX\\ncc".match(/.*X.*/s)[0]', '"aa\\nX\\ncc"');
+shouldBe('"aa\\nX\\ncc".match(/.*X.*/sm)[0]', '"aa\\nX\\ncc"');
+shouldBeNull('"aa\\nXcc".match(/.*^X/)');
+shouldBe('"aa\\nXcc".match(/.*^X/m)[0]', '"X"');
+shouldBeNull('"aa\\nXcc".match(/.*^X/s)', '"aa\\nX"');
+shouldBe('"aa\\nXcc".match(/.*^X/sm)[0]', '"aa\\nX"');
+shouldBeNull('"aaX\\ncc".match(/X$.*/)');
+shouldBe('"aaX\\ncc".match(/X$.*/m)[0]', '"X"');
+shouldBeNull('"aaX\\ncc".match(/X$.*/s)');
+shouldBe('"aaX\\ncc".match(/X$.*/sm)[0]', '"X\\ncc"');
+shouldBeNull('"aa\\nX\\ncc".match(/.*^X$.*/)');
+shouldBe('"aa\\nX\\ncc".match(/.*^X$.*/m)[0]', '"X"');
+shouldBeNull('"aa\\nX\\ncc".match(/.*^X$.*/s)');
+shouldBe('"aa\\nX\\ncc".match(/.*^X$.*/sm)[0]', '"aa\\nX\\ncc"');
+shouldBeNull('"aa\\nXcc".match(/^.*X/)');
+shouldBe('"aa\\nXcc".match(/^.*X/m)[0]', '"X"');
+shouldBe('"aa\\nXcc".match(/^.*X/s)[0]', '"aa\\nX"');
+shouldBe('"aa\\nXcc".match(/^.*X/sm)[0]', '"aa\\nX"');
+shouldBeNull('"aaX\\ncc".match(/X.*$/)');
+shouldBe('"aaX\\ncc".match(/X.*$/m)[0]', '"X"');
+shouldBe('"aaX\\ncc".match(/X.*$/s)[0]', '"X\\ncc"');
+shouldBe('"aaX\\ncc".match(/X.*$/sm)[0]', '"X\\ncc"');
+shouldBeNull('"aa\\nX\\ncc".match(/^.*X.*$/)');
+shouldBe('"aa\\nX\\ncc".match(/^.*X.*$/m)[0]', '"X"');
+shouldBe('"aa\\nX\\ncc".match(/^.*X.*$/s)[0]', '"aa\\nX\\ncc"');
+shouldBe('"aa\\nX\\ncc".match(/^.*X.*$/sm)[0]', '"aa\\nX\\ncc"');
+shouldBeNull('"a\\na\\nX\\nc\\nc\\n".match(/^.*X.*$/)');
+shouldBe('"a\\na\\nX\\nc\\nc\\n".match(/^.*X.*$/m)[0]', '"X"');
+shouldBe('"a\\na\\nX\\nc\\nc\\n".match(/^.*X.*$/s)[0]', '"a\\na\\nX\\nc\\nc\\n"');
+shouldBe('"a\\na\\nX\\nc\\nc\\n".match(/^.*X.*$/sm)[0]', '"a\\na\\nX\\nc\\nc\\n"');
+shouldBeNull('"a\\na\\nX\\nc\\nc\\n".match(/^.*X.*$/)');
+shouldBe('"a\\na\\nX\\nc\\nc\\n".match(/^.*X.*$/m)[0]', '"X"');
+shouldBe('"a\\na\\nX\\nc\\nc\\n".match(/^.*X.*$/s)[0]', '"a\\na\\nX\\nc\\nc\\n"');
+shouldBe('"a\\na\\nX\\nc\\nc\\n".match(/^.*X.*$/sm)[0]', '"a\\na\\nX\\nc\\nc\\n"');
+shouldBe('"\\n\\n\\nX".match(/.{1}X/sm)[0]', '"\\nX"');
+shouldBe('"\\n\\n\\nX".match(/.{1,2}X/sm)[0]', '"\\n\\nX"');
+shouldBe('"\\n\\n\\nX".match(/.{1,3}X/sm)[0]', '"\\n\\n\\nX"');
+shouldBe('"\\n\\n\\nX".match(/.{1,4}X/sm)[0]', '"\\n\\n\\nX"');
+shouldBe('"\\n\\n\\nX".match(/.{1,2}?X/sm)[0]', '"\\n\\nX"');
+shouldBe('"\\n\\n\\nX".match(/.{1,3}?X/sm)[0]', '"\\n\\n\\nX"');
+shouldBe('"\\n\\n\\nX".match(/.{1,4}?X/sm)[0]', '"\\n\\n\\nX"');
+shouldBe('"X\\n\\n\\nY".match(/X.{1}/sm)[0]', '"X\\n"');
+shouldBe('"X\\n\\n\\nY".match(/X.{1,2}/sm)[0]', '"X\\n\\n"');
+shouldBe('"X\\n\\n\\nY".match(/X.{1,3}/sm)[0]', '"X\\n\\n\\n"');
+shouldBe('"X\\n\\n\\nY".match(/X.{1,4}/sm)[0]', '"X\\n\\n\\nY"');
+shouldBe('"X\\n\\n\\nY".match(/X.{1,2}?/sm)[0]', '"X\\n"');
+shouldBe('"X\\n\\n\\nY".match(/X.{1,3}?/sm)[0]', '"X\\n"');
+shouldBe('"X\\n\\n\\nY".match(/X.{1,4}?/sm)[0]', '"X\\n"');
+shouldBe('"The\\nquick\\nbrown\\nfox\\njumped.".match(/.*brown.*/)[0]', '"brown"');
+shouldBe('"The\\nquick\\nbrown\\nfox\\njumped.".match(/.*brown.*/s)[0]', '"The\\nquick\\nbrown\\nfox\\njumped."');
+shouldBeNull('"The\\nquick\\nbrown\\nfox\\njumped.".match(/The.quick.brown.fox.jumped./)');
+shouldBe('"The\\nquick\\nbrown\\nfox\\njumped.".match(/The.quick.brown.fox.jumped./s)[0]', '"The\\nquick\\nbrown\\nfox\\njumped."');
+
+// Check that the dotAll flag getter works as expected
+shouldBeFalse('/a/.dotAll');
+shouldBeTrue('/a/s.dotAll');

Modified: trunk/Source/_javascript_Core/ChangeLog (221159 => 221160)


--- trunk/Source/_javascript_Core/ChangeLog	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/ChangeLog	2017-08-24 21:14:43 UTC (rev 221160)
@@ -1,3 +1,65 @@
+2017-08-24  Michael Saboff  <msab...@apple.com>
+
+        Add support for RegExp "dotAll" flag
+        https://bugs.webkit.org/show_bug.cgi?id=175924
+
+        Reviewed by Keith Miller.
+
+        The dotAll RegExp flag, 's', changes . to match any character including line terminators.
+        Added a the "dotAll" identifier as well as RegExp.prototype.dotAll getter.
+        Added a new any character CharacterClass that is used to match . terms in a dotAll flags
+        RegExp.  In the YARR pattern and parsing code, changed the NewlineClassID, which was only
+        used for '.' processing, to DotClassID.  The selection of which builtin character class
+        that DotClassID resolves to when generating the pattern is conditional on the dotAll flag.
+        This NewlineClassID to DotClassID refactoring includes the atomBuiltInCharacterClass() in
+        the WebCore content extensions code in the PatternParser class.
+
+        As an optimization, the Yarr JIT actually doesn't perform match checks against the builtin
+        any character CharacterClass, it merely reads the character.  There is another optimization
+        in our DotStart enclosure processing where a non-capturing regular _expression_ in the form
+        of .*<_expression_.*, with options beginning ^ and/or trailing $, match the contained
+        _expression_ and then look for the extents of the surrounding .*'s.  When used with the
+        dotAll flag, that processing alwys results with the beinning of the string and the end
+        of the string.  Therefore we short circuit the finding the beginning and end of the line
+        or string with dotAll patterns.
+
+        * bytecode/BytecodeDumper.cpp:
+        (JSC::regexpToSourceString):
+        * runtime/CommonIdentifiers.h:
+        * runtime/RegExp.cpp:
+        (JSC::regExpFlags):
+        (JSC::RegExpFunctionalTestCollector::outputOneTest):
+        * runtime/RegExp.h:
+        * runtime/RegExpKey.h:
+        * runtime/RegExpPrototype.cpp:
+        (JSC::RegExpPrototype::finishCreation):
+        (JSC::flagsString):
+        (JSC::regExpProtoGetterDotAll):
+        * yarr/YarrInterpreter.cpp:
+        (JSC::Yarr::Interpreter::matchDotStarEnclosure):
+        * yarr/YarrInterpreter.h:
+        (JSC::Yarr::BytecodePattern::dotAll const):
+        * yarr/YarrJIT.cpp:
+        (JSC::Yarr::YarrGenerator::optimizeAlternative):
+        (JSC::Yarr::YarrGenerator::generateCharacterClassOnce):
+        (JSC::Yarr::YarrGenerator::generateCharacterClassFixed):
+        (JSC::Yarr::YarrGenerator::generateCharacterClassGreedy):
+        (JSC::Yarr::YarrGenerator::backtrackCharacterClassNonGreedy):
+        (JSC::Yarr::YarrGenerator::generateDotStarEnclosure):
+        * yarr/YarrParser.h:
+        (JSC::Yarr::Parser::parseTokens):
+        * yarr/YarrPattern.cpp:
+        (JSC::Yarr::YarrPatternConstructor::atomBuiltInCharacterClass):
+        (JSC::Yarr::YarrPatternConstructor::atomCharacterClassBuiltIn):
+        (JSC::Yarr::YarrPatternConstructor::optimizeDotStarWrappedExpressions):
+        (JSC::Yarr::YarrPattern::YarrPattern):
+        (JSC::Yarr::PatternTerm::dump):
+        (JSC::Yarr::anycharCreate):
+        * yarr/YarrPattern.h:
+        (JSC::Yarr::YarrPattern::reset):
+        (JSC::Yarr::YarrPattern::anyCharacterClass):
+        (JSC::Yarr::YarrPattern::dotAll const):
+
 2017-08-23  Filip Pizlo  <fpi...@apple.com>
 
         Reduce Gigacage sizes

Modified: trunk/Source/_javascript_Core/bytecode/BytecodeDumper.cpp (221159 => 221160)


--- trunk/Source/_javascript_Core/bytecode/BytecodeDumper.cpp	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/bytecode/BytecodeDumper.cpp	2017-08-24 21:14:43 UTC (rev 221160)
@@ -252,7 +252,7 @@
 
 static CString regexpToSourceString(RegExp* regExp)
 {
-    char postfix[5] = { '/', 0, 0, 0, 0 };
+    char postfix[7] = { '/', 0, 0, 0, 0, 0, 0 };
     int index = 1;
     if (regExp->global())
         postfix[index++] = 'g';
@@ -260,10 +260,12 @@
         postfix[index++] = 'i';
     if (regExp->multiline())
         postfix[index] = 'm';
+    if (regExp->dotAll())
+        postfix[index++] = 's';
+    if (regExp->unicode())
+        postfix[index++] = 'u';
     if (regExp->sticky())
         postfix[index++] = 'y';
-    if (regExp->unicode())
-        postfix[index++] = 'u';
 
     return toCString("/", regExp->pattern().impl(), postfix);
 }

Modified: trunk/Source/_javascript_Core/runtime/CommonIdentifiers.h (221159 => 221160)


--- trunk/Source/_javascript_Core/runtime/CommonIdentifiers.h	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/runtime/CommonIdentifiers.h	2017-08-24 21:14:43 UTC (rev 221160)
@@ -221,6 +221,7 @@
     macro(displayName) \
     macro(document) \
     macro(done) \
+    macro(dotAll) \
     macro(enumerable) \
     macro(era) \
     macro(eval) \

Modified: trunk/Source/_javascript_Core/runtime/RegExp.cpp (221159 => 221160)


--- trunk/Source/_javascript_Core/runtime/RegExp.cpp	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/runtime/RegExp.cpp	2017-08-24 21:14:43 UTC (rev 221160)
@@ -59,6 +59,12 @@
             flags = static_cast<RegExpFlags>(flags | FlagMultiline);
             break;
 
+        case 's':
+            if (flags & FlagDotAll)
+                return InvalidFlags;
+            flags = static_cast<RegExpFlags>(flags | FlagDotAll);
+            break;
+            
         case 'u':
             if (flags & FlagUnicode)
                 return InvalidFlags;
@@ -104,10 +110,12 @@
             fputc('i', m_file);
         if (regExp->multiline())
             fputc('m', m_file);
+        if (regExp->dotAll())
+            fputc('s', m_file);
+        if (regExp->unicode())
+            fputc('u', m_file);
         if (regExp->sticky())
             fputc('y', m_file);
-        if (regExp->unicode())
-            fputc('u', m_file);
         fprintf(m_file, "\n");
     }
 

Modified: trunk/Source/_javascript_Core/runtime/RegExp.h (221159 => 221160)


--- trunk/Source/_javascript_Core/runtime/RegExp.h	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/runtime/RegExp.h	2017-08-24 21:14:43 UTC (rev 221160)
@@ -56,6 +56,7 @@
     bool sticky() const { return m_flags & FlagSticky; }
     bool globalOrSticky() const { return global() || sticky(); }
     bool unicode() const { return m_flags & FlagUnicode; }
+    bool dotAll() const { return m_flags & FlagDotAll; }
 
     const String& pattern() const { return m_patternString; }
 

Modified: trunk/Source/_javascript_Core/runtime/RegExpKey.h (221159 => 221160)


--- trunk/Source/_javascript_Core/runtime/RegExpKey.h	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/runtime/RegExpKey.h	2017-08-24 21:14:43 UTC (rev 221160)
@@ -39,7 +39,8 @@
     FlagMultiline = 4,
     FlagSticky = 8,
     FlagUnicode = 16,
-    InvalidFlags = 32,
+    FlagDotAll = 32,
+    InvalidFlags = 64,
     DeletedValueFlags = -1
 };
 

Modified: trunk/Source/_javascript_Core/runtime/RegExpPrototype.cpp (221159 => 221160)


--- trunk/Source/_javascript_Core/runtime/RegExpPrototype.cpp	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/runtime/RegExpPrototype.cpp	2017-08-24 21:14:43 UTC (rev 221160)
@@ -50,6 +50,7 @@
 static EncodedJSValue JSC_HOST_CALL regExpProtoGetterGlobal(ExecState*);
 static EncodedJSValue JSC_HOST_CALL regExpProtoGetterIgnoreCase(ExecState*);
 static EncodedJSValue JSC_HOST_CALL regExpProtoGetterMultiline(ExecState*);
+static EncodedJSValue JSC_HOST_CALL regExpProtoGetterDotAll(ExecState*);
 static EncodedJSValue JSC_HOST_CALL regExpProtoGetterSticky(ExecState*);
 static EncodedJSValue JSC_HOST_CALL regExpProtoGetterUnicode(ExecState*);
 static EncodedJSValue JSC_HOST_CALL regExpProtoGetterSource(ExecState*);
@@ -70,6 +71,7 @@
     JSC_NATIVE_INTRINSIC_FUNCTION_WITHOUT_TRANSITION(vm.propertyNames->exec, regExpProtoFuncExec, DontEnum, 1, RegExpExecIntrinsic);
     JSC_NATIVE_FUNCTION_WITHOUT_TRANSITION(vm.propertyNames->toString, regExpProtoFuncToString, DontEnum, 0);
     JSC_NATIVE_GETTER(vm.propertyNames->global, regExpProtoGetterGlobal, DontEnum | Accessor);
+    JSC_NATIVE_GETTER(vm.propertyNames->dotAll, regExpProtoGetterDotAll, DontEnum | Accessor);
     JSC_NATIVE_GETTER(vm.propertyNames->ignoreCase, regExpProtoGetterIgnoreCase, DontEnum | Accessor);
     JSC_NATIVE_GETTER(vm.propertyNames->multiline, regExpProtoGetterMultiline, DontEnum | Accessor);
     JSC_NATIVE_GETTER(vm.propertyNames->sticky, regExpProtoGetterSticky, DontEnum | Accessor);
@@ -210,6 +212,8 @@
     RETURN_IF_EXCEPTION(scope, string);
     JSValue multilineValue = regexp->get(exec, vm.propertyNames->multiline);
     RETURN_IF_EXCEPTION(scope, string);
+    JSValue dotAllValue = regexp->get(exec, vm.propertyNames->dotAll);
+    RETURN_IF_EXCEPTION(scope, string);
     JSValue unicodeValue = regexp->get(exec, vm.propertyNames->unicode);
     RETURN_IF_EXCEPTION(scope, string);
     JSValue stickyValue = regexp->get(exec, vm.propertyNames->sticky);
@@ -222,6 +226,8 @@
         string[index++] = 'i';
     if (multilineValue.toBoolean(exec))
         string[index++] = 'm';
+    if (dotAllValue.toBoolean(exec))
+        string[index++] = 's';
     if (unicodeValue.toBoolean(exec))
         string[index++] = 'u';
     if (stickyValue.toBoolean(exec))
@@ -306,6 +312,21 @@
     return JSValue::encode(jsBoolean(asRegExpObject(thisValue)->regExp()->multiline()));
 }
 
+EncodedJSValue JSC_HOST_CALL regExpProtoGetterDotAll(ExecState* exec)
+{
+    VM& vm = exec->vm();
+    auto scope = DECLARE_THROW_SCOPE(vm);
+    
+    JSValue thisValue = exec->thisValue();
+    if (UNLIKELY(!thisValue.inherits(vm, RegExpObject::info()))) {
+        if (thisValue.inherits(vm, RegExpPrototype::info()))
+            return JSValue::encode(jsUndefined());
+        return throwVMTypeError(exec, scope, ASCIILiteral("The RegExp.prototype.dotAll getter can only be called on a RegExp object"));
+    }
+    
+    return JSValue::encode(jsBoolean(asRegExpObject(thisValue)->regExp()->dotAll()));
+}
+    
 EncodedJSValue JSC_HOST_CALL regExpProtoGetterSticky(ExecState* exec)
 {
     VM& vm = exec->vm();

Modified: trunk/Source/_javascript_Core/yarr/YarrInterpreter.cpp (221159 => 221160)


--- trunk/Source/_javascript_Core/yarr/YarrInterpreter.cpp	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/yarr/YarrInterpreter.cpp	2017-08-24 21:14:43 UTC (rev 221160)
@@ -1120,6 +1120,13 @@
     bool matchDotStarEnclosure(ByteTerm& term, DisjunctionContext* context)
     {
         UNUSED_PARAM(term);
+
+        if (pattern->dotAll()) {
+            context->matchBegin = startOffset;
+            context->matchEnd = input.end();
+            return true;
+        }
+
         unsigned matchBegin = context->matchBegin;
 
         if (matchBegin > startOffset) {

Modified: trunk/Source/_javascript_Core/yarr/YarrInterpreter.h (221159 => 221160)


--- trunk/Source/_javascript_Core/yarr/YarrInterpreter.h	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/yarr/YarrInterpreter.h	2017-08-24 21:14:43 UTC (rev 221160)
@@ -371,6 +371,7 @@
     bool multiline() const { return m_flags & FlagMultiline; }
     bool sticky() const { return m_flags & FlagSticky; }
     bool unicode() const { return m_flags & FlagUnicode; }
+    bool dotAll() const { return m_flags & FlagDotAll; }
 
     std::unique_ptr<ByteDisjunction> m_body;
     RegExpFlags m_flags;

Modified: trunk/Source/_javascript_Core/yarr/YarrJIT.cpp (221159 => 221160)


--- trunk/Source/_javascript_Core/yarr/YarrJIT.cpp	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/yarr/YarrJIT.cpp	2017-08-24 21:14:43 UTC (rev 221160)
@@ -1169,15 +1169,18 @@
 
         JumpList matchDest;
         readCharacter(m_checkedOffset - term->inputPosition, character);
-        matchCharacterClass(character, matchDest, term->characterClass);
+        // If we are matching the "any character" builtin class we only need to read the
+        // character and don't need to match as it will always succeed.
+        if (term->invert() || term->characterClass != m_pattern.anyCharacterClass()) {
+            matchCharacterClass(character, matchDest, term->characterClass);
 
-        if (term->invert())
-            op.m_jumps.append(matchDest);
-        else {
-            op.m_jumps.append(jump());
-            matchDest.link(this);
+            if (term->invert())
+                op.m_jumps.append(matchDest);
+            else {
+                op.m_jumps.append(jump());
+                matchDest.link(this);
+            }
         }
-
 #ifdef JIT_UNICODE_EXPRESSIONS
         if (m_decodeSurrogatePairs) {
             Jump isBMPChar = branch32(LessThan, character, supplementaryPlanesBase);
@@ -1215,13 +1218,17 @@
         Label loop(this);
         JumpList matchDest;
         readCharacter(m_checkedOffset - term->inputPosition - term->quantityMaxCount, character, countRegister);
-        matchCharacterClass(character, matchDest, term->characterClass);
+        // If we are matching the "any character" builtin class we only need to read the
+        // character and don't need to match as it will always succeed.
+        if (term->invert() || term->characterClass != m_pattern.anyCharacterClass()) {
+            matchCharacterClass(character, matchDest, term->characterClass);
 
-        if (term->invert())
-            op.m_jumps.append(matchDest);
-        else {
-            op.m_jumps.append(jump());
-            matchDest.link(this);
+            if (term->invert())
+                op.m_jumps.append(matchDest);
+            else {
+                op.m_jumps.append(jump());
+                matchDest.link(this);
+            }
         }
 
         add32(TrustedImm32(1), countRegister);
@@ -1263,8 +1270,12 @@
         } else {
             JumpList matchDest;
             readCharacter(m_checkedOffset - term->inputPosition, character);
-            matchCharacterClass(character, matchDest, term->characterClass);
-            failures.append(jump());
+            // If we are matching the "any character" builtin class we only need to read the
+            // character and don't need to match as it will always succeed.
+            if (term->characterClass != m_pattern.anyCharacterClass()) {
+                matchCharacterClass(character, matchDest, term->characterClass);
+                failures.append(jump());
+            }
             matchDest.link(this);
         }
 
@@ -1365,13 +1376,17 @@
 
         JumpList matchDest;
         readCharacter(m_checkedOffset - term->inputPosition, character);
-        matchCharacterClass(character, matchDest, term->characterClass);
+        // If we are matching the "any character" builtin class we only need to read the
+        // character and don't need to match as it will always succeed.
+        if (term->invert() || term->characterClass != m_pattern.anyCharacterClass()) {
+            matchCharacterClass(character, matchDest, term->characterClass);
 
-        if (term->invert())
-            nonGreedyFailures.append(matchDest);
-        else {
-            nonGreedyFailures.append(jump());
-            matchDest.link(this);
+            if (term->invert())
+                nonGreedyFailures.append(matchDest);
+            else {
+                nonGreedyFailures.append(jump());
+                matchDest.link(this);
+            }
         }
 
         add32(TrustedImm32(1), index);
@@ -1407,6 +1422,13 @@
         JumpList saveStartIndex;
         JumpList foundEndingNewLine;
 
+        if (m_pattern.dotAll()) {
+            move(TrustedImm32(0), matchPos);
+            setMatchStart(matchPos);
+            move(length, index);
+            return;
+        }
+
         ASSERT(!m_pattern.m_body->m_hasFixedSize);
         getMatchStart(matchPos);
 

Modified: trunk/Source/_javascript_Core/yarr/YarrParser.h (221159 => 221160)


--- trunk/Source/_javascript_Core/yarr/YarrParser.h	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/yarr/YarrParser.h	2017-08-24 21:14:43 UTC (rev 221160)
@@ -36,7 +36,7 @@
     DigitClassID,
     SpaceClassID,
     WordClassID,
-    NewlineClassID,
+    DotClassID,
 };
 
 // The Parser class should not be used directly - only via the Yarr::parse() method.
@@ -694,7 +694,7 @@
 
             case '.':
                 consume();
-                m_delegate.atomBuiltInCharacterClass(NewlineClassID, true);
+                m_delegate.atomBuiltInCharacterClass(DotClassID, false);
                 lastTokenWasAnAtom = true;
                 break;
 

Modified: trunk/Source/_javascript_Core/yarr/YarrPattern.cpp (221159 => 221160)


--- trunk/Source/_javascript_Core/yarr/YarrPattern.cpp	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/yarr/YarrPattern.cpp	2017-08-24 21:14:43 UTC (rev 221160)
@@ -373,8 +373,12 @@
             else
                 m_alternative->m_terms.append(PatternTerm(m_pattern.wordcharCharacterClass(), invert));
             break;
-        case NewlineClassID:
-            m_alternative->m_terms.append(PatternTerm(m_pattern.newlineCharacterClass(), invert));
+        case DotClassID:
+            ASSERT(!invert);
+            if (m_pattern.dotAll())
+                m_alternative->m_terms.append(PatternTerm(m_pattern.anyCharacterClass(), false));
+            else
+                m_alternative->m_terms.append(PatternTerm(m_pattern.newlineCharacterClass(), true));
             break;
         }
     }
@@ -396,7 +400,7 @@
 
     void atomCharacterClassBuiltIn(BuiltInCharacterClassID classID, bool invert)
     {
-        ASSERT(classID != NewlineClassID);
+        ASSERT(classID != DotClassID);
 
         switch (classID) {
         case DigitClassID:
@@ -849,6 +853,7 @@
         if (alternatives.size() != 1)
             return;
 
+        CharacterClass* dotCharacterClass = m_pattern.dotAll() ? m_pattern.anyCharacterClass() : m_pattern.newlineCharacterClass();
         PatternAlternative* alternative = alternatives[0].get();
         Vector<PatternTerm>& terms = alternative->m_terms;
         if (terms.size() >= 3) {
@@ -863,7 +868,10 @@
             }
             
             PatternTerm& firstNonAnchorTerm = terms[termIndex];
-            if ((firstNonAnchorTerm.type != PatternTerm::TypeCharacterClass) || (firstNonAnchorTerm.characterClass != m_pattern.newlineCharacterClass()) || !((firstNonAnchorTerm.quantityType == QuantifierGreedy) || (firstNonAnchorTerm.quantityType == QuantifierNonGreedy)))
+            if ((firstNonAnchorTerm.type != PatternTerm::TypeCharacterClass)
+                || (firstNonAnchorTerm.characterClass != dotCharacterClass)
+                || !((firstNonAnchorTerm.quantityType == QuantifierGreedy)
+                    || (firstNonAnchorTerm.quantityType == QuantifierNonGreedy)))
                 return;
             
             firstExpressionTerm = termIndex + 1;
@@ -875,7 +883,9 @@
             }
             
             PatternTerm& lastNonAnchorTerm = terms[termIndex];
-            if ((lastNonAnchorTerm.type != PatternTerm::TypeCharacterClass) || (lastNonAnchorTerm.characterClass != m_pattern.newlineCharacterClass()) || (lastNonAnchorTerm.quantityType != QuantifierGreedy))
+            if ((lastNonAnchorTerm.type != PatternTerm::TypeCharacterClass)
+                || (lastNonAnchorTerm.characterClass != dotCharacterClass)
+                || (lastNonAnchorTerm.quantityType != QuantifierGreedy))
                 return;
 
             size_t endIndex = termIndex;
@@ -994,6 +1004,7 @@
     , m_flags(flags)
     , m_numSubpatterns(0)
     , m_maxBackReference(0)
+    , anycharCached(0)
     , newlineCached(0)
     , digitsCached(0)
     , spacesCached(0)
@@ -1089,7 +1100,9 @@
         break;
     case TypeCharacterClass:
         out.print("character class ");
-        if (characterClass == thisPattern->newlineCharacterClass())
+        if (characterClass == thisPattern->anyCharacterClass())
+            out.print("<any character>");
+        else if (characterClass == thisPattern->newlineCharacterClass())
             out.print("<newline>");
         else if (characterClass == thisPattern->digitsCharacterClass())
             out.print("<digits>");
@@ -1284,4 +1297,13 @@
     m_body->dump(out, this);
 }
 
+std::unique_ptr<CharacterClass> anycharCreate()
+{
+    auto characterClass = std::make_unique<CharacterClass>();
+    characterClass->m_ranges.append(CharacterRange(0x00, 0x7f));
+    characterClass->m_rangesUnicode.append(CharacterRange(0x0080, 0x10ffff));
+    characterClass->m_hasNonBMPCharacters = true;
+    return characterClass;
+}
+
 } }

Modified: trunk/Source/_javascript_Core/yarr/YarrPattern.h (221159 => 221160)


--- trunk/Source/_javascript_Core/yarr/YarrPattern.h	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/_javascript_Core/yarr/YarrPattern.h	2017-08-24 21:14:43 UTC (rev 221160)
@@ -305,6 +305,8 @@
 // (please to be calling newlineCharacterClass() et al on your
 // friendly neighborhood YarrPattern instance to get nicely
 // cached copies).
+
+std::unique_ptr<CharacterClass> anycharCreate();
 std::unique_ptr<CharacterClass> newlineCreate();
 std::unique_ptr<CharacterClass> digitsCreate();
 std::unique_ptr<CharacterClass> spacesCreate();
@@ -363,6 +365,7 @@
         m_hasCopiedParenSubexpressions = false;
         m_saveInitialStartValue = false;
 
+        anycharCached = 0;
         newlineCached = 0;
         digitsCached = 0;
         spacesCached = 0;
@@ -387,6 +390,14 @@
         return m_containsUnsignedLengthPattern;
     }
 
+    CharacterClass* anyCharacterClass()
+    {
+        if (!anycharCached) {
+            m_userCharacterClasses.append(anycharCreate());
+            anycharCached = m_userCharacterClasses.last().get();
+        }
+        return anycharCached;
+    }
     CharacterClass* newlineCharacterClass()
     {
         if (!newlineCached) {
@@ -468,6 +479,7 @@
     bool multiline() const { return m_flags & FlagMultiline; }
     bool sticky() const { return m_flags & FlagSticky; }
     bool unicode() const { return m_flags & FlagUnicode; }
+    bool dotAll() const { return m_flags & FlagDotAll; }
 
     bool m_containsBackreferences : 1;
     bool m_containsBOL : 1;
@@ -485,6 +497,7 @@
 private:
     const char* compile(const String& patternString, void* stackLimit);
 
+    CharacterClass* anycharCached;
     CharacterClass* newlineCached;
     CharacterClass* digitsCached;
     CharacterClass* spacesCached;

Modified: trunk/Source/WebCore/ChangeLog (221159 => 221160)


--- trunk/Source/WebCore/ChangeLog	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/WebCore/ChangeLog	2017-08-24 21:14:43 UTC (rev 221160)
@@ -1,3 +1,17 @@
+2017-08-24  Michael Saboff  <msab...@apple.com>
+
+        Add support for RegExp "dotAll" flag
+        https://bugs.webkit.org/show_bug.cgi?id=175924
+
+        Reviewed by Keith Miller.
+
+        Changed due to refactoring NewlineClassID to DotClassID.
+
+        No new tests. No change in behavior.
+
+        * contentextensions/URLFilterParser.cpp:
+        (WebCore::ContentExtensions::PatternParser::atomBuiltInCharacterClass):
+
 2017-08-24  Ryan Haddad  <ryanhad...@apple.com>
 
         Unreviewed, revert part of r221152 to fix internal builds.

Modified: trunk/Source/WebCore/contentextensions/URLFilterParser.cpp (221159 => 221160)


--- trunk/Source/WebCore/contentextensions/URLFilterParser.cpp	2017-08-24 21:03:05 UTC (rev 221159)
+++ trunk/Source/WebCore/contentextensions/URLFilterParser.cpp	2017-08-24 21:14:43 UTC (rev 221160)
@@ -101,7 +101,7 @@
         sinkFloatingTermIfNecessary();
         ASSERT(!m_floatingTerm.isValid());
 
-        if (builtInCharacterClassID == JSC::Yarr::NewlineClassID && inverted)
+        if (builtInCharacterClassID == JSC::Yarr::DotClassID && !inverted)
             m_floatingTerm = Term(Term::UniversalTransition);
         else
             fail(URLFilterParser::UnsupportedCharacterClass);
_______________________________________________
webkit-changes mailing list
webkit-changes@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to