Title: [107400] trunk
Revision
107400
Author
msab...@apple.com
Date
2012-02-10 07:31:10 -0800 (Fri, 10 Feb 2012)

Log Message

Yarr assert with regexp where alternative in *-quantified group matches empty
https://bugs.webkit.org/show_bug.cgi?id=67752        

Reviewed by Gavin Barraclough.

Source/_javascript_Core: 

Added backtracking for the prior alternative if it matched
but didn't consume any input characters.

* yarr/YarrJIT.cpp:
(YarrOp): New jump.
(JSC::Yarr::YarrGenerator::generate): Emit conditional jump
when an alternative matches and no input was consumed.  Moved the
zero length match check for a set of alternatives to the alternative
code from the parentheses cases to the alternative end cases.
Converted the existing zero length checks in the parentheses cases
to runtime assertion checks.
(JSC::Yarr::YarrGenerator::backtrack): Link new jump to backtrack
to prior term.

LayoutTests: 

Added new tests for alternatives that match without consuming 
any input characters.

* fast/js/regexp-zero-length-alternatives-expected.txt: Added.
* fast/js/regexp-zero-length-alternatives.html: Added.
* fast/js/script-tests/regexp-zero-length-alternatives.js: Added.

Modified Paths

Added Paths

Diff

Modified: trunk/LayoutTests/ChangeLog (107399 => 107400)


--- trunk/LayoutTests/ChangeLog	2012-02-10 13:15:12 UTC (rev 107399)
+++ trunk/LayoutTests/ChangeLog	2012-02-10 15:31:10 UTC (rev 107400)
@@ -1,3 +1,17 @@
+2012-02-10  Michael Saboff  <msab...@apple.com>
+
+        Yarr assert with regexp where alternative in *-quantified group matches empty
+        https://bugs.webkit.org/show_bug.cgi?id=67752        
+
+        Reviewed by Gavin Barraclough.
+
+        Added new tests for alternatives that match without consuming 
+        any input characters.
+
+        * fast/js/regexp-zero-length-alternatives-expected.txt: Added.
+        * fast/js/regexp-zero-length-alternatives.html: Added.
+        * fast/js/script-tests/regexp-zero-length-alternatives.js: Added.
+
 2012-02-10  Pavel Feldman  <pfeld...@google.com>
 
         Web Inspector: implement undo for setOuterHTML via undo-ing nested primitive commands.

Added: trunk/LayoutTests/fast/js/regexp-zero-length-alternatives-expected.txt (0 => 107400)


--- trunk/LayoutTests/fast/js/regexp-zero-length-alternatives-expected.txt	                        (rev 0)
+++ trunk/LayoutTests/fast/js/regexp-zero-length-alternatives-expected.txt	2012-02-10 15:31:10 UTC (rev 107400)
@@ -0,0 +1,146 @@
+Test regular _expression_ processing with alternatives that match consuming no characters
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS emptyStr.match(re1) is [""]
+PASS s1.match(re1) is [""]
+PASS s2.match(re1) is ["aaaa"]
+PASS s3.match(re1) is ["aa"]
+PASS emptyStr.match(re2) is [""]
+PASS s1.match(re2) is [""]
+PASS s2.match(re2) is ["aaaa"]
+PASS s3.match(re2) is ["aa"]
+PASS emptyStr.match(re3) is [""]
+PASS s1.match(re3) is [""]
+PASS s2.match(re3) is ["aaaa"]
+PASS s3.match(re3) is ["aa"]
+PASS emptyStr.match(re4) is ["", undefined]
+PASS s1.match(re4) is ["", undefined]
+PASS s2.match(re4) is ["aaaa", "a"]
+PASS s3.match(re4) is ["aa", "a"]
+PASS emptyStr.match(re5) is ["", undefined]
+PASS s1.match(re5) is ["", undefined]
+PASS s2.match(re5) is ["aaaa", "a"]
+PASS s3.match(re5) is ["aa", "a"]
+PASS emptyStr.match(re6) is ["", undefined]
+PASS s1.match(re6) is ["", undefined]
+PASS s2.match(re6) is ["aaaa", "a"]
+PASS s3.match(re6) is ["aa", "a"]
+PASS emptyStr.match(re7) is [""]
+PASS s1.match(re7) is [""]
+PASS s2.match(re7) is ["aaa"]
+PASS s3.match(re7) is ["aa"]
+PASS emptyStr.match(re8) is [""]
+PASS s1.match(re8) is [""]
+PASS s2.match(re8) is ["aaaa"]
+PASS s3.match(re8) is ["aa"]
+PASS emptyStr.match(re9) is [""]
+PASS s1.match(re9) is [""]
+PASS s2.match(re9) is ["aaaa"]
+PASS s3.match(re9) is ["aa"]
+PASS emptyStr.match(re10) is [""]
+PASS s1.match(re10) is [""]
+PASS s2.match(re10) is [""]
+PASS s3.match(re10) is [""]
+PASS emptyStr.match(re11) is [""]
+PASS s1.match(re11) is [""]
+PASS s2.match(re11) is [""]
+PASS s3.match(re11) is [""]
+PASS emptyStr.match(re12) is [""]
+PASS s1.match(re12) is [""]
+PASS s2.match(re12) is [""]
+PASS s3.match(re12) is [""]
+PASS emptyStr.match(re13) is ["", undefined]
+PASS s1.match(re13) is ["", undefined]
+PASS s2.match(re13) is ["", undefined]
+PASS s3.match(re13) is ["", undefined]
+PASS emptyStr.match(re14) is ["", undefined]
+PASS s1.match(re14) is ["", undefined]
+PASS s2.match(re14) is ["", undefined]
+PASS s3.match(re14) is ["", undefined]
+PASS emptyStr.match(re15) is ["", undefined]
+PASS s1.match(re15) is ["", undefined]
+PASS s2.match(re15) is ["", undefined]
+PASS s3.match(re15) is ["", undefined]
+PASS emptyStr.match(re16) is [""]
+PASS s1.match(re16) is [""]
+PASS s2.match(re16) is ["a"]
+PASS s3.match(re16) is ["a"]
+PASS emptyStr.match(re17) is [""]
+PASS s1.match(re17) is [""]
+PASS s2.match(re17) is ["a"]
+PASS s3.match(re17) is ["a"]
+PASS emptyStr.match(re18) is [""]
+PASS s1.match(re18) is [""]
+PASS s2.match(re18) is ["a"]
+PASS s3.match(re18) is ["a"]
+PASS emptyStr.match(re19) is ["", undefined]
+PASS s1.match(re19) is ["", undefined]
+PASS s2.match(re19) is ["a", "a"]
+PASS s3.match(re19) is ["a", "a"]
+PASS emptyStr.match(re20) is ["", undefined]
+PASS s1.match(re20) is ["", undefined]
+PASS s2.match(re20) is ["a", "a"]
+PASS s3.match(re20) is ["a", "a"]
+PASS emptyStr.match(re21) is ["", undefined]
+PASS s1.match(re21) is ["", undefined]
+PASS s2.match(re21) is ["a", "a"]
+PASS s3.match(re21) is ["a", "a"]
+PASS emptyStr.match(re22) is [""]
+PASS s1.match(re22) is [""]
+PASS s2.match(re22) is [""]
+PASS s3.match(re22) is [""]
+PASS emptyStr.match(re23) is [""]
+PASS s1.match(re23) is [""]
+PASS s2.match(re23) is [""]
+PASS s3.match(re23) is [""]
+PASS emptyStr.match(re24) is [""]
+PASS s1.match(re24) is [""]
+PASS s2.match(re24) is [""]
+PASS s3.match(re24) is [""]
+PASS emptyStr.match(re25) is ["", undefined]
+PASS s1.match(re25) is ["", undefined]
+PASS s2.match(re25) is ["", undefined]
+PASS s3.match(re25) is ["", undefined]
+PASS emptyStr.match(re26) is ["", undefined]
+PASS s1.match(re26) is ["", undefined]
+PASS s2.match(re26) is ["", undefined]
+PASS s3.match(re26) is ["", undefined]
+PASS emptyStr.match(re27) is ["", undefined]
+PASS s1.match(re27) is ["", undefined]
+PASS s2.match(re27) is ["", undefined]
+PASS s3.match(re27) is ["", undefined]
+PASS emptyStr.match(re28) is null
+PASS s1.match(re28) is ["x"]
+PASS s2.match(re28) is null
+PASS s3.match(re28) is ["aax"]
+PASS emptyStr.match(re29) is null
+PASS s1.match(re29) is ["x"]
+PASS s2.match(re29) is null
+PASS s3.match(re29) is ["aax"]
+PASS emptyStr.match(re30) is null
+PASS s1.match(re30) is ["x"]
+PASS s2.match(re30) is null
+PASS s3.match(re30) is ["aax"]
+PASS emptyStr.match(re31) is [""]
+PASS s1.match(re31) is [""]
+PASS s3.match(re31) is ["aa"]
+PASS s4.match(re31) is ["abab"]
+PASS emptyStr.match(re32) is [""]
+PASS s1.match(re32) is [""]
+PASS s2.match(re32) is ["aaaa"]
+PASS s4.match(re32) is ["abab"]
+PASS s5.match(re32) is ["ab"]
+PASS s6.match(re32) is [""]
+PASS emptyStr.match(re33) is [""]
+PASS s1.match(re33) is [""]
+PASS s7.match(re33) is ["g0"]
+PASS emptyStr.match(re34) is [""]
+PASS s1.match(re34) is [""]
+PASS s2.match(re34) is [""]
+PASS s3.match(re34) is [""]
+PASS successfullyParsed is true
+
+TEST COMPLETE
+

Added: trunk/LayoutTests/fast/js/regexp-zero-length-alternatives.html (0 => 107400)


--- trunk/LayoutTests/fast/js/regexp-zero-length-alternatives.html	                        (rev 0)
+++ trunk/LayoutTests/fast/js/regexp-zero-length-alternatives.html	2012-02-10 15:31:10 UTC (rev 107400)
@@ -0,0 +1,10 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+<head>
+<script src=""
+</head>
+<body>
+<script src=""
+<script src=""
+</body>
+</html>

Added: trunk/LayoutTests/fast/js/script-tests/regexp-zero-length-alternatives.js (0 => 107400)


--- trunk/LayoutTests/fast/js/script-tests/regexp-zero-length-alternatives.js	                        (rev 0)
+++ trunk/LayoutTests/fast/js/script-tests/regexp-zero-length-alternatives.js	2012-02-10 15:31:10 UTC (rev 107400)
@@ -0,0 +1,252 @@
+description(
+'Test regular _expression_ processing with alternatives that match consuming no characters'
+);
+
+var emptyStr = "";
+var s1 = "xxxx";
+var s2 = "aaaa";
+var s3 = "aax";
+var s4 = "abab";
+var s5 = "ab";
+var s6 = "xabx";
+var s7 = "g0";
+
+// Non-capturing empty first alternative greedy '*'
+var re1 = new RegExp(/(?:|a|z)*/);
+shouldBe('emptyStr.match(re1)', '[""]');
+shouldBe('s1.match(re1)', '[""]');
+shouldBe('s2.match(re1)', '["aaaa"]');
+shouldBe('s3.match(re1)', '["aa"]');
+
+// Non-capturing empty middle alternative greedy '*'
+var re2 = new RegExp(/(?:a||z)*/);
+shouldBe('emptyStr.match(re2)', '[""]');
+shouldBe('s1.match(re2)', '[""]');
+shouldBe('s2.match(re2)', '["aaaa"]');
+shouldBe('s3.match(re2)', '["aa"]');
+
+// Non-capturing empty last alternative greedy '*'
+var re3 = new RegExp(/(?:a|z|)*/);
+shouldBe('emptyStr.match(re3)', '[""]');
+shouldBe('s1.match(re3)', '[""]');
+shouldBe('s2.match(re3)', '["aaaa"]');
+shouldBe('s3.match(re3)', '["aa"]');
+
+// Capturing empty first alternative greedy '*'
+var re4 = new RegExp(/(|a|z)*/);
+shouldBe('emptyStr.match(re4)', '["", undefined]');
+shouldBe('s1.match(re4)', '["", undefined]');
+shouldBe('s2.match(re4)', '["aaaa", "a"]');
+shouldBe('s3.match(re4)', '["aa", "a"]');
+
+// Capturing empty middle alternative greedy '*'
+var re5 = new RegExp(/(a||z)*/);
+shouldBe('emptyStr.match(re5)', '["", undefined]');
+shouldBe('s1.match(re5)', '["", undefined]');
+shouldBe('s2.match(re5)', '["aaaa", "a"]');
+shouldBe('s3.match(re5)', '["aa", "a"]');
+
+// Capturing empty last alternative greedy '*'
+var re6 = new RegExp(/(a|z|)*/);
+shouldBe('emptyStr.match(re6)', '["", undefined]');
+shouldBe('s1.match(re6)', '["", undefined]');
+shouldBe('s2.match(re6)', '["aaaa", "a"]');
+shouldBe('s3.match(re6)', '["aa", "a"]');
+
+// Non-capturing empty first alternative fixed-count
+var re7 = new RegExp(/(?:|a|z){2,5}/);
+shouldBe('emptyStr.match(re7)', '[""]');
+shouldBe('s1.match(re7)', '[""]');
+shouldBe('s2.match(re7)', '["aaa"]');
+shouldBe('s3.match(re7)', '["aa"]');
+
+// Non-capturing empty middle alternative fixed-count
+var re8 = new RegExp(/(?:a||z){2,5}/);
+shouldBe('emptyStr.match(re8)', '[""]');
+shouldBe('s1.match(re8)', '[""]');
+shouldBe('s2.match(re8)', '["aaaa"]');
+shouldBe('s3.match(re8)', '["aa"]');
+
+// Non-capturing empty last alternative fixed-count
+var re9 = new RegExp(/(?:a|z|){2,5}/);
+shouldBe('emptyStr.match(re9)', '[""]');
+shouldBe('s1.match(re9)', '[""]');
+shouldBe('s2.match(re9)', '["aaaa"]');
+shouldBe('s3.match(re9)', '["aa"]');
+
+// Non-capturing empty first alternative non-greedy '*'
+var re10 = new RegExp(/(?:|a|z)*?/);
+shouldBe('emptyStr.match(re10)', '[""]');
+shouldBe('s1.match(re10)', '[""]');
+shouldBe('s2.match(re10)', '[""]');
+shouldBe('s3.match(re10)', '[""]');
+
+// Non-capturing empty middle alternative non-greedy '*'
+var re11 = new RegExp(/(?:a||z)*?/);
+shouldBe('emptyStr.match(re11)', '[""]');
+shouldBe('s1.match(re11)', '[""]');
+shouldBe('s2.match(re11)', '[""]');
+shouldBe('s3.match(re11)', '[""]');
+
+// Non-capturing empty last alternative non-greedy '*'
+var re12 = new RegExp(/(?:a|z|)*?/);
+shouldBe('emptyStr.match(re12)', '[""]');
+shouldBe('s1.match(re12)', '[""]');
+shouldBe('s2.match(re12)', '[""]');
+shouldBe('s3.match(re12)', '[""]');
+
+// Capturing empty first alternative non-greedy '*'
+var re13 = new RegExp(/(|a|z)*?/);
+shouldBe('emptyStr.match(re13)', '["", undefined]');
+shouldBe('s1.match(re13)', '["", undefined]');
+shouldBe('s2.match(re13)', '["", undefined]');
+shouldBe('s3.match(re13)', '["", undefined]');
+
+// Capturing empty middle alternative non-greedy '*'
+var re14 = new RegExp(/(a||z)*?/);
+shouldBe('emptyStr.match(re14)', '["", undefined]');
+shouldBe('s1.match(re14)', '["", undefined]');
+shouldBe('s2.match(re14)', '["", undefined]');
+shouldBe('s3.match(re14)', '["", undefined]');
+
+// Capturing empty last alternative non-greedy '*'
+var re15 = new RegExp(/(a|z|)*?/);
+shouldBe('emptyStr.match(re15)', '["", undefined]');
+shouldBe('s1.match(re15)', '["", undefined]');
+shouldBe('s2.match(re15)', '["", undefined]');
+shouldBe('s3.match(re15)', '["", undefined]');
+
+// Non-capturing empty first alternative greedy '?'
+var re16 = new RegExp(/(?:|a|z)?/);
+shouldBe('emptyStr.match(re16)', '[""]');
+shouldBe('s1.match(re16)', '[""]');
+shouldBe('s2.match(re16)', '["a"]');
+shouldBe('s3.match(re16)', '["a"]');
+
+// Non-capturing empty middle alternative greedy '?'
+var re17 = new RegExp(/(?:a||z)?/);
+shouldBe('emptyStr.match(re17)', '[""]');
+shouldBe('s1.match(re17)', '[""]');
+shouldBe('s2.match(re17)', '["a"]');
+shouldBe('s3.match(re17)', '["a"]');
+
+// Non-capturing empty last alternative greedy '?'
+var re18 = new RegExp(/(?:a|z|)?/);
+shouldBe('emptyStr.match(re18)', '[""]');
+shouldBe('s1.match(re18)', '[""]');
+shouldBe('s2.match(re18)', '["a"]');
+shouldBe('s3.match(re18)', '["a"]');
+
+// Capturing empty first alternative greedy '?'
+var re19 = new RegExp(/(|a|z)?/);
+shouldBe('emptyStr.match(re19)', '["", undefined]');
+shouldBe('s1.match(re19)', '["", undefined]');
+shouldBe('s2.match(re19)', '["a", "a"]');
+shouldBe('s3.match(re19)', '["a", "a"]');
+
+// Capturing empty middle alternative greedy '?'
+var re20 = new RegExp(/(a||z)?/);
+shouldBe('emptyStr.match(re20)', '["", undefined]');
+shouldBe('s1.match(re20)', '["", undefined]');
+shouldBe('s2.match(re20)', '["a", "a"]');
+shouldBe('s3.match(re20)', '["a", "a"]');
+
+// Capturing empty last alternative greedy '?'
+var re21 = new RegExp(/(a|z|)?/);
+shouldBe('emptyStr.match(re21)', '["", undefined]');
+shouldBe('s1.match(re21)', '["", undefined]');
+shouldBe('s2.match(re21)', '["a", "a"]');
+shouldBe('s3.match(re21)', '["a", "a"]');
+
+// Non-capturing empty first alternative non-greedy '?'
+var re22 = new RegExp(/(?:|a|z)??/);
+shouldBe('emptyStr.match(re22)', '[""]');
+shouldBe('s1.match(re22)', '[""]');
+shouldBe('s2.match(re22)', '[""]');
+shouldBe('s3.match(re22)', '[""]');
+
+// Non-capturing empty middle alternative non-greedy '?'
+var re23 = new RegExp(/(?:a||z)??/);
+shouldBe('emptyStr.match(re23)', '[""]');
+shouldBe('s1.match(re23)', '[""]');
+shouldBe('s2.match(re23)', '[""]');
+shouldBe('s3.match(re23)', '[""]');
+
+// Non-capturing empty last alternative non-greedy '?'
+var re24 = new RegExp(/(?:a|z|)??/);
+shouldBe('emptyStr.match(re24)', '[""]');
+shouldBe('s1.match(re24)', '[""]');
+shouldBe('s2.match(re24)', '[""]');
+shouldBe('s3.match(re24)', '[""]');
+
+// Capturing empty first alternative non-greedy '?'
+var re25 = new RegExp(/(|a|z)??/);
+shouldBe('emptyStr.match(re25)', '["", undefined]');
+shouldBe('s1.match(re25)', '["", undefined]');
+shouldBe('s2.match(re25)', '["", undefined]');
+shouldBe('s3.match(re25)', '["", undefined]');
+
+// Capturing empty middle alternative non-greedy '?'
+var re26 = new RegExp(/(a||z)??/);
+shouldBe('emptyStr.match(re26)', '["", undefined]');
+shouldBe('s1.match(re26)', '["", undefined]');
+shouldBe('s2.match(re26)', '["", undefined]');
+shouldBe('s3.match(re26)', '["", undefined]');
+
+// Capturing empty last alternative non-greedy '?'
+var re27 = new RegExp(/(a|z|)??/);
+shouldBe('emptyStr.match(re27)', '["", undefined]');
+shouldBe('s1.match(re27)', '["", undefined]');
+shouldBe('s2.match(re27)', '["", undefined]');
+shouldBe('s3.match(re27)', '["", undefined]');
+
+// Non-capturing empty first alternative greedy '*' non-terminal
+var re28 = new RegExp(/(?:|a|z)*x/);
+shouldBe('emptyStr.match(re28)', 'null');
+shouldBe('s1.match(re28)', '["x"]');
+shouldBe('s2.match(re28)', 'null');
+shouldBe('s3.match(re28)', '["aax"]');
+
+// Non-capturing empty middle alternative greedy '*' non-terminal
+var re29 = new RegExp(/(?:a||z)*x/);
+shouldBe('emptyStr.match(re29)', 'null');
+shouldBe('s1.match(re29)', '["x"]');
+shouldBe('s2.match(re29)', 'null');
+shouldBe('s3.match(re29)', '["aax"]');
+
+// Non-capturing empty last alternative greedy '*' non-terminal
+var re30 = new RegExp(/(?:a|z|)*x/);
+shouldBe('emptyStr.match(re30)', 'null');
+shouldBe('s1.match(re30)', '["x"]');
+shouldBe('s2.match(re30)', 'null');
+shouldBe('s3.match(re30)', '["aax"]');
+
+// Non-capturing two possibly empty alternatives greedy '*'
+var re31 = new RegExp(/(?:a*|b*)*/);
+shouldBe('emptyStr.match(re31)', '[""]');
+shouldBe('s1.match(re31)', '[""]');
+shouldBe('s3.match(re31)', '["aa"]');
+shouldBe('s4.match(re31)', '["abab"]');
+
+// Non-capturing two possibly empty non-greedy alternatives non-greedy '*'
+var re32 = new RegExp(/(?:a*?|b*?)*/);
+shouldBe('emptyStr.match(re32)', '[""]');
+shouldBe('s1.match(re32)', '[""]');
+shouldBe('s2.match(re32)', '["aaaa"]');
+shouldBe('s4.match(re32)', '["abab"]');
+shouldBe('s5.match(re32)', '["ab"]');
+shouldBe('s6.match(re32)', '[""]');
+
+// Three possibly empty alternatives with greedy +
+var re33 = new RegExp(/(?:(?:(?!))|g?|0*\*?)+/);
+shouldBe('emptyStr.match(re33)', '[""]');
+shouldBe('s1.match(re33)', '[""]');
+shouldBe('s7.match(re33)', '["g0"]');
+
+// first alternative zero length fixed count
+var re34 = new RegExp(/(?:|a)/);
+shouldBe('emptyStr.match(re34)', '[""]');
+shouldBe('s1.match(re34)', '[""]');
+shouldBe('s2.match(re34)', '[""]');
+shouldBe('s3.match(re34)', '[""]');
+

Modified: trunk/Source/_javascript_Core/ChangeLog (107399 => 107400)


--- trunk/Source/_javascript_Core/ChangeLog	2012-02-10 13:15:12 UTC (rev 107399)
+++ trunk/Source/_javascript_Core/ChangeLog	2012-02-10 15:31:10 UTC (rev 107400)
@@ -1,3 +1,24 @@
+2012-02-10  Michael Saboff  <msab...@apple.com>
+
+        Yarr assert with regexp where alternative in *-quantified group matches empty
+        https://bugs.webkit.org/show_bug.cgi?id=67752        
+
+        Reviewed by Gavin Barraclough.
+
+        Added backtracking for the prior alternative if it matched
+        but didn't consume any input characters.
+
+        * yarr/YarrJIT.cpp:
+        (YarrOp): New jump.
+        (JSC::Yarr::YarrGenerator::generate): Emit conditional jump
+        when an alternative matches and no input was consumed.  Moved the
+        zero length match check for a set of alternatives to the alternative
+        code from the parentheses cases to the alternative end cases.
+        Converted the existing zero length checks in the parentheses cases
+        to runtime assertion checks.
+        (JSC::Yarr::YarrGenerator::backtrack): Link new jump to backtrack
+        to prior term.
+
 2012-02-10  Roland Takacs  <takacs.rol...@stud.u-szeged.hu>
 
         [Qt] GC should be parallel on Qt platform

Modified: trunk/Source/_javascript_Core/yarr/YarrJIT.cpp (107399 => 107400)


--- trunk/Source/_javascript_Core/yarr/YarrJIT.cpp	2012-02-10 13:15:12 UTC (rev 107399)
+++ trunk/Source/_javascript_Core/yarr/YarrJIT.cpp	2012-02-10 15:31:10 UTC (rev 107400)
@@ -379,6 +379,10 @@
         Label m_reentry;
         JumpList m_jumps;
 
+        // Used for backtracking when the prior alternative did not consume any
+        // characters but matched.
+        Jump m_zeroLengthMatch;
+
         // This flag is used to null out the second pattern character, when
         // two are fused to match a pair together.
         bool m_isDeadCode;
@@ -1385,7 +1389,7 @@
                     op.m_checkAdjust -= disjunction->m_minimumSize;
                 if (op.m_checkAdjust)
                     op.m_jumps.append(jumpIfNoAvailableInput(op.m_checkAdjust));
- 
+
                 m_checked += op.m_checkAdjust;
                 break;
             }
@@ -1404,6 +1408,12 @@
                     op.m_returnAddress = storeToFrameWithPatch(alternativeFrameLocation);
                 }
 
+                if (term->quantityType != QuantifierFixedCount && !m_ops[op.m_previousOp].m_alternative->m_minimumSize) {
+                    // If the previous alternative matched without consuming characters then
+                    // backtrack to try to match while consumming some input.
+                    op.m_zeroLengthMatch = branch32(Equal, index, Address(stackPointerRegister, term->frameLocation * sizeof(void*)));
+                }
+
                 // If we reach here then the last alternative has matched - jump to the
                 // End node, to skip over any further alternatives.
                 //
@@ -1448,6 +1458,12 @@
                     op.m_returnAddress = storeToFrameWithPatch(alternativeFrameLocation);
                 }
 
+                if (term->quantityType != QuantifierFixedCount && !m_ops[op.m_previousOp].m_alternative->m_minimumSize) {
+                    // If the previous alternative matched without consuming characters then
+                    // backtrack to try to match while consumming some input.
+                    op.m_zeroLengthMatch = branch32(Equal, index, Address(stackPointerRegister, term->frameLocation * sizeof(void*)));
+                }
+
                 // If this set of alternatives contains more than one alternative,
                 // then the Next nodes will have planted jumps to the End, and added
                 // them to this node's m_jumps list.
@@ -1514,14 +1530,19 @@
             }
             case OpParenthesesSubpatternOnceEnd: {
                 PatternTerm* term = op.m_term;
-                unsigned parenthesesFrameLocation = term->frameLocation;
                 const RegisterID indexTemporary = regT0;
                 ASSERT(term->quantityCount == 1);
 
-                // For Greedy/NonGreedy quantified parentheses, we must reject zero length
-                // matches. If the minimum size is know to be non-zero we need not check.
-                if (term->quantityType != QuantifierFixedCount && !term->parentheses.disjunction->m_minimumSize)
-                    op.m_jumps.append(branch32(Equal, index, Address(stackPointerRegister, parenthesesFrameLocation * sizeof(void*))));
+#ifndef NDEBUG
+                // Runtime ASSERT to make sure that the nested alternative handled the
+                // "no input consumed" check.
+                if (term->quantityType != QuantifierFixedCount && !term->parentheses.disjunction->m_minimumSize) {
+                    Jump pastBreakpoint;
+                    pastBreakpoint = branch32(NotEqual, index, Address(stackPointerRegister, term->frameLocation * sizeof(void*)));
+                    breakpoint();
+                    pastBreakpoint.link(this);
+                }
+#endif
 
                 // If the parenthese are capturing, store the ending index value to the
                 // captures array, offsetting as necessary.
@@ -1568,15 +1589,21 @@
                 break;
             }
             case OpParenthesesSubpatternTerminalEnd: {
+                YarrOp& beginOp = m_ops[op.m_previousOp];
+#ifndef NDEBUG
                 PatternTerm* term = op.m_term;
 
-                // Check for zero length matches - if the match is non-zero, then we
-                // can accept it & loop back up to the head of the subpattern.
-                YarrOp& beginOp = m_ops[op.m_previousOp];
-                branch32(NotEqual, index, Address(stackPointerRegister, term->frameLocation * sizeof(void*)), beginOp.m_reentry);
+                // Runtime ASSERT to make sure that the nested alternative handled the
+                // "no input consumed" check.
+                Jump pastBreakpoint;
+                pastBreakpoint = branch32(NotEqual, index, Address(stackPointerRegister, term->frameLocation * sizeof(void*)));
+                breakpoint();
+                pastBreakpoint.link(this);
+#endif
 
-                // Reject the match - backtrack back into the subpattern.
-                op.m_jumps.append(jump());
+                // We know that the match is non-zero, we can accept it  and
+                // loop back up to the head of the subpattern.
+                jump(beginOp.m_reentry);
 
                 // This is the entry point to jump to when we stop matching - we will
                 // do so once the subpattern cannot match any more.
@@ -1928,7 +1955,7 @@
                         // An alternative that is not the last should jump to its successor.
                         jump(nextOp.m_reentry);
                     } else if (!isBegin) {
-                        // The last of more than one alternatives must jump back to the begnning.
+                        // The last of more than one alternatives must jump back to the beginning.
                         nextOp.m_jumps.append(jump());
                     } else {
                         // A single alternative on its own can fall through.
@@ -1940,12 +1967,16 @@
                         // An alternative that is not the last should jump to its successor.
                         m_backtrackingState.linkTo(nextOp.m_reentry, this);
                     } else if (!isBegin) {
-                        // The last of more than one alternatives must jump back to the begnning.
+                        // The last of more than one alternatives must jump back to the beginning.
                         m_backtrackingState.takeBacktracksToJumpList(nextOp.m_jumps, this);
                     }
                     // In the case of a single alternative on its own do nothing - it can fall through.
                 }
 
+                // If there is a backtrack jump from a zero length match link it here.
+                if (op.m_zeroLengthMatch.isSet())
+                    m_backtrackingState.append(op.m_zeroLengthMatch);
+
                 // At this point we've handled the backtracking back into this node.
                 // Now link any backtracks that need to jump to here.
 
@@ -1978,6 +2009,10 @@
             case OpNestedAlternativeEnd: {
                 PatternTerm* term = op.m_term;
 
+                // If there is a backtrack jump from a zero length match link it here.
+                if (op.m_zeroLengthMatch.isSet())
+                    m_backtrackingState.append(op.m_zeroLengthMatch);
+
                 // If we backtrack into the end of a simple subpattern do nothing;
                 // just continue through into the last alternative. If we backtrack
                 // into the end of a non-simple set of alterntives we need to jump
_______________________________________________
webkit-changes mailing list
webkit-changes@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-changes

Reply via email to