Title: [201412] trunk
Revision
201412
Author
benja...@webkit.org
Date
2016-05-25 20:19:06 -0700 (Wed, 25 May 2016)

Log Message

[JSC] RegExp with deeply nested subexpressions overflow the stack in Yarr
https://bugs.webkit.org/show_bug.cgi?id=158011
rdar://problem/25946592

Reviewed by Saam Barati.

Source/_javascript_Core:

When generating the meta-data required for compilation,
Yarr uses a recursive function over the various _expression_ in the pattern.

If you have many nested expressions, you can run out of stack
and crash the WebProcess.
This patch changes that into a soft failure. The _expression_ is just
considered invalid.

* runtime/RegExp.cpp:
(JSC::RegExp::finishCreation):
(JSC::RegExp::compile):
(JSC::RegExp::compileMatchOnly):
* yarr/YarrPattern.cpp:
(JSC::Yarr::YarrPatternConstructor::YarrPatternConstructor):
(JSC::Yarr::YarrPatternConstructor::setupOffsets):
(JSC::Yarr::YarrPatternConstructor::isSafeToRecurse):
(JSC::Yarr::YarrPattern::compile):
(JSC::Yarr::YarrPattern::YarrPattern):
(JSC::Yarr::YarrPatternConstructor::setupAlternativeOffsets): Deleted.
(JSC::Yarr::YarrPatternConstructor::setupDisjunctionOffsets): Deleted.
* yarr/YarrPattern.h:

LayoutTests:

* js/script-tests/stack-overflow-arrity-catch.js:
With the new failure, this test can fail on allocating
the RegExp for a valid reason.

The new _expression_ should not have this issue.
* js/script-tests/stack-overflow-regexp.js: Added.
(shouldThrow.recursiveCall):
(shouldThrow):
(recursiveCall):
* js/stack-overflow-regexp-expected.txt: Added.
* js/stack-overflow-regexp.html: Added.

Modified Paths

Added Paths

Diff

Modified: trunk/LayoutTests/ChangeLog (201411 => 201412)


--- trunk/LayoutTests/ChangeLog	2016-05-26 00:08:48 UTC (rev 201411)
+++ trunk/LayoutTests/ChangeLog	2016-05-26 03:19:06 UTC (rev 201412)
@@ -1,3 +1,23 @@
+2016-05-25  Benjamin Poulain  <benja...@webkit.org>
+
+        [JSC] RegExp with deeply nested subexpressions overflow the stack in Yarr
+        https://bugs.webkit.org/show_bug.cgi?id=158011
+        rdar://problem/25946592
+
+        Reviewed by Saam Barati.
+
+        * js/script-tests/stack-overflow-arrity-catch.js:
+        With the new failure, this test can fail on allocating
+        the RegExp for a valid reason.
+
+        The new _expression_ should not have this issue.
+        * js/script-tests/stack-overflow-regexp.js: Added.
+        (shouldThrow.recursiveCall):
+        (shouldThrow):
+        (recursiveCall):
+        * js/stack-overflow-regexp-expected.txt: Added.
+        * js/stack-overflow-regexp.html: Added.
+
 2016-05-25  Ryan Haddad  <ryanhad...@apple.com>
 
         Marking imported/blink/http/tests/plugins/get-url-notify-on-removal.html as a flaky timeout

Modified: trunk/LayoutTests/js/script-tests/stack-overflow-arrity-catch.js (201411 => 201412)


--- trunk/LayoutTests/js/script-tests/stack-overflow-arrity-catch.js	2016-05-26 00:08:48 UTC (rev 201411)
+++ trunk/LayoutTests/js/script-tests/stack-overflow-arrity-catch.js	2016-05-26 03:19:06 UTC (rev 201412)
@@ -17,7 +17,7 @@
         // Should get here because of stack overflow,
         // now cause a stack overflow exception due to arrity processing
         try {
-            var dummy = new RegExp('a|b|c');
+            var dummy = new Float64Array(128);
         } catch(err) {
             gotWrongCatch1 = true;
         }

Added: trunk/LayoutTests/js/script-tests/stack-overflow-regexp.js (0 => 201412)


--- trunk/LayoutTests/js/script-tests/stack-overflow-regexp.js	                        (rev 0)
+++ trunk/LayoutTests/js/script-tests/stack-overflow-regexp.js	2016-05-26 03:19:06 UTC (rev 201412)
@@ -0,0 +1,29 @@
+description('Test that we do not overflow the stack while handling regular expressions');
+
+// Base case.
+shouldThrow('new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")"))', '"SyntaxError: Invalid regular _expression_: too many nested disjunctions"');
+
+{ // Verify that a deep JS stack does not help avoiding the error.
+    function recursiveCall(depth) {
+        if (!(depth % 10)) {
+            debug("Creating RegExp at depth " + depth);
+            shouldThrow('new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")"))', '"SyntaxError: Invalid regular _expression_: too many nested disjunctions"');
+        }
+        if (depth < 100) {
+            recursiveCall(depth + 1);
+        }
+    }
+    recursiveCall(0);
+}
+
+{ // Have the deepest nested subpattern surrounded by other expressions.
+    var _expression_ = "";
+    for (let i = 0; i < 50000; ++i) {
+        _expression_ += "((a)(";
+    }
+    _expression_ += "b";
+    for (let i = 0; i < 50000; ++i) {
+        _expression_ += ")(c))";
+    }
+    shouldThrow('new RegExp(_expression_)', '"SyntaxError: Invalid regular _expression_: too many nested disjunctions"');
+}

Added: trunk/LayoutTests/js/stack-overflow-regexp-expected.txt (0 => 201412)


--- trunk/LayoutTests/js/stack-overflow-regexp-expected.txt	                        (rev 0)
+++ trunk/LayoutTests/js/stack-overflow-regexp-expected.txt	2016-05-26 03:19:06 UTC (rev 201412)
@@ -0,0 +1,33 @@
+Test that we do not overflow the stack while handling regular expressions
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 0
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 10
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 20
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 30
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 40
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 50
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 60
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 70
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 80
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 90
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+Creating RegExp at depth 100
+PASS new RegExp(Array(50000).join("(") + "a" + Array(50000).join(")")) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+PASS new RegExp(_expression_) threw exception SyntaxError: Invalid regular _expression_: too many nested disjunctions.
+PASS successfullyParsed is true
+
+TEST COMPLETE
+

Added: trunk/LayoutTests/js/stack-overflow-regexp.html (0 => 201412)


--- trunk/LayoutTests/js/stack-overflow-regexp.html	                        (rev 0)
+++ trunk/LayoutTests/js/stack-overflow-regexp.html	2016-05-26 03:19:06 UTC (rev 201412)
@@ -0,0 +1,10 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+<head>
+<script src=""
+</head>
+<body>
+<script src="" "></script>
+<script src=""
+</body>
+</html>

Modified: trunk/Source/_javascript_Core/ChangeLog (201411 => 201412)


--- trunk/Source/_javascript_Core/ChangeLog	2016-05-26 00:08:48 UTC (rev 201411)
+++ trunk/Source/_javascript_Core/ChangeLog	2016-05-26 03:19:06 UTC (rev 201412)
@@ -1,3 +1,33 @@
+2016-05-25  Benjamin Poulain  <benja...@webkit.org>
+
+        [JSC] RegExp with deeply nested subexpressions overflow the stack in Yarr
+        https://bugs.webkit.org/show_bug.cgi?id=158011
+        rdar://problem/25946592
+
+        Reviewed by Saam Barati.
+
+        When generating the meta-data required for compilation,
+        Yarr uses a recursive function over the various _expression_ in the pattern.
+
+        If you have many nested expressions, you can run out of stack
+        and crash the WebProcess.
+        This patch changes that into a soft failure. The _expression_ is just
+        considered invalid.
+
+        * runtime/RegExp.cpp:
+        (JSC::RegExp::finishCreation):
+        (JSC::RegExp::compile):
+        (JSC::RegExp::compileMatchOnly):
+        * yarr/YarrPattern.cpp:
+        (JSC::Yarr::YarrPatternConstructor::YarrPatternConstructor):
+        (JSC::Yarr::YarrPatternConstructor::setupOffsets):
+        (JSC::Yarr::YarrPatternConstructor::isSafeToRecurse):
+        (JSC::Yarr::YarrPattern::compile):
+        (JSC::Yarr::YarrPattern::YarrPattern):
+        (JSC::Yarr::YarrPatternConstructor::setupAlternativeOffsets): Deleted.
+        (JSC::Yarr::YarrPatternConstructor::setupDisjunctionOffsets): Deleted.
+        * yarr/YarrPattern.h:
+
 2016-05-25  Alex Christensen  <achristen...@webkit.org>
 
         Fix Win64 build after r201335

Modified: trunk/Source/_javascript_Core/runtime/RegExp.cpp (201411 => 201412)


--- trunk/Source/_javascript_Core/runtime/RegExp.cpp	2016-05-26 00:08:48 UTC (rev 201411)
+++ trunk/Source/_javascript_Core/runtime/RegExp.cpp	2016-05-26 03:19:06 UTC (rev 201412)
@@ -222,7 +222,7 @@
 void RegExp::finishCreation(VM& vm)
 {
     Base::finishCreation(vm);
-    Yarr::YarrPattern pattern(m_patternString, m_flags, &m_constructionError);
+    Yarr::YarrPattern pattern(m_patternString, m_flags, &m_constructionError, vm.stackLimit());
     if (m_constructionError)
         m_state = ParseError;
     else
@@ -264,7 +264,7 @@
 {
     ConcurrentJITLocker locker(m_lock);
     
-    Yarr::YarrPattern pattern(m_patternString, m_flags, &m_constructionError);
+    Yarr::YarrPattern pattern(m_patternString, m_flags, &m_constructionError, vm->stackLimit());
     if (m_constructionError) {
         RELEASE_ASSERT_NOT_REACHED();
 #if COMPILER_QUIRK(CONSIDERS_UNREACHABLE_CODE)
@@ -317,7 +317,7 @@
 {
     ConcurrentJITLocker locker(m_lock);
     
-    Yarr::YarrPattern pattern(m_patternString, m_flags, &m_constructionError);
+    Yarr::YarrPattern pattern(m_patternString, m_flags, &m_constructionError, vm->stackLimit());
     if (m_constructionError) {
         RELEASE_ASSERT_NOT_REACHED();
 #if COMPILER_QUIRK(CONSIDERS_UNREACHABLE_CODE)

Modified: trunk/Source/_javascript_Core/yarr/YarrPattern.cpp (201411 => 201412)


--- trunk/Source/_javascript_Core/yarr/YarrPattern.cpp	2016-05-26 00:08:48 UTC (rev 201411)
+++ trunk/Source/_javascript_Core/yarr/YarrPattern.cpp	2016-05-26 03:19:06 UTC (rev 201412)
@@ -31,6 +31,7 @@
 #include "YarrCanonicalize.h"
 #include "YarrParser.h"
 #include <wtf/Vector.h>
+#include <wtf/WTFThreadData.h>
 
 using namespace WTF;
 
@@ -274,9 +275,10 @@
 
 class YarrPatternConstructor {
 public:
-    YarrPatternConstructor(YarrPattern& pattern)
+    YarrPatternConstructor(YarrPattern& pattern, void* stackLimit)
         : m_pattern(pattern)
         , m_characterClassConstructor(pattern.ignoreCase(), pattern.unicode() ? CanonicalMode::Unicode : CanonicalMode::UCS2)
+        , m_stackLimit(stackLimit)
         , m_invertParentheticalAssertion(false)
     {
         auto body = std::make_unique<PatternDisjunction>();
@@ -579,8 +581,11 @@
         m_alternative = m_alternative->m_parent->addNewAlternative();
     }
 
-    unsigned setupAlternativeOffsets(PatternAlternative* alternative, unsigned currentCallFrameSize, unsigned initialInputPosition)
+    bool setupAlternativeOffsets(PatternAlternative* alternative, unsigned currentCallFrameSize, unsigned initialInputPosition, unsigned& newCallFrameSize) WARN_UNUSED_RETURN
     {
+        if (!isSafeToRecurse())
+            return false;
+
         alternative->m_hasFixedSize = true;
         Checked<unsigned> currentInputPosition = initialInputPosition;
 
@@ -637,18 +642,22 @@
                 if (term.quantityCount == 1 && !term.parentheses.isCopy) {
                     if (term.quantityType != QuantifierFixedCount)
                         currentCallFrameSize += YarrStackSpaceForBackTrackInfoParenthesesOnce;
-                    currentCallFrameSize = setupDisjunctionOffsets(term.parentheses.disjunction, currentCallFrameSize, currentInputPosition.unsafeGet());
+                    if (!setupDisjunctionOffsets(term.parentheses.disjunction, currentCallFrameSize, currentInputPosition.unsafeGet(), currentCallFrameSize))
+                        return false;
                     // If quantity is fixed, then pre-check its minimum size.
                     if (term.quantityType == QuantifierFixedCount)
                         currentInputPosition += term.parentheses.disjunction->m_minimumSize;
                     term.inputPosition = currentInputPosition.unsafeGet();
                 } else if (term.parentheses.isTerminal) {
                     currentCallFrameSize += YarrStackSpaceForBackTrackInfoParenthesesTerminal;
-                    currentCallFrameSize = setupDisjunctionOffsets(term.parentheses.disjunction, currentCallFrameSize, currentInputPosition.unsafeGet());
+                    if (!setupDisjunctionOffsets(term.parentheses.disjunction, currentCallFrameSize, currentInputPosition.unsafeGet(), currentCallFrameSize))
+                        return false;
                     term.inputPosition = currentInputPosition.unsafeGet();
                 } else {
                     term.inputPosition = currentInputPosition.unsafeGet();
-                    setupDisjunctionOffsets(term.parentheses.disjunction, 0, currentInputPosition.unsafeGet());
+                    unsigned ignoredCallFrameSize;
+                    if (!setupDisjunctionOffsets(term.parentheses.disjunction, 0, currentInputPosition.unsafeGet(), ignoredCallFrameSize))
+                        return false;
                     currentCallFrameSize += YarrStackSpaceForBackTrackInfoParentheses;
                 }
                 // Fixed count of 1 could be accepted, if they have a fixed size *AND* if all alternatives are of the same length.
@@ -658,7 +667,8 @@
             case PatternTerm::TypeParentheticalAssertion:
                 term.inputPosition = currentInputPosition.unsafeGet();
                 term.frameLocation = currentCallFrameSize;
-                currentCallFrameSize = setupDisjunctionOffsets(term.parentheses.disjunction, currentCallFrameSize + YarrStackSpaceForBackTrackInfoParentheticalAssertion, currentInputPosition.unsafeGet());
+                if (!setupDisjunctionOffsets(term.parentheses.disjunction, currentCallFrameSize + YarrStackSpaceForBackTrackInfoParentheticalAssertion, currentInputPosition.unsafeGet(), currentCallFrameSize))
+                    return false;
                 break;
 
             case PatternTerm::TypeDotStarEnclosure:
@@ -669,11 +679,15 @@
         }
 
         alternative->m_minimumSize = (currentInputPosition - initialInputPosition).unsafeGet();
-        return currentCallFrameSize;
+        newCallFrameSize = currentCallFrameSize;
+        return true;
     }
 
-    unsigned setupDisjunctionOffsets(PatternDisjunction* disjunction, unsigned initialCallFrameSize, unsigned initialInputPosition)
+    bool setupDisjunctionOffsets(PatternDisjunction* disjunction, unsigned initialCallFrameSize, unsigned initialInputPosition, unsigned& callFrameSize) WARN_UNUSED_RETURN
     {
+        if (!isSafeToRecurse())
+            return false;
+
         if ((disjunction != m_pattern.m_body) && (disjunction->m_alternatives.size() > 1))
             initialCallFrameSize += YarrStackSpaceForBackTrackInfoAlternative;
 
@@ -683,7 +697,9 @@
 
         for (unsigned alt = 0; alt < disjunction->m_alternatives.size(); ++alt) {
             PatternAlternative* alternative = disjunction->m_alternatives[alt].get();
-            unsigned currentAlternativeCallFrameSize = setupAlternativeOffsets(alternative, initialCallFrameSize, initialInputPosition);
+            unsigned currentAlternativeCallFrameSize;
+            if (!setupAlternativeOffsets(alternative, initialCallFrameSize, initialInputPosition, currentAlternativeCallFrameSize))
+                return false;
             minimumInputSize = std::min(minimumInputSize, alternative->m_minimumSize);
             maximumCallFrameSize = std::max(maximumCallFrameSize, currentAlternativeCallFrameSize);
             hasFixedSize &= alternative->m_hasFixedSize;
@@ -697,12 +713,17 @@
         disjunction->m_hasFixedSize = hasFixedSize;
         disjunction->m_minimumSize = minimumInputSize;
         disjunction->m_callFrameSize = maximumCallFrameSize;
-        return maximumCallFrameSize;
+        callFrameSize = maximumCallFrameSize;
+        return true;
     }
 
-    void setupOffsets()
+    const char* setupOffsets()
     {
-        setupDisjunctionOffsets(m_pattern.m_body, 0, 0);
+        // FIXME: Yarr should not use the stack to handle subpatterns (rdar://problem/26436314).
+        unsigned ignoredCallFrameSize;
+        if (!setupDisjunctionOffsets(m_pattern.m_body, 0, 0, ignoredCallFrameSize))
+            return REGEXP_ERROR_PREFIX "too many nested disjunctions";
+        return nullptr;
     }
 
     // This optimization identifies sets of parentheses that we will never need to backtrack.
@@ -842,16 +863,27 @@
     }
 
 private:
+    bool isSafeToRecurse() const
+    {
+        if (!m_stackLimit)
+            return true;
+        ASSERT(wtfThreadData().stack().isGrowingDownward());
+        int8_t* curr = reinterpret_cast<int8_t*>(&curr);
+        int8_t* limit = reinterpret_cast<int8_t*>(m_stackLimit);
+        return curr >= limit;
+    }
+
     YarrPattern& m_pattern;
     PatternAlternative* m_alternative;
     CharacterClassConstructor m_characterClassConstructor;
+    void* m_stackLimit;
     bool m_invertCharacterClass;
     bool m_invertParentheticalAssertion;
 };
 
-const char* YarrPattern::compile(const String& patternString)
+const char* YarrPattern::compile(const String& patternString, void* stackLimit)
 {
-    YarrPatternConstructor constructor(*this);
+    YarrPatternConstructor constructor(*this, stackLimit);
 
     if (const char* error = parse(constructor, patternString, unicode()))
         return error;
@@ -877,12 +909,13 @@
     constructor.optimizeDotStarWrappedExpressions();
     constructor.optimizeBOL();
         
-    constructor.setupOffsets();
+    if (const char* error = constructor.setupOffsets())
+        return error;
 
-    return 0;
+    return nullptr;
 }
 
-YarrPattern::YarrPattern(const String& pattern, RegExpFlags flags, const char** error)
+YarrPattern::YarrPattern(const String& pattern, RegExpFlags flags, const char** error, void* stackLimit)
     : m_containsBackreferences(false)
     , m_containsBOL(false)
     , m_containsUnsignedLengthPattern(false)
@@ -899,7 +932,7 @@
     , nonwordcharCached(0)
     , nonwordUnicodeIgnoreCasecharCached(0)
 {
-    *error = compile(pattern);
+    *error = compile(pattern, stackLimit);
 }
 
 } }

Modified: trunk/Source/_javascript_Core/yarr/YarrPattern.h (201411 => 201412)


--- trunk/Source/_javascript_Core/yarr/YarrPattern.h	2016-05-26 00:08:48 UTC (rev 201411)
+++ trunk/Source/_javascript_Core/yarr/YarrPattern.h	2016-05-26 03:19:06 UTC (rev 201412)
@@ -304,7 +304,7 @@
 
 
 struct YarrPattern {
-    JS_EXPORT_PRIVATE YarrPattern(const String& pattern, RegExpFlags flags, const char** error);
+    JS_EXPORT_PRIVATE YarrPattern(const String& pattern, RegExpFlags, const char** error, void* stackLimit = nullptr);
 
     void reset()
     {
@@ -428,7 +428,7 @@
     Vector<std::unique_ptr<CharacterClass>> m_userCharacterClasses;
 
 private:
-    const char* compile(const String& patternString);
+    const char* compile(const String& patternString, void* stackLimit);
 
     CharacterClass* newlineCached;
     CharacterClass* digitsCached;
_______________________________________________
webkit-changes mailing list
webkit-changes@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to