[MediaWiki-commits] [Gerrit] mediawiki/core[REL1_27]: SECURITY: Handle -{}- syntax in attributes safely

2017-11-14 Thread Reedy (Code Review)
Reedy has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/391383 )

Change subject: SECURITY: Handle -{}- syntax in attributes safely
..

SECURITY: Handle -{}- syntax in attributes safely

Previously, if one had an attribute with the contents
"-{}-foo-{}-", foo would get replaced by language converter as if
it wasn't in an attribute. This lead to an XSS attack.

This breaks doing manual conversions in url href's (or any
other attribute that goes through an escaping method
other than Sanitizer's). e.g. http://{sr-el:foo';sr-ec:bar}.com
won't work anymore. See also T87332

Bug: T119158
Change-Id: Idbc45cac12c309b0ccb4adeff6474fa527b48edb
---
M RELEASE-NOTES-1.27
M languages/LanguageConverter.php
M tests/parser/parserTests.txt
3 files changed, 40 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/core 
refs/changes/83/391383/1

diff --git a/RELEASE-NOTES-1.27 b/RELEASE-NOTES-1.27
index 79b8b98..38859b1 100644
--- a/RELEASE-NOTES-1.27
+++ b/RELEASE-NOTES-1.27
@@ -28,6 +28,7 @@
 * (T180237) SECURITY: Protect vendor folder with .htaccess.
 * (T180231) SECURITY: Remove PHPUnit file with known RCE if exists in 
update.php.
 * (T124404) SECURITY: XSS in langconverter when regex hits 
pcre.backtrack_limit.
+* (T119158) SECURITY: Handle -{}- syntax in attributes safely.
 
 == MediaWiki 1.27.3 ==
 Due to a packaging error, the wrong version of the SyntaxHighlight extension 
was
diff --git a/languages/LanguageConverter.php b/languages/LanguageConverter.php
index 4200978..87eb80d 100644
--- a/languages/LanguageConverter.php
+++ b/languages/LanguageConverter.php
@@ -377,9 +377,12 @@
$scriptfix = 
']*+>[^<]*+(?:(?:(?!<\/script>).)[^<]*+)*+<\/script>|';
// disable conversion of  tags
$prefix = 
']*+>[^<]*+(?:(?:(?!<\/pre>).)[^<]*+)*+<\/pre>|';
+   // The "|.*+)" at the end, is in case we missed some part of 
html syntax,
+   // we will fail securely (hopefully) by matching the rest of 
the string.
+   $htmlFullTag = 
'<(?:[^>=]*+(?>[^>=]*+=\s*+(?:"[^"]*"|\'[^\']*\'|[^\'">\s]*+))*+[^>=]*+>|.*+)|';
 
-   $reg = '/' . $codefix . $scriptfix . $prefix .
-   '<[^>]++>|&[a-zA-Z#][a-z0-9]++;' . $marker . $htmlfix . 
'|\004$/s';
+   $reg = '/' . $codefix . $scriptfix . $prefix . $htmlFullTag .
+   '&[a-zA-Z#][a-z0-9]++;' . $marker . $htmlfix . 
'|\004$/s';
$startPos = 0;
$sourceBlob = '';
$literalBlob = '';
@@ -660,29 +663,41 @@
$out = '';
$length = strlen( $text );
$shouldConvert = !$this->guessVariant( $text, $variant );
+   $continue = 1;
 
-   while ( $startPos < $length ) {
-   $pos = strpos( $text, '-{', $startPos );
+   $noScript = '.*?<\/script>(*SKIP)(*FAIL)';
+   $noStyle = '.*?<\/style>(*SKIP)(*FAIL)';
+   $noHtml = 
'<(?:[^>=]*+(?>[^>=]*+=\s*+(?:"[^"]*"|\'[^\']*\'|[^\'">\s]*+))*+[^>=]*+>|.*+)(*SKIP)(*FAIL)';
+   while ( $startPos < $length && $continue ) {
+   $continue = preg_match(
+   // Only match -{ outside of html.
+   "/$noScript|$noStyle|$noHtml|-\{/",
+   $text,
+   $m,
+   PREG_OFFSET_CAPTURE,
+   $startPos
+   );
 
-   if ( $pos === false ) {
+   if ( !$continue ) {
// No more markup, append final segment
$fragment = substr( $text, $startPos );
$out .= $shouldConvert ? $this->autoConvert( 
$fragment, $variant ) : $fragment;
return $out;
}
 
-   // Markup found
+   // Offset of the match of the regex pattern.
+   $pos = $m[0][1];
+
// Append initial segment
$fragment = substr( $text, $startPos, $pos - $startPos 
);
$out .= $shouldConvert ? $this->autoConvert( $fragment, 
$variant ) : $fragment;
-
-   // Advance position
+   // -{ marker found, not in attribute
+   // Advance position up to -{ marker.
$startPos = $pos;
-
// Do recursive conversion
+   // Note: This passes $startPos by reference, and 
advances it.
$out .= $this->recursiveConvertRule( $text, $variant, 
$startPos, $depth + 1 );
}
-
return $out;
}
 

[MediaWiki-commits] [Gerrit] mediawiki/core[REL1_27]: SECURITY: Handle -{}- syntax in attributes safely

2017-11-14 Thread Reedy (Code Review)
Reedy has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/391383 )

Change subject: SECURITY: Handle -{}- syntax in attributes safely
..


SECURITY: Handle -{}- syntax in attributes safely

Previously, if one had an attribute with the contents
"-{}-foo-{}-", foo would get replaced by language converter as if
it wasn't in an attribute. This lead to an XSS attack.

This breaks doing manual conversions in url href's (or any
other attribute that goes through an escaping method
other than Sanitizer's). e.g. http://{sr-el:foo';sr-ec:bar}.com
won't work anymore. See also T87332

Bug: T119158
Change-Id: Idbc45cac12c309b0ccb4adeff6474fa527b48edb
---
M RELEASE-NOTES-1.27
M languages/LanguageConverter.php
M tests/parser/parserTests.txt
3 files changed, 40 insertions(+), 10 deletions(-)



diff --git a/RELEASE-NOTES-1.27 b/RELEASE-NOTES-1.27
index 79b8b98..38859b1 100644
--- a/RELEASE-NOTES-1.27
+++ b/RELEASE-NOTES-1.27
@@ -28,6 +28,7 @@
 * (T180237) SECURITY: Protect vendor folder with .htaccess.
 * (T180231) SECURITY: Remove PHPUnit file with known RCE if exists in 
update.php.
 * (T124404) SECURITY: XSS in langconverter when regex hits 
pcre.backtrack_limit.
+* (T119158) SECURITY: Handle -{}- syntax in attributes safely.
 
 == MediaWiki 1.27.3 ==
 Due to a packaging error, the wrong version of the SyntaxHighlight extension 
was
diff --git a/languages/LanguageConverter.php b/languages/LanguageConverter.php
index 4200978..87eb80d 100644
--- a/languages/LanguageConverter.php
+++ b/languages/LanguageConverter.php
@@ -377,9 +377,12 @@
$scriptfix = 
']*+>[^<]*+(?:(?:(?!<\/script>).)[^<]*+)*+<\/script>|';
// disable conversion of  tags
$prefix = 
']*+>[^<]*+(?:(?:(?!<\/pre>).)[^<]*+)*+<\/pre>|';
+   // The "|.*+)" at the end, is in case we missed some part of 
html syntax,
+   // we will fail securely (hopefully) by matching the rest of 
the string.
+   $htmlFullTag = 
'<(?:[^>=]*+(?>[^>=]*+=\s*+(?:"[^"]*"|\'[^\']*\'|[^\'">\s]*+))*+[^>=]*+>|.*+)|';
 
-   $reg = '/' . $codefix . $scriptfix . $prefix .
-   '<[^>]++>|&[a-zA-Z#][a-z0-9]++;' . $marker . $htmlfix . 
'|\004$/s';
+   $reg = '/' . $codefix . $scriptfix . $prefix . $htmlFullTag .
+   '&[a-zA-Z#][a-z0-9]++;' . $marker . $htmlfix . 
'|\004$/s';
$startPos = 0;
$sourceBlob = '';
$literalBlob = '';
@@ -660,29 +663,41 @@
$out = '';
$length = strlen( $text );
$shouldConvert = !$this->guessVariant( $text, $variant );
+   $continue = 1;
 
-   while ( $startPos < $length ) {
-   $pos = strpos( $text, '-{', $startPos );
+   $noScript = '.*?<\/script>(*SKIP)(*FAIL)';
+   $noStyle = '.*?<\/style>(*SKIP)(*FAIL)';
+   $noHtml = 
'<(?:[^>=]*+(?>[^>=]*+=\s*+(?:"[^"]*"|\'[^\']*\'|[^\'">\s]*+))*+[^>=]*+>|.*+)(*SKIP)(*FAIL)';
+   while ( $startPos < $length && $continue ) {
+   $continue = preg_match(
+   // Only match -{ outside of html.
+   "/$noScript|$noStyle|$noHtml|-\{/",
+   $text,
+   $m,
+   PREG_OFFSET_CAPTURE,
+   $startPos
+   );
 
-   if ( $pos === false ) {
+   if ( !$continue ) {
// No more markup, append final segment
$fragment = substr( $text, $startPos );
$out .= $shouldConvert ? $this->autoConvert( 
$fragment, $variant ) : $fragment;
return $out;
}
 
-   // Markup found
+   // Offset of the match of the regex pattern.
+   $pos = $m[0][1];
+
// Append initial segment
$fragment = substr( $text, $startPos, $pos - $startPos 
);
$out .= $shouldConvert ? $this->autoConvert( $fragment, 
$variant ) : $fragment;
-
-   // Advance position
+   // -{ marker found, not in attribute
+   // Advance position up to -{ marker.
$startPos = $pos;
-
// Do recursive conversion
+   // Note: This passes $startPos by reference, and 
advances it.
$out .= $this->recursiveConvertRule( $text, $variant, 
$startPos, $depth + 1 );
}
-
return $out;
}
 
diff --git a/tests/parser/parserTests.txt b/tests/parser/parserTests.txt