[MediaWiki-commits] [Gerrit] [Cargo] #cargo_query Fix issues with quotes and other parsing - change (mediawiki...Cargo)

2015-12-13 Thread Ed Hoo (Code Review)
Ed Hoo has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/258753

Change subject: [Cargo] #cargo_query Fix issues with quotes and other parsing
..

[Cargo] #cargo_query Fix issues with quotes and other parsing

Apologies, the description of the changes will be provided as a separate 
comment due to commit issues with git

Change-Id: I56e9bee6237e1bfd66889bf9a63b2b3216ebd2d3
---
M CargoSQLQuery.php
M CargoUtils.php
2 files changed, 158 insertions(+), 118 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/extensions/Cargo 
refs/changes/53/258753/1

diff --git a/CargoSQLQuery.php b/CargoSQLQuery.php
index defc8ba..e1eaf17 100644
--- a/CargoSQLQuery.php
+++ b/CargoSQLQuery.php
@@ -95,19 +95,18 @@
'/\-\-/' => '--',
'/#/' => '#',
);
-   // HTML-decode the string - this is necessary if the query
-   // contains a call to {{PAGENAME}} and the page name has any
-   // special characters, because {{PAGENAME]] unfortunately
-   // HTML-encodes the value, which leads to a '#' in the string.
-   $decodedWhereStr = html_entity_decode( $whereStr );
foreach ( $whereStrRegexps as $regexp => $displayString ) {
-   if ( preg_match( $regexp, $decodedWhereStr ) ) {
+   if ( preg_match( $regexp, $whereStr ) ) {
throw new MWException( "Error in \"where\" 
parameter: the string \"$displayString\" cannot be used within #cargo_query." );
}
}
-   $simplifiedWhereStr = str_replace( array( '\"', "\'" ), '', 
$whereStr );
-   $simplifiedWhereStr = preg_replace( '/"[^"]*"/', '', 
$simplifiedWhereStr );
-   $simplifiedWhereStr = preg_replace( "/'[^']*'/", '', 
$simplifiedWhereStr );
+   $noQuotesPattern = '/([\'"]).*?\1/';
+   $noQuotesFieldsStr = preg_replace( $noQuotesPattern, '', 
$fieldsStr );
+   $noQuotesWhereStr = preg_replace( $noQuotesPattern, '', 
$whereStr );
+   $noQuotesJoinOnStr = preg_replace( $noQuotesPattern, '', 
$joinOnStr );
+   $noQuotesGroupByStr = preg_replace( $noQuotesPattern, '', 
$groupByStr );
+   $noQuotesHavingStr = preg_replace( $noQuotesPattern, '', 
$havingStr );
+   $noQuotesOrderByStr = preg_replace( $noQuotesPattern, '', 
$orderByStr );
 
$regexps = array(
'/\bselect\b/i' => 'SELECT',
@@ -123,22 +122,22 @@
);
foreach ( $regexps as $regexp => $displayString ) {
if ( preg_match( $regexp, $tablesStr ) ||
-   preg_match( $regexp, $fieldsStr ) ||
-   preg_match( $regexp, $simplifiedWhereStr ) ||
-   preg_match( $regexp, $joinOnStr ) ||
-   preg_match( $regexp, $groupByStr ) ||
-   preg_match( $regexp, $havingStr ) ||
-   preg_match( $regexp, $orderByStr ) ||
+   preg_match( $regexp, $noQuotesFieldsStr ) ||
+   preg_match( $regexp, $noQuotesWhereStr ) ||
+   preg_match( $regexp, $noQuotesJoinOnStr ) ||
+   preg_match( $regexp, $noQuotesGroupByStr ) ||
+   preg_match( $regexp, $noQuotesHavingStr ) ||
+   preg_match( $regexp, $noQuotesOrderByStr ) ||
preg_match( $regexp, $limitStr ) ) {
throw new MWException( "Error: the string 
\"$displayString\" cannot be used within #cargo_query." );
}
}
 
-   self::getAndValidateSQLFunctions( $simplifiedWhereStr );
-   self::getAndValidateSQLFunctions( $joinOnStr );
-   self::getAndValidateSQLFunctions( $groupByStr );
-   self::getAndValidateSQLFunctions( $havingStr );
-   self::getAndValidateSQLFunctions( $orderByStr );
+   self::getAndValidateSQLFunctions( $noQuotesWhereStr );
+   self::getAndValidateSQLFunctions( $noQuotesJoinOnStr );
+   self::getAndValidateSQLFunctions( $noQuotesGroupByStr );
+   self::getAndValidateSQLFunctions( $noQuotesHavingStr );
+   self::getAndValidateSQLFunctions( $noQuotesOrderByStr );
self::getAndValidateSQLFunctions( $limitStr );
}
 
@@ -340,7 +339,7 @@
global $wgCargoAllowedSQLFunctions;
 
$sqlFunctionMatches = array();
-   $sqlFunctionRegex = '/(\b|\W)(\w*?)\s?\(/';
+   $sqlFunctionRegex = '/(\b|\W)(\w*?)\s*\(/';
preg_match_all( $s

[MediaWiki-commits] [Gerrit] [Cargo] #cargo_query Fix issues with quotes and other parsing - change (mediawiki...Cargo)

2015-12-20 Thread Umherirrender (Code Review)
Umherirrender has submitted this change and it was merged.

Change subject: [Cargo] #cargo_query Fix issues with quotes and other parsing
..


[Cargo] #cargo_query Fix issues with quotes and other parsing

CargoUtils.php
==

(1) Fix smartSplit so that parenthesis, separators and "the other quote" 
(single quote in a double
quoted string or double quote in a single quoted string) inside a quoted string 
are not considered
lexically.

(2) Add 3 functions to CargoUtils to prepare regex expressions to match table 
names and field
names, allowing for '$' as part of the identifier.


CargoSQLQuery.php
=

(3) Change the quote elimination logic in validateValues so that "the other 
quote" inside a quoted
string is not considered lexically. Earier double quotes were made into single 
quotes so if a
double quote happened inside a single quoted string (or the other way around) 
things would get out
of synch.

(4) Eliminate quoted strings before all checks within validateValues, but for 
$limitStr -- i.e.
$fieldsStr, $whereStr, $joinStr, $groupByStr, $havingStr and $orderByStr.

(5) Fix sqlFunctionRegex within getAndValidateSQLFunctions to allow for 
multiple white spaces '\s'
between the function name and the parenthesis.

(6) Change setDescriptionsForFields, handleVirtualFields and addTablePrefixes 
use more resilent
regex when mathing identifiers.

(7) Bug T120583: Change the HOLDS check to match the field "book", but not the 
field "bookworm"
when the field that allows for multiple values is "book". Makes the match 
case-insensitive since
table and column names are usually not case sensitive.

(8) Change the decomposition of table / field from explode('.') to a regex, 
adding resilience.

(9) Fix the string literal identification in setDescriptionsForFields to 
non-greedy.

(10) Add support for double quoted literal strings in setDescriptionsForFields.

(11) Fix an issue where quoted strings were being scanned for function calls 
throwing and
exception when an open parenthesis followed a word inside a quoted string.

Change-Id: I56e9bee6237e1bfd66889bf9a63b2b3216ebd2d3
---
M CargoSQLQuery.php
M CargoUtils.php
2 files changed, 198 insertions(+), 112 deletions(-)

Approvals:
  Umherirrender: Verified; Looks good to me, approved
  Yaron Koren: Checked; Looks good to me, approved



diff --git a/CargoSQLQuery.php b/CargoSQLQuery.php
index b406a2f..69c30ea 100644
--- a/CargoSQLQuery.php
+++ b/CargoSQLQuery.php
@@ -95,6 +95,7 @@
'/\-\-/' => '--',
'/#/' => '#',
);
+
// HTML-decode the string - this is necessary if the query
// contains a call to {{PAGENAME}} and the page name has any
// special characters, because {{PAGENAME]] unfortunately
@@ -105,9 +106,12 @@
throw new MWException( "Error in \"where\" 
parameter: the string \"$displayString\" cannot be used within #cargo_query." );
}
}
-   $simplifiedWhereStr = str_replace( array( '\"', "\'" ), '', 
$whereStr );
-   $simplifiedWhereStr = preg_replace( '/"[^"]*"/', '', 
$simplifiedWhereStr );
-   $simplifiedWhereStr = preg_replace( "/'[^']*'/", '', 
$simplifiedWhereStr );
+   $noQuotesFieldsStr = CargoUtils::removeQuotedStrings( 
$fieldsStr );
+   $noQuotesWhereStr = CargoUtils::removeQuotedStrings( $whereStr 
);
+   $noQuotesJoinOnStr = CargoUtils::removeQuotedStrings( 
$joinOnStr );
+   $noQuotesGroupByStr = CargoUtils::removeQuotedStrings( 
$groupByStr );
+   $noQuotesHavingStr = CargoUtils::removeQuotedStrings( 
$havingStr );
+   $noQuotesOrderByStr = CargoUtils::removeQuotedStrings( 
$orderByStr );
 
$regexps = array(
'/\bselect\b/i' => 'SELECT',
@@ -123,22 +127,22 @@
);
foreach ( $regexps as $regexp => $displayString ) {
if ( preg_match( $regexp, $tablesStr ) ||
-   preg_match( $regexp, $fieldsStr ) ||
-   preg_match( $regexp, $simplifiedWhereStr ) ||
-   preg_match( $regexp, $joinOnStr ) ||
-   preg_match( $regexp, $groupByStr ) ||
-   preg_match( $regexp, $havingStr ) ||
-   preg_match( $regexp, $orderByStr ) ||
+   preg_match( $regexp, $noQuotesFieldsStr ) ||
+   preg_match( $regexp, $noQuotesWhereStr ) ||
+   preg_match( $regexp, $noQuotesJoinOnStr ) ||
+   preg_match( $regexp, $noQuotesGroupByStr ) ||
+   preg_match( $regexp, $noQuotesHavingStr ) ||
+   p