jenkins-bot has submitted this change and it was merged.

Change subject: 1.0 updates
......................................................................


1.0 updates

Everything works with Elasticsearch 0.90 and 1.0.  Dropping support for
0.90 will remove some of the weird constructs and might speed some things
up so we'll do that once we've updated WMF's cluster.

This commit also makes some changes to make running the browser tests on
vagrant simpler.  They are bundled together because that is how I tested
1.0 and untangling them is more trouble then it is worth because we want
both of them.

Bug: 61879

Change-Id: I1dc24be8350b3a9d7ec9aa4fc251368c7c4b7da2
---
M .gitignore
M includes/AnalysisConfigBuilder.php
M includes/ElasticsearchIntermediary.php
M includes/MappingConfigBuilder.php
M includes/Result.php
M includes/ResultsType.php
M includes/Searcher.php
M maintenance/updateOneSearchIndexConfig.php
M tests/browser/features/step_definitions/page_steps.rb
M tests/browser/features/step_definitions/search_steps.rb
M tests/browser/features/support/pages/article_page.rb
M tests/browser/features/update_weight.feature
12 files changed, 267 insertions(+), 93 deletions(-)

Approvals:
  Manybubbles: Looks good to me, approved
  jenkins-bot: Verified



diff --git a/.gitignore b/.gitignore
index 1dcf7a5..7c72671 100644
--- a/.gitignore
+++ b/.gitignore
@@ -2,6 +2,7 @@
 *~
 *.kate-swp
 .*.swp
+.indexed
 tests/browser/cucumber_failures.log
 tests/browser/.bundle
 tests/browser/screenshots
diff --git a/includes/AnalysisConfigBuilder.php 
b/includes/AnalysisConfigBuilder.php
index 8a0698c..f56144c 100644
--- a/includes/AnalysisConfigBuilder.php
+++ b/includes/AnalysisConfigBuilder.php
@@ -118,7 +118,7 @@
                                ),
                                'aggressive_splitting' => array(
                                        'type' => 'word_delimiter',
-                                       'stem_english_possessive' => 'false', 
// No need
+                                       'stem_english_possessive' => false, // 
No need
                                ),
                                'prefix_ngram_filter' => array(
                                        'type' => 'edgeNGram',
diff --git a/includes/ElasticsearchIntermediary.php 
b/includes/ElasticsearchIntermediary.php
index 8802521..e7b883c 100644
--- a/includes/ElasticsearchIntermediary.php
+++ b/includes/ElasticsearchIntermediary.php
@@ -115,6 +115,28 @@
        }
 
        /**
+        * Unwrap a result that we expect to be a single value.
+        * @param mixed $data from Elastica result
+        * @return mixed the single result
+        */
+       public static function singleValue( $result, $name ) {
+               $data = $result->__get( $name );
+               if ( $data === null ) {
+                       return null;
+               }
+               // Elasticsearch 0.90 returns single results as, well, single 
results
+               if ( !is_array( $data ) ) {
+                       return $data;
+               }
+               // Elasticsearch 1.0 returns them as single node arrays
+               $count = count( $data );
+               if ( $count !== 1 ) {
+                       wfLogWarning( "Expected a single result for $name but 
got $count." );
+               }
+               return $data[ 0 ];
+       }
+
+       /**
         * Does this status represent an Elasticsearch parse error?
         * @param $status Status to check
         * @return boolean is this a parse error?
diff --git a/includes/MappingConfigBuilder.php 
b/includes/MappingConfigBuilder.php
index cc37dcd..cc02143 100644
--- a/includes/MappingConfigBuilder.php
+++ b/includes/MappingConfigBuilder.php
@@ -80,11 +80,11 @@
 
                $config = array(
                        'dynamic' => false,
+                       '_all' => array( 'enabled' => false ),
                        'properties' => array(
                                'timestamp' => array(
                                        'type' => 'date',
                                        'format' => 'dateOptionalTime',
-                                       'include_in_all' => false,
                                ),
                                'namespace' => $this->buildLongField(),
                                'namespace_text' => $this->buildKeywordField(),
@@ -93,9 +93,8 @@
                                        $this->buildStringField( 'text', 
$textExtraAnalyzers ),
                                        array( 'fields' => array( 'word_count' 
=> array(
                                                'type' => 'token_count',
-                                               'store' => 'yes',
+                                               'store' => true,
                                                'analyzer' => 'plain',
-                                               'include_in_all' => false,
                                        ) ) )
                                ),
                                'file_text' => $this->buildStringField( 
'file_text', $textExtraAnalyzers ),
@@ -128,26 +127,24 @@
         * @return array definition of the field
         */
        public function buildStringField( $name, $extra = array(), $enableNorms 
= true ) {
-               $norms = array( 'enabled' => $enableNorms );
+               $norms = array( 'norms' => array( 'enabled' => $enableNorms ) );
+               // multi_field is dead in 1.0 so we do this which actually 
looks less gnarly.
                $field = array(
-                       'type' => 'multi_field',
+                       'type' => 'string',
+                       'analyzer' => 'text',
+                       'term_vector' => 'with_positions_offsets',
                        'fields' => array(
-                               $name => array(
-                                       'type' => 'string',
-                                       'analyzer' => 'text',
-                                       'term_vector' => 
'with_positions_offsets',
-                                       'include_in_all' => false,
-                                       'norms' => $norms,
-                               ),
                                'plain' => array(
                                        'type' => 'string',
                                        'analyzer' => 'plain',
                                        'term_vector' => 
'with_positions_offsets',
-                                       'include_in_all' => false,
-                                       'norms' => $norms,
                                ),
                        )
                );
+               if ( !$enableNorms ) {
+                       $field = array_merge( $field, $norms );
+                       $field[ 'fields' ][ 'plain' ] = array_merge( $field[ 
'fields' ][ 'plain' ], $norms );
+               }
                foreach ( $extra as $extraField ) {
                        if ( isset( $extraField[ 'analyzer' ] ) ) {
                                $extraName = $extraField[ 'analyzer' ];
@@ -156,9 +153,11 @@
                        }
                        $field[ 'fields' ][ $extraName ] = array_merge( array(
                                'type' => 'string',
-                               'include_in_all' => false,
-                               'norms' => $norms,
                        ), $extraField );
+                       if ( !$enableNorms ) {
+                               $field[ 'fields' ][ $extraName ] = array_merge(
+                                       $field[ 'fields' ][ $extraName ], 
$norms );
+                       }
                }
                return $field;
        }
@@ -171,7 +170,6 @@
                return array(
                        'type' => 'string',
                        'analyzer' => 'lowercase_keyword',
-                       'include_in_all' => false,
                        'norms' => array( 'enabled' => false ),  // Omit the 
length norm because there is only even one token
                        'index_options' => 'docs', // Omit the frequency and 
position information because neither are useful
                );
@@ -185,7 +183,6 @@
                return array(
                        'type' => 'string',
                        'analyzer' => 'keyword',
-                       'include_in_all' => false,
                        'norms' => array( 'enabled' => false ),  // Omit the 
length norm because there is only even one token
                        'index_options' => 'docs', // Omit the frequency and 
position information because neither are useful
                );
@@ -199,7 +196,6 @@
        public function buildLongField( $index = true ) {
                $config = array(
                        'type' => 'long',
-                       'include_in_all' => false,
                );
                if ( !$index ) {
                        $config[ 'index' ] = 'no';
diff --git a/includes/Result.php b/includes/Result.php
index f8f9227..c0f682c 100644
--- a/includes/Result.php
+++ b/includes/Result.php
@@ -46,7 +46,8 @@
                global $wgCirrusSearchShowScore;
 
                $this->maybeSetInterwiki( $result, $interwikis );
-               $this->mTitle = Title::makeTitle( $result->namespace, 
$result->title, '', $this->interwiki );
+               $this->mTitle = Title::makeTitle( 
ElasticsearchIntermediary::singleValue( $result, 'namespace' ),
+                       ElasticsearchIntermediary::singleValue( $result, 
'title' ), '', $this->interwiki );
                if ( $this->getTitle()->getNamespace() == NS_FILE ) {
                        $this->mImage = wfFindFile( $this->mTitle );
                }
@@ -54,8 +55,9 @@
                $data = $result->getData();
                // Not all results requested a word count. Just pretend we have 
none if so
                $this->wordCount = isset( $data['text.word_count'] ) ? 
$data['text.word_count'] : 0;
-               $this->byteSize = $result->text_bytes;
-               $this->timestamp = new MWTimestamp( $result->timestamp );
+               $this->byteSize = ElasticsearchIntermediary::singleValue( 
$result, 'text_bytes' );
+               $this->timestamp = ElasticsearchIntermediary::singleValue( 
$result, 'timestamp' );
+               $this->timestamp = new MWTimestamp( $this->timestamp );
                $highlights = $result->getHighlights();
                // TODO remove when Elasticsearch issue 3757 is fixed
                $highlights = $this->swapInPlainHighlighting( $highlights, 
'redirect.title' );
@@ -72,7 +74,24 @@
                }
                if ( !isset( $highlights[ 'title' ] ) && isset( $highlights[ 
'redirect.title' ] ) ) {
                        // Make sure to find the redirect title before escaping 
because escaping breaks it....
-                       $this->redirectTitle = $this->findRedirectTitle( 
$highlights[ 'redirect.title' ][ 0 ], $result->redirect );
+
+                       // This odd juggling is the second half of the script 
fields hack to get redirect loaded.
+                       // It'll go away when we switch to source filtering.
+                       $redirects = $result->redirect;
+                       wfDebugLog( 'CirrusSearch', "ASDFASDF  " . implode( ' 
', explode( "\n", var_export( $redirects, true ) ) ) );
+                       if ( $redirects !== null ) {
+                               // I not null it'll always be an array.
+                               // In Elasticsearch 0.90 it'll be an array of 
arrays which is what we need.
+                               // In Elasticsearch 1.0 it'll be an array of 
arrays of arrays where the outer most array
+                               // has only a single element which is exactly 
what would come back from 0.90.
+                               if ( count( $redirects ) !== 0 && 
!array_key_exists( 'title', $redirects[ 0 ] ) ) {
+                                       wfDebugLog( 'CirrusSearch', "1.0");
+                                       // Since the first entry doesn't have a 
title we assume we're in 1.0
+                                       $redirects = $redirects[ 0 ];
+                                       wfDebugLog( 'CirrusSearch', "ASDFASDF  
" . implode( ' ', explode( "\n", var_export( $redirects, true ) ) ) );
+                               }
+                       }
+                       $this->redirectTitle = $this->findRedirectTitle( 
$highlights[ 'redirect.title' ][ 0 ], $redirects );
                        $this->redirectSnipppet = $this->escapeHighlightedText( 
$highlights[ 'redirect.title' ][ 0 ] );
                } else {
                        $this->redirectSnipppet = '';
diff --git a/includes/ResultsType.php b/includes/ResultsType.php
index 4cea699..01486bd 100644
--- a/includes/ResultsType.php
+++ b/includes/ResultsType.php
@@ -23,6 +23,7 @@
  */
 interface ResultsType {
        function getFields();
+       function getScriptFields();
        function getHighlightingConfiguration();
        function transformElasticsearchResult( $suggestPrefixes, 
$suggestSuffixes,
                $result, $searchContainedSyntax );
@@ -45,6 +46,10 @@
 
        public function getFields() {
                return array( 'namespace', 'title' );
+       }
+
+       public function getScriptFields() {
+               return null;
        }
 
        public function getHighlightingConfiguration() {
@@ -75,7 +80,8 @@
                        $result, $searchContainedSyntax ) {
                $results = array();
                foreach( $result->getResults() as $r ) {
-                       $title = Title::makeTitle( $r->namespace, $r->title );
+                       $title = Title::makeTitle( 
ElasticsearchIntermediary::singleValue( $r, 'namespace' ),
+                               ElasticsearchIntermediary::singleValue( $r, 
'title' ) );
                        $highlights = $r->getHighlights();
 
                        // Now we have to use the highlights to figure out 
whether it was the title or the redirect
@@ -92,7 +98,7 @@
                                // Instead of getting the redirect's real 
namespace we're going to just use the namespace
                                // of the title.  This is not great but OK 
given that we can't find cross namespace
                                // redirects properly any way.
-                               $title = Title::makeTitle( $r->namespace, 
$redirectTitle );
+                               $title = Title::makeTitle( 
ElasticsearchIntermediary::singleValue( $r, 'namespace' ), $redirectTitle );
                        } else if ( !isset( $highlights[ 
"title.$this->matchedAnalyzer" ] ) ) {
                                // We're not really sure where the match came 
from so lets just pretend it was the title?
                                wfDebugLog( 'CirrusSearch', "Title search 
result type hit a match but we can't " .
@@ -109,8 +115,16 @@
 
 class FullTextResultsType implements ResultsType {
        public function getFields() {
-               return array( 'id', 'title', 'namespace', 'redirect', 
'timestamp',
-                       'text_bytes', 'text.word_count' );
+               return array( 'id', 'title', 'namespace',
+                       // 'redirect', was moved to a script field....
+                       'timestamp', 'text_bytes', 'text.word_count' );
+       }
+
+       public function getScriptFields() {
+               // Works for both 1.0 and 0.90.
+               // TODO remove this in favor of source filtering when we remove 
support for 0.90.
+               // see 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/search-request-source-filtering.html
+               return array( 'redirect' => new \Elastica\Script( '_source[ 
"redirect" ]' ) );
        }
 
        /**
@@ -200,4 +214,8 @@
        public function getFields() {
                return array( 'namespace', 'namespace_text', 'title' );
        }
+
+       public function getScriptFields() {
+               return null;
+       }
 }
diff --git a/includes/Searcher.php b/includes/Searcher.php
index 95f6b3f..97cfe5f 100644
--- a/includes/Searcher.php
+++ b/includes/Searcher.php
@@ -384,10 +384,15 @@
                                                $searchContainedSyntax = true;
                                                return '';
                                        case 'intitle':
-                                               $filterDestination[] = new 
\Elastica\Filter\Query( new \Elastica\Query\Field( 'title',
+                                               $query = new 
\Elastica\Query\QueryString(
                                                        
$searcher->fixupWholeQueryString(
                                                                
$searcher->fixupQueryStringPart( $value )
-                                                       ) ) );
+                                                       ) );
+                                               $query->setFields( array( 
'title' ) );
+                                               $query->setDefaultOperator( 
'AND' );
+                                               
$query->setAllowLeadingWildcard( false );
+                                               $query->setFuzzyPrefixLength( 2 
);
+                                               $filterDestination[] = new 
\Elastica\Filter\Query( $query );
                                                $searchContainedSyntax = true;
                                                return $keepText ? "$value " : 
'';
                                        default:
@@ -683,6 +688,10 @@
 
                $query = new Elastica\Query();
                $query->setFields( $this->resultsType->getFields() );
+               $scriptFields = $this->resultsType->getScriptFields();
+               if ( $scriptFields ) {
+                       $query->setScriptFields( $scriptFields );
+               }
 
                $extraIndexes = array();
                if ( $this->namespaces ) {
@@ -862,6 +871,11 @@
                                        array(
                                                'field' => $field,
                                                'suggest_mode' => 'always', // 
Forces us to generate lots of phrases to try.
+                                               // If a term appears in more 
then half the docs then don't try to correct it.  This really
+                                               // shouldn't kick in much 
because we're not looking for misspellings.  We're looking for phrases
+                                               // that can be might off.  Like 
"noble prize" ->  "nobel prize".  In any case, the default was
+                                               // 0.01 which way too 
frequently decided not to correct some terms.
+                                               'max_term_freq' => 0.5,
                                        ),
                                ),
                                'highlight' => array(
diff --git a/maintenance/updateOneSearchIndexConfig.php 
b/maintenance/updateOneSearchIndexConfig.php
index 0d34258..3bad9fa 100644
--- a/maintenance/updateOneSearchIndexConfig.php
+++ b/maintenance/updateOneSearchIndexConfig.php
@@ -88,10 +88,20 @@
        private $phraseUseText;
 
        /**
+        * @var bool print config as it is being checked
+        */
+       private $printDebugCheckConfig;
+
+       /**
         * @var float how much can the reindexed copy of an index is allowed to 
deviate from the current
         * copy without triggering a reindex failure
         */
        private $reindexAcceptableCountDeviation;
+
+       /**
+        * @var bool is this Elasticsearch 0.90.x?
+        */
+       private $elasticsearch90;
 
        public function __construct() {
                parent::__construct();
@@ -141,6 +151,8 @@
                        'rather than failing the whole reindex process.  
Defaults to 5.', false, true );
                $maintenance->addOption( 'baseName', 'What basename to use for 
all indexes, ' .
                        'defaults to wiki id', false, true );
+               $maintenance->addOption( 'debugCheckConfig', 'Print the 
configuration as it is checked ' .
+                       'to help debug unexepcted configuration missmatches.' );
        }
 
        public function execute() {
@@ -164,6 +176,7 @@
                        $this->getOption( 'reindexAcceptableCountDeviation', 
'5%' ) );
                $this->reindexChunkSize = $this->getOption( 'reindexChunkSize', 
100 );
                $this->reindexRetryAttempts = $this->getOption( 
'reindexRetryAttempts', 5 );
+               $this->printDebugCheckConfig = $this->getOption( 
'debugCheckConfig', false );
                $this->langCode = $wgLanguageCode;
                $this->aggressiveSplitting = 
$wgCirrusSearchUseAggressiveSplitting;
                $this->prefixSearchStartsWithAny = 
$wgCirrusSearchPrefixSearchStartsWithAnyWord;
@@ -175,6 +188,8 @@
                                $this->error( 'indexType option must be one of 
' .
                                        implode( ', ', $indexTypes ), 1 );
                        }
+
+                       $this->fetchElasticsearchVersion();
 
                        if ( $this->getOption( 'forceOpen', false ) ) {
                                $this->getIndex()->open();
@@ -201,6 +216,15 @@
                                "Message: $message\n" .
                                "Trace:\n" . $trace, 1 );
                }
+       }
+
+       private function fetchElasticsearchVersion() {
+               $this->output( $this->indent . 'Fetching Elasticsearch 
version...' );
+               $result = Connection::getClient()->request( '' );
+               $result = $result->getData();
+               $result = $result[ 'version' ][ 'number' ];
+               $this->elasticsearch90 = strpos( $result, '0.90.' ) !== false;
+               $this->output( ( $this->elasticsearch90 ? '' : 'after ' ) . 
"0.90\n" );
        }
 
        private function updateVersions() {
@@ -230,9 +254,8 @@
 
        private function validateIndexSettings() {
                $this->output( $this->indent . "\tValidating number of 
shards..." );
-               $settingsObject = $this->getIndex()->getSettings();
-               $settings = $settingsObject->get();
-               $actualShardCount = $settings[ 'index.number_of_shards' ];
+               $settings = $this->getSettings();
+               $actualShardCount = $settings[ 'index' ][ 'number_of_shards' ];
                if ( $actualShardCount == $this->getShardCount() ) {
                        $this->output( "ok\n" );
                } else {
@@ -245,23 +268,22 @@
                }
 
                $this->output( $this->indent . "\tValidating number of 
replicas..." );
-               $actualReplicaCount = $settings[ 'index.number_of_replicas' ];
+               $actualReplicaCount = $settings[ 'index' ][ 
'number_of_replicas' ];
                if ( $actualReplicaCount == $this->getReplicaCount() ) {
                        $this->output( "ok\n" );
                } else {
                        $this->output( "is $actualReplicaCount but should be " 
. $this->getReplicaCount() . '...' );
-                       $settingsObject->setNumberOfReplicas( 
$this->getReplicaCount() );
+                       $this->getIndex()->getSettings()->setNumberOfReplicas( 
$this->getReplicaCount() );
                        $this->output( "corrected\n" );
                }
        }
 
        private function validateAnalyzers() {
                $this->output( $this->indent . "Validating analyzers..." );
-               $settingsObject = $this->getIndex()->getSettings();
-               $settings = $settingsObject->get();
+               $settings = $this->getSettings();
                $analysisConfig = new AnalysisConfigBuilder( $this->langCode, 
$this->aggressiveSplitting );
                $requiredAnalyzers = $analysisConfig->buildConfig();
-               if ( $this->vaActualMatchRequired( 'index.analysis', $settings, 
$requiredAnalyzers ) ) {
+               if ( $this->checkConfig( $settings[ 'index' ][ 'analysis' ], 
$requiredAnalyzers ) ) {
                        $this->output( "ok\n" );
                } else {
                        $this->output( "different..." );
@@ -282,42 +304,37 @@
        }
 
        /**
-        * @param $prefix
-        * @param $settings
-        * @param $required array
-        * @return bool
+        * Load the settings array.  You can't use this to set the settings, 
use $this->getIndex()->getSettings() for that.
+        * @return array of settings always in Elasticsearch 1.0 format
         */
-       private function vaActualMatchRequired( $prefix, $settings, $required ) 
{
-               foreach( $required as $key => $value ) {
-                       $settingsKey = $prefix . '.' . $key;
-                       if ( is_array( $value ) ) {
-                               if ( !$this->vaActualMatchRequired( 
$settingsKey, $settings, $value ) ) {
-                                       return false;
-                               }
-                               continue;
-                       }
-                       // Note that I really mean !=, not !==.  Coercion is 
cool here.
-                       if ( !array_key_exists( $settingsKey, $settings ) || 
$settings[ $settingsKey ] != $value ) {
-                               return false;
-                       }
+       private function getSettings() {
+               $settings = $this->getIndex()->getSettings()->get();
+               if ( !$this->elasticsearch90 ) { // 1.0 compat
+                       // Looks like we're already in the new format already
+                       return $settings;
                }
-               return true;
+               $to = array();
+               foreach ( $settings as $key => $value ) {
+                       $current = &$to;
+                       foreach ( explode( '.', $key ) as $segment ) {
+                               $current = &$current[ $segment ];
+                       }
+                       $current = $value;
+               }
+               return $to;
        }
 
        private function validateMapping() {
-               $this->output( $this->indent . "Validating mappings...\n" );
-               $actualMappings = $this->getIndex()->getMapping();
-               $actualMappings = $actualMappings[ $this->getIndex()->getName() 
];
-
-               $this->output( $this->indent . "\tValidating mapping for page 
type..." );
+               $this->output( $this->indent . "Validating mappings..." );
                $requiredPageMappings = new MappingConfigBuilder(
                        $this->prefixSearchStartsWithAny, $this->phraseUseText 
);
                $requiredPageMappings = $requiredPageMappings->buildConfig();
-               if ( array_key_exists( 'page', $actualMappings ) &&
-                               $this->vmActualMatchRequired( $actualMappings[ 
'page' ], $requiredPageMappings ) ) {
-                       $this->output( "ok\n" );
-               } else {
-                       $this->output( "different..." );
+
+               if ( $this->elasticsearch90 ) {
+                       $requiredPageMappings = 
$this->transformMappingToElasticsearch90( $requiredPageMappings );
+               }
+
+               if ( !$this->checkMapping( $requiredPageMappings ) ) {
                        // TODO Conflict resolution here might leave old 
portions of mappings
                        $action = new \Elastica\Type\Mapping( 
$this->getPageType() );
                        foreach ( $requiredPageMappings as $key => $value ) {
@@ -334,37 +351,123 @@
                }
        }
 
+       private function transformMappingToElasticsearch90( $mapping ) {
+               foreach ( $mapping[ 'properties' ] as $name => $field ) {
+                       if ( array_key_exists( 'fields', $field ) ) {
+                               // Move the main field configuration to a 
subfield
+                               $field[ 'fields' ][ $name ] = $field;
+                               unset( $field[ 'fields' ][ $name ][ 'fields' ] 
);
+                               // Now clear everything from the "main" field 
that we've moved
+                               $mapping[ 'properties' ][ $name ] = array(
+                                       'type' => 'multi_field',
+                                       'fields' => $field[ 'fields' ],
+                               );
+                       }
+                       if ( array_key_exists( 'properties', $field ) ) {
+                               $mapping[ 'properties' ][ $name ] = 
$this->transformMappingToElasticsearch90(
+                                       $field );
+                       }
+               }
+               return $mapping;
+       }
+
+       /**
+        * Check that the mapping returned from Elasticsearch is as we want it.
+        * @param array $requiredPageMappings the mappings we want
+        * @return bool is the mapping good enough for us?
+        */
+       private function checkMapping( $requiredPageMappings ) {
+               $indexName = $this->getIndex()->getName();
+               $actualMappings = $this->getIndex()->getMapping();
+               if ( !array_key_exists( $indexName, $actualMappings ) ) {
+                       $this->output( "nonexistent...");
+                       return false;
+               }
+               $actualMappings = $actualMappings[ $indexName ];
+               if ( !$this->elasticsearch90 ) { // 1.0 compat
+                       $actualMappings = $actualMappings[ 'mappings' ];
+               }
+
+               $this->output( "\n" . $this->indent . "\tValidating mapping for 
page type..." );
+               if ( array_key_exists( 'page', $actualMappings ) &&
+                               $this->checkConfig( $actualMappings[ 'page' ], 
$requiredPageMappings ) ) {
+                       $this->output( "ok\n" );
+                       return true;
+               } else {
+                       $this->output( "different..." );
+                       return false;
+               }
+       }
+
        /**
         * @param $actual
         * @param $required array
         * @return bool
         */
-       private function vmActualMatchRequired( $actual, $required ) {
+       private function checkConfig( $actual, $required, $indent = null ) {
+               if ( $indent === null ) {
+                       $indent = $this->indent . "\t\t";
+               }
                foreach( $required as $key => $value ) {
+                       $this->debugCheckConfig( "\n$indent$key: " );
                        if ( !array_key_exists( $key, $actual ) ) {
+                               $this->debugCheckConfig( "not found..." );
+                               if ( $key === '_all' ) {
+                                       // The _all field never comes back so 
we just have to assume it
+                                       // is set correctly.
+                                       $this->debugCheckConfig( "was the all 
field so skipping..." );
+                                       continue;
+                               }
                                return false;
                        }
                        if ( is_array( $value ) ) {
+                               $this->debugCheckConfig( "descend..." );
                                if ( !is_array( $actual[ $key ] ) ) {
+                                       $this->debugCheckConfig( "other not 
array..." );
                                        return false;
                                }
-                               if ( !$this->vmActualMatchRequired( $actual[ 
$key ], $value ) ) {
+                               if ( !$this->checkConfig( $actual[ $key ], 
$value, $indent . "\t" ) ) {
                                        return false;
                                }
                                continue;
                        }
 
-                       if ( $actual[ $key ] === 'false' ) {
-                               $actual[ $key ] = false;
-                       }
+                       $actual[ $key ] = $this->normalizeConfigValue( $actual[ 
$key ] );
+                       $value = $this->normalizeConfigValue( $value );
+                       $this->debugCheckConfig( $actual[ $key ] . " ?? 
$value..." );
                        // Note that I really mean !=, not !==.  Coercion is 
cool here.
+                       // print $actual[ $key ] . "  $value\n";
                        if ( $actual[ $key ] != $value ) {
+                               $this->debugCheckConfig( 'different...' );
                                return false;
                        }
                }
                return true;
        }
 
+       /**
+        * Normalize a config value for comparison.  Elasticsearch will accept 
all kinds
+        * of config values but it tends to through back 'true' for true and 
'false' for
+        * false so we normalize everything.  Sometimes, oddly, it'll through 
back false
+        * for false....
+        * @param mixed $value config value
+        * @return mixes value normalized
+        */
+       private function normalizeConfigValue( $value ) {
+               if ( $value === true ) {
+                       return 'true';
+               } else if ( $value === false ) {
+                       return 'false';
+               }
+               return $value;
+       }
+
+       private function debugCheckConfig( $string ) {
+               if ( $this->printDebugCheckConfig ) {
+                       $this->output( $string );
+               }
+       }
+
        private function validateAlias() {
                $this->output( $this->indent . "Validating aliases...\n" );
                // Since validate the specific alias first as that can cause 
reindexing
diff --git a/tests/browser/features/step_definitions/page_steps.rb 
b/tests/browser/features/step_definitions/page_steps.rb
index 2a31d61..ff3cc64 100644
--- a/tests/browser/features/step_definitions/page_steps.rb
+++ b/tests/browser/features/step_definitions/page_steps.rb
@@ -80,20 +80,16 @@
   md5 = Digest::MD5.hexdigest(File.read(contents))
   md5_string = "md5: #{md5}"
   visit(ArticlePage, using_params: {page_name: title}) do |page|
-    if page.file_history? && page.file_last_comment? && 
page.file_last_comment.include?(md5_string)
+    if page.file_history? &&
+        page.file_last_comment? &&
+        page.file_last_comment.include?(md5_string)
       return
     end
-    if !(page.upload_new_version? || page.upload?)
+    if !page.upload?
       step "I am logged in"
-      visit(ArticlePage, using_params: {page_name: title})
     end
-    if page.upload?
-      # New file, upload it
-      page.upload
-    else
-      # Existing file, update it
-      page.upload_new_version
-    end
+    # If this doesn't exist then file uploading is probably disabled
+    page.upload
   end
   on(UploadFilePage) do |page|
     page.description = description + "\n" + md5_string
diff --git a/tests/browser/features/step_definitions/search_steps.rb 
b/tests/browser/features/step_definitions/search_steps.rb
index b121259..2362e77 100644
--- a/tests/browser/features/step_definitions/search_steps.rb
+++ b/tests/browser/features/step_definitions/search_steps.rb
@@ -37,10 +37,16 @@
       if page.simple_search_button? then
         page.simple_search_button
       else
-        #Since there isn't a simple search button on this page we're going to 
have
-        #to use the "containing..." drop down....
-        on(SearchPage).search_special_element.when_present.should exist
-        on(SearchPage).search_special_element.click
+        #Since there isn't a search button on this page we're going to have
+        #to use the "containing..." drop down....  We can't even click the
+        #search_button because that is a "go" search and we need one that won't
+        #drop us directly to the page on a perfect match
+
+        #I have no idea why, but just clicking on the element isn't good enough
+        #so we deploy this hack copied from mediawiki.searchSuggest.js
+        page.execute_script("$( '\#searchInput' ).closest( 'form' )
+          .append( $( '<input type=\"hidden\" name=\"fulltext\" value=\"1\"/>' 
) );")
+        page.search_button
       end
     end
   end
diff --git a/tests/browser/features/support/pages/article_page.rb 
b/tests/browser/features/support/pages/article_page.rb
index ddee828..5d26e9c 100644
--- a/tests/browser/features/support/pages/article_page.rb
+++ b/tests/browser/features/support/pages/article_page.rb
@@ -5,9 +5,8 @@
 
   h1(:title, id: "firstHeading")
   table(:file_history, :class => "filehistory")
-  cell(:file_last_comment){ table_element(:class => "filehistory")[1][5] }
-  link(:upload, text: "upload it")
-  link(:upload_new_version, text: "Upload a new version of this file")
+  cell(:file_last_comment){ |page| 
page.file_history_element[1][page.file_history_element[1].columns - 1] }
+  link(:upload, text: "Upload file")
   link(:create_link, text: "Create")
   link(:create_source_link, text: "Create source")
 end
diff --git a/tests/browser/features/update_weight.feature 
b/tests/browser/features/update_weight.feature
index ea4a0f4..5054e47 100644
--- a/tests/browser/features/update_weight.feature
+++ b/tests/browser/features/update_weight.feature
@@ -12,7 +12,7 @@
     Then within 10 seconds searching for WeightedLink%{epoch} yields 
WeightedLink%{epoch} 2 as the first result
     When a page named WeightedLink%{epoch} 1/1 exists with contents 
[[WeightedLink%{epoch} 1]]
     And a page named WeightedLink%{epoch} 1/2 exists with contents 
[[WeightedLink%{epoch} 1]]
-    Then within 10 seconds searching for WeightedLink%{epoch} yields 
WeightedLink%{epoch} 1 as the first result
+    Then within 20 seconds searching for WeightedLink%{epoch} yields 
WeightedLink%{epoch} 1 as the first result
 
   Scenario: Pages weights are updated when links are removed from them
     Given a page named WeightedLinkRemoveUpdate%{epoch} 1/1 exists with 
contents [[WeightedLinkRemoveUpdate%{epoch} 1]]
@@ -23,7 +23,7 @@
     Then within 10 seconds searching for WeightedLinkRemoveUpdate%{epoch} 
yields WeightedLinkRemoveUpdate%{epoch} 1 as the first result
     When a page named WeightedLinkRemoveUpdate%{epoch} 1/1 exists with 
contents [[Junk]]
     And a page named WeightedLinkRemoveUpdate%{epoch} 1/2 exists with contents 
[[Junk]]
-    Then within 10 seconds searching for WeightedLinkRemoveUpdate%{epoch} 
yields WeightedLinkRemoveUpdate%{epoch} 2 as the first result
+    Then within 20 seconds searching for WeightedLinkRemoveUpdate%{epoch} 
yields WeightedLinkRemoveUpdate%{epoch} 2 as the first result
 
   Scenario: Pages weights are updated when new pages link to their redirects
     Given a page named WeightedLinkRdir%{epoch} 1/Redirect exists with 
contents #REDIRECT [[WeightedLinkRdir%{epoch} 1]]
@@ -34,7 +34,7 @@
     Then within 10 seconds searching for WeightedLinkRdir%{epoch} yields 
WeightedLinkRdir%{epoch} 2 as the first result
     When a page named WeightedLinkRdir%{epoch} 1/1 exists with contents 
[[WeightedLinkRdir%{epoch} 1/Redirect]]
     And a page named WeightedLinkRdir%{epoch} 1/2 exists with contents 
[[WeightedLinkRdir%{epoch} 1/Redirect]]
-    Then within 10 seconds searching for WeightedLinkRdir%{epoch} yields 
WeightedLinkRdir%{epoch} 1 as the first result
+    Then within 20 seconds searching for WeightedLinkRdir%{epoch} yields 
WeightedLinkRdir%{epoch} 1 as the first result
 
   Scenario: Pages weights are updated when links are removed from their 
redirects
     Given a page named WLRURdir%{epoch} 1/1 exists with contents 
[[WLRURdir%{epoch} 1/Redirect]]
@@ -47,7 +47,7 @@
     Then within 10 seconds searching for WLRURdir%{epoch} yields 
WLRURdir%{epoch} 1 as the first result
     When a page named WLRURdir%{epoch} 1/1 exists with contents [[Junk]]
     And a page named WLRURdir%{epoch} 1/2 exists with contents [[Junk]]
-    Then within 10 seconds searching for WLRURdir%{epoch} yields 
WLRURdir%{epoch} 2 as the first result
+    Then within 20 seconds searching for WLRURdir%{epoch} yields 
WLRURdir%{epoch} 2 as the first result
 
   Scenario: Redirects to redirects don't count in the score
     Given a page named WLDoubleRdir%{epoch} 1/Redirect exists with contents 
#REDIRECT [[WLDoubleRdir%{epoch} 1]]
@@ -59,4 +59,4 @@
     And a page named WLDoubleRdir%{epoch} 2/1 exists with contents 
[[WLDoubleRdir%{epoch} 2/Redirect]]
     And a page named WLDoubleRdir%{epoch} 2/2 exists with contents 
[[WLDoubleRdir%{epoch} 2/Redirect]]
     And a page named WLDoubleRdir%{epoch} 2 exists
-    When within 10 seconds searching for WLDoubleRdir%{epoch} yields 
WLDoubleRdir%{epoch} 2 as the first result
+    When within 20 seconds searching for WLDoubleRdir%{epoch} yields 
WLDoubleRdir%{epoch} 2 as the first result

-- 
To view, visit https://gerrit.wikimedia.org/r/115215
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1dc24be8350b3a9d7ec9aa4fc251368c7c4b7da2
Gerrit-PatchSet: 8
Gerrit-Project: mediawiki/extensions/CirrusSearch
Gerrit-Branch: master
Gerrit-Owner: Manybubbles <never...@wikimedia.org>
Gerrit-Reviewer: Chad <ch...@wikimedia.org>
Gerrit-Reviewer: Hashar <has...@free.fr>
Gerrit-Reviewer: Manybubbles <never...@wikimedia.org>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to