Gehel has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/344044 )

Change subject: Update mwgrep for elasticsearch 5.x
......................................................................


Update mwgrep for elasticsearch 5.x

* the 'filtered' query has been removed, use a straight bool+filter
* We don't care about result order, so sort by _doc which basically
  skips the sorting step.
* Update cluster settings to allow querting > 1k shards. This limit
  exist to protect the cluster from bad queries, but this is our
  expected behaviour. It's unfortunate it has to be global instead of
  specified in the query. This has been applied to transient cluster
  settings already, the setting here ensures on restart it is picked
  back up.

Bug: T161055
Change-Id: I58ee998b5edd7914dd44a9acc0f15e92e7987b66
---
M hieradata/role/common/elasticsearch/cirrus.yaml
M modules/elasticsearch/manifests/init.pp
M modules/elasticsearch/templates/elasticsearch_5.yml.erb
M modules/scap/files/mwgrep
4 files changed, 13 insertions(+), 1 deletion(-)

Approvals:
  jenkins-bot: Verified
  DCausse: Looks good to me, but someone else must approve
  Gehel: Looks good to me, approved



diff --git a/hieradata/role/common/elasticsearch/cirrus.yaml 
b/hieradata/role/common/elasticsearch/cirrus.yaml
index 6392fbb..4343e25 100644
--- a/hieradata/role/common/elasticsearch/cirrus.yaml
+++ b/hieradata/role/common/elasticsearch/cirrus.yaml
@@ -25,3 +25,7 @@
 # once all elasticsearch clusters are upgraded to version 5, we will move
 # elasticsearch to our main repo and remove this configuration
 apt::use_experimental: true
+
+# mwgrep queries one copy of each shard in the cluster, which is currently just
+# over 3k shards. For it to work we need to increase the limit from default 1k
+elasticsearch::search_shard_count_limit: 5000
diff --git a/modules/elasticsearch/manifests/init.pp 
b/modules/elasticsearch/manifests/init.pp
index dd7540a..1b4b606 100644
--- a/modules/elasticsearch/manifests/init.pp
+++ b/modules/elasticsearch/manifests/init.pp
@@ -100,6 +100,7 @@
     $gc_log = true,
     $java_package = 'openjdk-8-jdk',
     $version = 5,
+    $search_shard_count_limit = 1000,
 ) {
 
     # Check arguments
diff --git a/modules/elasticsearch/templates/elasticsearch_5.yml.erb 
b/modules/elasticsearch/templates/elasticsearch_5.yml.erb
index 87aea61..5e77a83 100644
--- a/modules/elasticsearch/templates/elasticsearch_5.yml.erb
+++ b/modules/elasticsearch/templates/elasticsearch_5.yml.erb
@@ -360,6 +360,12 @@
 action.destructive_requires_name: true
 
 ##
+# Allow up to <%= @search_shard_count_limit %> shards to be queried at a time. 
The default
+# 1k is too low to allow mwgrep to operate.
+##
+action.search.shard_count.limit: <%= @search_shard_count_limit %>
+
+##
 # Enable the disk space aware shard allocator
 ##
 cluster.routing.allocation.disk.threshold_enabled: true
diff --git a/modules/scap/files/mwgrep b/modules/scap/files/mwgrep
index 0b4ce44..886c680 100755
--- a/modules/scap/files/mwgrep
+++ b/modules/scap/files/mwgrep
@@ -117,7 +117,8 @@
 search = {
     'size': args.max_results,
     '_source': ['namespace', 'title'],
-    'query': {'filtered': {'filter': {'bool': {'must': filters}}}},
+    'sort': ['_doc'],
+    'query': {'bool': {'filter':  filters}},
     'stats': ['mwgrep'],
 }
 

-- 
To view, visit https://gerrit.wikimedia.org/r/344044
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I58ee998b5edd7914dd44a9acc0f15e92e7987b66
Gerrit-PatchSet: 4
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: EBernhardson <[email protected]>
Gerrit-Reviewer: DCausse <[email protected]>
Gerrit-Reviewer: EBernhardson <[email protected]>
Gerrit-Reviewer: Gehel <[email protected]>
Gerrit-Reviewer: Giuseppe Lavagetto <[email protected]>
Gerrit-Reviewer: Krinkle <[email protected]>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to