On Fri, Nov 2, 2018 at 3:51 AM Hogan (US), Michael C <
michael.c.hog...@boeing.com> wrote:

> Can anyone point me to a starting point for learning about how to tune
> CirrusSearch (or examples)? I found the CirrusSearchScoreBuilder page [1],
> which implies it is possible to modify how search results are ranked. But,
> the documentation page hasn't been created yet. Thank you!
>

Hi,

there are many ways to tune the ranking of search results.
The hook you mention is designed to be used by extensions that want to tune
everything related to the search query itself. I strongly discourage to use
it, it is highly experimental and will be removed in the future.

To understand how cirrus scores docs I suggest to start with this
documentation [2].
You can then tune the retrieval query using profiles and the
wgCirrusSearchFullTextQueryBuilderProfiles config array:
E.g.
$wgCirrusSearchFullTextQueryBuilderProfiles => [
    'my_custom_profile' => [
                'builder_class' =>
\CirrusSearch\Query\FullTextSimpleMatchQueryBuilder::class,
                'settings' => [
                        'default_min_should_match' => '1',
                        'default_query_type' => 'most_fields',
                        'default_stem_weight' => 3.0,
                        'fields' => [
                                'title' => 0.3,
                                'redirect.title' => [
                                        'boost' => 0.27,
                                        'in_dismax' =>
'redirects_or_shingles'
                                ],
                                'suggest' => [
                                        'is_plain' => true,
                                        'boost' => 0.20,
                                        'in_dismax' =>
'redirects_or_shingles',
                                ],
                                'category' => 0.05,
                                'heading' => 0.05,
                                'text' => [
                                        'boost' => 0.6,
                                        'in_dismax' =>
'text_and_opening_text',
                                ],
                                'opening_text' => [
                                        'boost' => 0.5,
                                        'in_dismax' =>
'text_and_opening_text',
                                ],
                                'auxiliary_text' => 0.05,
                                'file_text' => 0.5,
                        ],
                        'phrase_rescore_fields' => [
                                'all' => 0.06,
                                'all.plain' => 0.1,
                        ],
                ],
        ],
];

And then activate it by default:
$wgCirrusSearchFullTextQueryBuilderProfile = "perfield_builder";

Please see [3] for more doc on the various settings.

To tune the query independent signals (the rescoring part in the doc), this
is similar as you declare a profile and activate it by default.
The config var to add a new profile is $wgCirrusSearchRescoreProfiles and
you can add more by following these examples [4].
The config var to change the default rescore profile is
$wgCirrusSearchRescoreProfile.
Rescore profiles internally use "rescore function chains" which can be
tuned as well using $wgCirrusSearchRescoreFunctionChains [5].

I'm sorry if this is bit dense and for the lack of comprehensive
documentation. I suggest having a look at the elasticsearch documentation
as well as many concepts here are related to elasticsearch features
(dismax, rescoring, function score, ...).
We have also some integration with the LTR plugin [6].

Please let me know if you have specific questions or specific problems I
could help going into a specific direction instead of digesting all of this.

Thank you.

[2] https://www.mediawiki.org/wiki/Extension:CirrusSearch/Scoring
[3]
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/CirrusSearch/+/master/profiles/FullTextQueryBuilderProfiles.config.php#39
[4]
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/CirrusSearch/+/master/profiles/RescoreProfiles.config.php
[5]
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/CirrusSearch/+/master/profiles/RescoreFunctionChains.config.php
[6] https://github.com/o19s/elasticsearch-learning-to-rank
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to