tl;dr: Search continues to expand functionality by displaying more
information on the search results page

Ever started searching for something on Wikipedia and wondered—*really*, is
that all that there is? Does it feel like you’re somehow playing hide and
seek with all the knowledge that’s out there? And...wouldn’t it be great to
see articles or categories that are similar to your search query and maybe
some related images or links to other languages in which to read that
article? Or, maybe you just want to read and contribute to projects other
than Wikipedia but need a jump start with a few short summaries from sister
projects.
The Discovery Search team has been testing out some really cool new
features that will enable some fun and fascinating clicking—down the rabbit
hole of Wikipedia.[1] But first, let’s recap what we’ve been doing recently.

We've been doing tons of work creating, updating, and finessing the search
back end to enhance search queries. There have been many complex things
that have happened, things like: adding ascii-folding and stemming,
detecting when a visitor might be typing in a language that is different
 than the Wikipedia that they are on, switching from tf-idf to BM25,
dropping trailing question marks, and updating to ElasticSearch version 5.
[2][3][4][5][6][7] Whew!

We have much more planned in the coming months—machine learning with
‘learning to rank’, investigating and deploying new language analyzers,
and, after exhaustive analysis, removing quotes within queries by
default.[8][9][10][11] We’ll also be working closely with the new
Structured Data team in their brand new work on Commons.[12][13]

We also want to improve the part that our readers and editors interface
with: the search results page! We started brainstorming during the late
summer of 2016 on what we could do to make search results better—to easily
find interesting, relevant content and to create a more intuitive viewing
experience.[14] We designed and refined numerous ideas on how to improve
the search results page and received lots of good feedback from the
community.[15]

Empowered by the feedback, we began testing starting with a display of
results from the Wikimedia sister projects next to the regular search
results.[16] The idea for this test was to enable discovery into other
projects—projects that our visitors might not have known about—by
displaying interesting results in small snippets. The sidebar display of
the sister projects borrows from a similar feature in use on the Italian,
Catalan and French Wikipedias. We've run two A/B tests on the sister
project search results with detailed analysis and, after a bit of final
touches to the code, we will release the new functionality into production
on all Wikipedias near the end of April 2017.

Our next A/B test will be to add additional information and related results
for each search query. This will be in the form of an ‘explore similar’
link that, when someone interacts with the link, an expanded display will
appear with related pages, categories and links to the article in other
languages—all of which might lead to further knowledge discovery.[17] We
know that not every search query will return exactly what folks were
looking for, but we feel that adding links to similar, but related
information would be helpful and, possibly, super interesting!

We also plan on doing a few more A/B tests in the coming year:
* Test a new display that will show the pronunciation of a word with its
definition and part of speech—all from existing data in Wiktionary.
Initially this will be in English only.
* Test placing a small image (from the article) next to each search result
that is displayed on the page.
* Test an additional future using a new auto completion metadata display in
the search box that is located on the top right of most pages in Wikipedia,
similar to what happens on the Wikipedia.org portal.[18]

For the more technical minded, there is a way to test out these new
features in your own browser. To display the sister project search results,
it will require a bit of URL manipulation; but for the explore similar and
Wiktionary widget, you can modify your common.js file to test an early
version of the features. Detailed information is available on
MediaWiki.org.[19]

Once the testing, analysis and feedback cycle is done for each new feature,
we’d like to slowly implement them into production on all Wikipedias
throughout the rest of the year. We’re really hoping that these
enhancements to how search works will further the usefulness of search and
make our readers and editors more productive.

Cheers from the Discovery Search team!

[1] https://xkcd.com/214/
[2] https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/R
e-Ordering_Stemming_and_Ascii-Folding_on_English_Wikipedia
[3] https://blog.wikimedia.org/2016/07/27/wikipedia-language-search/
[4] https://en.wikipedia.org/wiki/Tf%E2%80%93idf
[5] https://en.wikipedia.org/wiki/Okapi_BM25
[6]​ ​https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Drop
ping_Final_Question_Marks_in_the_Top_10_Wikipedias
[7] https://phabricator.wikimedia.org/T154501
[8] https://en.wikipedia.org/wiki/Learning_to_rank
[9] https://phabricator.wikimedia.org/T154511
[10] https://commons.wikimedia.org/wiki/File:From_Zero_to_
Hero_-_Anticipating_Zero_Results_From_Query_Features,_Ignoring_Content.pdf
[11] https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/
Quotes_and_Questions
[12] https://commons.wikimedia.org/wiki/Commons:Structured_data
[13] https://blog.wikimedia.org/2017/01/09/sloan-foundation-structured-data/
[14] https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements
[15] https://www.mediawiki.org/wiki/Talk:Cross-wiki_Search_
Result_Improvements
[16] https://www.mediawiki.org/wiki/Cross-wiki_Search_Result
_Improvements/Testing#A.2FB_test:_Add_cross-wiki_search_
results_in_a_right_hand_sidebar
[17] https://www.mediawiki.org/wiki/Cross-wiki_Search_Result
_Improvements/Testing#A.2FB_test:_Add_.27explore_similar.
27_pages_and_categories_for_search_results
[18] https://www.wikipedia.org/
[19] https://www.mediawiki.org/wiki/Cross-wiki_Search_Result
_Improvements/self-guided_testing


--
deb tankersley
irc: debt
Product Manager, Discovery
Wikimedia Foundation
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to