[Wikimedia-search] Enable or disable full text search query rewriting by default for API clients.

2015-07-30 Thread Erik Bernhardson
We have a new feature for web requests that rewrites zero result queries into a new search that might have results. I've started porting this same feature over to API clients so it has a larger effect on our zero results rate, but code review has turned up some indecision on if this should be enab

Re: [Wikimedia-search] Enable or disable full text search query rewriting by default for API clients.

2015-07-30 Thread Erik Bernhardson
On Thu, Jul 30, 2015 at 3:29 PM, Adam Baso wrote: > Probably a good idea. Is it opt-in or opt-our for the API consumer? > > -Adam > > Thats the main question :) To copy the dialog from gerrit: > *Anomie wrote:* > Should the ability to set this flag be exposed to the API somehow? > > Or, to avoi

Re: [Wikimedia-search] Enable or disable full text search query rewriting by default for API clients.

2015-07-30 Thread Erik Bernhardson
On Thu, Jul 30, 2015 at 4:51 PM, Legoktm wrote: > On 07/30/2015 02:06 PM, Erik Bernhardson wrote: > > results rate, but code review has turned up some indecision on if this > > should be enabled or disabled by default in the API. Either way the > > feature will be toggle

[Wikimedia-search] Completion suggestion API demo

2015-08-25 Thread Erik Bernhardson
We have been working on a replacement autocompletion API that is more forgiving than a strict prefix search. The scoring algorithm's have a long way to go but we have the first run through of building the completion index for enwiki so i thought i would share: Here are a couple examples, feel fre

Re: [Wikimedia-search] Completion suggestion API demo

2015-08-26 Thread Erik Bernhardson
I ran some zero result rate tests against this API today, it is a huge reduction in the zero result rate over the existing prefix search. from 32% to 19% (on a 1% sample of prefix searches for an entire day) On Wed, Aug 26, 2015 at 12:34 PM, Stas Malyshev wrote: > Hi! > > > I uploaded a small H

Re: [Wikimedia-search] Measuring user user satisfaction while reducing it at the same time?

2015-08-26 Thread Erik Bernhardson
On Wed, Aug 26, 2015 at 3:58 PM, Chad Horohoe wrote: > Aren't we uncached here anyway? Special pages and all. > > -Chad > Actually the events we are recording here measure the users interaction with the pages they found. The current idea is to add a query parameter to all search results (only for

Re: [Wikimedia-search] Measuring user user satisfaction while reducing it at the same time?

2015-08-26 Thread Erik Bernhardson
; > https://wikitech.wikimedia.org/wiki/Provenance > > You'd want to look at the *current* VCL in templates/varnish in the > operations repo to see how it's presently done for *wprov*. > > -Adam > > > > On Wed, Aug 26, 2015 at 4:00 PM, Erik Bernhardson < > ebern

Re: [Wikimedia-search] Measuring user user satisfaction while reducing it at the same time?

2015-08-27 Thread Erik Bernhardson
(cross-posting to ops@ as requested. This is in regards to an EventLogging schema[1] applied in javascript to track user behavior on pages they find via internal search to measure the quality of the search results[2][3]. ) The referrer would be nice to use, but we are trying to track more than jus

Re: [Wikimedia-search] Completion suggestion API demo

2015-08-27 Thread Erik Bernhardson
https://phabricator.wikimedia.org/T109729 > > More rigorous testing must be done before we can consider replacing > prefixsearch with the suggestion API. > > Thanks! > > Dan > > On 25 August 2015 at 15:38, Erik Bernhardson > wrote: > >> We have been working on a replacement

[Wikimedia-search] Asynchronously calling elasticsearch

2015-09-08 Thread Erik Bernhardson
The php engine used in prod by the wmf, hhvm, has built in support for shared (non-preemptive) concurrency via async/await keywords[1][2]. Over the weekend i spent some time converting the Elastica client library we use to work asynchronously, which would essentially let us continue on performing o

Re: [Wikimedia-search] Asynchronously calling elasticsearch

2015-09-09 Thread Erik Bernhardson
done by >> elastic. This would definitely help. >> >> Le 08/09/2015 21:01, Erik Bernhardson a Ă©crit : >> >> The php engine used in prod by the wmf, hhvm, has built in support for >> shared (non-preemptive) concurrency via async/await keywords[1][2]. Over >> the

Re: [Wikimedia-search] Smoothing in dashboard(s)

2015-09-10 Thread Erik Bernhardson
This is awesome. The weekly median for zero results rate change is much easier to comprehend. Thanks for doing this. On Thu, Sep 10, 2015 at 9:15 AM, Trey Jones wrote: > Smth! It looks great! > > Quick question—what's the period of the moving average? Is it a week, or > more? (A week mak

[Wikimedia-search] Page rank

2015-09-21 Thread Erik Bernhardson
Late last week while looking over our existing scoring methods i was thinking that while counting incoming links is nice, a couple guys dominated search with (among other things) a better way to judge the quality of incoming links, aka PageRank. PageRank takes a very simple input, it just needs a

Re: [Wikimedia-search] Asynchronously calling elasticsearch

2015-09-21 Thread Erik Bernhardson
Just to follow up here, i've updated the `async` branch of my Elastica fork, it now completely passes the test suite so might be ready for further CirrusSearch testing. On Wed, Sep 9, 2015 at 12:23 PM, Erik Bernhardson < ebernhard...@wikimedia.org> wrote: > This would allow the

Re: [Wikimedia-search] Asynchronously calling elasticsearch

2015-09-21 Thread Erik Bernhardson
On Mon, Sep 21, 2015 at 8:51 AM, Trey Jones wrote: > That's very cool! Have you stress-tested it at all? Like, what happens if > you search 10 wikipedias at once? (Because you know I want to search 10 > wikis at once. ) > No issues with a dozen requests at once in the application layer. Our clus