https://bugzilla.wikimedia.org/show_bug.cgi?id=43652

--- Comment #3 from Nik Everett <neverett+bugzi...@wikimedia.org> ---
I wonder if this is something that can replace some of the more uncommon
customizations that lsearchd did to improve recall.  It might not be because
this is really an expert tool and those uncommon customizations (dash handling
and stuff) effect everyone.

In any case, I think it might be useful to lean on search to cut down the list
of pages that must be checked.  Lucene search and Elasticsearch both seem well
optimized for a "first pass" you'd use to identify candidates that might match
the regex.  I suppose it wouldn't always be the right thing to do, but it might
be nice.

I like implementing this in labs because it could be a real performance drain
on the production infrastructure if done there.  OTOH, if we put the wikitext
in Elasticsearch we could have it run the regexes pretty easily.  The only
trouble would be making sure the regexes don't cause a performance problem and
I'm not sure that is possible.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to