On Friday I released version 0.0.2 of an Elasticsearch plugin to perform
accelerated regular expression search against source documents.  This
version has stability and speed improvements for complex queries.  It:
1.  Prevents the compilation step from consuming tons and tons of memory.
Now it'll throw an exception if it tries to compile a regex that is too big.
2.  Prevents complex regular expressions from performing hundreds of term
queries against the trigrams.  There is now a parameter to limit the number
of term queries that are attempted to prefilter the documents.  This
prevents memory exhaustion.
3.  Speeds up some of the internals of the compilation step several orders
of magnitude for complex queries.

If you are brave enough to have used version 0.0.1 of this plugin you
should certainly upgrade to version 0.0.2.

As always you can try it on our beta site:
* Find links to files or templates
<http://simple.wikipedia.beta.wmflabs.org/w/index.php?title=Special%3ASearch&profile=default&search=insource%3A%2F\[\[%28file%3A|template%3A%29[^\]]*\]\]%2F&fulltext=Search>
* Find links within 10 characters of each other
<http://simple.wikipedia.beta.wmflabs.org/w/index.php?title=Special%3ASearch&profile=default&search=insource%3A%2F\[\[[^\]]*\]\].{0%2C20}\[\[[^\]]*\]\]%2F&fulltext=Search>
* Prove that complex queries don't eat all of memory
<http://simple.wikipedia.beta.wmflabs.org/wiki/Special:Search?search=insource%3A%2F\[\[%28Datei|File|Bild|Image%29%3A[^]]*alt%3D[^]|}]{50%2C200}%2F&go=Search>

Nik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3aCYYZ1dFwj33un1HFzY4N%2BhK%2BsHc39VDuk9SBZoD%3Dgw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to