[ 
https://issues.apache.org/jira/browse/SOLR-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-2462:
-----------------------------

    Attachment: SOLR-2462.patch

This sets the maximum limit to 1000 possibilities.  When this limit is reached, 
the list is sorted by rank then reduced to the top 100.  From then on, only 
collations with a rank equal or better than the 100th are added.  This process 
repeats until finished or until it has taken 50ms, at which time it quits.

I also added a "maxTimeAllowed" setting of 50ms to the collation test queries 
as an additional performance safeguard.

> Using spellcheck.collate can result in extremely high memory usage
> ------------------------------------------------------------------
>
>                 Key: SOLR-2462
>                 URL: https://issues.apache.org/jira/browse/SOLR-2462
>             Project: Solr
>          Issue Type: Bug
>          Components: spellchecker
>    Affects Versions: 3.1, 4.0
>            Reporter: James Dyer
>            Priority: Critical
>         Attachments: SOLR-2462.patch
>
>
> When using "spellcheck.collate", class SpellPossibilityIterator creates a 
> ranked list of *every* possible correction combination.  But if returning 
> several corrections per term, and if several words are misspelled, the 
> existing algorithm uses a huge amount of memory.
> This bug was introduced with SOLR-2010.  However, it is triggered anytime 
> "spellcheck.collate" is used.  It is not necessary to use any features that 
> were added with SOLR-2010.
> We were in Production with Solr for 1 1/2 days and this bug started taking 
> our Solr servers down with "infinite" GC loops.  It was pretty easy for this 
> to happen as occasionally a user will accidently paste the URL into the 
> Search box on our app.  This URL results in a search with ~12 misspelled 
> words.  We have "spellcheck.count" set to 15. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to