[ https://issues.apache.org/jira/browse/LUCENE-4556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Uwe Schindler updated LUCENE-4556: ---------------------------------- Fix Version/s: (was: 4.3) 4.4 > FuzzyTermsEnum creates tons of objects > -------------------------------------- > > Key: LUCENE-4556 > URL: https://issues.apache.org/jira/browse/LUCENE-4556 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search, modules/spellchecker > Affects Versions: 4.0 > Reporter: Simon Willnauer > Assignee: Simon Willnauer > Priority: Critical > Fix For: 4.4 > > Attachments: LUCENE-4556.patch, LUCENE-4556.patch > > > I ran into this problem in production using the DirectSpellchecker. The > number of objects created by the spellchecker shoot through the roof very > very quickly. We ran about 130 queries and ended up with > 2M transitions / > states. We spend 50% of the time in GC just because of transitions. Other > parts of the system behave just fine here. > I talked quickly to robert and gave a POC a shot providing a > LevenshteinAutomaton#toRunAutomaton(prefix, n) method to optimize this case > and build a array based strucuture converted into UTF-8 directly instead of > going through the object based APIs. This involved quite a bit of changes but > they are all package private at this point. I have a patch that still has a > fair set of nocommits but its shows that its possible and IMO worth the > trouble to make this really useable in production. All tests pass with the > patch - its a start.... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org