[ https://issues.apache.org/jira/browse/SOLR-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907515#comment-14907515 ]
Mike Murphy edited comment on SOLR-8096 at 9/25/15 3:38 AM: ------------------------------------------------------------ The people who did this are elasticsearch employees. That is one way to deal with Solr's faster faceting! This smells like the VW pollution scandal for lucene/solr/elasticsearch, except perhaps no consequences for those who pulled it off? Why are elasticsearch employees allowed to do this? was (Author: mmurphy3141): The people who did this are elasticsearch employees. That is one way to deal with Solr's faster faceting! This smells like the VW pollution scandal for lucene/solr/elasticsearch, except perhaps no consequences for those who pulled it off? > Major faceting performance regressions > -------------------------------------- > > Key: SOLR-8096 > URL: https://issues.apache.org/jira/browse/SOLR-8096 > Project: Solr > Issue Type: Bug > Affects Versions: 5.0, 5.1, 5.2, 5.3, Trunk > Reporter: Yonik Seeley > Priority: Critical > > Use of the highly optimized faceting that Solr had for multi-valued fields > over relatively static indexes was *secretly removed* as part of LUCENE-5666, > causing severe performance regressions. > Here are some quick benchmarks to gauge the damage, on a 5M document index, > with each field having between 0 and 5 values per document. *Higher numbers > represent worse 5x performance*. > Solr 5.4_dev faceting time as a percent of Solr 4.10.3 faceting time > ||...................................|| Percent of index being faceted > ||num_unique_values|| 10% || 50% || 90% || > |10 | 351.17% | 1587.08% | 3057.28% | > |100 | 158.10% | 203.61% | 1421.93% | > |1000 | 143.78% | 168.01% | 1325.87% | > |10000 | 137.98% | 175.31% | 1233.97% | > |100000 | 142.98% | 159.42% | 1252.45% | > |1000000 | 255.15% | 165.17% | 1236.75% | > For example, a field with 1000 unique values in the whole index, faceting > with 5x took 143% of the 4x time, when ~10% of the docs in the index were > faceted. > One user who brought the performance problem to our attention: > http://markmail.org/message/ekmqh4ocbkwxv3we > "faceting is unusable slow since upgrade to 5.3.0" (from 4.10.3) > The disabling of the UnInvertedField algorithm was previously discovered in > SOLR-7190, but we didn't know just how bad the problem was at that time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org