Alice, RE grouping, try Solr 4.8’s new “collapse” qparser w/ “expand" SearchComponent. The ref guide has the docs. It’s usually a faster equivalent approach to group=true
Do you care to comment further on NewEgg’s apparent switch from Endeca to Solr? (confirm true/false and rationale) ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Tue, May 27, 2014 at 4:17 AM, Alice.H.Yang (mis.cnsh04.Newegg) 41493 < alice.h.y...@newegg.com> wrote: > Hi, Token > > 1. > I set the 3 fields with hundreds of values uses fc and the rest > uses enum, the performance is improved 2 times compared with no parameter, > and then I add facet.method=20 , the performance is improved about 4 times > compared with no parameter. > And I also tried setting 9 facet field to one copyfield, I test > the performance, it is improved about 2.5 times compared with no parameter. > So, It is improved a lot under your advice, thanks a lot. > 2. > Now I have another performance issue, It's the group performance. > The number of data is as same as facet performance scenario. > When the keyword search hits about one million documents, the QTime is > about 600ms.(It doesn't query the first time, it's in cache) > > Query url: > > select?fl=item_catalog&q=default_search:paramter&defType=edismax&rows=50&group=true&group.field=item_group_id&group.ngroups=true&group.sort=stock4sort%20desc,final_price%20asc,is_selleritem%20asc&sort=score%20desc,default_sort%20desc > > It need Qtime about 600ms. > > This query have two parameter: > 1. fl one field > 2. group=true, > group.ngroups=true > > If I set group=false,, the QTime is only 1 ms. > But I need do group and group.ngroups, How can I improve the group > performance under this demand. Do you have some advice for me. I'm looking > forward to your reply. > > Best Regards, > Alice Yang > +86-021-51530666*41493 > Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042) > > > -----邮件原件----- > 发件人: Toke Eskildsen [mailto:t...@statsbiblioteket.dk] > 发送时间: 2014年5月24日 15:17 > 收件人: solr-user@lucene.apache.org > 主题: RE: (Issue) How improve solr facet performance > > Alice.H.Yang (mis.cnsh04.Newegg) 41493 [alice.h.y...@newegg.com] wrote: > > 1. I'm sorry, I have made a mistake, the total number of documents is > 32 Million, not 320 Million. > > 2. The system memory is large for solr index, OS total has 256G, I set > the solr tomcat HEAPSIZE="-Xms25G -Xmx100G" > > 100G is a very high number. What special requirements dictates such a > large heap size? > > > Reply: 9 fields I facet on. > > Solr treats each facet separately and with facet.method=fc and 10M hits, > this means that it will iterate 9*10M = 90M document IDs and update the > counters for those. > > > Reply: 3 facet fields have one hundred unique values, other 6 facet > fields' unique values are between 3 to 15. > > So very low cardinality. This is confirmed by your low response time of > 6ms for 2925 hits. > > > And we test this scenario: If the number of facet fields' unique values > is less we add facet.method=enum, there is a little to improve performance. > > That is a shame: enum is normally the simple answer to a setup like yours. > Have you tried fine-tuning your fc/enum selection, so that the 3 fields > with hundreds of values uses fc and the rest uses enum? That might halve > your response time. > > > Since the number of unique facets is so low, I do not think that DocValues > can help you here. Besides the fine-grained fc/enum-selection above, you > could try collapsing all 9 facet-fields into a single field. The idea > behind this is that for facet.method=fc, performing faceting on a field > with (for example) 300 unique values takes practically the same amount of > time as faceting on a field with 1000 unique values: Faceting on a single > slightly larger field is much faster than faceting on 9 smaller fields. > After faceting with facet.limit=-1 on the single super-facet-field, you > must match the returned values back to their original fields: > > > If you have the facet-fields > > field0: 34 > field1: 187 > field2: 78432 > field3: 3 > ... > > then collapse them by or-ing a field-specific mask that is bigger than the > max in any field, then put it all into a single field: > > fieldAll: 0xA0000000 | 34 > fieldAll: 0xA1000000 | 187 > fieldAll: 0xA2000000 | 78432 > fieldAll: 0xA3000000 | 3 > ... > > perform the facet request on fieldAll with facet.limit=-1 and split the > resulting counts with > > for (entry: facetResultAll) { > switch (0xFF000000 & entry.value) { > case 0xA0000000: > field0.add(entry.value, entry.count); > break; > case 0xA1000000: > field1.add(entry.value, entry.count); > break; > ... > } > } > > > Regards, > Toke Eskildsen, State and University Library, Denmark >