Hi, all Does anybody has some advice for me on solr group performance. I have no idea on the group performance.
To David Smiley I am not responsible for endeca, It's a pity ,I have no comment on endeca. Best Regards, Alice Yang +86-021-51530666*41493 Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042) -----邮件原件----- 发件人: david.w.smi...@gmail.com [mailto:david.w.smi...@gmail.com] 发送时间: 2014年5月27日 21:29 收件人: solr-user@lucene.apache.org 主题: Re: 答复: (Issue) How improve solr facet performance Alice, RE grouping, try Solr 4.8’s new “collapse” qparser w/ “expand" SearchComponent. The ref guide has the docs. It’s usually a faster equivalent approach to group=true Do you care to comment further on NewEgg’s apparent switch from Endeca to Solr? (confirm true/false and rationale) ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Tue, May 27, 2014 at 4:17 AM, Alice.H.Yang (mis.cnsh04.Newegg) 41493 < alice.h.y...@newegg.com> wrote: > Hi, Token > > 1. > I set the 3 fields with hundreds of values uses fc and the > rest uses enum, the performance is improved 2 times compared with no > parameter, and then I add facet.method=20 , the performance is > improved about 4 times compared with no parameter. > And I also tried setting 9 facet field to one copyfield, I > test the performance, it is improved about 2.5 times compared with no > parameter. > So, It is improved a lot under your advice, thanks a lot. > 2. > Now I have another performance issue, It's the group performance. > The number of data is as same as facet performance scenario. > When the keyword search hits about one million documents, the QTime is > about 600ms.(It doesn't query the first time, it's in cache) > > Query url: > > select?fl=item_catalog&q=default_search:paramter&defType=edismax&rows= > 50&group=true&group.field=item_group_id&group.ngroups=true&group.sort= > stock4sort%20desc,final_price%20asc,is_selleritem%20asc&sort=score%20d > esc,default_sort%20desc > > It need Qtime about 600ms. > > This query have two parameter: > 1. fl one field > 2. group=true, > group.ngroups=true > > If I set group=false,, the QTime is only 1 ms. > But I need do group and group.ngroups, How can I improve the group > performance under this demand. Do you have some advice for me. I'm > looking forward to your reply. > > Best Regards, > Alice Yang > +86-021-51530666*41493 > Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042) > > > -----邮件原件----- > 发件人: Toke Eskildsen [mailto:t...@statsbiblioteket.dk] > 发送时间: 2014年5月24日 15:17 > 收件人: solr-user@lucene.apache.org > 主题: RE: (Issue) How improve solr facet performance > > Alice.H.Yang (mis.cnsh04.Newegg) 41493 [alice.h.y...@newegg.com] wrote: > > 1. I'm sorry, I have made a mistake, the total number of documents > > is > 32 Million, not 320 Million. > > 2. The system memory is large for solr index, OS total has 256G, I > > set > the solr tomcat HEAPSIZE="-Xms25G -Xmx100G" > > 100G is a very high number. What special requirements dictates such a > large heap size? > > > Reply: 9 fields I facet on. > > Solr treats each facet separately and with facet.method=fc and 10M > hits, this means that it will iterate 9*10M = 90M document IDs and > update the counters for those. > > > Reply: 3 facet fields have one hundred unique values, other 6 facet > fields' unique values are between 3 to 15. > > So very low cardinality. This is confirmed by your low response time > of 6ms for 2925 hits. > > > And we test this scenario: If the number of facet fields' unique > > values > is less we add facet.method=enum, there is a little to improve performance. > > That is a shame: enum is normally the simple answer to a setup like yours. > Have you tried fine-tuning your fc/enum selection, so that the 3 > fields with hundreds of values uses fc and the rest uses enum? That > might halve your response time. > > > Since the number of unique facets is so low, I do not think that > DocValues can help you here. Besides the fine-grained > fc/enum-selection above, you could try collapsing all 9 facet-fields > into a single field. The idea behind this is that for facet.method=fc, > performing faceting on a field with (for example) 300 unique values > takes practically the same amount of time as faceting on a field with > 1000 unique values: Faceting on a single slightly larger field is much faster > than faceting on 9 smaller fields. > After faceting with facet.limit=-1 on the single super-facet-field, > you must match the returned values back to their original fields: > > > If you have the facet-fields > > field0: 34 > field1: 187 > field2: 78432 > field3: 3 > ... > > then collapse them by or-ing a field-specific mask that is bigger than > the max in any field, then put it all into a single field: > > fieldAll: 0xA0000000 | 34 > fieldAll: 0xA1000000 | 187 > fieldAll: 0xA2000000 | 78432 > fieldAll: 0xA3000000 | 3 > ... > > perform the facet request on fieldAll with facet.limit=-1 and split > the resulting counts with > > for (entry: facetResultAll) { > switch (0xFF000000 & entry.value) { > case 0xA0000000: > field0.add(entry.value, entry.count); > break; > case 0xA1000000: > field1.add(entry.value, entry.count); > break; > ... > } > } > > > Regards, > Toke Eskildsen, State and University Library, Denmark >