Thanks Erick I will check this as soon as I can. In the meantime, here is a sample query and how it looks in our index. It looks good to me (at least that what is showing up as well in our other and older indexes)
http://10.0.5.227:8201/solr/Current/select?q=*:*&fl=ad_job_type_id&fq=ad_job_type_id:[*%20TO%20*]&facet=on&facet.field=ad_job_type_id&rows=1 <result name="response" numFound="12204004" start="0" maxScore="1.0"> <doc> <arr name="ad_job_type_id"> <str>4 5 1</str> </arr> </doc> </result> > -----Original Message----- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: April-09-14 2:21 PM > To: solr-user@lucene.apache.org > Subject: Re: Were changes made to facetting on multivalued fields recently? > > That is...um...very strange. It looks to me like you have somehow indexed a > bunch of new values. I'm guessing here, but it's suspicious that you have a > value "4,1" should that have been indexed as "4" and "1" as separate tokens? > > So here's what I'd do > 1> take a look at the solr/admin/schema browser output for that field > in the two versions. I suspect you'll see 7 values in 4.6 and a bazillion in > 4.7.1. > 2> if <1> is true, take a look at the admin/analysis page for the > field in question and see some sample index-time inputs, especially for the > theoretical "4,1" entries. I suspect that 4.6 will break these up into two > tokens and 4.7.1 won't. > 3> if <2> is true, take a very careful look at the index-time analysis > chains in the two versions, I bet they're different and that accounts for your > observations. > 4> try 1-3, discover I'm totally off base and paste the schema.xml > definitions for the field in question in both 4.6 and 4.7.1 to this thread and > we can take a look. > > This should not have changed between 4.6 and 4.7.1, at least not > intentionally. > > Best, > Erick > > On Wed, Apr 9, 2014 at 11:04 AM, Jean-Sebastien Vachon <jean- > sebastien.vac...@wantedanalytics.com> wrote: > > Hi All, > > > > We just discovered that the response from Solr (4.7.1) when faceting on > one of our multi-valued fields has changed considerably. > > > > In the past (4.6.1 and prior versions as well) we used to have > > something like this: (there are 7 possible values for this attribute) > > > > <lst name="facet_counts"> > > <lst name="facet_queries"/> > > <lst name="facet_fields"> > > <lst name="ad_job_type_id"> > > <int name="1">11454652</int> > > <int name="4">11387070</int> > > <int name="5">2095603</int> > > <int name="3">809992</int> > > <int name="2">567244</int> > > <int name="6">139389</int> > > <int name="7">4120</int> > > </lst> > > </lst> > > <lst name="facet_dates"/> > > </lst> > > > > And now with 4.7.1 we are getting this: > > <lst name="facet_counts"> > > <lst name="facet_queries"/> > > <lst name="facet_fields"> > > <lst name="ad_job_type_id"> > > <int name="1">10954552</int> > > <int name="4">10884418</int> > > <int name="5">2000530</int> > > <int name="3">784491</int> > > <int name="2">535935</int> > > <int name="4,1">134826</int> > > <int name="5,1">11770</int> > > ... there are too many values to list them all ... > > > > I checked the Change log for 4.7.1 and only saw an optimization made > > for https://issues.apache.org/jira/browse/SOLR-5512 > > > > Is there any new configuration directive that we should be aware of? > > > > Thanks > > > > > > > > > > > > ----- > Aucun virus trouvé dans ce message. > Analyse effectuée par AVG - www.avg.fr > Version: 2014.0.4354 / Base de données virale: 3722/7256 - Date: > 27/03/2014 La Base de données des virus a expiré.