bq: The SQL query contains a Replace statement that does this Well, I suspect that's where the issue is. The facet values being reported include: <int name="4,1">134826</int> which indicates that the incoming text to Solr still has the commas. Solr is seeing the commas and all.
You can cure this by using PatternReplaceCharFilterFactory and doing the substitution at index time if you want to. That doesn't clarify why the behavior has changed though, but my supposition is that it has nothing to do with Solr, and something about your SQL statement is different. Best, Erick On Thu, Apr 10, 2014 at 9:33 AM, Jean-Sebastien Vachon <jean-sebastien.vac...@wantedanalytics.com> wrote: > The SQL query contains a Replace statement that does this > >> -----Original Message----- >> From: Shawn Heisey [mailto:s...@elyograg.org] >> Sent: April-10-14 11:30 AM >> To: solr-user@lucene.apache.org >> Subject: Re: Were changes made to facetting on multivalued fields recently? >> >> On 4/10/2014 9:14 AM, Jean-Sebastien Vachon wrote: >> > Here are the field definitions for both our old and new index... as you can >> see that are identical. We've been using this chain and field type starting >> with >> Solr 1.4 and never had any problem. As for the documents, both indexes are >> using the same data source. They could be slightly out of sync from time to >> time but we tend to index them on a daily basis. Both indexes are also using >> the same code (indexing through SolrJ) to index their content. >> > >> > The source is a column in MySql that contains entries such as "4,1" >> > that get stored in a Multivalued fields after replacing commas by >> > spaces >> > >> > OLD (4.6.1): >> > <fieldType name="text_ws" class="solr.TextField" >> positionIncrementGap="100"> >> > <analyzer> >> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> > </analyzer> >> > </fieldType> >> > >> > <field name="ad_job_type_id" type="text_ws" indexed="true" >> > stored="true" required="false" multiValued="true" /> >> >> Just so you know, there's nothing here that would require the field to be >> multivalued. WhitespaceTokenizerFactory does not create multiple field >> values, it creates multiple terms. If you are actually inserting multiple >> values >> for the field in SolrJ, then you would need a multivalued field. >> >> What is replacing the commas with spaces? I don't see anything here that >> would do that. It sounds like that part of your indexing is not working. >> >> Thanks, >> Shawn >> >> >> ----- >> Aucun virus trouvé dans ce message. >> Analyse effectuée par AVG - www.avg.fr >> Version: 2014.0.4355 / Base de données virale: 3882/7323 - Date: >> 09/04/2014