bq: The SQL query contains a Replace statement that does this

Well, I suspect that's where the issue is. The facet values being
reported include:
<int name="4,1">134826</int>
which indicates that the incoming text to Solr still has the commas.
Solr is seeing the commas and all.

You can cure this by using PatternReplaceCharFilterFactory and doing
the substitution at index time if you want to.

That doesn't clarify why the behavior has changed though, but my
supposition is that it has nothing to do with Solr, and something
about your SQL statement is different.

Best,
Erick

On Thu, Apr 10, 2014 at 9:33 AM, Jean-Sebastien Vachon
<jean-sebastien.vac...@wantedanalytics.com> wrote:
> The SQL query contains a Replace statement that does this
>
>> -----Original Message-----
>> From: Shawn Heisey [mailto:s...@elyograg.org]
>> Sent: April-10-14 11:30 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Were changes made to facetting on multivalued fields recently?
>>
>> On 4/10/2014 9:14 AM, Jean-Sebastien Vachon wrote:
>> > Here are the field definitions for both our old and new index... as you can
>> see that are identical. We've been using this chain and field type starting 
>> with
>> Solr 1.4 and never had any problem. As for the documents, both indexes are
>> using the same data source. They could be slightly out of sync from time to
>> time but we tend to index them on a daily basis. Both indexes are also using
>> the same code (indexing through SolrJ) to index their content.
>> >
>> > The source is a column in MySql that contains entries such as "4,1"
>> > that get stored in a Multivalued fields after replacing commas by
>> > spaces
>> >
>> > OLD (4.6.1):
>> >    <fieldType name="text_ws" class="solr.TextField"
>> positionIncrementGap="100">
>> >       <analyzer>
>> >         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>> >       </analyzer>
>> >     </fieldType>
>> >
>> >     <field name="ad_job_type_id" type="text_ws" indexed="true"
>> > stored="true" required="false" multiValued="true" />
>>
>> Just so you know, there's nothing here that would require the field to be
>> multivalued.  WhitespaceTokenizerFactory does not create multiple field
>> values, it creates multiple terms.  If you are actually inserting multiple 
>> values
>> for the field in SolrJ, then you would need a multivalued field.
>>
>> What is replacing the commas with spaces?  I don't see anything here that
>> would do that.  It sounds like that part of your indexing is not working.
>>
>> Thanks,
>> Shawn
>>
>>
>> -----
>> Aucun virus trouvé dans ce message.
>> Analyse effectuée par AVG - www.avg.fr
>> Version: 2014.0.4355 / Base de données virale: 3882/7323 - Date:
>> 09/04/2014

Reply via email to