Thanks Erick.

I believe the issue is in solr. The character “à” is getting stored in solr as 
“Ã ”. Notice the space after Ã.

I'm using solrj to ingest the documents into solr. So, one of those could be 
the culprit?


-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, July 08, 2015 1:36 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Encoding Issue?

Attachments are pretty aggressively stripped by the e-mail server, so there's 
nothing to see, you'll have to paste it somewhere else and provide a link.

Usually, though, this is a character set issue with the browser using a 
different charset than Solr, it's really the same character, just displayed 
differently.

Shot in the dark though.

Erick

On Wed, Jul 8, 2015 at 10:49 AM, Tarala, Magesh <mtar...@bh.com> wrote:

>  I’m ingesting a .TXT file with HTML content into Solr. The content 
> has the following character highlighted below:
>
> The file we get from CRM (also attached):
>
> [image: cid:image001.png@01D0B972.75BE23F0]
>
>
>
>
>
> After ingesting into solr, I see a different character. This is query 
> response from solr management console.
>
>
>
> [image: cid:image003.png@01D0B972.D1AED290]
>
>
>
>
>
> Anybody know how I can prevent this from happening?
>
>
>
> Thanks!
>

Reply via email to