Re: UTF-8 support during indexing content

2012-02-01 Thread Chris Hostetter
: Subject: UTF-8 support during indexing content : References: <8ce9f966c6f6769-19a0-9e...@webmail-m069.sysops.aol.com> : <1326447127.1952.10.camel@snape> : <8ceade0f7e0ecec-189c-c...@webmail-m069.sysops.aol.com> : <1328105200.2033.33.camel@snape> : In-Reply-To: <132

RE: UTF-8 support during indexing content

2012-02-01 Thread Van Tassell, Kristian
--Original Message- From: Travis Low [mailto:t...@4centurion.com] Sent: Wednesday, February 01, 2012 8:27 AM To: solr-user@lucene.apache.org Subject: Re: UTF-8 support during indexing content Are you sure the input document is in UTF-8? That looks like classic ISO-8859-1-treated-as-UTF-8. How d

Re: UTF-8 support during indexing content

2012-02-01 Thread Travis Low
Are you sure the input document is in UTF-8? That looks like classic ISO-8859-1-treated-as-UTF-8. How did you confirm the document contains the right quote marks immediately prior to uploading? If you just visually inspected it, then use whatever tool you viewed it in to see what the character s

UTF-8 support during indexing content

2012-02-01 Thread Van Tassell, Kristian
Hello everyone, I have a question that I imagine has been asked many times before, so I apologize for the repeat. I have a basic text field with the following text: the word ”stemming” in quotes Uploading the data yields no errors, however when it is indexed, the text looks like this: