Hi Chris,
Getting unicode to pass cleanly through to the browser can be tricky. A couple of questions:
What do you see instead of the incorrect character?
What happens when you tell your browser to display UTF-8? (In IE, it's in View->Encoding->Unicode) Does your character display properly?
Are your pages tagged with charset=utf-8? If not, the browser will default to non-unicode.
Are you using any view technology? Velocity, for example, has configuration settings for UTF-8 that control both how its templates are read from disk and how it encodes final output.
Hope this helps. If not, let us know more about how you're getting from the index to the page.
Thanks,
Fred
At 09:40 PM 10/11/2004, you wrote:
I was unaware of luke... a very nice tool. It would appear that the characters are correct in the index, but somewhere between me receiving the fields on the search and displaying them, something is lost... do i need to do any utf-8 encoding/decoding when extracting the fields?
On Mon, 11 Oct 2004 21:12:13 -0400, Erik Hatcher <[EMAIL PROTECTED]> wrote: > Chris - I suspect something else in your application is getting in the > way. Try to simplify and eliminate the servlet, or use a tool like > Luke to see what is truly in the index and what truly is being > returned. Lucene indexes what you tell it (perhaps your analyzer is > manipulating things?), and returns what is stored exactly, so I doubt > Lucene is the culprit. > > Erik > > > > > On Oct 11, 2004, at 8:50 PM, Chris Fraschetti wrote: > > > The dataset that I index is pretty dynamic and flexible, and I started > > to notice a incorrectly displayed character on some of my results... > > some debugging showed that it was a the Unicode character for single > > quote which is 8217 decimal. As far as I know, everything is fine > > before I index, but when retrieving the content, I receive a character > > that cannot be displayed on the java servlet I use to display them. > > How can I make lucene be vary general and accept and return all > > encoded/non-encoded chars are they were in their original state? > > > > > > -- > > ___________________________________________________ > > Chris Fraschetti > > e [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
-- ___________________________________________________ Chris Fraschetti, Student CompSci System Admin University of San Francisco e [EMAIL PROTECTED] | http://meteora.cs.usfca.edu
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]