Re: [CODE4LIB] indexing & searching chinese text using solr

2018-08-14 Thread Kyle Banerjee
Hi Eric, If you're pretty sure you indexed the characters properly and are getting garbage no matter what you do, my first thought is that this is a localization issue. Can you cat/grep/sed/vi/whatever these characters in a terminal window? If not, that is at least part of your problem. Running

Re: [CODE4LIB] indexing & searching chinese text using solr

2018-08-14 Thread Erik Hatcher
Eric - How did you index the files?These are, I presume (based on "body" mention below) these are HTML files?Can you send along a file (direct to me is fine) and how you indexed it and I'll take a look. Erik > On Aug 14, 2018, at 10:49 AM, Eric Lease Morgan wrote: > > How

[CODE4LIB] indexing & searching chinese text using solr

2018-08-14 Thread Eric Lease Morgan
How to I go about indexing & searching Chinese text using Solr? I have a pile o' simplified Chinese text encoded in UTF-8. Taking hints from some Solr documentation [1], I have configured my index thusly: key