You must know what language each text is in, and use an appropriate
analyzer. Some people do this by using a separate field (text_eng,
text_spa, text_jpn). Other people put some extra information at the
beginning of the field, and then make an analyzer that peeks in order to
dispatch to the
Just curious, what are some of the things that people do to properly
tokenize the queries with mixed language collections? What do you do
with mixed language queries?
On 4/6/2014 4:51 AM, Benson Margulies wrote:
You must know what language each text is in, and use an appropriate
analyzer.
On Sun, Apr 6, 2014 at 10:30 AM, Herb Roitblat herb.roitb...@orcatec.comwrote:
Just curious, what are some of the things that people do to properly
tokenize the queries with mixed language collections? What do you do with
mixed language queries?
You can either force the user to tell you the
Thanks.
These are familiar. Any other approaches that people use? I guess I'm
hoping ...
On 4/6/2014 7:37 AM, Benson Margulies wrote:
On Sun, Apr 6, 2014 at 10:30 AM, Herb Roitblat herb.roitb...@orcatec.comwrote:
Just curious, what are some of the things that people do to properly
tokenize
For the japanese/english/french/german/dutch/russian/spanish/portuguese
with lots of searchable metadata dictionary that I am developping for
Android, I'm using a multi-field index that uses human input (a single
string) and i have to
USE 1 : guess/associate each term/range to one (or more)
On 04/06/2014 04:37 PM, Benson Margulies wrote:
On Sun, Apr 6, 2014 at 10:30 AM, Herb Roitblat herb.roitb...@orcatec.comwrote:
Just curious, what are some of the things that people do to properly
tokenize the queries with mixed language collections? What do you do with
mixed language queries?