I'm having some problems with chars in keywords that are not a-z0-9 chars...

If I have a keyword like "Det Naturvidenskabelige Fakultet" or a name "Jan Agermose" - 
well besides the fact I need to lowercase the keywords as the querystring is 
lowercased by lucene, I still cannot get any hits on the keywords. 

"Det Naturvidenskabelige Fakultet" - hits = 0
Det* - hits!
Det Naturvidenskabelige Fakultet - hits = 0

I can understand the last one - but shouldn't the first one return hits? If not, using 
keywords seems to be limited to keywords composed of [a-z0-9]+ ??? 

Now I do a string replace on [^a-z0-9]+ / "" (removing all the chars) but this gives 
the queryparse some problems I would think - unless in my special case where the user 
is not really free to compose queries on there own - therefore I can do the same 
stringreplace thing on the input :-D But I would like for the poweruser to input real 
queries - and this leaves me with the problem of parsing queries. I need to do 
stringreplace only within double quotes... This should be lucenes problem not mine :-D

Am I missing something ??

Jan Agermose

Reply via email to