2011/3/31 Robert Muir <rcm...@gmail.com>: > On Thu, Mar 31, 2011 at 9:51 AM, Patrick ALLAERT > <patrick.alla...@gmail.com> wrote: >> Hello, >> >> Facing a Solr issue, I have been told that queries with a term like: >> Kiinteistösih* >> will not match the Finnish word "Kiinteistösihteeri" and that it's a >> known limitation of Lucene. >> Instead, using the word directly, without wildcard, works. >> >> Do you confirm this a known limitation/bug? >> If so do you have any registered issue about that? > > this isn't the case, there's no unicode limitation here. > > more likely, your analyzer is configured to lowercase text, so in the > index Kiinteistösihteeri is really kiinteistösihteeri > in other words, try kiinteistösih* and see how that works.
Following your suggestion, I tested with: kiinteistösih* but it doesn't show me the intended result. I have found the reason why, this is because of the ISOLatin1AccentFilterFactory filter which is present for both the "index" and "query" analyzer. Searching with: kiinteistosih* did the trick. One question remains now: why should I lowercase terms containing a wildcard and making the ISO Latin1 accent conversion myself while I do have: <analyzer type="query"> ... <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ISOLatin1AccentFilterFactory"/> ... for the corresponding fieldType? I would have guessed it would does it for me. Your reply helped me a lot understanding what's going on. Thank you very much for your participation! Patrick --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org