Re: Ascii folding

2023-11-12 Thread Dawid Weiss
Thanks Robert, Uwe - all this is enlightening. I didn't know about those things you mentioned. Dawid On Sat, Nov 11, 2023 at 2:02 PM Uwe Schindler wrote: > Hi Dawid, > > the ASCII folding filter is meant to remove accents. You would like to > have searching for visually simila

Re: Ascii folding

2023-11-11 Thread Uwe Schindler
Hi Dawid, the ASCII folding filter is meant to remove accents. You would like to have searching for visually similar characters. These are 2 different things. Actually Robert also has some config options, waht I generally use for wester european searches where some documents may contain

Re: Ascii folding

2023-11-10 Thread Robert Muir
nt on input). > > > > Dawid > > > > On Fri, Nov 10, 2023 at 6:58 PM Chris Hostetter > > wrote: > >> > >> > >> : Here's the unicode letter after "th": > >> : https://www.fileformat.info/info/unicode/char/0435/index.htm > >> : >

Re: Ascii folding

2023-11-10 Thread Robert Muir
the unicode letter after "th": >> : https://www.fileformat.info/info/unicode/char/0435/index.htm >> : >> : To my surprise, I couldn't find it in the ascii folding filter: >> : >> : >> https://github.com/apache/lucene/blob/main/lucene/analysis/common

Re: Ascii folding

2023-11-10 Thread Dawid Weiss
: > > : Here's the unicode letter after "th": > : https://www.fileformat.info/info/unicode/char/0435/index.htm > : > : To my surprise, I couldn't find it in the ascii folding filter: > : > : > https://github.com/apache/lucene/blob/main/lucene/analysis/common/s

Re: Ascii folding

2023-11-10 Thread Chris Hostetter
: Here's the unicode letter after "th": : https://www.fileformat.info/info/unicode/char/0435/index.htm : : To my surprise, I couldn't find it in the ascii folding filter: : : https://github.com/apache/lucene/blob/main/lucene/analysis/common/src/java/org/apache/lucene/analysis/mis

Re: Ascii folding

2023-11-10 Thread Steve Rowe
h means > > thе and the > > are two different things. > > Here's the unicode letter after "th": > https://www.fileformat.info/info/unicode/char/0435/index.htm > > To my surprise, I couldn't find it in the ascii folding filter: > > https://github.com/

Ascii folding

2023-11-10 Thread Dawid Weiss
ormat.info/info/unicode/char/0435/index.htm To my surprise, I couldn't find it in the ascii folding filter: https://github.com/apache/lucene/blob/main/lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.java Anybody remembers whether the omission of Cyrillic