https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=9729

--- Comment #7 from David Cook <dc...@prosentient.com.au> ---
After some experimenting, it seems YAZ ICU will tokenize based on the "+"
without any normalization when using the "line" tokenize rule:

echo -n "C++" | yaz-icu -c chain.xml
1 1 'c+' 'C+'
2 1 '+' '+'

echo -n "C#" | yaz-icu -c chain.xml
1 1 'c#' 'C#'

echo -n ".NET" | yaz-icu -c chain.xml
1 1 '.net' '.NET'

I wonder if that's a bug in YAZ because it doesn't do that for all
punctuation/symbols... 

echo -n 'C--' | yaz-icu -c chain.xml
1 1 'c--' 'C--'

echo -n 'C???' | yaz-icu -c chain.xml
1 1 'c???' 'C???'

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Reply via email to