On Sun, Sep 1, 2013 at 4:34 AM, Karen Coyle <[email protected]> wrote:
> Thanks, Tom. It seems to me that we have a general problem with accents > that needs solving -- I have no idea how accents are handled in > searching or merging, but I know that this has come up before: > > https://github.com/internetarchive/openlibrary/issues/11 There's a duplicate issue here too: https://github.com/internetarchive/openlibrary/issues/178 Presumably SOLR is able to deal with this? > Yes, SOLR can handle this. It just needs to be told to do it. I think the default search should be both case and diacritic insensitive as well as using Unicode normalization (ie making sure that both é as a single character and as a composed sequence of e plus accent are treated the same). A phonetic search index might be a nice addition as well to deal with Chaikovsky vs Tsaichovsky, but that's lower priority than getting the basics to work correctly. Tom
_______________________________________________ Ol-discuss mailing list - [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/[email protected]/ To unsubscribe from this mailing list, send email to [email protected]
