On Sun, Sep 1, 2013 at 4:34 AM, Karen Coyle <[email protected]> wrote:

> Thanks, Tom. It seems to me that we have a general problem with accents
> that needs solving -- I have no idea how accents are handled in
> searching or merging, but I know that this has come up before:
>
> https://github.com/internetarchive/openlibrary/issues/11


There's a duplicate issue here too:

https://github.com/internetarchive/openlibrary/issues/178

Presumably SOLR is able to deal with this?
>

Yes, SOLR can handle this.  It just needs to be told to do it.   I think
the default search should be both case and diacritic insensitive as well as
using Unicode normalization (ie making sure that both é as a single
character and as a composed sequence of e plus accent are treated the same).

A phonetic search index might be a nice addition as well to deal with
Chaikovsky vs Tsaichovsky, but that's lower priority than getting the
basics to work correctly.

Tom
_______________________________________________
Ol-discuss mailing list - [email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
Archives: http://www.mail-archive.com/[email protected]/
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to