[ https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653049#action_12653049 ]
Steven Rowe commented on LUCENE-1390: ------------------------------------- bq. What is the likelyhood that a forced upgrade to this class would lose words in an older index without a reindex? The problem would be words that contain characters that were not folded by ISOLatin1AccentFilter, but are folded by ASCIIFoldingFilter, and that are used in documents *and* in queries. Individual implementors would have to make that determination, but it's not outside the realm of possibility. If ISOLatin1AccentFilter were deprecated for 2.9, and advertised as targeted for removal in 3.0, assuming there will be a significant gap in time between the 2.9 and 3.0 releases, that would give users time to complain about its pending demise, and the plan to remove it could be revisited based on that feedback. > add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter > ------------------------------------------------------------ > > Key: LUCENE-1390 > URL: https://issues.apache.org/jira/browse/LUCENE-1390 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Environment: any > Reporter: Andi Vajda > Assignee: Mark Miller > Priority: Minor > Fix For: 2.9 > > Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, > ISOLatinAccentFilter.java > > > The ISOLatin1AccentFilter is removing accents from accented characters in the > ISO Latin 1 character set. > It does what it does and there is no bug with it. > It would be nicer, though, if there was a more comprehensive version of this > code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 > and Latin Extended A unicode blocks. > See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block > See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block > That way, all languages using roman characters are covered. > A new class, ISOLatinAccentFilter is attached. It is intended to supercede > ISOLatin1AccentFilter which should get deprecated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]