[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653049#action_12653049
 ] 

Steven Rowe commented on LUCENE-1390:
-------------------------------------

bq. What is the likelyhood that a forced upgrade to this class would lose words 
in an older index without a reindex? 

The problem would be words that contain characters that were not folded by 
ISOLatin1AccentFilter, but are folded by ASCIIFoldingFilter, and that are used 
in documents *and* in queries.  Individual implementors would have to make that 
determination, but it's not outside the realm of possibility.

If ISOLatin1AccentFilter were deprecated for 2.9, and advertised as targeted 
for removal in 3.0, assuming there will be a significant gap in time between 
the 2.9 and 3.0 releases, that would give users time to complain about its 
pending demise, and the plan to remove it could be revisited based on that 
feedback.

> add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
> ------------------------------------------------------------
>
>                 Key: LUCENE-1390
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1390
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>         Environment: any
>            Reporter: Andi Vajda
>            Assignee: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
> ISOLatinAccentFilter.java
>
>
> The ISOLatin1AccentFilter is removing accents from accented characters in the 
> ISO Latin 1 character set.
> It does what it does and there is no bug with it.
> It would be nicer, though, if there was a more comprehensive version of this 
> code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
> and Latin Extended A unicode blocks.
> See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
> See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
> That way, all languages using roman characters are covered.
> A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
> ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to