[jira] [Commented] (LUCENE-3663) Add a phone number normalization TokenFilter

Uwe Schindler (Commented) (JIRA) Wed, 21 Dec 2011 03:39:55 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174024#comment-13174024
 ]


Uwe Schindler commented on LUCENE-3663:
---------------------------------------

One more thing, as you want to filter out tokens, you should not subclass 
TokenFilter directly but instead sublass 
org.apache.lucene.analysis.util.FilteringTokenFilter and do the work in the 
match() method. You are free to modify the token there, too. This new base 
class would correctly handle position increments, as noted as TODO in your 
comments.
                
> Add a phone number normalization TokenFilter
> --------------------------------------------
>
>                 Key: LUCENE-3663
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3663
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: Santiago M. Mola
>            Priority: Minor
>         Attachments: PhoneFilter.java
>
>
> Phone numbers can be found in the wild in an infinity variety of formats 
> (e.g. with spaces, parenthesis, dashes, with or without country code, with 
> letters in substitution of numbers). So some Lucene applications can benefit 
> of phone normalization with a TokenFilter that gets a phone number in any 
> format, and outputs it in a standard format, using a default country to guess 
> country code if it's not present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3663) Add a phone number normalization TokenFilter

Reply via email to