[
https://issues.apache.org/jira/browse/LUCENE-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174022#comment-13174022
]
Uwe Schindler edited comment on LUCENE-3663 at 12/21/11 11:33 AM:
------------------------------------------------------------------
This looks strange and creates useless objects:
{code:java}
final char[] buffer = termAtt.buffer();
final int length = termAtt.length();
CharBuffer cb = CharBuffer.wrap(buffer, 0, length);
try {
PhoneNumber pn = pnu.parse(cb.toString(), defaultCountry);
{code}
should be:
{code:java}
try {
PhoneNumber pn = pnu.parse(termAtt.toString(), defaultCountry);
{code}
Ideally, PhoneNumberUtil would take CharSequence (so you could directly pass
termAtt without toString()), but unfortunately Google's lib is too stupid to
use a more generic Java type.
Otherwise patch looks fine, but it adds another external reference. You should
make all fields final, they will never change!
was (Author: thetaphi):
This looks strange and creates useless objects:
{code:java}
final char[] buffer = termAtt.buffer();
final int length = termAtt.length();
CharBuffer cb = CharBuffer.wrap(buffer, 0, length);
try {
PhoneNumber pn = pnu.parse(cb.toString(), defaultCountry);
{code}
should be:
{code:java}
try {
PhoneNumber pn = pnu.parse(termAtt.toString(), defaultCountry);
{code}
Ideally, PhoneNumberUtil would take CharSequence, but unfortunately Google's
lib is too stupid to use a more generic Java type.
Otherwise patch looks fine, but it adds another external reference. You should
make all fields final, they will never change!
> Add a phone number normalization TokenFilter
> --------------------------------------------
>
> Key: LUCENE-3663
> URL: https://issues.apache.org/jira/browse/LUCENE-3663
> Project: Lucene - Java
> Issue Type: New Feature
> Components: modules/analysis
> Reporter: Santiago M. Mola
> Priority: Minor
> Attachments: PhoneFilter.java
>
>
> Phone numbers can be found in the wild in an infinity variety of formats
> (e.g. with spaces, parenthesis, dashes, with or without country code, with
> letters in substitution of numbers). So some Lucene applications can benefit
> of phone normalization with a TokenFilter that gets a phone number in any
> format, and outputs it in a standard format, using a default country to guess
> country code if it's not present.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]