[ https://issues.apache.org/jira/browse/SOLR-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan McKinley updated SOLR-813: ------------------------------- Attachment: SOLR-813.patch Here is an update that adresses two concerns: 1. position increments -- this keeps the tokens in sync with the input 2. previous version would stop processing after a number. That is: "aaa 1234 bbb" would not process "bbb" 3. Token types... this changes it to "DoubleMetaphone" rather then "ALPHANUM" here is the key part: {code:java} boolean isPhonetic = false; String v = new String(t.termBuffer(), 0, t.termLength()); String primaryPhoneticValue = encoder.doubleMetaphone(v); if (primaryPhoneticValue.length() > 0) { Token token = (Token) t.clone(); if( inject ) { token.setPositionIncrement( 0 ); } token.setType( TOKEN_TYPE ); token.setTermBuffer(primaryPhoneticValue); remainingTokens.addLast(token); isPhonetic = true; } String alternatePhoneticValue = encoder.doubleMetaphone(v, true); if (alternatePhoneticValue.length() > 0 && !primaryPhoneticValue.equals(alternatePhoneticValue)) { Token token = (Token) t.clone(); token.setPositionIncrement( 0 ); token.setType( TOKEN_TYPE ); token.setTermBuffer(alternatePhoneticValue); remainingTokens.addLast(token); isPhonetic = true; } // If we did not add something, then go to the next one... if( !isPhonetic ) { t = next(in); t.setPositionIncrement( t.getPositionIncrement()+1 ); return t; } {code} > Add new DoubleMetaphone Filter and Factory > ------------------------------------------ > > Key: SOLR-813 > URL: https://issues.apache.org/jira/browse/SOLR-813 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.3 > Reporter: Todd Feak > Priority: Minor > Attachments: SOLR-813.patch, SOLR-813.patch > > > The existing PhoneticFilter allows for use of the DoubleMetaphone encoder. > However, it doesn't expose the maxCodeLength() setting, and it ignores the > alternate encodings that the encoder provides for some words. This new filter > is not as generic as the PhoneticFilter, but allows more detailed control > over the DoubleMetaphone encoder. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.