[jira] Updated: (LUCENE-1150) The token types of the standard tokenizer is not accessible
[ https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1150: --- Fix Version/s: 2.3.2 Backported fix to 2.3.2. The token types of the standard tokenizer is not accessible --- Key: LUCENE-1150 URL: https://issues.apache.org/jira/browse/LUCENE-1150 Project: Lucene - Java Issue Type: Bug Components: Analysis Affects Versions: 2.3 Reporter: Nicolas Lalevée Assignee: Michael McCandless Fix For: 2.3.2, 2.4 Attachments: LUCENE-1150.patch, LUCENE-1150.take2.patch The StandardTokenizerImpl not being public, these token types are not accessible : {code:java} public static final int ALPHANUM = 0; public static final int APOSTROPHE= 1; public static final int ACRONYM = 2; public static final int COMPANY = 3; public static final int EMAIL = 4; public static final int HOST = 5; public static final int NUM = 6; public static final int CJ= 7; /** * @deprecated this solves a bug where HOSTs that end with '.' are identified * as ACRONYMs. It is deprecated and will be removed in the next * release. */ public static final int ACRONYM_DEP = 8; public static final String [] TOKEN_TYPES = new String [] { ALPHANUM, APOSTROPHE, ACRONYM, COMPANY, EMAIL, HOST, NUM, CJ, ACRONYM_DEP }; {code} So no custom TokenFilter can be based of the token type. Actually even the StandardFilter cannot be writen outside the org.apache.lucene.analysis.standard package. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-1150) The token types of the standard tokenizer is not accessible
[ https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1150: --- Attachment: LUCENE-1150.take2.patch New patch attached, that also exposes the token types for WikipediaTokenizer. I'll commit in a day or two. The token types of the standard tokenizer is not accessible --- Key: LUCENE-1150 URL: https://issues.apache.org/jira/browse/LUCENE-1150 Project: Lucene - Java Issue Type: Bug Components: Analysis Affects Versions: 2.3 Reporter: Nicolas Lalevée Assignee: Michael McCandless Attachments: LUCENE-1150.patch, LUCENE-1150.take2.patch The StandardTokenizerImpl not being public, these token types are not accessible : {code:java} public static final int ALPHANUM = 0; public static final int APOSTROPHE= 1; public static final int ACRONYM = 2; public static final int COMPANY = 3; public static final int EMAIL = 4; public static final int HOST = 5; public static final int NUM = 6; public static final int CJ= 7; /** * @deprecated this solves a bug where HOSTs that end with '.' are identified * as ACRONYMs. It is deprecated and will be removed in the next * release. */ public static final int ACRONYM_DEP = 8; public static final String [] TOKEN_TYPES = new String [] { ALPHANUM, APOSTROPHE, ACRONYM, COMPANY, EMAIL, HOST, NUM, CJ, ACRONYM_DEP }; {code} So no custom TokenFilter can be based of the token type. Actually even the StandardFilter cannot be writen outside the org.apache.lucene.analysis.standard package. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-1150) The token types of the standard tokenizer is not accessible
[ https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1150: --- Attachment: LUCENE-1150.patch Attached patch fixing this. I just added a new Constants.java that has static constants defined, and added a compile-time testcase to assert that these constants remain publicly accessible. I will commit in a day or two. The token types of the standard tokenizer is not accessible --- Key: LUCENE-1150 URL: https://issues.apache.org/jira/browse/LUCENE-1150 Project: Lucene - Java Issue Type: Bug Components: Analysis Affects Versions: 2.3 Reporter: Nicolas Lalevée Assignee: Michael McCandless Attachments: LUCENE-1150.patch The StandardTokenizerImpl not being public, these token types are not accessible : {code:java} public static final int ALPHANUM = 0; public static final int APOSTROPHE= 1; public static final int ACRONYM = 2; public static final int COMPANY = 3; public static final int EMAIL = 4; public static final int HOST = 5; public static final int NUM = 6; public static final int CJ= 7; /** * @deprecated this solves a bug where HOSTs that end with '.' are identified * as ACRONYMs. It is deprecated and will be removed in the next * release. */ public static final int ACRONYM_DEP = 8; public static final String [] TOKEN_TYPES = new String [] { ALPHANUM, APOSTROPHE, ACRONYM, COMPANY, EMAIL, HOST, NUM, CJ, ACRONYM_DEP }; {code} So no custom TokenFilter can be based of the token type. Actually even the StandardFilter cannot be writen outside the org.apache.lucene.analysis.standard package. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]