[jira] Updated: (LUCENE-1150) The token types of the standard tokenizer is not accessible

2008-04-09 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1150:
---

Fix Version/s: 2.3.2

Backported fix to 2.3.2.

 The token types of the standard tokenizer is not accessible
 ---

 Key: LUCENE-1150
 URL: https://issues.apache.org/jira/browse/LUCENE-1150
 Project: Lucene - Java
  Issue Type: Bug
  Components: Analysis
Affects Versions: 2.3
Reporter: Nicolas Lalevée
Assignee: Michael McCandless
 Fix For: 2.3.2, 2.4

 Attachments: LUCENE-1150.patch, LUCENE-1150.take2.patch


 The StandardTokenizerImpl not being public, these token types are not 
 accessible :
 {code:java}
 public static final int ALPHANUM  = 0;
 public static final int APOSTROPHE= 1;
 public static final int ACRONYM   = 2;
 public static final int COMPANY   = 3;
 public static final int EMAIL = 4;
 public static final int HOST  = 5;
 public static final int NUM   = 6;
 public static final int CJ= 7;
 /**
  * @deprecated this solves a bug where HOSTs that end with '.' are identified
  * as ACRONYMs. It is deprecated and will be removed in the next
  * release.
  */
 public static final int ACRONYM_DEP   = 8;
 public static final String [] TOKEN_TYPES = new String [] {
 ALPHANUM,
 APOSTROPHE,
 ACRONYM,
 COMPANY,
 EMAIL,
 HOST,
 NUM,
 CJ,
 ACRONYM_DEP
 };
 {code}
 So no custom TokenFilter can be based of the token type. Actually even the 
 StandardFilter cannot be writen outside the 
 org.apache.lucene.analysis.standard package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1150) The token types of the standard tokenizer is not accessible

2008-01-25 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1150:
---

Attachment: LUCENE-1150.take2.patch

New patch attached, that also exposes the token types for WikipediaTokenizer.  
I'll commit in a day or two.

 The token types of the standard tokenizer is not accessible
 ---

 Key: LUCENE-1150
 URL: https://issues.apache.org/jira/browse/LUCENE-1150
 Project: Lucene - Java
  Issue Type: Bug
  Components: Analysis
Affects Versions: 2.3
Reporter: Nicolas Lalevée
Assignee: Michael McCandless
 Attachments: LUCENE-1150.patch, LUCENE-1150.take2.patch


 The StandardTokenizerImpl not being public, these token types are not 
 accessible :
 {code:java}
 public static final int ALPHANUM  = 0;
 public static final int APOSTROPHE= 1;
 public static final int ACRONYM   = 2;
 public static final int COMPANY   = 3;
 public static final int EMAIL = 4;
 public static final int HOST  = 5;
 public static final int NUM   = 6;
 public static final int CJ= 7;
 /**
  * @deprecated this solves a bug where HOSTs that end with '.' are identified
  * as ACRONYMs. It is deprecated and will be removed in the next
  * release.
  */
 public static final int ACRONYM_DEP   = 8;
 public static final String [] TOKEN_TYPES = new String [] {
 ALPHANUM,
 APOSTROPHE,
 ACRONYM,
 COMPANY,
 EMAIL,
 HOST,
 NUM,
 CJ,
 ACRONYM_DEP
 };
 {code}
 So no custom TokenFilter can be based of the token type. Actually even the 
 StandardFilter cannot be writen outside the 
 org.apache.lucene.analysis.standard package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1150) The token types of the standard tokenizer is not accessible

2008-01-25 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1150:
---

Attachment: LUCENE-1150.patch

Attached patch fixing this.  I just added a new Constants.java that has static 
constants defined, and added a compile-time testcase to assert that these 
constants remain publicly accessible.

I will commit in a day or two.

 The token types of the standard tokenizer is not accessible
 ---

 Key: LUCENE-1150
 URL: https://issues.apache.org/jira/browse/LUCENE-1150
 Project: Lucene - Java
  Issue Type: Bug
  Components: Analysis
Affects Versions: 2.3
Reporter: Nicolas Lalevée
Assignee: Michael McCandless
 Attachments: LUCENE-1150.patch


 The StandardTokenizerImpl not being public, these token types are not 
 accessible :
 {code:java}
 public static final int ALPHANUM  = 0;
 public static final int APOSTROPHE= 1;
 public static final int ACRONYM   = 2;
 public static final int COMPANY   = 3;
 public static final int EMAIL = 4;
 public static final int HOST  = 5;
 public static final int NUM   = 6;
 public static final int CJ= 7;
 /**
  * @deprecated this solves a bug where HOSTs that end with '.' are identified
  * as ACRONYMs. It is deprecated and will be removed in the next
  * release.
  */
 public static final int ACRONYM_DEP   = 8;
 public static final String [] TOKEN_TYPES = new String [] {
 ALPHANUM,
 APOSTROPHE,
 ACRONYM,
 COMPANY,
 EMAIL,
 HOST,
 NUM,
 CJ,
 ACRONYM_DEP
 };
 {code}
 So no custom TokenFilter can be based of the token type. Actually even the 
 StandardFilter cannot be writen outside the 
 org.apache.lucene.analysis.standard package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]