[jira] Updated: (LUCENE-1150) The token types of the standard tokenizer is not accessible

2008-04-09 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1150:
---

Fix Version/s: 2.3.2

Backported fix to 2.3.2.

> The token types of the standard tokenizer is not accessible
> ---
>
> Key: LUCENE-1150
> URL: https://issues.apache.org/jira/browse/LUCENE-1150
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 2.3
>Reporter: Nicolas Lalevée
>Assignee: Michael McCandless
> Fix For: 2.3.2, 2.4
>
> Attachments: LUCENE-1150.patch, LUCENE-1150.take2.patch
>
>
> The StandardTokenizerImpl not being public, these token types are not 
> accessible :
> {code:java}
> public static final int ALPHANUM  = 0;
> public static final int APOSTROPHE= 1;
> public static final int ACRONYM   = 2;
> public static final int COMPANY   = 3;
> public static final int EMAIL = 4;
> public static final int HOST  = 5;
> public static final int NUM   = 6;
> public static final int CJ= 7;
> /**
>  * @deprecated this solves a bug where HOSTs that end with '.' are identified
>  * as ACRONYMs. It is deprecated and will be removed in the next
>  * release.
>  */
> public static final int ACRONYM_DEP   = 8;
> public static final String [] TOKEN_TYPES = new String [] {
> "",
> "",
> "",
> "",
> "",
> "",
> "",
> "",
> ""
> };
> {code}
> So no custom TokenFilter can be based of the token type. Actually even the 
> StandardFilter cannot be writen outside the 
> org.apache.lucene.analysis.standard package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1150) The token types of the standard tokenizer is not accessible

2008-01-25 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1150:
---

Attachment: LUCENE-1150.take2.patch

New patch attached, that also exposes the token types for WikipediaTokenizer.  
I'll commit in a day or two.

> The token types of the standard tokenizer is not accessible
> ---
>
> Key: LUCENE-1150
> URL: https://issues.apache.org/jira/browse/LUCENE-1150
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 2.3
>Reporter: Nicolas Lalevée
>Assignee: Michael McCandless
> Attachments: LUCENE-1150.patch, LUCENE-1150.take2.patch
>
>
> The StandardTokenizerImpl not being public, these token types are not 
> accessible :
> {code:java}
> public static final int ALPHANUM  = 0;
> public static final int APOSTROPHE= 1;
> public static final int ACRONYM   = 2;
> public static final int COMPANY   = 3;
> public static final int EMAIL = 4;
> public static final int HOST  = 5;
> public static final int NUM   = 6;
> public static final int CJ= 7;
> /**
>  * @deprecated this solves a bug where HOSTs that end with '.' are identified
>  * as ACRONYMs. It is deprecated and will be removed in the next
>  * release.
>  */
> public static final int ACRONYM_DEP   = 8;
> public static final String [] TOKEN_TYPES = new String [] {
> "",
> "",
> "",
> "",
> "",
> "",
> "",
> "",
> ""
> };
> {code}
> So no custom TokenFilter can be based of the token type. Actually even the 
> StandardFilter cannot be writen outside the 
> org.apache.lucene.analysis.standard package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1150) The token types of the standard tokenizer is not accessible

2008-01-25 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1150:
---

Attachment: LUCENE-1150.patch

Attached patch fixing this.  I just added a new Constants.java that has static 
constants defined, and added a compile-time testcase to assert that these 
constants remain publicly accessible.

I will commit in a day or two.

> The token types of the standard tokenizer is not accessible
> ---
>
> Key: LUCENE-1150
> URL: https://issues.apache.org/jira/browse/LUCENE-1150
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 2.3
>Reporter: Nicolas Lalevée
>Assignee: Michael McCandless
> Attachments: LUCENE-1150.patch
>
>
> The StandardTokenizerImpl not being public, these token types are not 
> accessible :
> {code:java}
> public static final int ALPHANUM  = 0;
> public static final int APOSTROPHE= 1;
> public static final int ACRONYM   = 2;
> public static final int COMPANY   = 3;
> public static final int EMAIL = 4;
> public static final int HOST  = 5;
> public static final int NUM   = 6;
> public static final int CJ= 7;
> /**
>  * @deprecated this solves a bug where HOSTs that end with '.' are identified
>  * as ACRONYMs. It is deprecated and will be removed in the next
>  * release.
>  */
> public static final int ACRONYM_DEP   = 8;
> public static final String [] TOKEN_TYPES = new String [] {
> "",
> "",
> "",
> "",
> "",
> "",
> "",
> "",
> ""
> };
> {code}
> So no custom TokenFilter can be based of the token type. Actually even the 
> StandardFilter cannot be writen outside the 
> org.apache.lucene.analysis.standard package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]