[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-09-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920375#comment-16920375
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 3e1c472dec4f9c3bac7483fffa7569e985ee03ac in lucene-solr's branch 
refs/heads/branch_8x from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3e1c472 ]

LUCENE-8778: Fix (uncapitalize) SPI names.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> LUCENE-8778-migrate-note.patch, ListAnalysisComponents.java, 
> SPINamesGenerator.java, Screenshot from 2019-04-26 02-17-48.png, Screenshot 
> from 2019-05-25 23-25-24.png, TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-09-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920374#comment-16920374
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 8c124337914643a89c278fa7f9c56a78f2a6270e in lucene-solr's branch 
refs/heads/master from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8c12433 ]

LUCENE-8778: Fix (uncapitalize) SPI names.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> LUCENE-8778-migrate-note.patch, ListAnalysisComponents.java, 
> SPINamesGenerator.java, Screenshot from 2019-04-26 02-17-48.png, Screenshot 
> from 2019-05-25 23-25-24.png, TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-08-04 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899643#comment-16899643
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

Here is an additional patch for updating MIGRATE.txt: 
[^LUCENE-8778-migrate-note.patch]

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> LUCENE-8778-migrate-note.patch, ListAnalysisComponents.java, 
> SPINamesGenerator.java, Screenshot from 2019-04-26 02-17-48.png, Screenshot 
> from 2019-05-25 23-25-24.png, TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-08-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899642#comment-16899642
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 09993c6cf0aea840a3f8d73b7596e94aa5569a54 in lucene-solr's branch 
refs/heads/master from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=09993c6 ]

LUCENE-8778: Add a migration note to MIGRATE.txt


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-07-16 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16885969#comment-16885969
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit b5e8dc3af4227401233289fdf7433be9d7440ca1 in lucene-solr's branch 
refs/heads/branch_8x from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b5e8dc3 ]

LUCENE-8911: Backport LUCENE-8778 (improved analysis SPI name handling) to 8.x 
(#782)

This also keeps old names for backwards compatibility on 8.x


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-07-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883473#comment-16883473
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 6d79cc9e443b957baf72abcc7757c5edfb7ff1c1 in lucene-solr's branch 
refs/heads/jira/SOLR-13565 from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6d79cc9 ]

LUCENE-8907: Move change logs for LUCENE-8778 and following issues to the 9.0.0 
updates section.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-07-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883044#comment-16883044
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 6d79cc9e443b957baf72abcc7757c5edfb7ff1c1 in lucene-solr's branch 
refs/heads/master from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6d79cc9 ]

LUCENE-8907: Move change logs for LUCENE-8778 and following issues to the 9.0.0 
updates section.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-07-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883028#comment-16883028
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 824a196d780a5d5694a579969c2aaf45554555c8 in lucene-solr's branch 
refs/heads/branch_8_2 from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=824a196d ]

LUCENE-8907: Revert LUCENE-8778 and succeeding commits.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-07-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883017#comment-16883017
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 59c7eb92cfcc766ee6866fe14ec609b09d41fcf6 in lucene-solr's branch 
refs/heads/branch_8x from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=59c7eb9 ]

LUCENE-8907: Revert LUCENE-8778 and succeeding commits.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-06-22 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870301#comment-16870301
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit d1678a3a68685acd1432dfadcd5f49fa817273ad in lucene-solr's branch 
refs/heads/branch_8x from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d1678a3 ]

LUCENE-8778: Don't use Java 11 APIs on 8x branch.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-06-22 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870231#comment-16870231
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 559abd8f287b6841cef0a78898590f68c4d8823d in lucene-solr's branch 
refs/heads/master from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=559abd8 ]

LUCENE-8778: Update the changelog because this was backported to 8.x branch.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-06-22 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870228#comment-16870228
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

I also pushed same changes to 8.x branch, because it would be annoying if new 
tokenizer/charfilter/tokefilter is added and backported to 8.x branch.

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-06-22 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870226#comment-16870226
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 12e3451fb80e151b16f51493761b9dfc580f8fa0 in lucene-solr's branch 
refs/heads/branch_8x from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=12e3451 ]

LUCENE-8778: Define analyzer SPI names as static final fields and document the 
names in all analysis components. This also changes SPI loader to detect 
service names via the static NAME fields instead of class names.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-06-22 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870227#comment-16870227
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 0cc1753e76eb61000c1d3513448719e5f7923e48 in lucene-solr's branch 
refs/heads/branch_8x from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0cc1753 ]

LUCENE-8778: Add SPI name and documentation for the KoreanNumberFilterFactory


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-06-22 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870224#comment-16870224
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

I was not aware that a new token filter {{KoreanNumberFilter}} was added in 
LUCENE-8812. Here is the patch  [^LUCENE-8778-koreanNumber.patch] .

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8778-koreanNumber.patch, 
> ListAnalysisComponents.java, SPINamesGenerator.java, Screenshot from 
> 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 23-25-24.png, 
> TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-06-22 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870223#comment-16870223
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 2d4dea370a926b6e424068d6ca4981e608214e5f in lucene-solr's branch 
refs/heads/master from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2d4dea3 ]

LUCENE-8778: Add SPI name and documentation for the KoreanNumberFilterFactory


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java, 
> Screenshot from 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 
> 23-25-24.png, TestSPINames.java
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-06-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870008#comment-16870008
 ] 

ASF subversion and git services commented on LUCENE-8778:
-

Commit 98c85a0e1a611d3a0337483ab87183bfeccec33b in lucene-solr's branch 
refs/heads/master from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=98c85a0 ]

LUCENE-8778: Define analyzer SPI names as static final fields and document the 
names in all analysis components. This also changes SPI loader to detect 
service names via the static NAME fields instead of class names.


> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java, 
> Screenshot from 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 
> 23-25-24.png, TestSPINames.java
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-25 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848292#comment-16848292
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

Sorry, here is the diff between the master and my branch of 
{{AnalysisSPILoader}}: 
https://github.com/apache/lucene-solr/pull/654/files#diff-66189248b1160a3e940e33d09e7d82a8

This is the critical part that needs to be reviewed. I think I can validate all 
other parts with unit tests and regression tests I attached here, before 
pushing the (squash-merged) commit to the Apache gitbox .

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java, 
> Screenshot from 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 
> 23-25-24.png, TestSPINames.java
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-25 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848225#comment-16848225
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

I updated the pull request.
 * Service lookup is performed on the case-insensitive map keys (as before). 
Preserve original names in the auxiliary Set for reference. Also add a check to 
make sure that the size of the lookup map and the original name set.
 * Restrict characters that can be used in the SPI names: only allow alphabets, 
digits, and underscores. (The last one is added for possible future uses.)
 * Document about case-insensitive lookup in each Javadoc tag (I took a 
screenshot). It's a bit redundant but at least they are not likely to be 
overlooked.

!Screenshot from 2019-05-25 23-25-24.png!

I would like to delay allowing "multiple names" or "aliases", because I don't 
want to implement a feature this could never be used. If Elasticsearch team or 
someone else is interested in using the analysis service loader, I think the 
modification is easy and we can work together then.

Can you please review the last changes in the service loader class? Here are 
the diff: 
[bf6fc2b|https://github.com/apache/lucene-solr/pull/654/commits/bf6fc2b4cc3db2848e2f79cfbb1fa917a834cf06],
 
[dab1f5a|https://github.com/apache/lucene-solr/pull/654/commits/dab1f5a9a8cd36ead1272ee99ef51200600a3b3b]

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java, 
> Screenshot from 2019-04-26 02-17-48.png, Screenshot from 2019-05-25 
> 23-25-24.png, TestSPINames.java
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-25 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848181#comment-16848181
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

Thanks for the comment, I will rethink the implementation of the SPI loader.

About the compatibility with Elasticsearch: this was also in my mind, however, 
I am not sure that we should introduce complexity for allowing multiple names 
or aliases (camel casing for backward compatibility and snake casing for 
Elasticsearch style). I imagine it will require time and effort to migrate 
their whole analysis framework, so I cannot think that they are interested in 
the move just for using Lucene's factories...

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java, 
> Screenshot from 2019-04-26 02-17-48.png, TestSPINames.java
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-25 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848161#comment-16848161
 ] 

Uwe Schindler commented on LUCENE-8778:
---

Instead of slowing down the case-insensitive lookup, I'd just handle a Set with 
the original names (for reference), but do the lookup on the lowercased map. 
You just have to be sure that you don't generate duplicates.

I would like to have the documented names preserving their original case, 
deduplicate those and lowercase them for lookup map. Also check that size of 
map and size of set are identical. In addition document that the lookup is 
case-insensitive (which it always was).

Another idea I had was to allow "multiple names" for same component (to allow 
compatibility with Elasticsearch). But I am not sure if this is worth. If 
Elasticsearch moves to "our" names, they should keep a legacy mapping 
internally.

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java, 
> Screenshot from 2019-04-26 02-17-48.png, TestSPINames.java
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-25 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848148#comment-16848148
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

Or, we may be able to document about the case-insensitive lookup (in 
somewhere), while allowing to use upper case characters for readability of the 
names.

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java, 
> Screenshot from 2019-04-26 02-17-48.png, TestSPINames.java
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-25 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848132#comment-16848132
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

bq. Now documented SPI names are camel cased, so it would be better that we 
preserve original names as is.

In that case, for example "htmlStrip" and "htmlstrip" can be registered as 
different services, of course this is problematic... :-/

I cannot come up with good ideas for now, should we go back to the lowercased 
names (the same as the current SPI loader generates)?

 

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java, 
> Screenshot from 2019-04-26 02-17-48.png, TestSPINames.java
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-24 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16847486#comment-16847486
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

Hi [~thetaphi],

I did a regression test and fixed incorrect SPI names (they had been mistakenly 
copypateted in previous commits).
 # List SPI names and their class names of all analysis components with master 
branch. ([^ListAnalysisComponents.java])
 # Make sure that all components can be looked up by (old) SPI names with my 
branch (pull request). ([^TestSPINames.java])

Also I modified {{AnalysisSPILoader}} to preserve service names' letter casing. 
Now documented SPI names are camel cased, so it would be better that we 
preserve original names as is. Instead of lowercasing when registering the 
names, we can perform case-insensitive lookup. Because the service map is 
small, I guess the performance degredation will not matter much in this case 
(I'm not quite sure, but there might be better ways?). 
([diff|https://github.com/apache/lucene-solr/pull/654/commits/fc903379b0a53b690adf1c1ca5843b92444895ec])

This branch passed ant test & precommit.

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java, 
> Screenshot from 2019-04-26 02-17-48.png, TestSPINames.java
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-22 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846110#comment-16846110
 ] 

Uwe Schindler commented on LUCENE-8778:
---

Hi Tomoko, from my persepctive this looks fine.

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: SPINamesGenerator.java, Screenshot from 2019-04-26 
> 02-17-48.png
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-22 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846037#comment-16846037
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

Update: static SPI {{NAME}} fields and javadoc tags are added to all analysis 
components. I generated (new) SPI names by [^SPINamesGenerator.java] and 
copypasted to the each factory's source code.
 The branch (pull request) passed {{TestAnalysisSPILoader}}. I noticed Solr has 
a few custom filters and added NAME field for them, but TestAnalysisSPILoader 
does not concern them...

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: SPINamesGenerator.java, Screenshot from 2019-04-26 
> 02-17-48.png
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-05-03 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832463#comment-16832463
 ] 

Uwe Schindler commented on LUCENE-8778:
---

Comments are now there. I fogot to click on "Start Review" (the mobile phone 
screens are too small to see everything!).

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: Screenshot from 2019-04-26 02-17-48.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-04-30 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830829#comment-16830829
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

Hi Uwe,

seems your comments are not visible to me. Not sure, but review comments on a 
pull request should be added in the "Conversation" tab like this (and mails 
sent to the dev-list automatically)?

[https://github.com/apache/lucene-solr/pull/638#discussion_r279698470]

I found this help page. Let me know if there is anything I (reviewee side) 
should do.

[https://help.github.com/en/articles/reviewing-proposed-changes-in-a-pull-request]

 

 

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: Screenshot from 2019-04-26 02-17-48.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-04-30 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830689#comment-16830689
 ] 

Uwe Schindler commented on LUCENE-8778:
---

I added some review comments. Looks fine to me, I just had to check the name 
lowercasing, but that's also done on lookup.

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: Screenshot from 2019-04-26 02-17-48.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-04-30 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830626#comment-16830626
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

Hi Uwe,

I updated the pull request:
 * Now {{AnalysisSPILoader#reload()}} method looks up the names on the new 
"NAME" fields.
 * Remove unnecessary source code check (added in the previous commit).

Still tokenizers and token filters do not have the "NAME" fields, so this does 
not pass {{TestAnalysisSPILoader}}.

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: Screenshot from 2019-04-26 02-17-48.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  * Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  * Officially document the SPI names in Javadocs.
>  * Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> and,
>  * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from 
> class names.
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-04-26 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826897#comment-16826897
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

Thanks,

first I thought we can change the SPI loader in the next step, in other words, 
another issue. I will look into the lookup algorithm and change it (or give it 
a try).

 

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: Screenshot from 2019-04-26 02-17-48.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  - Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  - Officially document the SPI names in Javadocs.
>  - Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-04-26 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826852#comment-16826852
 ] 

Uwe Schindler commented on LUCENE-8778:
---

Hi Tomoko,

the first step looks fine. But now we have 2 variants of the names: One is 
generated by the SPI lookup mechanism and one is listed as static final field.

To make it complete, we have to change {{AnalysisSPILoader}} (see 
https://github.com/apache/lucene-solr/blob/master/lucene/analysis/common/src/java/org/apache/lucene/analysis/util/AnalysisSPILoader.java)
 to no longer guess the name from the class name but instead do a reflective 
lookup on the new "NAME" field to get the name. We can then also remove the 
"suffixes" from the whole algorithm. I can assist with that. This allows us to 
also make the names customizeable.

This also makes the checks for the "NAME" static final field in our source code 
checker obsolete, as the SPI mechanism looks fo existence of the NAME field 
anyways.

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: Screenshot from 2019-04-26 02-17-48.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  - Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  - Officially document the SPI names in Javadocs.
>  - Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8778) Define analyzer SPI names as static final fields and document the names in Javadocs

2019-04-25 Thread Tomoko Uchida (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826254#comment-16826254
 ] 

Tomoko Uchida commented on LUCENE-8778:
---

To begin with, I only changed char filters for design review.

[https://github.com/apache/lucene-solr/pull/654]

 

The custom Javadoc tag generates HTML like this.

!Screenshot from 2019-04-26 02-17-48.png!

 

 

> Define analyzer SPI names as static final fields and document the names in 
> Javadocs
> ---
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: Screenshot from 2019-04-26 02-17-48.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token 
> filter)  has a SPI name but currently this is not  documented anywhere.
> The goals of this issue:
>  - Define SPI names as static final field for each analysis component so that 
> users can get the component by name (via {{NAME}} static field.) This also 
> provides compile time safety.
>  - Officially document the SPI names in Javadocs.
>  - Add proper source validation rules to ant {{validate-source-patterns}} 
> target so that we can make sure that all analysis components have correct 
> field definitions and documentation
> (Just for quick reference) we now have:
>  * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
>  * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
>  * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org