[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112552#comment-13112552 ] ofer fort commented on LUCENE-152: -- sorry for the reopen, but why is the constructor of KStemmer not public? > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Assignee: Robert Muir >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, LUCENE-152_alt.patch, > LUCENE-152_optimization.patch, LUCENE-152_optimization.patch, > kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047171#comment-13047171 ] Yonik Seeley commented on LUCENE-152: - bq. i think its the same as the patch i uploaded D'oh! I hate that the "All" tab in JIRA isn't selected by default (and hence one doesn't see stuff like file uploads ;-) > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Assignee: Robert Muir >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, LUCENE-152_alt.patch, > LUCENE-152_optimization.patch, LUCENE-152_optimization.patch, > kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047168#comment-13047168 ] Robert Muir commented on LUCENE-152: it looks good... i think its the same as the patch i uploaded (_alt.patch)... only i used the .append syntactic sugar > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Assignee: Robert Muir >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, LUCENE-152_alt.patch, > LUCENE-152_optimization.patch, LUCENE-152_optimization.patch, > kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042769#comment-13042769 ] Ryan McKinley commented on LUCENE-152: -- wow, closing a ticket from 2003! Thanks Robert, Yonik, etc > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Assignee: Robert Muir >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042756#comment-13042756 ] Robert Muir commented on LUCENE-152: Thanks for reviewing Ryan... i found some @authors just doing another scan, i'll nuke those before committing. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042755#comment-13042755 ] Robert Muir commented on LUCENE-152: No, nothing will be lost... and actually since 'false' is passed here for ignoreCase, the constant does nothing... just looks wierd. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042753#comment-13042753 ] Ryan McKinley commented on LUCENE-152: -- ok -- just making sure there is nothing lost with Version.LUCENE_40 +1 to commit > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042749#comment-13042749 ] Robert Muir commented on LUCENE-152: Personally I don't think we should do this: if you look thru the analyzers you will find other similar code. Plus, there's no need to support the 'broken' pre-3.1 behavior here, since this thing isn't planned to be released until 3.3. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042737#comment-13042737 ] Ryan McKinley commented on LUCENE-152: -- great. Another thing that jumps out is {code} CharArrayMap d = new CharArrayMap( Version.LUCENE_31, 1000, false); {code} Looks like we need to refactor: {code} private static final CharArrayMap dict_ht = initializeDictHash(); {code} so it can be passed the Lucene Version. I'm not sure we need it to be static either... I can take a look at that if you are not already on it > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042734#comment-13042734 ] Robert Muir commented on LUCENE-152: its also worth mentioning if the class is just used for appending (not sure if it is), we might be able to just append to the CharTermAttribute directly instead, it implements Appendable already. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042732#comment-13042732 ] Robert Muir commented on LUCENE-152: Ryan: maybe, I thought of this too myself looking at the patch. Then again there are probably other kinds of refactoring improvements we could make... honestly I didn't dig deep enough into this one to see if it can be solved just by 'add Appendable interface to CharsRef' or to even think if thats the right thing to do. I don't think we should move it out of the analysis package for now (maybe i shouldn't have put it in util even in the patch) unless there's something else that actually wants to use it: I think this would be premature. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042730#comment-13042730 ] Ryan McKinley commented on LUCENE-152: -- +1 Looks good! Should OpenStringBuilder and CharsRef be combined? If not, is OpenStringBuilder usful outside of analysis? Should it be in org.apache.lucene.util? > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-152.patch, kstemTestData.zip, lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042278#comment-13042278 ] Robert Muir commented on LUCENE-152: we can zip it anyway, the existing stemmer tests use zipped files for this exact purpose. zipped: all the test data is about 500KB our snowball test data currently in src/test is zipped 3.1MB... so I think 500kb is ok. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042275#comment-13042275 ] Ryan McKinley commented on LUCENE-152: -- I'm fine with the 1.2MB history_of_the_united_states.txt in the tests > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042273#comment-13042273 ] Robert Muir commented on LUCENE-152: {quote} This is not a patch, but simply a tarball of Lucid's version. Not sure what we want to do with some of the stuff (like the biggish test files) {quote} I don't think biggish test files are a problem personally, we already have these for the snowball stemmers for example. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: lucid_kstem.tgz > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042262#comment-13042262 ] Robert Zotter commented on LUCENE-152: -- +1 > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041328#comment-13041328 ] Jan Høydahl commented on LUCENE-152: +1 > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > Fix For: 3.3, 4.0 > > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035647#comment-13035647 ] Yonik Seeley commented on LUCENE-152: - heh - I had heard enough times that the license wouldn't permit it that I never looked into it myself. http://markmail.org/message/zlett7y3dj76xa2f Anyway, I did a bunch of optimizations for Lucid's version way back when. It makes sense for those to be contributed back here... I'll see what I can do (but it might be delayed a week by everyone being busy at Lucene Revolution). > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035616#comment-13035616 ] Michael McCandless commented on LUCENE-152: --- I think that's right. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035519#comment-13035519 ] Steven Rowe commented on LUCENE-152: bq. Code is fine to afaik: http://www.apache.org/legal/3party.html My interpretation of this is that we can directly include the KStem source code in Lucene/Solr's source tree, and then modify it at will, since its license (BSD style) is in Category A (authorized licenses). Thoughts? > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034478#comment-13034478 ] Mark Miller commented on LUCENE-152: bq. More specifically: compile time dependencies on compiled BSD libraries are fine, but actually incorporating and releasing code that is under a BSD license is something we're aren't suppose to do (last time i checked) Code is fine to afaik: http://www.apache.org/legal/3party.html > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034456#comment-13034456 ] Mark Miller commented on LUCENE-152: To extract a bit for clarity: {quote} This form is not for new projects. This is for projects and PMCs that have already been created and are receiving a code donation into an existing codebase. Any code that was developed outside of the ASF SVN repository and our public mailing lists must be processed like this, even if the external developer is already an ASF committer. {quote} > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034454#comment-13034454 ] Mark Miller commented on LUCENE-152: bq. Uh... that may be a stretch. It's what the incubator seems to recommend, and the side have err'd on in the past. http://incubator.apache.org/ip-clearance/index.html If it was developed outside of Apache, we don't really know it's IP history, and that's something we want to take seriously. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034450#comment-13034450 ] Hoss Man commented on LUCENE-152: - bq. even if its Apache 2 licensed code. Uh... that may be a stretch. More specifically: compile time dependencies on compiled BSD libraries are fine, but actually incorporating and *releasing* code that is under a BSD license is something we're aren't suppose to do (last time i checked) > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034439#comment-13034439 ] Mark Miller commented on LUCENE-152: The general rule is that if its a fair amount of code, and it was developed outside of the Apache system, we want a software grant - even if its Apache 2 licensed code. > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-152) [PATCH] KStem for Lucene
[ https://issues.apache.org/jira/browse/LUCENE-152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034368#comment-13034368 ] Steven Rowe commented on LUCENE-152: If the original sources are BSD licensed, is a software grant required to incorporate the sources into the Lucene/Solr source tree? > [PATCH] KStem for Lucene > > > Key: LUCENE-152 > URL: https://issues.apache.org/jira/browse/LUCENE-152 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis >Affects Versions: unspecified > Environment: Operating System: other > Platform: Other >Reporter: Otis Gospodnetic >Priority: Minor > > September 10th 2003 contributionn from "Sergio Guzman-Lara" > > Original email: > Hi all, > I have ported the kstem stemmer to Java and incorporated it to > Lucene. You can get the source code (Kstem.jar) from the following website: > http://ciir.cs.umass.edu/downloads/ > Just click on "KStem Java Implementation" (you will need to register > your e-mail, for free of course, with the CIIR --Center for Intelligent > Information Retrieval, UMass -- and get an access code). > Content of Kstem.jar: > java/org/apache/lucene/analysis/KStemData1.java > java/org/apache/lucene/analysis/KStemData2.java > java/org/apache/lucene/analysis/KStemData3.java > java/org/apache/lucene/analysis/KStemData4.java > java/org/apache/lucene/analysis/KStemData5.java > java/org/apache/lucene/analysis/KStemData6.java > java/org/apache/lucene/analysis/KStemData7.java > java/org/apache/lucene/analysis/KStemData8.java > java/org/apache/lucene/analysis/KStemFilter.java > java/org/apache/lucene/analysis/KStemmer.java > KStemData1.java, ..., KStemData8.java Contain several lists of words > used by Kstem > KStemmer.java Implements the Kstem algorithm > KStemFilter.java Extends TokenFilter applying Kstem > To compile > unjar the file Kstem.jar to Lucene's "src" directory, and compile it > there. > What is Kstem? > A stemmer designed by Bob Krovetz (for more information see > http://ciir.cs.umass.edu/pubfiles/ir-35.pdf). > Copyright issues > This is open source. The actual license agreement is included at the > top of every source file. > Any comments/questions/suggestions are welcome, > Sergio Guzman-Lara > Senior Research Fellow > CIIR UMass -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org