[ https://issues.apache.org/jira/browse/SOLR-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597185#action_12597185 ]
Otis Gospodnetic commented on SOLR-379: --------------------------------------- It would be great to have this available in Solr. Because of Kstem's incompatible library, I don't know how we can handle this. Incompatible license really just means we cannot distribute the KStem code (and cannot have it in the Lucene/Solr svn repository). Usually when incompatible licensing is a problem we say "modify the build script to download the needed library on demand if it's not present locally". This is what some of the Lucene contrib components do, for example. However, looking at your ZIP file I see: -rw-r--r-- 2836 15-Oct-2007 17:16:46 src/java/org/apache/solr/analysis/KStemFilterFactory.java -rw-r--r-- 42222 15-Oct-2007 16:28:08 src/java/org/apache/lucene/analysis/KStemmer.java -rw-r--r-- 4501 15-Oct-2007 17:08:38 src/java/org/apache/lucene/analysis/KStemFilter.java -rw-r--r-- 34259 15-Oct-2007 16:28:24 src/java/org/apache/lucene/analysis/KStemData8.java -rw-r--r-- 39918 15-Oct-2007 16:28:28 src/java/org/apache/lucene/analysis/KStemData7.java -rw-r--r-- 41412 15-Oct-2007 16:28:34 src/java/org/apache/lucene/analysis/KStemData6.java -rw-r--r-- 40457 15-Oct-2007 16:28:40 src/java/org/apache/lucene/analysis/KStemData5.java -rw-r--r-- 40823 15-Oct-2007 16:28:44 src/java/org/apache/lucene/analysis/KStemData4.java -rw-r--r-- 39808 15-Oct-2007 16:28:50 src/java/org/apache/lucene/analysis/KStemData3.java -rw-r--r-- 42696 15-Oct-2007 16:29:00 src/java/org/apache/lucene/analysis/KStemData2.java -rw-r--r-- 40020 15-Oct-2007 16:29:14 src/java/org/apache/lucene/analysis/KStemData1.java But this is really just a duplicate of what's in http://ciir.cs.umass.edu/downloads/files/KStem.jar, plus the Solr-specific KStemFilterFactory.java. So, could we simply download KStem.jar on demand? And is KStemFilterFactory.java really copyright CIIR? If we can change that to ASL then we can include it in the repo and with the modified build that downloads KStem.jar before compiling this class would compile. > KStem Token Filter > ------------------ > > Key: SOLR-379 > URL: https://issues.apache.org/jira/browse/SOLR-379 > Project: Solr > Issue Type: New Feature > Components: search > Reporter: Pieter Berkel > Priority: Minor > Attachments: KStemSolr.zip > > > A Lucene / Solr implementation of the KStem stemmer. Full credit goes to > Harry Wagner for adapting the Lucene version found here: > http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi > Background discussion to this stemmer (including licensing issues) can be > found in this thread: > http://www.nabble.com/Embedded-about-50--faster-for-indexing-tf4325720.html#a12376295 > I've made some minor changes to KStemFilterFactory so that it compiles > cleanly against trunk: > 1) removed some unnecessary imports > 2) changed the init() method parameters introduced by SOLR-215 > 3) moved KStemFilterFactory into package org.apache.solr.analysis > Once compiled and included in your Solr war (or as a jar in your lib > directory, the KStem filter can be used in your schema very easily: > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > <filter class="solr.StandardFilterFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.KStemFilterFactory" cacheSize="20000"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.