Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-25 Thread Jakub Godawa
After all I choose hunspell-solr as a Polish language interpreter. It
understands Polish and is much easier to install. But look out! I does
not work with current nightly build - works good with solr 1.4.1!

It just works well, and hey! I got Ukrainian out of the box too. I am
thinking of replacing all required lanugages' SnowballPorterFilters with
*.aff and *.dic support.

Thanks for the help everyone!

On Wed, 2010-11-24 at 19:00 +0100, Jakub Godawa wrote:
 Yes, from the current nightly release setting up Stempel is quite easy.
 
 All I did was:
 
 svn co https://svn.apache.org/repos/asf/lucene/dev/trunk ./lucene-solr
 
 cd lucene-solr/solr
 ant example
 
 cp 
 ./contrib/analysis-extras/lucene-libs/lucene-analyzers-stempel-4.0-SNAPSHOT.jar
  ./lib
 cp 
 ./contrib/analysis-extras/build/apache-solr-analysis-extras-4.0-SNAPSHOT.jar 
 ./lib
 
 in solrschema.xml
 
 lib path=../../lib/apache-solr-analysis-extras-4.0-SNAPSHOT.jar /
 lib path=../../lib/lucene-analyzers-stempel-4.0-SNAPSHOT.jar /
 
 in schema.xml
 
 !-- Polish --
 fieldType name=text_pl class=solr.TextField
   analyzer
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.WordDelimiterFilterFactory /
 
 filter class=solr.StempelPolishStemFilterFactory
 language=Polish /
   /analyzer
 /fieldType
 
 The end.
 
 Anyway. I don't know if that is Polish stemmer or bad configurated
 fieldType, but the results are just wrong.
 
 example:
 
 index for type text_pl: bilety
 query for type text_pl: bilet
  
 Index Analyzer
 
 org.apache.solr.analysis.StempelPolishStemFilterFactory
 {language=Polish, luceneMatchVersion=LUCENE_24}
 term position
 1
 term text
 bilić
 term type
 word
 source start,end
 0,6
 payload
 
 Query Analyzer
 
 org.apache.solr.analysis.StempelPolishStemFilterFactory
 {language=Polish, luceneMatchVersion=LUCENE_24}
 term position
 1
 term text
 binąć
 term type
 word
 source start,end
 0,5
 payload
 
 
 But I imagine the result as: bilet and bilet which are the base.
 
 Any clues how to make it work like Polish? Maybe someone has good
 experience with hunspell-solr and Polish dictonaries?
 
 Thanks for letting me know!
 
 Cheers,
 Jakub Godawa.
 
 
 
 
 On Mon, 2010-11-15 at 08:35 -0500, Robert Muir wrote:
  https://issues.apache.org/jira/browse/SOLR-2237
  
  On Mon, Nov 15, 2010 at 5:04 AM, Jakub Godawa jakub.god...@gmail.com
  wrote:
   I tried to reach the autors twice, but with no luck. I've seen some
   posts where people finally were able to lunch it (without much
  pain).
   I don't know. If any pro would be so nice to try to run the stempel
  on
   his/her machine and paste me some verbose step by step solution I
   would really appreciate.
  
   Cheers,
   Jakub Godawa.
  
   2010/11/13 Lance Norskog goks...@gmail.com:
   I don't know of the Stempel jar includes the Java source. At this
  point I
   think you should ask the author to Stempel to make a Solr front-end
  for it.
   It's very simple for him.
  
   Jakub Godawa wrote:
  
   Am I not doing it in the point no 4? I am compiling all the folder
   that was extracted before, but now with that new class file.
  
   2010/11/12 Lance Norskoggoks...@gmail.com:
  
  
   I think you have to compile all of the stempel source including
  your
   filter factory into one jar at the same time. Everybody does
  this; I
   don't know how different Java versions make class file binaries.
  
   On Thu, Nov 11, 2010 at 3:06 AM, Jakub
  Godawajakub.god...@gmail.com
wrote:
  
  
   Hi! Sorry for such a break, but I was moving house... anyway:
  
   1. I took the
  
  ~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
   file and modified it (named as StempelFilterFactory.java) in Vim
  that
   way:
  
   package org.getopt.solr.analysis;
  
   import org.apache.lucene.analysis.TokenStream;
   import org.apache.lucene.analysis.standard.StandardFilter;
  
   public class StempelTokenFilterFactory extends
  BaseTokenFilterFactory {
public StempelFilter create(TokenStream input) {
  return new StempelFilter(input);
}
   }
  
   2. Then I put the file to the extracted stempel-1.0.jar in
   ./org/getopt/solr/analysis/
   3. Then I created a class from it: jar -cf
   StempelTokenFilterFactory.class StempelFilterFactory.java
   4. Then I created new stempel-1.0.jar archive: jar -cf
  stempel-1.0.jar
   -C ./stempel-1.0/ .
   5. Then in schema.xml I've put:
  
  fieldType name=text_pl class=solr.TextField
analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter
   class=org.getopt.solr.analysis.StempelTokenFilterFactory /
/analyzer
  /fieldType
  
   6. I started the solr server and I recieved the following error:
  
   2010-11-11 11:50:56 org.apache.solr.common.SolrException log
   SEVERE: java.lang.ClassFormatError: Incompatible magic value
   1347093252 in class file
   

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-24 Thread Jakub Godawa
Yes, from the current nightly release setting up Stempel is quite easy.

All I did was:

svn co https://svn.apache.org/repos/asf/lucene/dev/trunk ./lucene-solr

cd lucene-solr/solr
ant example

cp 
./contrib/analysis-extras/lucene-libs/lucene-analyzers-stempel-4.0-SNAPSHOT.jar 
./lib
cp ./contrib/analysis-extras/build/apache-solr-analysis-extras-4.0-SNAPSHOT.jar 
./lib

in solrschema.xml

lib path=../../lib/apache-solr-analysis-extras-4.0-SNAPSHOT.jar /
lib path=../../lib/lucene-analyzers-stempel-4.0-SNAPSHOT.jar /

in schema.xml

!-- Polish --
fieldType name=text_pl class=solr.TextField
  analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.WordDelimiterFilterFactory /

filter class=solr.StempelPolishStemFilterFactory
language=Polish /
  /analyzer
/fieldType

The end.

Anyway. I don't know if that is Polish stemmer or bad configurated
fieldType, but the results are just wrong.

example:

index for type text_pl: bilety
query for type text_pl: bilet
 
Index Analyzer

org.apache.solr.analysis.StempelPolishStemFilterFactory
{language=Polish, luceneMatchVersion=LUCENE_24}
term position
1
term text
bilić
term type
word
source start,end
0,6
payload

Query Analyzer

org.apache.solr.analysis.StempelPolishStemFilterFactory
{language=Polish, luceneMatchVersion=LUCENE_24}
term position
1
term text
binąć
term type
word
source start,end
0,5
payload


But I imagine the result as: bilet and bilet which are the base.

Any clues how to make it work like Polish? Maybe someone has good
experience with hunspell-solr and Polish dictonaries?

Thanks for letting me know!

Cheers,
Jakub Godawa.




On Mon, 2010-11-15 at 08:35 -0500, Robert Muir wrote:
 https://issues.apache.org/jira/browse/SOLR-2237
 
 On Mon, Nov 15, 2010 at 5:04 AM, Jakub Godawa jakub.god...@gmail.com
 wrote:
  I tried to reach the autors twice, but with no luck. I've seen some
  posts where people finally were able to lunch it (without much
 pain).
  I don't know. If any pro would be so nice to try to run the stempel
 on
  his/her machine and paste me some verbose step by step solution I
  would really appreciate.
 
  Cheers,
  Jakub Godawa.
 
  2010/11/13 Lance Norskog goks...@gmail.com:
  I don't know of the Stempel jar includes the Java source. At this
 point I
  think you should ask the author to Stempel to make a Solr front-end
 for it.
  It's very simple for him.
 
  Jakub Godawa wrote:
 
  Am I not doing it in the point no 4? I am compiling all the folder
  that was extracted before, but now with that new class file.
 
  2010/11/12 Lance Norskoggoks...@gmail.com:
 
 
  I think you have to compile all of the stempel source including
 your
  filter factory into one jar at the same time. Everybody does
 this; I
  don't know how different Java versions make class file binaries.
 
  On Thu, Nov 11, 2010 at 3:06 AM, Jakub
 Godawajakub.god...@gmail.com
   wrote:
 
 
  Hi! Sorry for such a break, but I was moving house... anyway:
 
  1. I took the
 
 ~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
  file and modified it (named as StempelFilterFactory.java) in Vim
 that
  way:
 
  package org.getopt.solr.analysis;
 
  import org.apache.lucene.analysis.TokenStream;
  import org.apache.lucene.analysis.standard.StandardFilter;
 
  public class StempelTokenFilterFactory extends
 BaseTokenFilterFactory {
   public StempelFilter create(TokenStream input) {
 return new StempelFilter(input);
   }
  }
 
  2. Then I put the file to the extracted stempel-1.0.jar in
  ./org/getopt/solr/analysis/
  3. Then I created a class from it: jar -cf
  StempelTokenFilterFactory.class StempelFilterFactory.java
  4. Then I created new stempel-1.0.jar archive: jar -cf
 stempel-1.0.jar
  -C ./stempel-1.0/ .
  5. Then in schema.xml I've put:
 
 fieldType name=text_pl class=solr.TextField
   analyzer
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter
  class=org.getopt.solr.analysis.StempelTokenFilterFactory /
   /analyzer
 /fieldType
 
  6. I started the solr server and I recieved the following error:
 
  2010-11-11 11:50:56 org.apache.solr.common.SolrException log
  SEVERE: java.lang.ClassFormatError: Incompatible magic value
  1347093252 in class file
  org/getopt/solr/analysis/StempelTokenFilterFactory
 at java.lang.ClassLoader.defineClass1(Native Method)
 at
 java.lang.ClassLoader.defineClass(ClassLoader.java:634)
 at
 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
  ...
 
  Question: What is wrong? :) I use jar (fastjar) 0.98 to create
 jars,
  I googled on that error but with no answer gave me idea what is
 wrong
  in my .java file.
 
  Please help, as I believe I am close to the end of that subject.
 
  Cheers,
  Jakub Godawa.
 
  2010/11/3 Lance Norskoggoks...@gmail.com:
 
 
  Here's the problem: Solr is a little dumb about these 

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-24 Thread Jakub Godawa
On Wed, 2010-11-24 at 19:00 +0100, Jakub Godawa wrote:
 Yes, from the current nightly release setting up Stempel is quite easy.
Thanks to Rober Muir :)



Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-15 Thread Jakub Godawa
I tried to reach the autors twice, but with no luck. I've seen some
posts where people finally were able to lunch it (without much pain).
I don't know. If any pro would be so nice to try to run the stempel on
his/her machine and paste me some verbose step by step solution I
would really appreciate.

Cheers,
Jakub Godawa.

2010/11/13 Lance Norskog goks...@gmail.com:
 I don't know of the Stempel jar includes the Java source. At this point I
 think you should ask the author to Stempel to make a Solr front-end for it.
 It's very simple for him.

 Jakub Godawa wrote:

 Am I not doing it in the point no 4? I am compiling all the folder
 that was extracted before, but now with that new class file.

 2010/11/12 Lance Norskoggoks...@gmail.com:


 I think you have to compile all of the stempel source including your
 filter factory into one jar at the same time. Everybody does this; I
 don't know how different Java versions make class file binaries.

 On Thu, Nov 11, 2010 at 3:06 AM, Jakub Godawajakub.god...@gmail.com
  wrote:


 Hi! Sorry for such a break, but I was moving house... anyway:

 1. I took the
 ~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
 file and modified it (named as StempelFilterFactory.java) in Vim that
 way:

 package org.getopt.solr.analysis;

 import org.apache.lucene.analysis.TokenStream;
 import org.apache.lucene.analysis.standard.StandardFilter;

 public class StempelTokenFilterFactory extends BaseTokenFilterFactory {
  public StempelFilter create(TokenStream input) {
    return new StempelFilter(input);
  }
 }

 2. Then I put the file to the extracted stempel-1.0.jar in
 ./org/getopt/solr/analysis/
 3. Then I created a class from it: jar -cf
 StempelTokenFilterFactory.class StempelFilterFactory.java
 4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar
 -C ./stempel-1.0/ .
 5. Then in schema.xml I've put:

    fieldType name=text_pl class=solr.TextField
      analyzer
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.LowerCaseFilterFactory/
        filter
 class=org.getopt.solr.analysis.StempelTokenFilterFactory /
      /analyzer
    /fieldType

 6. I started the solr server and I recieved the following error:

 2010-11-11 11:50:56 org.apache.solr.common.SolrException log
 SEVERE: java.lang.ClassFormatError: Incompatible magic value
 1347093252 in class file
 org/getopt/solr/analysis/StempelTokenFilterFactory
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
        at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 ...

 Question: What is wrong? :) I use jar (fastjar) 0.98 to create jars,
 I googled on that error but with no answer gave me idea what is wrong
 in my .java file.

 Please help, as I believe I am close to the end of that subject.

 Cheers,
 Jakub Godawa.

 2010/11/3 Lance Norskoggoks...@gmail.com:


 Here's the problem: Solr is a little dumb about these Filter classes,
 and so you have to make a Factory object for the Stempel Filter.

 There are a lot of other FilterFactory classes. You would have to just
 copy one and change the names to Stempel and it might actually work.

 This will take some Solr programming- perhaps the author can help you?

 On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawajakub.god...@gmail.com
  wrote:


 Sorry, I am not Java programmer at all. I would appreciate more
 verbose (or step by step) help.

 2010/11/2 Bernd Fehlingbernd.fehl...@uni-bielefeld.de:


 So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
 In this case I would assume a file StempelTokenFilterFactory.class
 in your directory org/getopt/solr/analysis/.

 And a class which extends the BaseTokenFilterFactory rigth?
 ...
 public class StempelTokenFilterFactory extends BaseTokenFilterFactory
 implements ResourceLoaderAware {
 ...



 Am 02.11.2010 14:20, schrieb Jakub Godawa:


 This is what stempel-1.0.jar consist of after jar -xf:

 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
 org/:
 egothor  getopt

 org/egothor:
 stemmer

 org/egothor/stemmer:
 Cell.class     Diff.class    Gener.class  MultiTrie2.class
 Optimizer2.class  Reduce.class        Row.class    TestAll.class
 TestLoad.class  Trie$StrEnum.class
 Compile.class  DiffIt.class  Lift.class   MultiTrie.class
 Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
 Trie.class

 org/getopt:
 stempel

 org/getopt/stempel:
 Benchmark.class  lucene  Stemmer.class

 org/getopt/stempel/lucene:
 StempelAnalyzer.class  StempelFilter.class
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
 META-INF/:
 MANIFEST.MF
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
 res:
 tables

 res/tables:
 readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
 stemmer_200.out  stemmer_500.out  stemmer_700.out

 2010/11/2 Bernd Fehlingbernd.fehl...@uni-bielefeld.de:


 Hi Jakub,

 if you unzip your stempel-1.0.jar do you 

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-15 Thread Robert Muir
https://issues.apache.org/jira/browse/SOLR-2237

On Mon, Nov 15, 2010 at 5:04 AM, Jakub Godawa jakub.god...@gmail.com wrote:
 I tried to reach the autors twice, but with no luck. I've seen some
 posts where people finally were able to lunch it (without much pain).
 I don't know. If any pro would be so nice to try to run the stempel on
 his/her machine and paste me some verbose step by step solution I
 would really appreciate.

 Cheers,
 Jakub Godawa.

 2010/11/13 Lance Norskog goks...@gmail.com:
 I don't know of the Stempel jar includes the Java source. At this point I
 think you should ask the author to Stempel to make a Solr front-end for it.
 It's very simple for him.

 Jakub Godawa wrote:

 Am I not doing it in the point no 4? I am compiling all the folder
 that was extracted before, but now with that new class file.

 2010/11/12 Lance Norskoggoks...@gmail.com:


 I think you have to compile all of the stempel source including your
 filter factory into one jar at the same time. Everybody does this; I
 don't know how different Java versions make class file binaries.

 On Thu, Nov 11, 2010 at 3:06 AM, Jakub Godawajakub.god...@gmail.com
  wrote:


 Hi! Sorry for such a break, but I was moving house... anyway:

 1. I took the
 ~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
 file and modified it (named as StempelFilterFactory.java) in Vim that
 way:

 package org.getopt.solr.analysis;

 import org.apache.lucene.analysis.TokenStream;
 import org.apache.lucene.analysis.standard.StandardFilter;

 public class StempelTokenFilterFactory extends BaseTokenFilterFactory {
  public StempelFilter create(TokenStream input) {
    return new StempelFilter(input);
  }
 }

 2. Then I put the file to the extracted stempel-1.0.jar in
 ./org/getopt/solr/analysis/
 3. Then I created a class from it: jar -cf
 StempelTokenFilterFactory.class StempelFilterFactory.java
 4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar
 -C ./stempel-1.0/ .
 5. Then in schema.xml I've put:

    fieldType name=text_pl class=solr.TextField
      analyzer
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.LowerCaseFilterFactory/
        filter
 class=org.getopt.solr.analysis.StempelTokenFilterFactory /
      /analyzer
    /fieldType

 6. I started the solr server and I recieved the following error:

 2010-11-11 11:50:56 org.apache.solr.common.SolrException log
 SEVERE: java.lang.ClassFormatError: Incompatible magic value
 1347093252 in class file
 org/getopt/solr/analysis/StempelTokenFilterFactory
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
        at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 ...

 Question: What is wrong? :) I use jar (fastjar) 0.98 to create jars,
 I googled on that error but with no answer gave me idea what is wrong
 in my .java file.

 Please help, as I believe I am close to the end of that subject.

 Cheers,
 Jakub Godawa.

 2010/11/3 Lance Norskoggoks...@gmail.com:


 Here's the problem: Solr is a little dumb about these Filter classes,
 and so you have to make a Factory object for the Stempel Filter.

 There are a lot of other FilterFactory classes. You would have to just
 copy one and change the names to Stempel and it might actually work.

 This will take some Solr programming- perhaps the author can help you?

 On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawajakub.god...@gmail.com
  wrote:


 Sorry, I am not Java programmer at all. I would appreciate more
 verbose (or step by step) help.

 2010/11/2 Bernd Fehlingbernd.fehl...@uni-bielefeld.de:


 So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
 In this case I would assume a file StempelTokenFilterFactory.class
 in your directory org/getopt/solr/analysis/.

 And a class which extends the BaseTokenFilterFactory rigth?
 ...
 public class StempelTokenFilterFactory extends BaseTokenFilterFactory
 implements ResourceLoaderAware {
 ...



 Am 02.11.2010 14:20, schrieb Jakub Godawa:


 This is what stempel-1.0.jar consist of after jar -xf:

 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
 org/:
 egothor  getopt

 org/egothor:
 stemmer

 org/egothor/stemmer:
 Cell.class     Diff.class    Gener.class  MultiTrie2.class
 Optimizer2.class  Reduce.class        Row.class    TestAll.class
 TestLoad.class  Trie$StrEnum.class
 Compile.class  DiffIt.class  Lift.class   MultiTrie.class
 Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
 Trie.class

 org/getopt:
 stempel

 org/getopt/stempel:
 Benchmark.class  lucene  Stemmer.class

 org/getopt/stempel/lucene:
 StempelAnalyzer.class  StempelFilter.class
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
 META-INF/:
 MANIFEST.MF
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
 res:
 tables

 res/tables:
 readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
 stemmer_200.out  stemmer_500.out 

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-12 Thread Lance Norskog
I think you have to compile all of the stempel source including your
filter factory into one jar at the same time. Everybody does this; I
don't know how different Java versions make class file binaries.

On Thu, Nov 11, 2010 at 3:06 AM, Jakub Godawa jakub.god...@gmail.com wrote:
 Hi! Sorry for such a break, but I was moving house... anyway:

 1. I took the 
 ~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
 file and modified it (named as StempelFilterFactory.java) in Vim that
 way:

 package org.getopt.solr.analysis;

 import org.apache.lucene.analysis.TokenStream;
 import org.apache.lucene.analysis.standard.StandardFilter;

 public class StempelTokenFilterFactory extends BaseTokenFilterFactory {
  public StempelFilter create(TokenStream input) {
    return new StempelFilter(input);
  }
 }

 2. Then I put the file to the extracted stempel-1.0.jar in
 ./org/getopt/solr/analysis/
 3. Then I created a class from it: jar -cf
 StempelTokenFilterFactory.class StempelFilterFactory.java
 4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar
 -C ./stempel-1.0/ .
 5. Then in schema.xml I've put:

    fieldType name=text_pl class=solr.TextField
      analyzer
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.LowerCaseFilterFactory/
        filter class=org.getopt.solr.analysis.StempelTokenFilterFactory /
      /analyzer
    /fieldType

 6. I started the solr server and I recieved the following error:

 2010-11-11 11:50:56 org.apache.solr.common.SolrException log
 SEVERE: java.lang.ClassFormatError: Incompatible magic value
 1347093252 in class file
 org/getopt/solr/analysis/StempelTokenFilterFactory
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
        at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 ...

 Question: What is wrong? :) I use jar (fastjar) 0.98 to create jars,
 I googled on that error but with no answer gave me idea what is wrong
 in my .java file.

 Please help, as I believe I am close to the end of that subject.

 Cheers,
 Jakub Godawa.

 2010/11/3 Lance Norskog goks...@gmail.com:
 Here's the problem: Solr is a little dumb about these Filter classes,
 and so you have to make a Factory object for the Stempel Filter.

 There are a lot of other FilterFactory classes. You would have to just
 copy one and change the names to Stempel and it might actually work.

 This will take some Solr programming- perhaps the author can help you?

 On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawa jakub.god...@gmail.com wrote:
 Sorry, I am not Java programmer at all. I would appreciate more
 verbose (or step by step) help.

 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:

 So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
 In this case I would assume a file StempelTokenFilterFactory.class
 in your directory org/getopt/solr/analysis/.

 And a class which extends the BaseTokenFilterFactory rigth?
 ...
 public class StempelTokenFilterFactory extends BaseTokenFilterFactory 
 implements ResourceLoaderAware {
 ...



 Am 02.11.2010 14:20, schrieb Jakub Godawa:
 This is what stempel-1.0.jar consist of after jar -xf:

 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
 org/:
 egothor  getopt

 org/egothor:
 stemmer

 org/egothor/stemmer:
 Cell.class     Diff.class    Gener.class  MultiTrie2.class
 Optimizer2.class  Reduce.class        Row.class    TestAll.class
 TestLoad.class  Trie$StrEnum.class
 Compile.class  DiffIt.class  Lift.class   MultiTrie.class
 Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
 Trie.class

 org/getopt:
 stempel

 org/getopt/stempel:
 Benchmark.class  lucene  Stemmer.class

 org/getopt/stempel/lucene:
 StempelAnalyzer.class  StempelFilter.class
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
 META-INF/:
 MANIFEST.MF
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
 res:
 tables

 res/tables:
 readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
 stemmer_200.out  stemmer_500.out  stemmer_700.out

 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 if you unzip your stempel-1.0.jar do you have the
 required directory structure and file in there?
 org/getopt/stempel/lucene/StempelFilter.class

 Regards,
 Bernd

 Am 02.11.2010 13:54, schrieb Jakub Godawa:
 Erick I've put the jar files like that before. I also added the
 directive and put the file in instanceDir/lib

 What is still a problem is that even the files are loaded:
 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader 
 replaceClassLoader
 INFO: Adding 
 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
 to classloader

 I am not able to use the FilterFactory... maybe I am attempting it in
 a wrong way?

 Cheers,
 Jakub Godawa.

 2010/11/2 Erick Erickson erickerick...@gmail.com:
 The polish stemmer jar file needs to be findable by Solr, if you copy
 it to 

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-12 Thread Jakub Godawa
Am I not doing it in the point no 4? I am compiling all the folder
that was extracted before, but now with that new class file.

2010/11/12 Lance Norskog goks...@gmail.com:
 I think you have to compile all of the stempel source including your
 filter factory into one jar at the same time. Everybody does this; I
 don't know how different Java versions make class file binaries.

 On Thu, Nov 11, 2010 at 3:06 AM, Jakub Godawa jakub.god...@gmail.com wrote:
 Hi! Sorry for such a break, but I was moving house... anyway:

 1. I took the 
 ~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
 file and modified it (named as StempelFilterFactory.java) in Vim that
 way:

 package org.getopt.solr.analysis;

 import org.apache.lucene.analysis.TokenStream;
 import org.apache.lucene.analysis.standard.StandardFilter;

 public class StempelTokenFilterFactory extends BaseTokenFilterFactory {
  public StempelFilter create(TokenStream input) {
    return new StempelFilter(input);
  }
 }

 2. Then I put the file to the extracted stempel-1.0.jar in
 ./org/getopt/solr/analysis/
 3. Then I created a class from it: jar -cf
 StempelTokenFilterFactory.class StempelFilterFactory.java
 4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar
 -C ./stempel-1.0/ .
 5. Then in schema.xml I've put:

    fieldType name=text_pl class=solr.TextField
      analyzer
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.LowerCaseFilterFactory/
        filter class=org.getopt.solr.analysis.StempelTokenFilterFactory /
      /analyzer
    /fieldType

 6. I started the solr server and I recieved the following error:

 2010-11-11 11:50:56 org.apache.solr.common.SolrException log
 SEVERE: java.lang.ClassFormatError: Incompatible magic value
 1347093252 in class file
 org/getopt/solr/analysis/StempelTokenFilterFactory
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
        at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 ...

 Question: What is wrong? :) I use jar (fastjar) 0.98 to create jars,
 I googled on that error but with no answer gave me idea what is wrong
 in my .java file.

 Please help, as I believe I am close to the end of that subject.

 Cheers,
 Jakub Godawa.

 2010/11/3 Lance Norskog goks...@gmail.com:
 Here's the problem: Solr is a little dumb about these Filter classes,
 and so you have to make a Factory object for the Stempel Filter.

 There are a lot of other FilterFactory classes. You would have to just
 copy one and change the names to Stempel and it might actually work.

 This will take some Solr programming- perhaps the author can help you?

 On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawa jakub.god...@gmail.com wrote:
 Sorry, I am not Java programmer at all. I would appreciate more
 verbose (or step by step) help.

 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:

 So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
 In this case I would assume a file StempelTokenFilterFactory.class
 in your directory org/getopt/solr/analysis/.

 And a class which extends the BaseTokenFilterFactory rigth?
 ...
 public class StempelTokenFilterFactory extends BaseTokenFilterFactory 
 implements ResourceLoaderAware {
 ...



 Am 02.11.2010 14:20, schrieb Jakub Godawa:
 This is what stempel-1.0.jar consist of after jar -xf:

 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
 org/:
 egothor  getopt

 org/egothor:
 stemmer

 org/egothor/stemmer:
 Cell.class     Diff.class    Gener.class  MultiTrie2.class
 Optimizer2.class  Reduce.class        Row.class    TestAll.class
 TestLoad.class  Trie$StrEnum.class
 Compile.class  DiffIt.class  Lift.class   MultiTrie.class
 Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
 Trie.class

 org/getopt:
 stempel

 org/getopt/stempel:
 Benchmark.class  lucene  Stemmer.class

 org/getopt/stempel/lucene:
 StempelAnalyzer.class  StempelFilter.class
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
 META-INF/:
 MANIFEST.MF
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
 res:
 tables

 res/tables:
 readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
 stemmer_200.out  stemmer_500.out  stemmer_700.out

 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 if you unzip your stempel-1.0.jar do you have the
 required directory structure and file in there?
 org/getopt/stempel/lucene/StempelFilter.class

 Regards,
 Bernd

 Am 02.11.2010 13:54, schrieb Jakub Godawa:
 Erick I've put the jar files like that before. I also added the
 directive and put the file in instanceDir/lib

 What is still a problem is that even the files are loaded:
 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader 
 replaceClassLoader
 INFO: Adding 
 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
 to classloader

 I am not able to use the FilterFactory... maybe I am attempting 

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-12 Thread Lance Norskog
I don't know of the Stempel jar includes the Java source. At this point 
I think you should ask the author to Stempel to make a Solr front-end 
for it. It's very simple for him.


Jakub Godawa wrote:

Am I not doing it in the point no 4? I am compiling all the folder
that was extracted before, but now with that new class file.

2010/11/12 Lance Norskoggoks...@gmail.com:
   

I think you have to compile all of the stempel source including your
filter factory into one jar at the same time. Everybody does this; I
don't know how different Java versions make class file binaries.

On Thu, Nov 11, 2010 at 3:06 AM, Jakub Godawajakub.god...@gmail.com  wrote:
 

Hi! Sorry for such a break, but I was moving house... anyway:

1. I took the 
~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
file and modified it (named as StempelFilterFactory.java) in Vim that
way:

package org.getopt.solr.analysis;

import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.standard.StandardFilter;

public class StempelTokenFilterFactory extends BaseTokenFilterFactory {
  public StempelFilter create(TokenStream input) {
return new StempelFilter(input);
  }
}

2. Then I put the file to the extracted stempel-1.0.jar in
./org/getopt/solr/analysis/
3. Then I created a class from it: jar -cf
StempelTokenFilterFactory.class StempelFilterFactory.java
4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar
-C ./stempel-1.0/ .
5. Then in schema.xml I've put:

fieldType name=text_pl class=solr.TextField
  analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=org.getopt.solr.analysis.StempelTokenFilterFactory /
  /analyzer
/fieldType

6. I started the solr server and I recieved the following error:

2010-11-11 11:50:56 org.apache.solr.common.SolrException log
SEVERE: java.lang.ClassFormatError: Incompatible magic value
1347093252 in class file
org/getopt/solr/analysis/StempelTokenFilterFactory
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
...

Question: What is wrong? :) I use jar (fastjar) 0.98 to create jars,
I googled on that error but with no answer gave me idea what is wrong
in my .java file.

Please help, as I believe I am close to the end of that subject.

Cheers,
Jakub Godawa.

2010/11/3 Lance Norskoggoks...@gmail.com:
   

Here's the problem: Solr is a little dumb about these Filter classes,
and so you have to make a Factory object for the Stempel Filter.

There are a lot of other FilterFactory classes. You would have to just
copy one and change the names to Stempel and it might actually work.

This will take some Solr programming- perhaps the author can help you?

On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawajakub.god...@gmail.com  wrote:
 

Sorry, I am not Java programmer at all. I would appreciate more
verbose (or step by step) help.

2010/11/2 Bernd Fehlingbernd.fehl...@uni-bielefeld.de:
   

So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
In this case I would assume a file StempelTokenFilterFactory.class
in your directory org/getopt/solr/analysis/.

And a class which extends the BaseTokenFilterFactory rigth?
...
public class StempelTokenFilterFactory extends BaseTokenFilterFactory 
implements ResourceLoaderAware {
...



Am 02.11.2010 14:20, schrieb Jakub Godawa:
 

This is what stempel-1.0.jar consist of after jar -xf:

jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
org/:
egothor  getopt

org/egothor:
stemmer

org/egothor/stemmer:
Cell.class Diff.classGener.class  MultiTrie2.class
Optimizer2.class  Reduce.classRow.classTestAll.class
TestLoad.class  Trie$StrEnum.class
Compile.class  DiffIt.class  Lift.class   MultiTrie.class
Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
Trie.class

org/getopt:
stempel

org/getopt/stempel:
Benchmark.class  lucene  Stemmer.class

org/getopt/stempel/lucene:
StempelAnalyzer.class  StempelFilter.class
jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
META-INF/:
MANIFEST.MF
jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
res:
tables

res/tables:
readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
stemmer_200.out  stemmer_500.out  stemmer_700.out

2010/11/2 Bernd Fehlingbernd.fehl...@uni-bielefeld.de:
   

Hi Jakub,

if you unzip your stempel-1.0.jar do you have the
required directory structure and file in there?
org/getopt/stempel/lucene/StempelFilter.class

Regards,
Bernd

Am 02.11.2010 13:54, schrieb Jakub Godawa:
 

Erick I've put the jar files like that before. I also added the
directive and put the file in instanceDir/lib

What is still a problem is that even the files are loaded:
2010-11-02 13:20:48 

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-11 Thread Jakub Godawa
Hi! Sorry for such a break, but I was moving house... anyway:

1. I took the 
~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
file and modified it (named as StempelFilterFactory.java) in Vim that
way:

package org.getopt.solr.analysis;

import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.standard.StandardFilter;

public class StempelTokenFilterFactory extends BaseTokenFilterFactory {
  public StempelFilter create(TokenStream input) {
return new StempelFilter(input);
  }
}

2. Then I put the file to the extracted stempel-1.0.jar in
./org/getopt/solr/analysis/
3. Then I created a class from it: jar -cf
StempelTokenFilterFactory.class StempelFilterFactory.java
4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar
-C ./stempel-1.0/ .
5. Then in schema.xml I've put:

fieldType name=text_pl class=solr.TextField
  analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=org.getopt.solr.analysis.StempelTokenFilterFactory /
  /analyzer
/fieldType

6. I started the solr server and I recieved the following error:

2010-11-11 11:50:56 org.apache.solr.common.SolrException log
SEVERE: java.lang.ClassFormatError: Incompatible magic value
1347093252 in class file
org/getopt/solr/analysis/StempelTokenFilterFactory
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
...

Question: What is wrong? :) I use jar (fastjar) 0.98 to create jars,
I googled on that error but with no answer gave me idea what is wrong
in my .java file.

Please help, as I believe I am close to the end of that subject.

Cheers,
Jakub Godawa.

2010/11/3 Lance Norskog goks...@gmail.com:
 Here's the problem: Solr is a little dumb about these Filter classes,
 and so you have to make a Factory object for the Stempel Filter.

 There are a lot of other FilterFactory classes. You would have to just
 copy one and change the names to Stempel and it might actually work.

 This will take some Solr programming- perhaps the author can help you?

 On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawa jakub.god...@gmail.com wrote:
 Sorry, I am not Java programmer at all. I would appreciate more
 verbose (or step by step) help.

 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:

 So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
 In this case I would assume a file StempelTokenFilterFactory.class
 in your directory org/getopt/solr/analysis/.

 And a class which extends the BaseTokenFilterFactory rigth?
 ...
 public class StempelTokenFilterFactory extends BaseTokenFilterFactory 
 implements ResourceLoaderAware {
 ...



 Am 02.11.2010 14:20, schrieb Jakub Godawa:
 This is what stempel-1.0.jar consist of after jar -xf:

 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
 org/:
 egothor  getopt

 org/egothor:
 stemmer

 org/egothor/stemmer:
 Cell.class     Diff.class    Gener.class  MultiTrie2.class
 Optimizer2.class  Reduce.class        Row.class    TestAll.class
 TestLoad.class  Trie$StrEnum.class
 Compile.class  DiffIt.class  Lift.class   MultiTrie.class
 Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
 Trie.class

 org/getopt:
 stempel

 org/getopt/stempel:
 Benchmark.class  lucene  Stemmer.class

 org/getopt/stempel/lucene:
 StempelAnalyzer.class  StempelFilter.class
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
 META-INF/:
 MANIFEST.MF
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
 res:
 tables

 res/tables:
 readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
 stemmer_200.out  stemmer_500.out  stemmer_700.out

 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 if you unzip your stempel-1.0.jar do you have the
 required directory structure and file in there?
 org/getopt/stempel/lucene/StempelFilter.class

 Regards,
 Bernd

 Am 02.11.2010 13:54, schrieb Jakub Godawa:
 Erick I've put the jar files like that before. I also added the
 directive and put the file in instanceDir/lib

 What is still a problem is that even the files are loaded:
 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader 
 replaceClassLoader
 INFO: Adding 
 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
 to classloader

 I am not able to use the FilterFactory... maybe I am attempting it in
 a wrong way?

 Cheers,
 Jakub Godawa.

 2010/11/2 Erick Erickson erickerick...@gmail.com:
 The polish stemmer jar file needs to be findable by Solr, if you copy
 it to solr_home/lib and restart solr you should be set.

 Alternatively, you can add another lib directive to the solrconfig.xml
 file
 (there are several examples in that file already).

 I'm a little confused about not being able to find TokenFilter, is that
 still
 a problem?

 HTH
 Erick

 On Tue, Nov 2, 2010 at 

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-02 Thread Jakub Godawa
Thank you Bernd! I couldn't make it run though. Here is my problem:

1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
directive: lib path=../lib/stempel-1.0.jar /
3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:

(...)
  !-- Polish --
  fieldType name=text_pl class=solr.TextField
analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=org.getopt.stempel.lucene.StempelFilter /
  !--filter
class=org.getopt.solr.analysis.StempelTokenFilterFactory
protected=protwords.txt / --
/analyzer
  /fieldType
(...)

4. jar file is loaded but I got an error:
SEVERE: Could not start SOLR. Check solr/home property
java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
  at java.lang.ClassLoader.defineClass1(Native Method)
  at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
  at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
(...)

5. Different class gave me that one:
SEVERE: org.apache.solr.common.SolrException: Error loading class
'org.getopt.solr.analysis.StempelTokenFilterFactory'
  at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
  at 
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
(...)

Question is: How to make fieldType / and filter / work with that Stempel? :)

Cheers,
Jakub Godawa.

2010/10/29 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 I have ported the KStemmer for use in most recent Solr trunk version.
 My stemmer is located in the lib directory of Solr 
 solr/lib/KStemmer-2.00.jar
 because it belongs to Solr.

 Write it as FilterFactory and use it as Filter like:
 filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory 
 protected=protwords.txt /

 This is how my fieldType looks like:

    fieldType name=text_kstem class=solr.TextField 
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.WhitespaceTokenizerFactory /
        filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt enablePositionIncrements=false /
        filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1 /
        filter class=solr.LowerCaseFilterFactory /
        filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory 
 protected=protwords.txt /
        filter class=solr.RemoveDuplicatesTokenFilterFactory /
      /analyzer
      analyzer type=query
        tokenizer class=solr.WhitespaceTokenizerFactory /
        filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt /
        filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
 generateNumberParts=1 catenateWords=0 catenateNumbers=0
 catenateAll=0 splitOnCaseChange=1 /
        filter class=solr.LowerCaseFilterFactory /
        filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory 
 protected=protwords.txt /
        filter class=solr.RemoveDuplicatesTokenFilterFactory /
      /analyzer
    /fieldType

 Regards,
 Bernd



 Am 28.10.2010 14:56, schrieb Jakub Godawa:
 Hi!
 There is a polish stemmer http://www.getopt.org/stempel/ and I have
 problems connecting it with solr 1.4.1
 Questions:

 1. Where EXACTLY do I put stemper-1.0.jar file?
 2. How do I register the file, so I can build a fieldType like:

 fieldType name=text_pl class=solr.TextField
   analyzer class=org.geoopt.solr.analysis.StempelTokenFilterFactory/
 /fieldType

 3. Is that the right approach to make it work?

 Thanks for verbose explanation,
 Jakub.



Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-02 Thread Jakub Godawa
Erick I've put the jar files like that before. I also added the
directive and put the file in instanceDir/lib

What is still a problem is that even the files are loaded:
2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader replaceClassLoader
INFO: Adding 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
to classloader

I am not able to use the FilterFactory... maybe I am attempting it in
a wrong way?

Cheers,
Jakub Godawa.

2010/11/2 Erick Erickson erickerick...@gmail.com:
 The polish stemmer jar file needs to be findable by Solr, if you copy
 it to solr_home/lib and restart solr you should be set.

 Alternatively, you can add another lib directive to the solrconfig.xml
 file
 (there are several examples in that file already).

 I'm a little confused about not being able to find TokenFilter, is that
 still
 a problem?

 HTH
 Erick

 On Tue, Nov 2, 2010 at 8:07 AM, Jakub Godawa jakub.god...@gmail.com wrote:

 Thank you Bernd! I couldn't make it run though. Here is my problem:

 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
 directive: lib path=../lib/stempel-1.0.jar /
 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:

 (...)
  !-- Polish --
   fieldType name=text_pl class=solr.TextField
    analyzer
       tokenizer class=solr.WhitespaceTokenizerFactory/
      filter class=solr.LowerCaseFilterFactory/
      filter class=org.getopt.stempel.lucene.StempelFilter /
      !--    filter
 class=org.getopt.solr.analysis.StempelTokenFilterFactory
 protected=protwords.txt / --
    /analyzer
  /fieldType
 (...)

 4. jar file is loaded but I got an error:
 SEVERE: Could not start SOLR. Check solr/home property
 java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
      at java.lang.ClassLoader.defineClass1(Native Method)
      at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
      at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 (...)

 5. Different class gave me that one:
 SEVERE: org.apache.solr.common.SolrException: Error loading class
 'org.getopt.solr.analysis.StempelTokenFilterFactory'
      at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
      at
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
 (...)

 Question is: How to make fieldType / and filter / work with that
 Stempel? :)

 Cheers,
 Jakub Godawa.

 2010/10/29 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
  Hi Jakub,
 
  I have ported the KStemmer for use in most recent Solr trunk version.
  My stemmer is located in the lib directory of Solr
 solr/lib/KStemmer-2.00.jar
  because it belongs to Solr.
 
  Write it as FilterFactory and use it as Filter like:
  filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /
 
  This is how my fieldType looks like:
 
     fieldType name=text_kstem class=solr.TextField
 positionIncrementGap=100
       analyzer type=index
         tokenizer class=solr.WhitespaceTokenizerFactory /
         filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=false /
         filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1
  catenateAll=0 splitOnCaseChange=1 /
         filter class=solr.LowerCaseFilterFactory /
         filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /
         filter class=solr.RemoveDuplicatesTokenFilterFactory /
       /analyzer
       analyzer type=query
         tokenizer class=solr.WhitespaceTokenizerFactory /
         filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt /
         filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0
  catenateAll=0 splitOnCaseChange=1 /
         filter class=solr.LowerCaseFilterFactory /
         filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /
         filter class=solr.RemoveDuplicatesTokenFilterFactory /
       /analyzer
     /fieldType
 
  Regards,
  Bernd
 
 
 
  Am 28.10.2010 14:56, schrieb Jakub Godawa:
  Hi!
  There is a polish stemmer http://www.getopt.org/stempel/ and I have
  problems connecting it with solr 1.4.1
  Questions:
 
  1. Where EXACTLY do I put stemper-1.0.jar file?
  2. How do I register the file, so I can build a fieldType like:
 
  fieldType name=text_pl class=solr.TextField
    analyzer class=org.geoopt.solr.analysis.StempelTokenFilterFactory/
  /fieldType
 
  3. Is that the right approach to make it work?
 
  Thanks for verbose explanation,
  Jakub.
 




Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-02 Thread Bernd Fehling
Hi Jakub,

if you unzip your stempel-1.0.jar do you have the
required directory structure and file in there?
org/getopt/stempel/lucene/StempelFilter.class

Regards,
Bernd

Am 02.11.2010 13:54, schrieb Jakub Godawa:
 Erick I've put the jar files like that before. I also added the
 directive and put the file in instanceDir/lib
 
 What is still a problem is that even the files are loaded:
 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader replaceClassLoader
 INFO: Adding 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
 to classloader
 
 I am not able to use the FilterFactory... maybe I am attempting it in
 a wrong way?
 
 Cheers,
 Jakub Godawa.
 
 2010/11/2 Erick Erickson erickerick...@gmail.com:
 The polish stemmer jar file needs to be findable by Solr, if you copy
 it to solr_home/lib and restart solr you should be set.

 Alternatively, you can add another lib directive to the solrconfig.xml
 file
 (there are several examples in that file already).

 I'm a little confused about not being able to find TokenFilter, is that
 still
 a problem?

 HTH
 Erick

 On Tue, Nov 2, 2010 at 8:07 AM, Jakub Godawa jakub.god...@gmail.com wrote:

 Thank you Bernd! I couldn't make it run though. Here is my problem:

 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
 directive: lib path=../lib/stempel-1.0.jar /
 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:

 (...)
  !-- Polish --
   fieldType name=text_pl class=solr.TextField
analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=org.getopt.stempel.lucene.StempelFilter /
  !--filter
 class=org.getopt.solr.analysis.StempelTokenFilterFactory
 protected=protwords.txt / --
/analyzer
  /fieldType
 (...)

 4. jar file is loaded but I got an error:
 SEVERE: Could not start SOLR. Check solr/home property
 java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
  at java.lang.ClassLoader.defineClass1(Native Method)
  at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
  at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 (...)

 5. Different class gave me that one:
 SEVERE: org.apache.solr.common.SolrException: Error loading class
 'org.getopt.solr.analysis.StempelTokenFilterFactory'
  at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
  at
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
 (...)

 Question is: How to make fieldType / and filter / work with that
 Stempel? :)

 Cheers,
 Jakub Godawa.

 2010/10/29 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 I have ported the KStemmer for use in most recent Solr trunk version.
 My stemmer is located in the lib directory of Solr
 solr/lib/KStemmer-2.00.jar
 because it belongs to Solr.

 Write it as FilterFactory and use it as Filter like:
 filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /

 This is how my fieldType looks like:

fieldType name=text_kstem class=solr.TextField
 positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=false /
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1 /
filter class=solr.LowerCaseFilterFactory /
filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /
filter class=solr.RemoveDuplicatesTokenFilterFactory /
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt /
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0
 catenateAll=0 splitOnCaseChange=1 /
filter class=solr.LowerCaseFilterFactory /
filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /
filter class=solr.RemoveDuplicatesTokenFilterFactory /
  /analyzer
/fieldType

 Regards,
 Bernd



 Am 28.10.2010 14:56, schrieb Jakub Godawa:
 Hi!
 There is a polish stemmer http://www.getopt.org/stempel/ and I have
 problems connecting it with solr 1.4.1
 Questions:

 1. Where EXACTLY do I put stemper-1.0.jar file?
 2. How do I register the file, so I can build a fieldType like:

 fieldType name=text_pl class=solr.TextField
   analyzer class=org.geoopt.solr.analysis.StempelTokenFilterFactory/
 /fieldType

 3. Is that the right approach to make it work?

 Thanks for verbose explanation,
 Jakub.




-- 
*

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-02 Thread Jakub Godawa
This is what stempel-1.0.jar consist of after jar -xf:

jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
org/:
egothor  getopt

org/egothor:
stemmer

org/egothor/stemmer:
Cell.class Diff.classGener.class  MultiTrie2.class
Optimizer2.class  Reduce.classRow.classTestAll.class
TestLoad.class  Trie$StrEnum.class
Compile.class  DiffIt.class  Lift.class   MultiTrie.class
Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
Trie.class

org/getopt:
stempel

org/getopt/stempel:
Benchmark.class  lucene  Stemmer.class

org/getopt/stempel/lucene:
StempelAnalyzer.class  StempelFilter.class
jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
META-INF/:
MANIFEST.MF
jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
res:
tables

res/tables:
readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
stemmer_200.out  stemmer_500.out  stemmer_700.out

2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 if you unzip your stempel-1.0.jar do you have the
 required directory structure and file in there?
 org/getopt/stempel/lucene/StempelFilter.class

 Regards,
 Bernd

 Am 02.11.2010 13:54, schrieb Jakub Godawa:
 Erick I've put the jar files like that before. I also added the
 directive and put the file in instanceDir/lib

 What is still a problem is that even the files are loaded:
 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader 
 replaceClassLoader
 INFO: Adding 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
 to classloader

 I am not able to use the FilterFactory... maybe I am attempting it in
 a wrong way?

 Cheers,
 Jakub Godawa.

 2010/11/2 Erick Erickson erickerick...@gmail.com:
 The polish stemmer jar file needs to be findable by Solr, if you copy
 it to solr_home/lib and restart solr you should be set.

 Alternatively, you can add another lib directive to the solrconfig.xml
 file
 (there are several examples in that file already).

 I'm a little confused about not being able to find TokenFilter, is that
 still
 a problem?

 HTH
 Erick

 On Tue, Nov 2, 2010 at 8:07 AM, Jakub Godawa jakub.god...@gmail.com wrote:

 Thank you Bernd! I couldn't make it run though. Here is my problem:

 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
 directive: lib path=../lib/stempel-1.0.jar /
 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:

 (...)
  !-- Polish --
   fieldType name=text_pl class=solr.TextField
    analyzer
       tokenizer class=solr.WhitespaceTokenizerFactory/
      filter class=solr.LowerCaseFilterFactory/
      filter class=org.getopt.stempel.lucene.StempelFilter /
      !--    filter
 class=org.getopt.solr.analysis.StempelTokenFilterFactory
 protected=protwords.txt / --
    /analyzer
  /fieldType
 (...)

 4. jar file is loaded but I got an error:
 SEVERE: Could not start SOLR. Check solr/home property
 java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
      at java.lang.ClassLoader.defineClass1(Native Method)
      at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
      at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 (...)

 5. Different class gave me that one:
 SEVERE: org.apache.solr.common.SolrException: Error loading class
 'org.getopt.solr.analysis.StempelTokenFilterFactory'
      at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
      at
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
 (...)

 Question is: How to make fieldType / and filter / work with that
 Stempel? :)

 Cheers,
 Jakub Godawa.

 2010/10/29 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 I have ported the KStemmer for use in most recent Solr trunk version.
 My stemmer is located in the lib directory of Solr
 solr/lib/KStemmer-2.00.jar
 because it belongs to Solr.

 Write it as FilterFactory and use it as Filter like:
 filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /

 This is how my fieldType looks like:

    fieldType name=text_kstem class=solr.TextField
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.WhitespaceTokenizerFactory /
        filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=false /
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1 /
        filter class=solr.LowerCaseFilterFactory /
        filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /
        filter class=solr.RemoveDuplicatesTokenFilterFactory /
      /analyzer
      analyzer type=query
        tokenizer class=solr.WhitespaceTokenizerFactory /
        filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt /
        filter 

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-02 Thread Bernd Fehling

So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
In this case I would assume a file StempelTokenFilterFactory.class
in your directory org/getopt/solr/analysis/.

And a class which extends the BaseTokenFilterFactory rigth?
...
public class StempelTokenFilterFactory extends BaseTokenFilterFactory 
implements ResourceLoaderAware {
...



Am 02.11.2010 14:20, schrieb Jakub Godawa:
 This is what stempel-1.0.jar consist of after jar -xf:
 
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
 org/:
 egothor  getopt
 
 org/egothor:
 stemmer
 
 org/egothor/stemmer:
 Cell.class Diff.classGener.class  MultiTrie2.class
 Optimizer2.class  Reduce.classRow.classTestAll.class
 TestLoad.class  Trie$StrEnum.class
 Compile.class  DiffIt.class  Lift.class   MultiTrie.class
 Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
 Trie.class
 
 org/getopt:
 stempel
 
 org/getopt/stempel:
 Benchmark.class  lucene  Stemmer.class
 
 org/getopt/stempel/lucene:
 StempelAnalyzer.class  StempelFilter.class
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
 META-INF/:
 MANIFEST.MF
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
 res:
 tables
 
 res/tables:
 readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
 stemmer_200.out  stemmer_500.out  stemmer_700.out
 
 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 if you unzip your stempel-1.0.jar do you have the
 required directory structure and file in there?
 org/getopt/stempel/lucene/StempelFilter.class

 Regards,
 Bernd

 Am 02.11.2010 13:54, schrieb Jakub Godawa:
 Erick I've put the jar files like that before. I also added the
 directive and put the file in instanceDir/lib

 What is still a problem is that even the files are loaded:
 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader 
 replaceClassLoader
 INFO: Adding 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
 to classloader

 I am not able to use the FilterFactory... maybe I am attempting it in
 a wrong way?

 Cheers,
 Jakub Godawa.

 2010/11/2 Erick Erickson erickerick...@gmail.com:
 The polish stemmer jar file needs to be findable by Solr, if you copy
 it to solr_home/lib and restart solr you should be set.

 Alternatively, you can add another lib directive to the solrconfig.xml
 file
 (there are several examples in that file already).

 I'm a little confused about not being able to find TokenFilter, is that
 still
 a problem?

 HTH
 Erick

 On Tue, Nov 2, 2010 at 8:07 AM, Jakub Godawa jakub.god...@gmail.com 
 wrote:

 Thank you Bernd! I couldn't make it run though. Here is my problem:

 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
 directive: lib path=../lib/stempel-1.0.jar /
 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:

 (...)
  !-- Polish --
   fieldType name=text_pl class=solr.TextField
analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=org.getopt.stempel.lucene.StempelFilter /
  !--filter
 class=org.getopt.solr.analysis.StempelTokenFilterFactory
 protected=protwords.txt / --
/analyzer
  /fieldType
 (...)

 4. jar file is loaded but I got an error:
 SEVERE: Could not start SOLR. Check solr/home property
 java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
  at java.lang.ClassLoader.defineClass1(Native Method)
  at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
  at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 (...)

 5. Different class gave me that one:
 SEVERE: org.apache.solr.common.SolrException: Error loading class
 'org.getopt.solr.analysis.StempelTokenFilterFactory'
  at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
  at
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
 (...)

 Question is: How to make fieldType / and filter / work with that
 Stempel? :)

 Cheers,
 Jakub Godawa.

 2010/10/29 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 I have ported the KStemmer for use in most recent Solr trunk version.
 My stemmer is located in the lib directory of Solr
 solr/lib/KStemmer-2.00.jar
 because it belongs to Solr.

 Write it as FilterFactory and use it as Filter like:
 filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /

 This is how my fieldType looks like:

fieldType name=text_kstem class=solr.TextField
 positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=false /
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1 /

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-02 Thread Jakub Godawa
Sorry, I am not Java programmer at all. I would appreciate more
verbose (or step by step) help.

2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:

 So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
 In this case I would assume a file StempelTokenFilterFactory.class
 in your directory org/getopt/solr/analysis/.

 And a class which extends the BaseTokenFilterFactory rigth?
 ...
 public class StempelTokenFilterFactory extends BaseTokenFilterFactory 
 implements ResourceLoaderAware {
 ...



 Am 02.11.2010 14:20, schrieb Jakub Godawa:
 This is what stempel-1.0.jar consist of after jar -xf:

 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
 org/:
 egothor  getopt

 org/egothor:
 stemmer

 org/egothor/stemmer:
 Cell.class     Diff.class    Gener.class  MultiTrie2.class
 Optimizer2.class  Reduce.class        Row.class    TestAll.class
 TestLoad.class  Trie$StrEnum.class
 Compile.class  DiffIt.class  Lift.class   MultiTrie.class
 Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
 Trie.class

 org/getopt:
 stempel

 org/getopt/stempel:
 Benchmark.class  lucene  Stemmer.class

 org/getopt/stempel/lucene:
 StempelAnalyzer.class  StempelFilter.class
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
 META-INF/:
 MANIFEST.MF
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
 res:
 tables

 res/tables:
 readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
 stemmer_200.out  stemmer_500.out  stemmer_700.out

 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 if you unzip your stempel-1.0.jar do you have the
 required directory structure and file in there?
 org/getopt/stempel/lucene/StempelFilter.class

 Regards,
 Bernd

 Am 02.11.2010 13:54, schrieb Jakub Godawa:
 Erick I've put the jar files like that before. I also added the
 directive and put the file in instanceDir/lib

 What is still a problem is that even the files are loaded:
 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader 
 replaceClassLoader
 INFO: Adding 
 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
 to classloader

 I am not able to use the FilterFactory... maybe I am attempting it in
 a wrong way?

 Cheers,
 Jakub Godawa.

 2010/11/2 Erick Erickson erickerick...@gmail.com:
 The polish stemmer jar file needs to be findable by Solr, if you copy
 it to solr_home/lib and restart solr you should be set.

 Alternatively, you can add another lib directive to the solrconfig.xml
 file
 (there are several examples in that file already).

 I'm a little confused about not being able to find TokenFilter, is that
 still
 a problem?

 HTH
 Erick

 On Tue, Nov 2, 2010 at 8:07 AM, Jakub Godawa jakub.god...@gmail.com 
 wrote:

 Thank you Bernd! I couldn't make it run though. Here is my problem:

 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
 directive: lib path=../lib/stempel-1.0.jar /
 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:

 (...)
  !-- Polish --
   fieldType name=text_pl class=solr.TextField
    analyzer
       tokenizer class=solr.WhitespaceTokenizerFactory/
      filter class=solr.LowerCaseFilterFactory/
      filter class=org.getopt.stempel.lucene.StempelFilter /
      !--    filter
 class=org.getopt.solr.analysis.StempelTokenFilterFactory
 protected=protwords.txt / --
    /analyzer
  /fieldType
 (...)

 4. jar file is loaded but I got an error:
 SEVERE: Could not start SOLR. Check solr/home property
 java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
      at java.lang.ClassLoader.defineClass1(Native Method)
      at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
      at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 (...)

 5. Different class gave me that one:
 SEVERE: org.apache.solr.common.SolrException: Error loading class
 'org.getopt.solr.analysis.StempelTokenFilterFactory'
      at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
      at
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
 (...)

 Question is: How to make fieldType / and filter / work with that
 Stempel? :)

 Cheers,
 Jakub Godawa.

 2010/10/29 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 I have ported the KStemmer for use in most recent Solr trunk version.
 My stemmer is located in the lib directory of Solr
 solr/lib/KStemmer-2.00.jar
 because it belongs to Solr.

 Write it as FilterFactory and use it as Filter like:
 filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory
 protected=protwords.txt /

 This is how my fieldType looks like:

    fieldType name=text_kstem class=solr.TextField
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.WhitespaceTokenizerFactory /
        filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=false /
        filter 

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-02 Thread Lance Norskog
Here's the problem: Solr is a little dumb about these Filter classes,
and so you have to make a Factory object for the Stempel Filter.

There are a lot of other FilterFactory classes. You would have to just
copy one and change the names to Stempel and it might actually work.

This will take some Solr programming- perhaps the author can help you?

On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawa jakub.god...@gmail.com wrote:
 Sorry, I am not Java programmer at all. I would appreciate more
 verbose (or step by step) help.

 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:

 So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
 In this case I would assume a file StempelTokenFilterFactory.class
 in your directory org/getopt/solr/analysis/.

 And a class which extends the BaseTokenFilterFactory rigth?
 ...
 public class StempelTokenFilterFactory extends BaseTokenFilterFactory 
 implements ResourceLoaderAware {
 ...



 Am 02.11.2010 14:20, schrieb Jakub Godawa:
 This is what stempel-1.0.jar consist of after jar -xf:

 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
 org/:
 egothor  getopt

 org/egothor:
 stemmer

 org/egothor/stemmer:
 Cell.class     Diff.class    Gener.class  MultiTrie2.class
 Optimizer2.class  Reduce.class        Row.class    TestAll.class
 TestLoad.class  Trie$StrEnum.class
 Compile.class  DiffIt.class  Lift.class   MultiTrie.class
 Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
 Trie.class

 org/getopt:
 stempel

 org/getopt/stempel:
 Benchmark.class  lucene  Stemmer.class

 org/getopt/stempel/lucene:
 StempelAnalyzer.class  StempelFilter.class
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
 META-INF/:
 MANIFEST.MF
 jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
 res:
 tables

 res/tables:
 readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
 stemmer_200.out  stemmer_500.out  stemmer_700.out

 2010/11/2 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 if you unzip your stempel-1.0.jar do you have the
 required directory structure and file in there?
 org/getopt/stempel/lucene/StempelFilter.class

 Regards,
 Bernd

 Am 02.11.2010 13:54, schrieb Jakub Godawa:
 Erick I've put the jar files like that before. I also added the
 directive and put the file in instanceDir/lib

 What is still a problem is that even the files are loaded:
 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader 
 replaceClassLoader
 INFO: Adding 
 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
 to classloader

 I am not able to use the FilterFactory... maybe I am attempting it in
 a wrong way?

 Cheers,
 Jakub Godawa.

 2010/11/2 Erick Erickson erickerick...@gmail.com:
 The polish stemmer jar file needs to be findable by Solr, if you copy
 it to solr_home/lib and restart solr you should be set.

 Alternatively, you can add another lib directive to the solrconfig.xml
 file
 (there are several examples in that file already).

 I'm a little confused about not being able to find TokenFilter, is that
 still
 a problem?

 HTH
 Erick

 On Tue, Nov 2, 2010 at 8:07 AM, Jakub Godawa jakub.god...@gmail.com 
 wrote:

 Thank you Bernd! I couldn't make it run though. Here is my problem:

 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
 directive: lib path=../lib/stempel-1.0.jar /
 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:

 (...)
  !-- Polish --
   fieldType name=text_pl class=solr.TextField
    analyzer
       tokenizer class=solr.WhitespaceTokenizerFactory/
      filter class=solr.LowerCaseFilterFactory/
      filter class=org.getopt.stempel.lucene.StempelFilter /
      !--    filter
 class=org.getopt.solr.analysis.StempelTokenFilterFactory
 protected=protwords.txt / --
    /analyzer
  /fieldType
 (...)

 4. jar file is loaded but I got an error:
 SEVERE: Could not start SOLR. Check solr/home property
 java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
      at java.lang.ClassLoader.defineClass1(Native Method)
      at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
      at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 (...)

 5. Different class gave me that one:
 SEVERE: org.apache.solr.common.SolrException: Error loading class
 'org.getopt.solr.analysis.StempelTokenFilterFactory'
      at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
      at
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
 (...)

 Question is: How to make fieldType / and filter / work with that
 Stempel? :)

 Cheers,
 Jakub Godawa.

 2010/10/29 Bernd Fehling bernd.fehl...@uni-bielefeld.de:
 Hi Jakub,

 I have ported the KStemmer for use in most recent Solr trunk version.
 My stemmer is located in the lib directory of Solr
 solr/lib/KStemmer-2.00.jar
 because it belongs to Solr.

 Write it as FilterFactory and use it as 

How to use polish stemmer - Stempel - in schema.xml?

2010-10-28 Thread Jakub Godawa
Hi!
There is a polish stemmer http://www.getopt.org/stempel/ and I have
problems connecting it with solr 1.4.1
Questions:

1. Where EXACTLY do I put stemper-1.0.jar file?
2. How do I register the file, so I can build a fieldType like:

fieldType name=text_pl class=solr.TextField
  analyzer class=org.geoopt.solr.analysis.StempelTokenFilterFactory/
/fieldType

3. Is that the right approach to make it work?

Thanks for verbose explanation,
Jakub.


Re: How to use polish stemmer - Stempel - in schema.xml?

2010-10-28 Thread Bernd Fehling
Hi Jakub,

I have ported the KStemmer for use in most recent Solr trunk version.
My stemmer is located in the lib directory of Solr solr/lib/KStemmer-2.00.jar
because it belongs to Solr.

Write it as FilterFactory and use it as Filter like:
filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory 
protected=protwords.txt /

This is how my fieldType looks like:

fieldType name=text_kstem class=solr.TextField 
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=false /
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1 /
filter class=solr.LowerCaseFilterFactory /
filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory 
protected=protwords.txt /
filter class=solr.RemoveDuplicatesTokenFilterFactory /
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt /
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0
catenateAll=0 splitOnCaseChange=1 /
filter class=solr.LowerCaseFilterFactory /
filter class=de.ubbielefeld.solr.analysis.KStemFilterFactory 
protected=protwords.txt /
filter class=solr.RemoveDuplicatesTokenFilterFactory /
  /analyzer
/fieldType

Regards,
Bernd



Am 28.10.2010 14:56, schrieb Jakub Godawa:
 Hi!
 There is a polish stemmer http://www.getopt.org/stempel/ and I have
 problems connecting it with solr 1.4.1
 Questions:
 
 1. Where EXACTLY do I put stemper-1.0.jar file?
 2. How do I register the file, so I can build a fieldType like:
 
 fieldType name=text_pl class=solr.TextField
   analyzer class=org.geoopt.solr.analysis.StempelTokenFilterFactory/
 /fieldType
 
 3. Is that the right approach to make it work?
 
 Thanks for verbose explanation,
 Jakub.