Hi. Did you get this working?
I think this is a problem on many German languages. Using "brute force
dictionary splitting" of compound words will generate to many words.
There is some words about using myspell dictionaries in the tsearch
project here:
http://www.sai.msu.su/~megera/wiki/Tsearch_V2
solr-user@lucene.apache.org
Sent: Friday, February 6, 2009 6:23:51 PM
Subject: Need help with DictionaryCompoundWordTokenFilterFactory
Hi,
Now I ran into another problem by using the
solr.DictionaryCompoundWordTokenFilterFactory :-(
If I search for the german word "Spargelcremesuppe" whi
Sounds like you need some work on the analysis part. I would start by
using the Solr Admin Analysis tool and play around with your settings
for that TokenFilter. Sounds too me like you might want a different
approach to compound words. I'm not a German expert, so can't offer
too much the
Hi,
Now I ran into another problem by using the
solr.DictionaryCompoundWordTokenFilterFactory :-(
If I search for the german word "Spargelcremesuppe" which contains
"Spargel", "Creme" and "Suppe" SOLR will find way to many result.
Its because SOLR finds EVERY entry with either one of the three
Hi Ralf,
On 10/14/2008 at 9:35 AM, Kraus, Ralf | pixelhouse GmbH wrote:
> Steven A Rowe schrieb:
> > Oops, variable-name != attribute-name.
> >
> > Thanks Hoss.
> >
> > Steve
> So
>
> "dictFile" or "dictionary" ???
Sorry, didn't mean to muddy the water.
Hoss is correct. I misread the s
Steven A Rowe schrieb:
Oops, variable-name != attribute-name.
Thanks Hoss.
Steve
So
"dictFile" or "dictionary" ???
Greets -Ralf-
Oops, variable-name != attribute-name.
Thanks Hoss.
Steve
On 10/14/2008 at 1:12 AM, Chris Hostetter wrote:
>
> > Try using the name of the file without a path - I believe
> the conf/ directory is in the search path used by Solr when
> loading resources, i.e.:
> >
> >dictFile="de_DR.xml"
>
Chris Hostetter schrieb:
: :dictFile="de_DR.xml"
:
: according to the code the param name is "dictionary" not dictFile.
PS: the dictionary file shouldn't be and XML file, it should look just
like a stopwords file (one word per line)
-Hoss
thx !
It finally runs perfect !
Gree
: :dictFile="de_DR.xml"
:
: according to the code the param name is "dictionary" not dictFile.
PS: the dictionary file shouldn't be and XML file, it should look just
like a stopwords file (one word per line)
-Hoss
: Try using the name of the file without a path - I believe the conf/ directory
is in the search path used by Solr when loading resources, i.e.:
:
:dictFile="de_DR.xml"
according to the code the param name is "dictionary" not dictFile.
I'll add a better error message.
-Hoss
Hi Ralf,
On 10/13/2008 at 5:45 AM, Kraus, Ralf | pixelhouse GmbH wrote:
> but solr canĀ“t find the dictionary file :-(
Try using the name of the file without a path - I believe the conf/ directory
is in the search path used by Solr when loading resources, i.e.:
dictFile="de_DR.xml"
As an al
Thx a lot !
I downloaded a dictionary called "de_DR.xml" and put it into my "conf"
directory...
Then I changed my schema.xml to :
class="solr.DictionaryCompoundWordTokenFilterFactory"
dictFile="./conf/de_DR.xml"
minWordSize="5"
minSubwordSize="2"
maxSubwordSize="15"
onlyLongestMatch="true"
bu
Hi Ralf,
On 10/10/2008 at 10:57 AM, Kraus, Ralf | pixelhouse GmbH wrote:
> I am trying to solve the typical german "Donaudampfschiff"-
> problem by using the DictionaryCompoundWordTokenFilter ...
> Anyone can show me how to configure my schema.xml to use the
> DictionaryCompoundWordTokenFilterFact
Hi,
I am trying to solve the typical german "Donaudampfschiff"- problem by
using the DictionaryCompoundWordTokenFilter ...
Anyone can show me how to configure my schema.xml to use the
DictionaryCompoundWordTokenFilterFactory ???
Greets -Ralf-
14 matches
Mail list logo