-user=t
Subject: Re: copyField at search time / multi-language support
To: [hidden
email]http://user/SendEmail.jtp?type=nodenode=2747011i=2by-user=t
Cc: Andy [hidden
email]http://user/SendEmail.jtp?type=nodenode=2747011i=3by-user=t
Date: Tuesday, March 29, 2011, 1:29 AM
https
This may not be all that helpful, but have you looked at edismax?
https://issues.apache.org/jira/browse/SOLR-1553
It allows the full Solr query syntax while preserving the goodness of
dismax.
This is standard equipment on 3.1, which is being released even as we
speak, and I also know it's being
Hi,
Here's my problem: I'm indexing a corpus with text in a variety of
languages. I'm planning to detect these at index time and send the
text to one of a suitably-configured field (e.g. mytext_de for
German, mytext_cjk for Chinese/Japanese/Korean etc.)
At search time I want to search all of
On Mon, Mar 28, 2011 at 2:15 PM, Tom Mortimer t...@flax.co.uk wrote:
Hi,
Here's my problem: I'm indexing a corpus with text in a variety of
languages. I'm planning to detect these at index time and send the
text to one of a suitably-configured field (e.g. mytext_de for
German, mytext_cjk for
Tom,
Could you share the method you use to perform language detection? Any open
source tools that do that?
Thanks.
--- On Mon, 3/28/11, Tom Mortimer t...@flax.co.uk wrote:
From: Tom Mortimer t...@flax.co.uk
Subject: copyField at search time / multi-language support
To: solr-user
at search time / multi-language support
To: solr-user@lucene.apache.org
Date: Monday, March 28, 2011, 4:45 AM
Hi,
Here's my problem: I'm indexing a corpus with text in a
variety of
languages. I'm planning to detect these at index time and
send the
text to one of a suitably-configured
Thanks Markus.
Do you know if this patch is good enough for production use? Thanks.
Andy
--- On Tue, 3/29/11, Markus Jelsma markus.jel...@openindex.io wrote:
From: Markus Jelsma markus.jel...@openindex.io
Subject: Re: copyField at search time / multi-language support
To: solr-user
:
From: Markus Jelsma markus.jel...@openindex.io
Subject: Re: copyField at search time / multi-language support
To: solr-user@lucene.apache.org
Cc: Andy angelf...@yahoo.com
Date: Tuesday, March 29, 2011, 1:29 AM
https://issues.apache.org/jira/browse/SOLR-1979
Tom,
Could you
this message in context:
http://lucene.472066.n3.nabble.com/Help-on-Multi-language-support-tp2636054p2636054.html
Sent from the Solr - User mailing list archive at Nabble.com.
.nabble.com/Help-on-Multi-language-support-tp2636054p2636054.html
Sent from the Solr - User mailing list archive at Nabble.com.
This is the solr schema:
--
View this message in context:
http://lucene.472066.n3.nabble.com/Help-on-Multi-language-support-tp2636054p2636065.html
Sent from the Solr - User mailing list archive at Nabble.com.
right, but we should not encourage users to significantly degrade
overall relevance for all movies due to a few movies and a band (very
special cases, as I said).
In english, by not using stopwords, it doesn't really degrade
relevance that much, so its a reasonable decision to make. This is not
Isn't the conclusion here that some stopword and stemming free
matching should be the best match if ever and to then gently degrade
to weaker forms of matching?
paul
Le 13-janv.-10 à 07:08, Walter Underwood a écrit :
There is a band named The The. And a producer named Don Was. For
a
Robert Muir: Thank you for the pointer to that paper!
On Wed, Jan 13, 2010 at 6:29 AM, Paul Libbrecht p...@activemath.org wrote:
Isn't the conclusion here that some stopword and stemming free matching
should be the best match if ever and to then gently degrade to weaker forms
of matching?
There are a lot of projects that don't use stopwords any more. You
might consider dropping them altogether.
On Mon, Jan 11, 2010 at 2:25 PM, Don Werve d...@madwombat.com wrote:
This is the way I've implemented multilingual search as well.
2010/1/11 Markus Jelsma mar...@buyways.nl
Hello,
I don't think this is something to consider across the board for all
languages. The same grammatical units that are part of a word in one
language (and removed by stemmers) are independent morphemes in others
(and should be stopwords)
so please take this advice on a case-by-case basis for each
sorry, i forgot to include this 2009 paper comparing what stopwords do
across 3 languages:
http://doc.rero.ch/lm.php?url=1000,43,4,20091218142456-GY/Dolamic_Ljiljana_-_When_Stopword_Lists_Make_the_Difference_20091218.pdf
in my opinion, if stopwords annoy your users for very special cases
like
There is a band named The The. And a producer named Don Was. For a list of
all-stopword movie titles at Netflix, see this post:
http://wunderwood.org/most_casual_observer/2007/05/invisible_titles.html
My favorite is To Be and To Have (Être et Avoir), which is all stopwords in
two languages.
Hi Solr users.
I'm trying to set up a site with Solr search integrated. And I use the
SolJava API to feed the index with search documents. At the moment I
have only activated search on the English portion of the site. I'm
interested in using as many features of solr as possible. Synonyms,
Hello,
We have implemented language specific search in Solr using language
specific fields and field types. For instance, an en_text field type can
use an English stemmer, and list of stopwords and synonyms. We, however
did not use specific stopwords, instead we used one list shared by both
This is the way I've implemented multilingual search as well.
2010/1/11 Markus Jelsma mar...@buyways.nl
Hello,
We have implemented language specific search in Solr using language
specific fields and field types. For instance, an en_text field type can
use an English stemmer, and list of
On Apr 9, 2009, at 7:09 AM, revas wrote:
Hi,
To reframe my earlier question
Some languages have just analyzers only but nostemmer from snowball
porter,then does the analyzer take care of stemming as well?
Some languages only have the stemmer from snowball but no analyzer?
Some have both.
Hi,
To reframe my earlier question
Some languages have just analyzers only but nostemmer from snowball
porter,then does the analyzer take care of stemming as well?
Some languages only have the stemmer from snowball but no analyzer?
Some have both.
Can we say then that solr supports all the
23 matches
Mail list logo