that is really strange. so basic stopwords such as a the' are not
eliminated from the index?
On Tue, Nov 27, 2012 at 11:16 PM, 曹霖 cao...@babytree-inc.com wrote:
justt no stopwords are considered in that case
2012/11/28 Joe Zhang smartag...@gmail.com
t no stopwords are considered
On Nov 28, 2012, at 12:33 AM, Joe Zhang smartag...@gmail.com wrote:
that is really strange. so basic stopwords such as a the' are not
eliminated from the index?
There is no list of basic stopwords anywhere. If you want stop words, you
have to put them in the file yourself
Eliminating stopwords is generally a bad idea. It means you cannot search for
vitamin a.
Back in the 1970's, search engines eliminated stopwords so they could work on
16-bit machines. That isn't a problem any more.
wunder
On Nov 27, 2012, at 10:33 PM, Joe Zhang wrote:
that is really strange
words with umlauts in the text box for
indexing and queries.
Lance
- Original Message -
| From: Daniel Brügge daniel.brue...@googlemail.com
| To: solr-user@lucene.apache.org
| Sent: Wednesday, November 7, 2012 8:45:45 AM
| Subject: SolrCloud, Zookeeper and Stopwords with Umlaute
...@googlemail.com
| To: solr-user@lucene.apache.org
| Sent: Wednesday, November 7, 2012 8:45:45 AM
| Subject: SolrCloud, Zookeeper and Stopwords with Umlaute or other
special characters
|
| Hi,
|
| i am running a SolrCloud cluster with the 4.0.0 version. I have a
| stopwords
| file
| which
On Wed, Nov 7, 2012 at 11:45 AM, Daniel Brügge
daniel.brue...@googlemail.com wrote:
Hi,
i am running a SolrCloud cluster with the 4.0.0 version. I have a stopwords
file
which is in the correct encoding.
What makes you think that?
Note: Because I can read it is not the correct answer
at 12:12 PM, Robert Muir rcm...@gmail.com wrote:
On Wed, Nov 7, 2012 at 11:45 AM, Daniel Brügge
daniel.brue...@googlemail.com wrote:
Hi,
i am running a SolrCloud cluster with the 4.0.0 version. I have a
stopwords
file
which is in the correct encoding.
What makes you think that?
Note
issue,
which
somehow destroys my file. Will check.
On Thu, Nov 8, 2012 at 12:12 PM, Robert Muir rcm...@gmail.com wrote:
On Wed, Nov 7, 2012 at 11:45 AM, Daniel Brügge
daniel.brue...@googlemail.com wrote:
Hi,
i am running a SolrCloud cluster with the 4.0.0 version. I have a
stopwords
AM, Daniel Brügge
daniel.brue...@googlemail.com wrote:
Hi,
i am running a SolrCloud cluster with the 4.0.0 version. I have a
stopwords
file
which is in the correct encoding.
What makes you think that?
Note: Because I can read it is not the correct answer.
Ensure any of your
Hi,
i am running a SolrCloud cluster with the 4.0.0 version. I have a stopwords
file
which is in the correct encoding. It contains german Umlaute like e.g. 'ü'.
I am
also running a standalone Zookeeper which contains this stopwords file. In
my schema
i am using the stopwords file in the standard
: Wednesday, November 7, 2012 8:45:45 AM
| Subject: SolrCloud, Zookeeper and Stopwords with Umlaute or other special
characters
|
| Hi,
|
| i am running a SolrCloud cluster with the 4.0.0 version. I have a
| stopwords
| file
| which is in the correct encoding. It contains german Umlaute like
| e.g. 'ü
form
within Solr
--
View this message in context:
http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008580.html
Sent from the Solr - User mailing list archive at Nabble.com.
still learning about this, but by importing it twice, I think remove the
need to ever store the uneccessary fulltext document in its original form
within Solr
--
View this message in context:
http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords
Hi James,
In order to do the copyfield
technique, I need to store the original full text document
within Solr, like
this:
field name=truncated_description indexed=false
stored=false
field name=quot;keyword_descriptionquot;
indexed=quot;truequot;
stored=quot;lt;btrue*
No, that's not
@lucene.apache.org
Subject: Re: Taking a full text, then truncate and duplicate with stopwords
Ok, I’ve been doing a bit more research. In order to do the copyfield
technique, I need to store the original full text document within Solr, like
this:
field name=truncated_description indexed=false stored
in context:
http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008615.html
Sent from the Solr - User mailing list archive at Nabble.com.
Krupansky
-Original Message-
From: Spadez
Sent: Tuesday, September 18, 2012 10:33 AM
To: solr-user@lucene.apache.org
Subject: Re: Taking a full text, then truncate and duplicate with stopwords
Ok, thank you for the reply. I have one more question then I think
everything
is cleared up. If I
stopwords to remove common words):*
How should I be doing this. Purely with index analyzer's?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269.html
Sent from the Solr - User mailing list archive
Form (using stopwords to remove common words):*
Are you going to use this keyword form for searching or displaying purposes?
Purely for searching.
The truncated form is just to show to the user as a preview, and the keyword
form is for the keyword searching.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008295.html
Sent
it first into truncated_description and then again into
keyword_description.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008327.html
Sent from the Solr - User mailing list archive at Nabble.com.
--- On Mon, 9/17/12, Spadez james_will...@hotmail.com wrote:
From: Spadez james_will...@hotmail.com
Subject: Re: Taking a full text, then truncate and duplicate with stopwords
To: solr-user@lucene.apache.org
Date: Monday, September 17, 2012, 5:32 PM
In an attempt to answer my own
question
bench rings man engages
hands-free speaker function begins talk Everyone else
--
View this message in context:
http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008358.html
Sent from the Solr - User mailing list archive at Nabble.com.
The trouble is, I want the truncated desciption to still
have the keywords.
copyField copies raw text, it has noting to do with analysis.
.
copyField source=keyword_description dest=truncated_description
maxChars=3000/
--
View this message in context:
http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008372.html
Sent from the Solr - User mailing list archive at Nabble.com.
--- On Mon, 9/17/12, Spadez james_will...@hotmail.com wrote:
From: Spadez james_will...@hotmail.com
Subject: Re: Taking a full text, then truncate and duplicate with stopwords
To: solr-user@lucene.apache.org
Date: Monday, September 17, 2012, 7:10 PM
Maybe I dont understand, but if you
and duplicate with stopwords
The trouble is, I want the truncated desciption to still
have the keywords.
copyField copies raw text, it has noting to do with analysis.
into keyword_document which uses stopwords to remove words like
and it this. Now I only have 3000 words for example.
Then if I do copy command to move it into truncate_document then even though
I can reduce it down to say 100 words, it is lacking words like and it
and this because it has been copied from
Then if I do copy command to move it into truncate_document
then even though
I can reduce it down to say 100 words, it is lacking words
like and it
and this because it has been copied from the
keyword_document.
That's not true. copy operation is performed before analysis (stopword removal,
the value in
different ways.
-- Jack Krupansky
-Original Message-
From: Spadez
Sent: Monday, September 17, 2012 12:29 PM
To: solr-user@lucene.apache.org
Subject: Re: Taking a full text, then truncate and duplicate with stopwords
I'm really confused here. I have a document which is say 4000
source value even if they analyze and index the value in
different ways.
-- Jack Krupansky
-Original Message-
From: Spadez
Sent: Monday, September 17, 2012 12:29 PM
To: solr-user@.apache
Subject: Re: Taking a full text, then truncate and duplicate with
stopwords
I'm really
@lucene.apache.org
Subject: Re: Taking a full text, then truncate and duplicate with stopwords
Ah, ok this is news to me and makes a lot more sense. If I can just run this
back past you to make sure I understand. If I move my full_text to
If I move my fulltext document from my SQL database
Two things:
1 did you re-index after you got your stopwords file set up? And I'd
blow away the index directory before re-indexing.
2 If you _store_ your field, the stopwords will be in your results
lists, but _not_ in your index. As a secondary
check, try going into your admin/schema browser
Look at the index with the Schema Browser in the Solr UI. This pulls
the terms for each field.
On Sun, Jul 15, 2012 at 8:38 PM, Giovanni Gherdovich
g.gherdov...@gmail.com wrote:
Hi all,
are stopwords from the stopwords.txt config file
supposed to be indexed?
I would say
Hi Giovanni,
you have entered the stopwords into stopword.txt file, right? But in the
definition of the field type you are referencing stopwords_FR.txt..
best regards,
Michael
On Mon, 16 Jul 2012 05:38:04 +0200, Giovanni Gherdovich
g.gherdov...@gmail.com wrote:
Hi all,
are stopwords from
, but...
they were all there.
Michael:
Hi Giovanni,
you have entered the stopwords into stopword.txt file, right? But in the
definition of the field type you are referencing stopwords_FR.txt..
good catch Micheal, but that's not the problem.
In my message I referred to stopwords.txt, but actually my
Hi all,
are stopwords from the stopwords.txt config file
supposed to be indexed?
I would say no, but this is the situation I am
observing on my Solr instance:
* I have a bunch of stopwords in stopwords.txt
* my fields are of fieldType text from the example schema.xml,
i.e. I have
-- -- 8
: I am using a solr.StopFilterFactory in a query filter for a text_general
: field (here: content). It works fine, when I query the field for the
: stopword, then I am getting no results.
...
: used in the text. What I am trying to achieve is, to also filter the
: stopwords from
to achieve is, to also filter the
stopwords from the facet_fields, but it's not working. It would only work
if the stopwords are also used during the indexing of the text_general
field, right?
The problem here is, that it's too much data to re-index every time I add a
new stopword.
My current
doing a facet.field=content call to get the words which are
used in the text. What I am trying to achieve is, to also filter the
stopwords from the facet_fields, but it's not working. It would only work
if the stopwords are also used during the indexing of the text_general
field, right
rather than the value of q. There's no tricks, I think.
koji
--
Apache Solr Query Log Visualizer
http://soleami.com/
Field definitions:
content_text (no stopwords, only synonyms in index)
content_hl (stopwords, synonyms in index and query, and only field in hl.fl)
Searching is done
O. Klein wrote
Hmm, now the synonyms aren't highlighted anymore.
OK back to basic (im using trunk and FVH).
What is the way to go about if I want to search on a field without
stopwords, but still want to highlight the stopwords? (and still highlight
synonyms and stemmed words)?
I
O. Klein wrote
O. Klein wrote
Hmm, now the synonyms aren't highlighted anymore.
OK back to basic (im using trunk and FVH).
What is the way to go about if I want to search on a field without
stopwords, but still want to highlight the stopwords? (and still
highlight synonyms
. There's no tricks, I think.
When using hl.q=content_hl:(spell Check) I now get highlighting including
stopwords.
but when using hl.q=content_hl:(SC) where SC is synonym I get no
highlighting.
Can you verify if synonyms work when using hl.q?
:
OK I got it working by using hl.q=content_hl
flexibility.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Highlighting-stopwords-tp3681901p3744054.html
Sent from the Solr - User mailing list archive at Nabble.com.
to explain a bit more how hl.q is supposed to work and
with some examples?
Thanx.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Highlighting-stopwords-tp3681901p3740114.html
Sent from the Solr - User mailing list archive at Nabble.com.
I got it fixed now I think.
I thought that if you used it like hl.q=spell Checker it would use the
query analysis of the field that was being highlighted as default. But in my
case it needs to be hl.q=content_hl:(spell Checker) for it to work. The
behavour I got default made no sense whatsoever.
Hmm, now the synonyms aren't highlighted anymore.
OK back to basic (im using trunk and FVH).
What is the way to go about if I want to search on a field without
stopwords, but still want to highlight the stopwords? (and still highlight
synonyms and stemmed words)?
--
View this message
someone confirm whether this is a bug?
Thank you.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Highlighting-stopwords-tp3681901p3734892.html
Sent from the Solr - User mailing list archive at Nabble.com.
(12/02/11 21:19), O. Klein wrote:
Koji Sekiguchi wrote
(12/01/24 9:31), O. Klein wrote:
Let's say I search for spellcheck solr on a website that only contains
info about Solr, so solr was added to the stopwords.txt. The query that
will be parsed then (dismax) will not contain the term solr.
(12/01/24 9:31), O. Klein wrote:
Let's say I search for spellcheck solr on a website that only contains
info about Solr, so solr was added to the stopwords.txt. The query that
will be parsed then (dismax) will not contain the term solr.
So fragments won't contain highlights of the term solr. So
Ah, I never used the hl.q
That did the trick. Thanx!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Highlighting-stopwords-tp3681901p3684245.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
I am using solr-3.4. My part of the schema looks like :
fieldType name=text class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory
Im using trunk and FVH and eventhough I filter stopwords when searching, I
would like to highlight stopwords in fragments. Using a different field
without the stopwords filter did not have the desired effect.
Is there a way to do this?
--
View this message in context:
http://lucene.472066.n3
(12/01/23 23:14), O. Klein wrote:
Im using trunk and FVH and eventhough I filter stopwords when searching, I
would like to highlight stopwords in fragments. Using a different field
without the stopwords filter did not have the desired effect.
Please provide more info. In particular, how your
Koji Sekiguchi wrote
(12/01/23 23:14), O. Klein wrote:
Im using trunk and FVH and eventhough I filter stopwords when searching,
I
would like to highlight stopwords in fragments. Using a different field
without the stopwords filter did not have the desired effect.
Please provide more
It's a bit of a privacy through obscurity measure, unfortunately. The
problem is that American courts do a lousy job of removing social
security numbers from cases that I put on my site. I do anonymization
before sending the cases to Solr, but if you're clever (and the
stopwords weren't
at index and query time, so sounds like I'm all set.
I'm doing anonymization of social security numbers, converting them to
xxx-xx-. I don't *think* users can find a way of identifying these docs
if the stopwords-based block works.
Thank you both for the confirmation.
Mike
On Sun 08
I have a unique use case where I have words in my corpus that users
shouldn't ever be allowed to search for. My theory is that if I add
these to the stopwords list, that should do the trick.
I'm using the edismax parser and it seems to be working in my dev
environment. Is there any risk
On Sun, Jan 8, 2012 at 3:33 PM, Michael Lissner
mliss...@michaeljaylissner.com wrote:
I have a unique use case where I have words in my corpus that users
shouldn't ever be allowed to search for. My theory is that if I add these
to the stopwords list, that should do the trick.
That should do
On Mon, Jan 9, 2012 at 5:03 AM, Michael Lissner
mliss...@michaeljaylissner.com wrote:
I have a unique use case where I have words in my corpus that users
shouldn't ever be allowed to search for. My theory is that if I add these to
the stopwords list, that should do the trick.
Yes, that should
I've got them configured at index and query time, so sounds like I'm
all set.
I'm doing anonymization of social security numbers, converting them to
xxx-xx-. I don't *think* users can find a way of identifying these
docs if the stopwords-based block works.
Thank you both
intentions of using both of them is - first I want
to use phrase queries so used CommonGramsFilterFactory. Secondly, I dont
want those stopwords in my index, so I have used StopFilterFactory to remove
them.
The commongrams filter turns each found occurrence of a word in the file
into two tokens
On 9/23/2011 1:45 AM, Pranav Prakash wrote:
Maybe I am wrong. But my intentions of using both of them is - first I
want to use phrase queries so used CommonGramsFilterFactory. Secondly,
I dont want those stopwords in my index, so I have used
StopFilterFactory to remove them.
CommonGrams
Hi List,
I included StopFilterFactory and I can see it taking action in the Analyzer
Interface. However, when I go to Schema Analyzer, I see those stop words in
the top 10 terms. Is this normal?
fieldType name=text_commongrams class=solr.TextField
analyzer
charFilter
On 9/22/2011 3:54 AM, Pranav Prakash wrote:
Hi List,
I included StopFilterFactory and I can see it taking action in the Analyzer
Interface. However, when I go to Schema Analyzer, I see those stop words in
the top 10 terms. Is this normal?
fieldType name=text_commongrams class=solr.TextField
Hi,
I have an autocomplete fieldType that works really well, but because
the KeywordTokenizerFactory (if I understand correctly) is emitting a
single token, the stopword filter will not detect any stopwords.
Anyone know of a way to strip out stopwords when using
KeywordTokenizerFactory? I did try
, the stopword filter will not detect any stopwords.
Anyone know of a way to strip out stopwords when using
KeywordTokenizerFactory? I did try the reg-exp replace filter, but I'm
not sure I want to add a bunch of reg-exps for replacing every
stopword.
Thanks,
Matt
Here's the fieldType definition
Hi Erik. Yes something like what you describe would do the trick. I
did find this:
http://lucene.472066.n3.nabble.com/Concatenate-multiple-tokens-into-one-td1879611.html
I might try the pattern replace filter with stopwords, even though
that feels kinda clunky.
Matt
On Wed, Jun 8, 2011 at 11
opinion) is not
to give up on stop words -- if you want to use stop words, by all means
use stop words. BUT! You must use them in all the fields of your qf ...
evne fields where you think why in gods name would i need stopwords on
this field, those terms will never exist in this field! ... you may
A thread with this same subject from 2008/2009 is here:
http://search-lucene.com/m/jkBgXnSsla
We're seeing customers being bitten by this bug now and then, and normally my
workaround is to simply not use stopwords at all.
However, is there an actual fix in the 3.1 eDisMax parser which solves
: stopwords not working in multicore setup
Ahh, thank you for the hints Martin... German stopwords without Umlaut work
correctly.
So I'm trying to figure out where the UTF-8 chars are getting messed up. Using
the Solr admin web UI, I did a search for title:für and the xml (or json)
output
I have some questions about your config:
Is the stopwords-de.txt in the same diractory as the shema.xml?
Is the title field from type text?
Have you the same problem with german stopwords with out Umlaut (ü,ö,ä) like
the word denn?
A Problem can be that the stopwords-de.txt is not save as UTF
Ahh, thank you for the hints Martin... German stopwords without Umlaut work
correctly.
So I'm trying to figure out where the UTF-8 chars are getting messed up.
Using the Solr admin web UI, I did a search for title:für and the xml (or
json) output in the browser shows the query with the proper
Hello,
I'm running a Solr server with 5 cores. Three are for English content and
two are for German content. The default stopwords setup works fine for the
English cores, but the German stopwords aren't working.
The German stopwords file is stopwords-de.txt and resides in the same
directory
Hello everyone,
I am developing a multilingual index so there is a need for different
languages support. I need some answers to the follwing questions:
1. Which steps should I follow in order to get(download) all the
stopwords-synonyms files for several languages?
2. Is there any site
On Friday 18 March 2011 17:09:35 abiratsis wrote:
Hello everyone,
I am developing a multilingual index so there is a need for different
languages support. I need some answers to the follwing questions:
1. Which steps should I follow in order to get(download) all the
stopwords-synonyms
OK thanx Markus, is clear enough now
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files-for-several-lanuages-tp2698494p2698566.html
Sent from the Solr - User mailing list archive at Nabble.com.
OK thanx Markus, is clear enough now
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files-for-several-lanuages-tp2698494p2698567.html
Sent from the Solr - User mailing list archive at Nabble.com.
://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files-for-several-lanuages-tp2698494p2698593.html
Sent from the Solr - User mailing list archive at Nabble.com.
depend
on what you're indexing you mean that I probably need to implement a
mechanism for handling synonyms right? If yes, you have any suggestions how
to implement this?
Thanx,
Alex
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-get-stopwords-and-synonyms-files
Hello all,
I have gotten my DataImporthandler to index my data from my MySQL database. I
was looking at the schema tool and noticing that stopwords in different
languages are being indexed as terms. The 6 languages we have are English,
French, Spanish, Chinese, German and Italian.
Right now I
Greg,
You need to get stopword lists for your 6 languages. Then you need to create
new field types just like that 'text' type, one for each language. Point them
to the appropriate stopwords files and instead of English specify each one of
your languages. You can either index each language
I reply to myself because I founded the mistake. The italian stopwords file
that I founded on apache site contains on the same line of each stopword a
comment shell style, the stopwords tokenizer probably is basical and doesn't
accept comments on the same line of stopwords. I dropped them
I'm using Lucid Imagination installation kit for SOLR (the last one with SOLR
1.4).
I would like to use stopwords, and I installed in
LucidWorks/lucidworks/solr/conf/stopwords.txt the italian version of the
file.
Moreover the field where I want to clean stopwords is declared in schema.xml
I'm not sure but it seems to me that subqueries query(.) [
http://wiki.apache.org/solr/FunctionQuery#query ] with only stopwords are
evaluated forall documents.
Example:
q={!func}myFunction(query(field:the))fq=field:(helloworld)
Since the is a stopword for field field, query(field:the
- Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
From: Rodrigo Rezende rcreze...@gmail.com
To: solr-user solr-user@lucene.apache.org
Sent: Thu, October 7, 2010 4:03:57 PM
Subject: subquery with stopwords
I'm not sure but it seems to me that subqueries
Let's suppose we have a regular search field body_t, and an internal
boolean flag flag_t not exposed to the user.
I'd like
body_t:foo AND flag_t:true
to be an intersection, but if foo is a stopword I get all documents
for which flag_t is true, as if the first class was dropped, or if
On Mon, Sep 13, 2010 at 3:27 PM, Xavier Noria f...@hashref.com wrote:
Let's suppose we have a regular search field body_t, and an internal
boolean flag flag_t not exposed to the user.
I'd like
body_t:foo AND flag_t:true
this is solr right? why don't you use filterquery for you unexposed
On Mon, Sep 13, 2010 at 4:29 PM, Simon Willnauer
simon.willna...@googlemail.com wrote:
On Mon, Sep 13, 2010 at 3:27 PM, Xavier Noria f...@hashref.com wrote:
Let's suppose we have a regular search field body_t, and an internal
boolean flag flag_t not exposed to the user.
I'd like
filter the EdgeNGramTokenFilter
field? Otherwise I would run into the same problems again, won't I?
Or if stopword filtering is ok on this field: Do you filter the
stopwords before or after EdgeNGram tokenizing?
Thanks,
Gert
dismax...
q=word1 word2 word3 qf=text text_prefix mm=100% tie=0
Ok, I will think about this. But I wonder if this will be more efficient
than just not filtering stopwords? (But I have to study the EdgeNGram
thing first. AFAIK it indexes all WORDS as WORDS, WORD, WOR, WO. So the
index
=0
Ok, I will think about this. But I wonder if this will be more efficient
than just not filtering stopwords? (But I have to study the EdgeNGram thing
first. AFAIK it indexes all WORDS as WORDS, WORD, WOR, WO. So the index will
be blown up, too?)
What I do not understand in your idea, why I
Hello,
I am having some problems with solr 1.4. I am indexing and querying data
using the following fieldType:
fieldType name=text_de_de class=solr.TextField
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter
Hmmm, I don't really see the problem here. I'll have to use English
examples...
Searching on the* (assuming the is a stopword) will search on
(them OR theory OR thespian) assuming those three words are in
your index. It will NOT search on the. So I think you're OK, or are
you seeing anomalous
: Searching on the* (assuming the is a stopword) will search on
: (them OR theory OR thespian) assuming those three words are in
: your index. It will NOT search on the. So I think you're OK, or are
: you seeing anomalous results?
i think the missing pieces to hte puzzle here are:
1) wildcard
I was reading Scaling Lucen and Solr
(http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr/)
and I came across the section StopWords.
In there it mentioned that its not recommended to remove
stop words at index
time. Why is this the case? Don't
-the-Experts/Articles/Scaling-Lucene-and-Solr/)
and I came across the section StopWords.
In there it mentioned that its not recommended to remove
stop words at index
time. Why is this the case? Don't all the extraneous
stopwords bloat the
index and lead to less relevant results? Can someone
11:13 AM
To: solr-user@lucene.apache.org
Subject: Re: Stopwords
That discussion cites a paper via a URL:
http://doc.rero.ch/lm.php?url#16;00,43,4,20091218142456-GY/Dolamic_Ljiljana__When_Stopword_Lists_Make_the_Difference_20091218.pdf
Unfortunately when I go to this URL I get:
L'accès à ce
On Mar 16, 2010, at 9:51 PM, blargy wrote:
I was reading Scaling Lucen and Solr
(http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr/)
and I came across the section StopWords.
In there it mentioned that its not recommended to remove stop
201 - 300 of 372 matches
Mail list logo