try to set the tiebreaker above 1.0, this will increase score for dismax
findings in other than the best field.
But this may lead to strange side effects?
-Original Message-
From: ext davidbrai [mailto:davidb...@gmail.com]
Sent: Donnerstag, 9. Dezember 2010 09:55
To:
You just can't set it to unlimited. What you could do, is ignoring the
positions and put a filter in, that sets the token for all but the first token
to 0 (means the field length will be just 1, all tokens stacked on the first
position)
You could also break per page, so you put each page on a
I don't know about upload limitations, but for sure there are some in
the default settings, this could explain the limit of 20MB. Which
upload mechanism on solr side do you use? I guess this is not a lucene
problem but rather the http-layer of solr.
If you manage to stream your PDF and
You could also put a short representation of the data (I suggest days since
01.01.2010) as payload and calculate boost with payload function of the
similarity.
-Original Message-
From: ext Jason Brown [mailto:jason.br...@sjp.co.uk]
Sent: Montag, 29. November 2010 17:28
To:
We had the same problem for our fields and we wrote a Tokenizer using the icu4j
library. Breaking tokens at script changes, and dealing with them according the
script and the configured Breakiterators.
This works out very well, as we also add the scrip information to the token
so later filter
Sorry for the double post. Is there someone, that can point me where the
original query given to the DisMaxHandler/QParser is splitted?
Jan
-Original Message-
From: Kurella Jan (Nokia-MS/Berlin)
Sent: Montag, 22. November 2010 14:49
To: solr-user@lucene.apache.org
Subject:
Hi,
Using the SearchHandler with the deftype=”dismax” option enables the
DisMaxQParserPlugin. From investigating it seems, it is just tokenizing by
whitespace.
Although by looking in the code I could not find the place, where this behavior
is enforced? I only found, that for each field
Hi,
I’m trying to find a solution to search only in a given language.
On index time the language is known per string to be tokenized so I would like
to write a filter that prefixes each token according to its language.
First question: how to pass the language argument to the filter best?
I’m
Hi,
yes this is one of my four options I am going to evaluate. Why your suggestion
might be problematic:
We have ca. 12 language sensitive fields and support ca. 200 distinct languages
= 2400 fields
a multifield/dismax query spanning 2400 fields might become problematic?
We will go for this
Hi,
Using the SearchHandler with the deftype=”dismax” option enables the
DisMaxQParserPlugin. From investigating it seems, it is just tokenizing by
whitespace.
Although by looking in the code I could not find the place, where this behavior
is enforced? I only found, that for each field
10 matches
Mail list logo