hi
Is there any documentation that says that scores obtained from
TopDocs.scoredocs[i].score
are comparable across queries. I am having this problem myself so I would
really appreciate if anyone has some pointers to this.
At [1], it seems like they are not. Is there any solution to enable this
co
Unfortunately, not yet. There have been discussions about this,
including this issue for "column-stride fields":
https://issues.apache.org/jira/browse/LUCENE-1231
But no real progress on it lately...
Mike
Diego Cassinera wrote:
Hello All
I´m writing an application to move full te
Well, MultiSearcher is just a Searcher, so you have available
all of the search methods on Searcher. One of which is:
search
public TopFieldDocs
*search*(Query
query,
Filter
filter,
int n,
Sort
sort)
Well, this is what I am doing:
queryString="year:[2003 TO 2005]"
[CODE]
Query pquery = null;
Hits hits = null;
Analyzer analyzer = null;
analyzer = new SnowballAnalyzer("English");
try {
pquery = MultiFieldQueryParser.parse(new String[] {queryString,
queryString}, new S
Are you using one of the search methods that includes sorting? If
not, then do. If you are, then you need to tell us exactly what you
are doing and exactly what you reckon is going wrong.
--
Ian.
On Wed, Nov 19, 2008 at 6:23 PM, Ariel <[EMAIL PROTECTED]> wrote:
> it is supposed lucene make a
Tim,
Op Wednesday 19 November 2008 02:32:40 schreef Tim Sturge:
...
> >>
> >> This is less than 2x slower than the dedicated bitset and more
> >> than 50x faster than the range boolean query.
> >>
> >> Mike, Paul, I'm happy to contribute this (ugly but working) code
> >> if there is interest. Let
it is supposed lucene make a lexicocraphic sorting but this is not hapening,
Could you tell me what I'm doing wrong ?
I hope you can help me.
Regards
On Wed, Nov 19, 2008 at 11:56 AM, Ariel <[EMAIL PROTECTED]> wrote:
> Thanks, that was very helpful, but I have a question when I make the
> searche
It's more than possible, it's probable. Cache thrashing would definitely be
my first guess; with so many copies of the exact same data you're not only
missing out on significant gains with the L2 cache, you're also taking a
major hit with every cache miss (which probably happens every context
swit
Please ignore this question.
I've noticed it was answered in
another thread just before
I posted my question.
Answer: use TopDocs.scoredocs[i].score
T. "Kuro" Kurosaka, Basis Technology
San Francisco, California, U.S.A.
-
I have a couple quick questions...it might just be because I haven't looked
at this in a week now (got pulled away onto some other stuff that had to
take priority).
In the searching phase, I would run the search across all page documents,
and then for each of those pages, do a search with
PayloadS
Thanks, that was very helpful, but I have a question when I make the
searches it does not sort the results according to the range, for example:
year: [2003 TO 2008] in the first page 2003 documents are showed, in the
second 2005 documents, in the third page 2004 documents, I don't see any
sort crit
Op Wednesday 19 November 2008 03:39:01 schreef [EMAIL PROTECTED]:
...
>
> Our design is roughly as follows: we have some pre-query filters,
> queries typically involving around 25 clauses, and some
> post-processing of hits. We collect counts and filter post query
> using a hit collector, which use
Hello,
Is there anyway to obtain a raw hit score?
I understand the deprecated Hits.getScore()
returns normalized scores, relative to each
query. Is TopDocs.scoreDocs[i].score
also normalized, or raw?
I'd like to compare confidence levels
of hits among different queries.
Thanks.
T. "Kuro" K
Hello All
I´m writing an application to move full text search out of my rdbms. Today
the app hits the db two times. 1) to do the search it self. 2) to format
the output of the search results. In my plan I´m moving everything to
lucene documents that contain fields where I will be doing the s
Hi - sounds like you need a range query.
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Range%20Searches
--
Ian.
On Wed, Nov 19, 2008 at 4:02 PM, Ariel <[EMAIL PROTECTED]> wrote:
> Hi everybody:
>
> I need to make search with lucene 2.3.2, taking in account the dates,
> previously
Hi everybody:
I need to make search with lucene 2.3.2, taking in account the dates,
previously when I build the index I create a date field where I stored the
year in which the document was created, at the search moment I would like to
retrieve documents that have been created before a Year or aft
excitingComm2 wrote:
Hi everybody,
as far as I know the lucene score is an arbitrary number between 0.0 and
1.0.
Is this correct, that the scores in my resultset are always normalised to
this spread or is it possible to get higher scores?
Regards,
John W.
Hits is the class that did the norma
Hi everybody,
as far as I know the lucene score is an arbitrary number between 0.0 and
1.0.
Is this correct, that the scores in my resultset are always normalised to
this spread or is it possible to get higher scores?
Regards,
John W.
--
View this message in context:
http://www.nabble.com/Spre
Hi Karl,
The reset() problem is not very problematic I can adapt our TokenStreams.
For the Serialization : as we need to share very small indexes (200 docs
max) in a cluster we need to serialize something.
I was planning to use the Java Serialization with maybe some compression
on the resulting
I'm going to have to punt on what Hibernate does/doesn't do since I have no
experience there.
But in general analyzers are very important. StandardAnalyzer, for instance,
tries
to recognize e-mail addresses. So it'll create some very interesting tokens,
some
that are unexpected unless you really k
Hi David,
thanks for the report! I suppose you speak of IndexWriter vs
InstantiatedIndexWriter? These are definitely considered discrepancy
problems. I've created a new issue in the tracker:
http://issues.apache.org/jira/browse/LUCENE-1462
For what reason do you try to serialize the InstantatedIn
Thanks for the quick answer!
I haven't specified the analyzer so it should be the StandardAnalyzer. I
forgot to mention that I'm using Lucene via Hibernate seach where I can
easily define the fields in the hibernate POJO-classes. But as far as I
know this shouldn't change things that much bec
Can you describe the queries in more detail? Can you narrow down
exactly which operations / types of queries are substantially slower?
Also, I'm assuming both of you are NOT on Windows? NIOFSDirectory has
poor performance on Windows due to this bug in Sun's JVM:
http://bugs.sun.com/
Hi,
Here are some differences I noticed between InstanciatedIndex and
RAMDirectory :
- RAMDirectory seems to do a reset on tokenStreams the first time, this
permits to initialise some objects before starting streaming,
InstanciatedIndex does not.
- I can Serialize a RAMDirectory but I cannot
[EMAIL PROTECTED] wrote:
> On an index of around 20 gigs I've been seeing a performance drop of
> around 35% after upgrading to 2.4 (measured on ~1 requests
> identical requests, executed in parallel against a threaded lucene /
> apache setup, after a roughly 1 query warmup). The principal
If you don't have a lot of entries for each invoice you can duplicate the
invoice for each entry - you'll have some field duplications (and bigger
index size) between the different invoices but it'll be easy to find exactly
what you want.
If you have too many different values, I built a solution s
26 matches
Mail list logo