I'm using this for searching (extracted and not full code):
IndexSearcher indexSearcher = new IndexSearcher("index");
final QueryParser queryParser = new QueryParser("Line", myAnalyzer);
queryParser.setAllowLeadingWildcard(true);
final Query query = queryParser.parse(searchText);
final BitSet
As i know, the time effciency of creating index is non-linearity with the size
of documents. For example, if the size of indexes is 1G, the time cost is 2
hours, If the size of indexes is 10G, the time cost may be 30 hours. Who can
tell me what is the reason? Any tips will be appreciated.
Hi,
Can I boost different fields in MultiFieldQueryParser with different
factors? Also, what is the maximum boost factor value I can assign to a
field?
Thanks a ton! Ed
--
View this message in context:
http://www.nabble.com/Boost-factor-in-MultiFieldQueryParser-tp22031092p22031092.html
Sent
In my experience, the main issue to be concerned about with tons of
fields is norms. You'll likely have to turn them off for most of the
fields unless you have plenty of RAM to burn. They are stored in byte
arrays of size maxdoc for each field (eg non sparse). Other than that, I
don't think the
Thanks Mark,
I got the latest Contrib bits for Highlighter.net (Jan 28/2008 Version
2.3.2) but it looks similar to the older 2.0.0
There is a QueryScroer only.
Any ideas? (Really important to me :)
Ian
On Sat, Feb 14, 2009 at 11:56 PM, Mark Miller wrote:
> Sorry, I wasn't specific enough. I
Mark Miller wrote:
Michael McCandless wrote:
Mark Miller wrote:
So HitCollector#collect(int doc, float score) is not called in a
special
(default) order and must order the docs itself by score if one
needs the
hits sorted by relevance?
Presumably there is no score ordering to the h
15 feb 2009 kl. 16.27 skrev Joel Halbert:
Is there any practical limit on the number of fields that can be
maintained on an index?
My index looks something like this, 1 million documents. For each
group
of 1000 documents I might have 10 indexed fields. This would mean in
total about 1 f
Michael McCandless wrote:
Mark Miller wrote:
So HitCollector#collect(int doc, float score) is not called in a
special
(default) order and must order the docs itself by score if one needs
the
hits sorted by relevance?
Presumably there is no score ordering to the hit id's lucene
delivers
Mark Miller wrote:
So HitCollector#collect(int doc, float score) is not called in a
special
(default) order and must order the docs itself by score if one
needs the
hits sorted by relevance?
Presumably there is no score ordering to the hit id's lucene
delivers to
a HitCollector? i.e.
So HitCollector#collect(int doc, float score) is not called in a special
(default) order and must order the docs itself by score if one needs the
hits sorted by relevance?
Presumably there is no score ordering to the hit id's lucene delivers to
a HitCollector? i.e. they are delivered in th
> The HitCollector used will determine how things are ordered.
> In 2.4, the
> TopDocCollector will order by relevancy and the
> TopFieldDocCollector can
> order by
> relevancy, index order, or by field. Lucene delivers the hit
> ids to the
> HitCollector and it can order as it pleases.
So
Presumably there is no score ordering to the hit id's lucene delivers to
a HitCollector? i.e. they are delivered in the order they are found and
score is neither ascending or descending i.e. the next score could be
higher or lower that the previous one?
-Original Message-
From: Mark Miller
spr...@gmx.eu wrote:
Hi,
in what order does search(Query query, HitCollector results) return the
results? By relevance?
Thank you.
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-
Hi,
in what order does search(Query query, HitCollector results) return the
results? By relevance?
Thank you.
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lu
On Sun, Feb 15, 2009 at 10:50 AM, Joel Halbert wrote:
> When constructing a query, using a series of terms e.g.
>
> Term1=X, Term2=Y etc...
>
> does it make sense, like in sql, to place to most restrictive term query
> first?
>
> i.e. if I know that the query will be mainly constrained by the valu
When constructing a query, using a series of terms e.g.
Term1=X, Term2=Y etc...
does it make sense, like in sql, to place to most restrictive term query
first?
i.e. if I know that the query will be mainly constrained by the value of
Term1, does having this as the first in the query make the exec
Hi,
Is there any practical limit on the number of fields that can be
maintained on an index?
My index looks something like this, 1 million documents. For each group
of 1000 documents I might have 10 indexed fields. This would mean in
total about 1 fields. Am I going to run into any issues her
Meanwhile the choice between SortedVIntList and OpenBitSet
has been removed from the trunk (development version),
that now uses OpenBitSet only:
https://issues.apache.org/jira/browse/LUCENE-1296
In case there is preference to have SortedVIntList used in the
next lucene version (i.e. in cases when
I think you would need to
1) collect all the matching IDs for Field2=x
2) loop through Field1, for each Term's doc, collect the term if the term
doc is in the matching IDs from step 1.
This should be the fastest approach, pretty similar to what you suggested.
--
Chris Lu
Hi,
I'm looking for an optimal solution for extracting unique field values.
The rub is that I want to be able to perform this for a unique subset of
documents...as per the example:
I have an index with Field1 and Field2.
I want "all unique values of Field1 where Field2=X".
Other than actually p
20 matches
Mail list logo