Hi Uwe, thanks for your suggestions. I have tried a couple of things
with no luck yet:
Sorry,
I just noticed, you are using TermFilter not TermsFilter: This one
does not support random access (using bits()). Because of this the
filtered docs cannot be passed down using acceptDocs.
TermsFilter made no difference, still no acceptDocs passed to the
filter.
The should
clause in addition causes that the ConstantScoreQuery has to try all
documents because there is nothing else that could drive the query.
As an experiment I tried MUST, this didn't help either.
An alternative approach would be (in Lucene 4.10 or 5.0) to add the
TermFilter as ConstantScoreFilter(TermQuery) with boost=0 to the
BooleanQuery. In that case it can drive the query and does not affect
scoring. In later Lucene versions you may use the new
BooleanQuery.Occur type "FILTER" which can add any query as filter.
Filters will be deprecated once this is ready.
This is interesting and I will try it when I get a chance.
My goal is to slowly transform a particular field from StringField
to
BinaryDocValues so that during the transition a doc may hold the
value either
in the old location or the new. Therefore a query must be able to
say
oldField:"foo" OR newField:"foo"
Where oldField is a StringField and newField is a BinaryDocValues.
Why do you want to do this.
Good question! In our architecture we build indexes by pulling data
from several sources and
it is _expensive_. Increasingly we are requested to change one or two
fields which currently
requires a full re-index of the doc. When I attended the Dublin Lucene
conference I spoke to
Shai Erera about this problem and he pointed me at DocValues which
allow you to update fields
without incurring the full doc reindex cost. That is the appeal for
us.
As I said before, we want to transform docs only as they are updated,
where transformation
involves dropping the old TextField and creating a new
BinaryDocValuesField containing the same
value. Hence the need for the query to be able to search 'old OR new'.
If you want to query like this on the field, it is a bad idea to use
DocValues.
Why is it a bad idea?
If you want to use DocValues
in addition for something else, you should place both in your index:
the indexed term for queries/filters and the docvalues for e.g.
sorting / whatever... You can use the same field name for both.
Sorry I don't really follow this ... If it helps I can provide the
source of my experimental code
so far.
I must add that a full reindex all in one go is currently not an
option, so the
solution must support this mixed mode.
Uwe
Kind regards
- Chris
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org