Hi Uwe, thanks for your suggestions. I have tried a couple of things with no luck yet:

Sorry,
I just noticed, you are using TermFilter not TermsFilter: This one
does not support random access (using bits()). Because of this the
filtered docs cannot be passed down using acceptDocs.

TermsFilter made no difference, still no acceptDocs passed to the filter.

The should
clause in addition causes that the ConstantScoreQuery has to try all
documents because there is nothing else that could drive the query.

As an experiment I tried MUST, this didn't help either.

An alternative approach would be (in Lucene 4.10 or 5.0) to add the
TermFilter as ConstantScoreFilter(TermQuery) with boost=0 to the
BooleanQuery. In that case it can drive the query and does not affect
scoring. In later Lucene versions you may use the new
BooleanQuery.Occur type "FILTER" which can add any query as filter.
Filters will be deprecated once this is ready.

This is interesting and I will try it when I get a chance.

My goal is to slowly transform a particular field from StringField to BinaryDocValues so that during the transition a doc may hold the value either in the old location or the new. Therefore a query must be able to say
    oldField:"foo" OR newField:"foo"
Where oldField is a StringField and newField is a BinaryDocValues.

Why do you want to do this.

Good question! In our architecture we build indexes by pulling data from several sources and it is _expensive_. Increasingly we are requested to change one or two fields which currently requires a full re-index of the doc. When I attended the Dublin Lucene conference I spoke to Shai Erera about this problem and he pointed me at DocValues which allow you to update fields without incurring the full doc reindex cost. That is the appeal for us. As I said before, we want to transform docs only as they are updated, where transformation involves dropping the old TextField and creating a new BinaryDocValuesField containing the same
value.  Hence the need for the query to be able to search 'old OR new'.

If you want to query like this on the field, it is a bad idea to use DocValues.

Why is it a bad idea?

If you want to use DocValues
in addition for something else, you should place both in your index:
the indexed term for queries/filters and the docvalues for e.g.
sorting / whatever... You can use the same field name for both.

Sorry I don't really follow this ... If it helps I can provide the source of my experimental code
so far.

I must add that a full reindex all in one go is currently not an option, so the
solution must support this mixed mode.

Uwe

Kind regards

- Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to