Hi, we are indexing different types of documents in one Lucene index. They have most fields in common but we need to filter some types for certain queries. We are using numeric values to determine the types of documents (1-4). Now, when querying these documents we see that the performance degrades the more documents of a type are in the index.
Using a simple test that indexes 10 Mio documents I can see the following when filtering on everything but 100000 documents: * When issuing the query alone the new PointRangeQuery (IntPoint.newExactQuery) is a lot faster than term and legacy numeric (in my case around 2x the speed of the others) * When issuing a bool query that contains a term query that selects 5 documents together with a must query that selects on the numeric the points are 5x slower than legacy numeric (LegacyNumericRangeQuery.newIntRange) and terms (TermQuery) * When doing the same thing with SHOULD instead of MUST for the additional term query the PointRangeQuery is fastests as well I suspect this to be related to the discussion in https://issues.apache.org/jira/browse/LUCENE-7254 Of course there could be something wrong with the way I am measuring the performance, I'd be happy to share the code. But what I read in the ticket above seems to hint that the points are not suited for every use case? Is it recommended to use StringField in a case like this instead? Regards Florian -- Florian Hopf Freelance Software Developer http://blog.florian-hopf.de --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org