[ https://issues.apache.org/jira/browse/LUCENE-6352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386461#comment-14386461 ]
Adrien Grand commented on LUCENE-6352: -------------------------------------- Thanks Martijn! I had a look at the patch it looks very clean, I like it. {code} Query rewrittenFromQuery = fromQuery.rewrite(indexReader); (JoinUtil.java) {code} I think you should rather call searcher.rewrite(fromQuery) here, which will take care of rewriting until rewrite returns 'this'. {code} final float[][] blocks = new float[Integer.MAX_VALUE / arraySize][]; {code} Instead of allocating based on Integer.MAX_VALUE, maybe it should use the number of unique values? ie. '(int) (((long) valueCount + arraySize - 1) / arraySize)' ? {code} return new ComplexExplanation(true, score, "Score based on join value " + joinValue.utf8ToString()); {code} I don't think it is safe to convert to a string as we have no idea whether the value represents an utf8 string? In BaseGlobalOrdinalScorer, you are caching the current doc ID, maybe we should not? When I worked on approximations, caching the current doc ID proved to be quite error-prone and it was often better to just call approximation.docID() when the current doc ID was needed. Another thing I'm wondering about is the equals/hashCode impl of this global ordinal query: since documents that match depend on what happens in other segments, this query cannot be cached per segment. So maybe it should include the current IndexReader in its equals/hashCode comparison in order to work correctly with query caches? In the read-only case, this would still allow this query to be cached since the current reader never changes while in the read/write case this query will unlikely be cached given that the query cache will notice that it does not get reused? > Add global ordinal based query time join > ----------------------------------------- > > Key: LUCENE-6352 > URL: https://issues.apache.org/jira/browse/LUCENE-6352 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Martijn van Groningen > Attachments: LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch > > > Global ordinal based query time join as an alternative to the current query > time join. The implementation is faster for subsequent joins between reopens, > but requires an OrdinalMap to be built. > This join has certain restrictions and requirements: > * A document can only refer to on other document. (but can be referred by one > or more documents) > * A type field must exist on all documents and each document must be > categorized to a type. This is to distingues between the "from" and "to" side. > * There must be a single sorted doc values field use by both the "from" and > "to" documents. By encoding join into a single doc values field it is trival > to build an ordinals map from it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org