Aklakan commented on code in PR #2405:
URL: https://github.com/apache/jena/pull/2405#discussion_r1703989751
##########
jena-arq/src/main/java/org/apache/jena/sparql/engine/join/AbstractIterHashJoin.java:
##########
@@ -42,47 +42,52 @@ public abstract class AbstractIterHashJoin extends
QueryIter2 {
protected long s_countResults = 0 ; // Overall result size.
protected long s_trailerResults = 0 ; // Results from the
trailer iterator.
// See also stats in the probe table.
-
+
protected final JoinKey joinKey ;
- protected final HashProbeTable hashTable ;
+ protected final MultiHashProbeTable hashTable ;
private QueryIterator iterStream ;
private Binding rowStream = null ;
private Iterator<Binding> iterCurrent ;
- private boolean yielded ; // Flag to note when
current probe causes a result.
+ private boolean yielded ; // Flag to note when
current probe causes a result.
// Hanlde any "post join" additions.
private Iterator<Binding> iterTail = null ;
-
+
enum Phase { INIT, HASH , STREAM, TRAILER, DONE }
Phase state = Phase.INIT ;
-
+
private Binding slot = null ;
- protected AbstractIterHashJoin(JoinKey joinKey, QueryIterator probeIter,
QueryIterator streamIter, ExecutionContext execCxt) {
+ protected AbstractIterHashJoin(JoinKey initialJoinKey, QueryIterator
probeIter, QueryIterator streamIter, ExecutionContext execCxt) {
super(probeIter, streamIter, execCxt) ;
-
- if ( joinKey == null ) {
+
+ if ( initialJoinKey == null ) {
+ // This block computes an initial join key from the common
variables of each iterator's first binding.
+
QueryIterPeek pProbe = QueryIterPeek.create(probeIter, execCxt) ;
QueryIterPeek pStream = QueryIterPeek.create(streamIter, execCxt) ;
-
+
Binding bLeft = pProbe.peek() ;
Binding bRight = pStream.peek() ;
-
+
List<Var> varsLeft = Iter.toList(bLeft.vars()) ;
List<Var> varsRight = Iter.toList(bRight.vars()) ;
- joinKey = JoinKey.createVarKey(varsLeft, varsRight) ;
+ // joinKey = JoinKey.createVarKey(varsLeft, varsRight) ;
+ initialJoinKey = JoinKey.create(varsLeft, varsRight) ;
+
probeIter = pProbe ;
streamIter = pStream ;
}
-
- this.joinKey = joinKey ;
+
+ JoinKey maxJoinKey = null;
Review Comment:
The max join key was meant to prevent creation of JoinIndexes for other
variables than the given ones. For example, when set to [x, y] then an attempt
to create a JoinIndex for [x, y, z] would still only create one for [x, y] and
thus omit z. Most likely such a feature is not needed and its disabled by
setting it to null.
`initialJoinKey` improves over the original `joinKey`: Originally, only the
first variable common to the peeked bindings was used for indexing, now the all
common variables are used.
The `initialJoinKey` is used to store the probe-bindings directly in an
indexed structure (rather than e.g. a list), but the MultiHashProbe table can
create further `JoinIndex` instances depending on the lookup requests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]