Following up on my message from last week... After doing a little bit of profiling, it turns out that the query execution time was dominated by lookups to the TDB node table to translate between a NodeId (wrapper around a long) and a Node (an RDF node). These lookups happen as a result of calling Algebra.compatible() to test for compatibility between bindings on the LHS and RHS of a join (or in our case, a minus which has approximately the same logic).
To give an idea of the scale of the problem caused by the naive minus implementation that loops on the RHS for each row on the LHS: On a single profiling run of an ASK query, I found that QueryIterMinus advanced the LHS iterator 9600 times looking for a row that was not matched on the RHS. On average, each iteration over the RHS processed 4800 rows before finding a matching binding and removing the solution for the RHS for a total of 46 million binding compatibility tests. Two common variables between the LHS and RHS gives a total of 4 node table lookups per test, for a grand total of 184 million node table lookups in this query. Thankfully 99.99% of the lookups hit the in-memory cache (nothing was ever evicted), but still there's no way that many lookups are going to perform well. The query in question took ~5 minutes to complete. The good news in all this is, some small changes in the query logic to reduce the number of compatibility tests will result in tremendous performance improvement. I tested the same query with an initial QueryIterMinus implementation that uses a fairly straightforward HashSet-based index pre-computed on the RHS bindings (projected to the common variables as determined by static analysis). Queries that previously had timed out after 15 minutes had sub-second response times with a warm node table cache using this new implementation. Beyond this simple change, there are more potential optimizations that can be made in situations where multiple iterations over a result set are inevitable (or when a server under heavy load has lots of node table cache misses and has to go to disk a lot). These optimizations depend on the fact that, for a given node table, two NodeId's are equal if and only if their mapped Nodes are also (syntactically) equal. Or, stated slightly differently, it's impossible for two Node that compare as equal via Object.equals() to map to different NodeId values. I believe this to be the case based on a cursory look at the node table source, but this should be verified by somebody more familiar with the code. Once you've established the equivalence between NodeId equality and Node equality (at least with respect to NodeId's from the same transaction in the same node table) you can take advantage of this by detecting that the two bindings being tested for compatibility are backed by NodeId's from the same table. In this situation, you can just compare NodeId values instead of looking up the nodes in the table. Not only does this eliminate node table lookups, but it also simplifies the equality tests, which are now simple integer comparisons instead of string comparisons. I think that using such a strategy could give significant performance improvement for queries that make heavy use of join operations. Regards, Alex