On 31/01/12 14:50, Ozga, Rafal wrote:
Fair enough, the single comparison is indeed in O(size(algebra_expression))
but then doing it naively I would have to compare my tree with all the
already existing ones in the cache. And that's not really an option I think.
Unless there is some way of ordering the algebras so I could compare mine
with like logarithm or so of the number of already existing algebras in the
cache ?
But I guess there isn't any obvious way of doing so, so it seems that the
option for me is to implement an OpVisitor and traverse the algebra tree
modifying the subsequent nodes one by one, isn't ?
Speaking of NodeIsomorphismMap. How do you make the bgp (triple ?s ?p ?o)
isomorphic with the infinite number of patterns bgp (triple ?s' ?p' ?o')
that are different only by variable names ?
You can do better than that :-)
Approach 1:
Canonicalize patterns so the variables are allocated ?a1 ?a2 as
encounter in the walk of the algebra expression. This canonical pattern
can be used in a normal hash table. It's better to execute the
canonical form - makes rewriting the results easier.
Approach 2:
A special hash function where the hash function is var name insensitive
(position sensitive would be good but just missing vars will work) which
will bucket the incoming. Test within the bucket is still as you
describe but (hopefully!) much smaller set of queries. Just the way
predicate URIs are used is probably fairly distinguishing.
If you produce either isomorphic-by-var-bNode or the canonical form
code, please consider contributing it to Jena.
Andy