was just reading that EXT3 does HTREE as its more efficient than a
BTREE, jdbm has an HTREE implementation.
MArk
Edson Tirelli wrote:
Mark,
My feeling is that once someone is using an "ordering" constraint
(>, >=, <, <=), it means the attribute will have several possible
values (not a discrete small set of possible values). So, the issue is
not really the overhead of a Tree x Hashmap, but if the overhead of a
tree pays off the overhead of not using indexing for cases like that.
My feeling is that it does pays off and is worth at least a try.
I'm saying this because what I think you are refering to when you
say a hashmap+training-data approach (or hints as suggested by
Michael) will not address the general case.
I mean, when you have a rule like:
when
A( $var : attr )
B( attr2 > $var )
then
Using a hashmap will lead to a buckets count explosion that may
cause the hashmap to perform worse than the tree. Also, realize that
in the above case, a naive approach will cause a single B fact to be
added to multiple buckets, since it may match more than one A with
different "attr" values.
So, thats why I think a tree is better for the general case of
"ordering" constraints. Specific cases may be handled in a different way.
I know htrees only by name. Any specific feature you think may be
useful for us?
[]s
Edson
Mark Proctor wrote:
I wonder if the overhead of a btree, btw have you seen htrees,
compared to hashmap is worth while, I expect it is as join ordering
can make a big difference.
Mark
Edson Tirelli wrote:
Mark,
The approach I was looking into that time was not feasible because
we used to keep order of asserted fact/tuples on a node basis. With
the core changes you made in 3.1, we can implement range ordering by
replacing the current hashmap index for a tree index. No need for
training data, in my understanding.
Maybe we will need a composed index approach to work some cases,
but the general solution idea is simple.
[]s
Edson
Mark Proctor wrote:
Actually I was just thinking about some stuff Edson has done.
With solvers we know the available data and ranges, right? We can
use this to order indexes, I know this was something Edson looked
into - but without training data, we couldn't make it worth
while - same for custom indexing. So we can start to incorporate
those to get faster joins for known data sets.
Mark
Geoffrey De Smet wrote:
The more I learn from JCHS (or prolog for that matter),
the more I am starting to think that this is a different way of
solving.
1) JCHS/prolog looks like (or is) declarative solving.
2) Taseree is actually more hybrid, the general idea behind it is:
- Drools (declarative programming) is very easy for evaluation
but very difficult for solving.
- Local/tabu search (procedural programming) is easy for solving
but difficult for evaluation.
Both have it's disadvantages and advantages, for example:
Local search is generally faster but doesn't recognize the optimal
solution.
To me it seems they are both interesting to implement,
there must be some common ground too.
We should hold a conference call about it this weekend?
It would be a good idea to compare JCHS and Taseree on a couple of
problems, like the tt problem:
http://mat.gsia.cmu.edu/TOURN/
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email