Thanks for the clarification David,
Jacques
From: "David E Jones" <david.jo...@hotwaxmedia.com>
On Jan 16, 2009, at 4:39 AM, madppiper wrote:
Hey everybody,
I just wanted to get a discussion going on nested sets. In the past
few
weeks, I had to work alot with the ofbiz category structure and I
really got
the feeling that the way the productcategories are setup, it takes
an awful
lot of time to run through the categories, slowly querying from
level to
level and work your way down to the data you need. I got the feeling
that we
could really improve the data structure behind that by introducing
nested
trees. I have done a similar thing for a non-ofbiz-based application
and the
query results are really fast that way (the queries themselves are
rather
simple to put together to).
In case you need some additional information on this topic:
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
(just
scroll down to nested sets).
Yes, this is a classic approach, and a pretty well documented one. The
author sited in that mysql article (Joe Celko) has a number of great
books on data structures, especially for relational databases, and
worked at a school near Salt Lake City, Utah not far from where I used
to live there. I had the pleasure of meeting him and Terry Halpin
there, a real treat since they've both done some great work, and
continue to do good work and publish places like Intelligent
Enterprise and the BPM Forum.
It does have various limitations though (too simplistic) and is not
capable of functionally equivalent to the category rollup model in
OFBiz. Some reasons:
1. not a tree: categories are allowed to have multiple parents (be in
multiple categories) as is thus a graph and not a tree and this model
does not work with graphs, only with trees
2. the category rollups are effective dated so a category being a sub-
category of another can change at any time without a database update,
just the passing of time
3. categories are sorted by a sequence field on the rollup, and that
sequence can be different for different parent categories
This pattern also requires a large number of updates to change
anything in the category rollup. To maintain data consistency it is
necessary to lock the entire category table, which is bad in large
organizations with many product marketers or others dealing with
categories.
The main reason this isn't an issue in OFBiz is that all category
lookups are cached in public facing code.
On that note, this could be used in OFBiz but only as an alternative
form of caching. In other words, we could have a separate table that
models a top-down view of the categories with duplicate records as
needed when a category is in multiple places in the tree (ie use a key
for that entity that other than the productCategoryId).
The real question is, would that help performance at all? The only way
to tell is to try to implement and see how it compares to the current
caching. My guess based on similar efforts is that the current caching
approach will still be far faster so I've never even bothered with
implementing this.
On a different note, I also noticed that the macro within
sidedeepcategory.ftl is awfully complicated, possibly much more
complicated
than it has to be and I'd be interested in approving the very same. I
couldn't get my head around the contentWrapper class, but I think that
calling the .getRelated function, could do the trick also and
possibly be a
simpler solution...
As always, if your changes don't break existing functionality then
your contributions will be welcome.
-David