Thanks for the clarification David,

Jacques

From: "David E Jones" <david.jo...@hotwaxmedia.com>

On Jan 16, 2009, at 4:39 AM, madppiper wrote:


Hey everybody,

I just wanted to get a discussion going on nested sets. In the past few weeks, I had to work alot with the ofbiz category structure and I really got the feeling that the way the productcategories are setup, it takes an awful lot of time to run through the categories, slowly querying from level to level and work your way down to the data you need. I got the feeling that we could really improve the data structure behind that by introducing nested trees. I have done a similar thing for a non-ofbiz-based application and the query results are really fast that way (the queries themselves are rather
simple to put together to).

In case you need some additional information on this topic:
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html (just
scroll down to nested sets).

Yes, this is a classic approach, and a pretty well documented one. The author sited in that mysql article (Joe Celko) has a number of great books on data structures, especially for relational databases, and worked at a school near Salt Lake City, Utah not far from where I used to live there. I had the pleasure of meeting him and Terry Halpin there, a real treat since they've both done some great work, and continue to do good work and publish places like Intelligent Enterprise and the BPM Forum.

It does have various limitations though (too simplistic) and is not capable of functionally equivalent to the category rollup model in OFBiz. Some reasons:

1. not a tree: categories are allowed to have multiple parents (be in multiple categories) as is thus a graph and not a tree and this model does not work with graphs, only with trees

2. the category rollups are effective dated so a category being a sub- category of another can change at any time without a database update, just the passing of time

3. categories are sorted by a sequence field on the rollup, and that sequence can be different for different parent categories

This pattern also requires a large number of updates to change anything in the category rollup. To maintain data consistency it is necessary to lock the entire category table, which is bad in large organizations with many product marketers or others dealing with categories.

The main reason this isn't an issue in OFBiz is that all category lookups are cached in public facing code.

On that note, this could be used in OFBiz but only as an alternative form of caching. In other words, we could have a separate table that models a top-down view of the categories with duplicate records as needed when a category is in multiple places in the tree (ie use a key for that entity that other than the productCategoryId).

The real question is, would that help performance at all? The only way to tell is to try to implement and see how it compares to the current caching. My guess based on similar efforts is that the current caching approach will still be far faster so I've never even bothered with implementing this.

On a different note, I also noticed that the macro within
sidedeepcategory.ftl is awfully complicated, possibly much more complicated
than it has to be and I'd be interested in approving the very same. I
couldn't get my head around the contentWrapper class, but I think that
calling the .getRelated function, could do the trick also and possibly be a
simpler solution...

As always, if your changes don't break existing functionality then your contributions will be welcome.

-David


Reply via email to