I'd like to introduce a package I've been working on for a little while now: SQLAlchemy-ORM-tree, “an implementation for SQLAlchemy- based applications of the nested-sets/modified-pre-order-tree- traversal technique for storing hierarchical data in a relational database.” It's gone through a couple of semi-public releases already, but now for the first time the API is complete and semi-stable, and I would like to get some feedback from developers on this list who have used or would like to use hierarchical data in an RDBMS through SQLAlchemy. Some example code to show it in action:
from sqlalchemy_tree import TreeManager node_table = Table('node', metadata, Column('id', Integer, primary_key=True), Column('parent_id', Integer, ForeignKey('node.id'))) class Node(object): tree = TreeManager(node_table) mapper(Node, node_table, properties={ 'parent': relationship(Node, backref = backref('children'), remote_side = node_table.c.id), }) Node.tree.register() root = Node() session.add(root) session.flush() child = Node() child.parent = root session.commit() >>> child.tree.is_leaf_node True >>> child.tree.query_ancestors().one() == root True >>> child2 = Node() >>> Node.tree.insert(child2, root, Node.tree.POSITION_LAST_CHILD) >>> session.flush() >>> root.tree.query_leaf_nodes() \ .order_by(Node.tree) \ .all() == [child, child2] True >>> child2.parent = None >>> session.flush() >>> child2.tree.is_root_node True >>> The PyPI page: http://pypi.python.org/pypi/SQLAlchemy-ORM-tree and github (pull requests accepted): https://github.com/rokusigma/sqlalchemy-orm-tree Some things to note about the project as it is now: I am operating under the principles of “1) make it work; 2) make it beautiful; then finally 3) make it fast”. It works, and to prove it I've written over 300 unit tests in 3.4kLOC covering the expected behavior of all API entry points. Now I'm soliciting the community's input in making it “beautiful”--making sure the API is both functional and elegant, refactoring areas of the code base that need it, making algorithmic changes, and generally ensuring that the SQLAlchemy-ORM-tree package is useful to anyone and everyone working with hierarchical data. I'll then concern myself with low-level modifications for speed. Some ugly warts I'm aware of (and reasons it's not yet 1.0): * Making any changes to the structure of the tree requires flushing before the various tree properties are updated (tree_id, left, right, depth, etc.), as these updates are handled by a mapper before_flush event handler and session (before/after)_(insert/update/delete) event handlers. More subtlety, the introspective APIs `is_leaf_node`, `is_descendant_of()`, etc. may possibly return incorrect results until the session is flushed as they rely upon these tree property values. I'm considering having `insert()` make changes to the database directly and immediately without waiting for a flush event. * There are no bulk insert, update, or delete operations. Within the mapper event handlers each update, insert, and delete is handled in turn, with all session objects updated accordingly at each stage. I could use some input as to how best to structure a bulk-load/bulk- update API from the user's perspective, as I lack experience in this area. * The tree manager needs access to the table at initialization time... which in the declarative style means that the tree manager has to be monkey-patched into the class object after it is defined. Suggestions for improving this would be appreciated. * The API for both inserting nodes AND updating the location of existing nodes is `insert()`, the analogy being inserting into a specific position in a list. It's confusing however as one uses it for updates as well as insertions, and it also has nothing to do with the SQL INSERT statement. `update()` or `move()` wouldn't be any better for the same or similar reasons. Nomenclature suggestions would be much appreciated. * I just today learned about Tropashko's nested interval tree encoding using continued fractions. I admit that I don't (yet) fully understand the underlying maths, but appears that switching to continued fractions over integers for the left/right fields could lead to significant algorithmic gains in performance. Thankfully this could be a transparent switch with no external API changes. Happy hacking, Mark -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.