Well, I finally had the time to try this out, and after a little fiddling, I cut my time in half. I still think there must be further optimization options (what David suggested probably won't work for our purposes, but I may look into it), as when I only load the first level of data, which is 6K, it takes 0.15 seconds, but loading the complete hieararchy, which is just 2 more 2k, it takes 2 seconds (note that this was a slightly different data set than what I used before, where it was taking 9 seconds).
But it's nice to actually see some results, and given that these measurements are from my laptop, which does all disk IO really slowly, I think it's probably all the improvement I need for the moment. Thanks! Alec On Sep 5, 9:55 pm, Michael Bayer <mike...@zzzcomputing.com> wrote: > On Sep 5, 2010, at 7:41 PM,AlecMunrowrote: > > > > > > > > > > > Hi list, > > I've got a tree-type of data, representing somewhat of a method call > > history. When this is loaded, in an instance where it represents about > > 60 entries total, up to 2 deep, representing 14k of data, it takes > > about 9 seconds (up from ~0). Now, my laptop HDD where I am testing > > now is pretty terrible, but there's probably also some kind of > > optimization I could do. > > When I retrieve the data, I convert it to JSON, with a call like this > > (where json_test_case_method is a recursive function I defined): > > > "methods":[json_test_case_method(method) for method in > > test_case.methods]} > > > The hierarchical relationship is defined thusly: > > > TestCaseMethod.children = relation(TestCaseMethod, cascade="all", > > backref=backref("parent", > > > remote_side=[TestCaseMethod.id]), > > order_by=TestCaseMethod.id) > > > I've done zero SQLAlchemy optimization in my career thus far, and very > > little ORM optimization (a bit with Hibernate several years ago), so > > any advice is appreciated. :) > > You want to use "subqueryload()" or alternatively "joinedload()". The > former is probably best as it will emit one query per level of hierarchy, for > the whole load, fetching only as many rows as there are nodes. joinedload() > can load the whole thing in one query but it will see many duplicate parent > rows go by in result sets, which most DBAPIs pull over the wire > unconditionally even if skipped. > > That said, even if you were emitting a distinct query for every node (which > would be the case if your tree were one root node 60 levels deep), 9 seconds > sounds very slow. On my macbookpro, I just ran a test that emits 60 distinct > ORM queries for unloaded objects, each with a string field 1000 characters in > length, and it takes .03 seconds. So you might want to poke around at your > network or some other logic that's taking way too much time in there > (profiling would be a good start - here's a big post I wrote about it > -http://stackoverflow.com/questions/1171166/how-can-i-profile-a-sqlalc...). > > > > > > > > > > > Thanks, > >Alec > > > -- > > You received this message because you are subscribed to the Google Groups > > "sqlalchemy" group. > > To post to this group, send email to sqlalch...@googlegroups.com. > > To unsubscribe from this group, send email to > > sqlalchemy+unsubscr...@googlegroups.com. > > For more options, visit this group > > athttp://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.