On 15 July 2016 at 18:46, Mike Bayer <mike...@zzzcomputing.com> wrote:
> Here's the problem that cant be solved: > > 1. fetch rows 0-1000 > > 2. start loading Foo objects: > > Foo(1) -> eagerload Foo.related with 400 members > Foo(2) -> eagerload Foo.related with 500 members > Foo(3) -> eagerload Foo.related with the first 100 of 250 members > > 3. yield. That is, stop reading more rows. Send out Foo(1), Foo(2), > Foo(3). The application now can go nuts with these. It can read them, > write them, iterate through .related. > > Right there, we're broken. Foo(3) is out in the world with less than half > of its ".related" collection - they are still sitting on the cursor! > > Ok, I think I see what you're getting at but I don't think that a problem here. It looks like you have a query which joins all the relationships in one go. That's not what I'm doing, I'm relying on the fact that I can interleave the queries. On SQL level it would look like so: DECLARE my_cursor CURSOR FOR SELECT * FROM bigtable; FETCH 1000 FROM my_cursor; SELECT * FROM related_table WHERE bigtable_id IN (1,2,3,4,5,....) SELECT * FROM related_table2 WHERE bigtable_id IN (1,2,3,4,5,....) SELECT * FROM related_table JOIN second_order_relationship USING (pkey) WHERE bigtable_id IN (1,2,3,4,5,....) FETCH 1000 FROM my_cursor; SELECT * FROM related_table WHERE bigtable_id IN (11,12,13,14,15,....) SELECT * FROM related_table2 WHERE bigtable_id IN (11,12,13,14,15,....) SELECT * FROM related_table JOIN second_order_relationship USING (pkey) WHERE bigtable_id IN (11,12,13,14,15,....) <... repeat ...> (this may be a postgresql specific feature, not sure). Yes, this may mean that some of the related objects may be fetched multiple times, but this is not a problem for me (the whole thing is read-only anyway). What I need however is for the secondary queries to populate the relationships in the original BigTable objects. Vastly simpler is to use the recommended window functions to do pagination > of any scale with no chance of complex failures. > > A bit of googling suggests you are referring to this: https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/WindowedRangeQuery which indeed looks very neat and much easier than what I was thinking. It doesn't say explicitly, but it looks like it will work transparently with eager loading. It basically does the above, but skips the cursor and replaces it with queries on ranges of the primary key (which is shorter and probably more efficient to boot). Thanks for the tip! -- Martijn van Oosterhout <klep...@gmail.com> http://svana.org/kleptog/ -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.