On Sun, 2006-07-23 at 22:09 +1000, Malcolm Tredinnick wrote:
> Whilst your suggestion is slightly faster for smaller cases, the problem
> is that it doesn't scale quite as well: there is an upper limit for the
> number of joins a database can do, for example.

For list-wise scaling, yes. I've once tried to join 25 tables with
MySQL, with pathetic result. However, I've managed to first join 12
tables and the rest in second run, but that's an ugly story.

Malcolm, my apologies for not applying indexes to the reln_class_student
table. It was the sequential scans that were killing the performance.

> So take those numbers (particularly 35 times faster) with a small grain
> of salt: if performance is critical for somebody's use case,
> benchmarking with realistic data on production systems is the real test
> case. Absolute times are dependent on a lot of things, including usage
> patterns.

Definitely. One could think that there are only four parameters
affecting the performance: sizes of the three tables and the length of
the list. It would be fairly easy to build a benchmark script that would
test the queries against the combinations of those four parameters. But
it's not that simple. The crucial thing is the structure of the
class-student connection, and that's not something you can express as a
single number and test combinations of. You could have students equally
spread over all the classes. Or you could have just 2% of popular
classes that concentrate 90% of students, etc.

However, I'd like to bring your attention to one thing. Multi-join
solution gives the planner opportunity to adjust the strategy of table
joining for _each_ join. With one-join solution the planner has to
decide only one strategy for all the rows it's going to process.

In terms of the diversity of cases and usage patterns, the multi-join
query has more potential of being _adaptive_.

That's only theory for now, I'd love to see some real production
database tests. How are Google guys doing it, by the way? They have the
“productionest” database ever, don't they?

> Not knocking your solutionas an idea, though: it's quite cute and
> definitely shows thinking outside the box.

Thanks.

If you fancy a new puzzle, try a query that finds a class which was
attended by all students from one list, and which wasn't attended by any
of the students from the second list.

Best regards,
Maciej

-- 
Maciej Bliziński <[EMAIL PROTECTED]>
http://automatthias.wordpress.com


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~----------~----~----~----~------~----~------~--~---

Reply via email to