Daniel Fetchinson wrote:
[off but interesting topic]
<cut>

What would be the corresponding database layout that would scale and I
could get the total number of legs in the zoo or total number of
animals in the zoo without join(s)?

Cheers,
Daniel

[/off but interesting topic]


That all comes down to the keywords, efficiency, robustness and performance and you can only pick one of them. So which two can you sacrifice? The good news is that it is only theoretical, if you have your requirements (i.e. the query with the right results has to return within an acceptable time for the user not to get frustrated) who cares if it is not as fast as it theoretically could be? Especially if this means you can sacrifice the theoretically performance for easier deployment and maintenance.

To get back to the layout question, if I need to have a count of the legs at any given time, I would like to see the business process that requires this operation. Using the business process I could probably narrow down the scope quite a bit like, it is not necessary to have a precise count at time *now* but it is good enough to have an exact count which is no more then 4 seconds old but the query does need to return within 50 ms.

One solution could be to build a central data warehouse where all info is stored in. Then have satellite db's that does ETL syncing with the central one. The layout of the satellites contain an optimised table version for the queries you want to throw at it.

If you need to scale it, that is a question of adding another satellite.
Efficiency is right out of the window because you store in essence just multiple copies of the same data just in another order. However it is robust (by having multiple copies) and has a predictable performance figure.

hth
--
mph
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to