Ah, a solution just came to me. If I had 10 categories and denormalized the user_id into them, I can reverse the queries and do only 10 selects, filtered by user_id and joined with the master table on pkey. I'd end up with equal number of rows selected, but only 10 queries issued. The denormalized user_id would not pose any problems and I can cascade updates and deletes via master table, which is in cascade from the users table.

Thoughts?


.oO V Oo.


On 10/02/2011 04:12 PM, Vlad K. wrote:

I have an impedance mismatch problem, namely a model in X categories, each having its own set of fields and all having a set of common fields. I solved that with:

1. "master" table, contains the sequenced pkey and common fields, as well as "category" discriminator 2. individual category tables with fields specific to that category, and foreign key pkey to the master table.

I've been reading on the Joined Table Inheritance and this seems like a right way to do it, however I'm not quite sure I can use SQLAlchemy's polymorphic patterns. Here's why.

When I create new model of certain category I of course know which category it is, so that's not the problem. But when the system queries the model (a model) by ID, it does not know in advance which category it belongs to. So whatever I do I am forced to do two queries, first to the master table and then to the proper category table. AFAIK this is the only way to do it unless there's some SQL trick that can join tablename dynamically based on column value (and I don't mean triggers)?

Anyways, the problem I'm having is when I query the whole set. Each master-category "pair" belongs to a user, and I have to query ALL the models that belong to the user. So I end up querying everything from the master table that belongs to that user, and then FOR EACH, load up the belonging category model from appropriate category table. This means that if I had 10k rows, I'd have to issue 10k selects, next to the master select to pull in all 10k master table rows (and from what I ready psycopg2 can't yield, it has to buffer them all).

Thankfully, these kinds of queries are not done on user demand (realtime web experience), but within an automated dispatch system nightly (dump to xml), but still I'm concerned that I'm doing something wrong if I have to issue 10k selects for single user, having few hundred users in the system!

Am I missing some design pattern here? Should I simply go big table denormalize?


The backend is PostgreSQL 9.0.



Thanks!



--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to