Re: taking python enterprise level?...

mk Wed, 03 Mar 2010 12:52:31 -0800

Hello Tim,

Pardon the questions but I haven't had the need to use denormalizationyet, so:


Tim Wintle wrote:

/*   Table A    */
CREATE TABLE TableA (
    project_id BIGINT NOT NULL,
    cost INT,
    date DATETIME,
    PRIMARY KEY (project_id, date)
);

/* Table projects */
CREATE TABLE projects (
    client_id BIGINT NOT NULL,
    project_id BIGINT NOT NULL,
    INDEX(client_id)
);


.... now the index on TableA has been optimised for queries against date
ranges on specific project ids which should more or less be sequential
(under a load of other assumptions) - but that reduces the efficiency of
the query under a join with the table "projects".

If you denormalise the table, and update the first index to be on
(client_id, project_id, date) it can end up running far more quickly -


IOW you basically merged the tables like follows?

CREATE TABLE projects (
    client_id BIGINT NOT NULL,
    project_id BIGINT NOT NULL,
    cost INT,
    date DATETIME,
    INDEX(client_id, project_id, date)
);

From what you write further in the mail I conclude that you have noteliminated the first table, just made table projects look like I wroteabove, right? (and used stored procedures to make sure that both tablescontain the relevant data for client_id and project_id columns in bothtables)

Have you had some other joins on denormalized keys? i.e. in example howthe join of hypothetical TableB with projects on projects.client_idbehave with such big tables? (bc I assume that you obviously can'tdenormalize absolutely everything, so this implies the need of doingsome joins on denormalized columns like client_id).

assuming you can access the first mapping anyway -


? I'm not clear on what you mean here.

so you're still
storing the first table, with stored procedures to ensure you still have
correct data in all tables.


Regards,
mk

--
http://mail.python.org/mailman/listinfo/python-list

Re: taking python enterprise level?...

Reply via email to