Re: efficient query for: "it's your birthday today"

Jigal van Hemert Fri, 19 Nov 2004 00:31:54 -0800

From: "Brent Baisley"
> You don't have to replicate the data, and you shouldn't. Since you will
> be searching on parts of the data, you may want to store it in parts.
> Then you can create indexes to "merge" the data for searching.


I'll be searching both on parts of the date as well as on the entire date.
Queries like "col > CURDATE() - INTERVAL 2 DAY" are also used; I don't see
how that is going to be fast if you store the date parts in seperate
columns.

> Now, if your table is only 10K records, that's pretty small and you
> probably won't see any performance difference using indexes since the
> table may be in cache. If the table is not going to get a lot bigger
> (i.e. 50K+ records), it may be easier just to make sure you have plenty
> of RAM in the machine.

The test machine currently has only 10K records, but we're designing the
rest of the application for 500K+ records...
That's why I wanted to have faster queries for this...

The actual lay-out of the database is a bit more complicated than the
queries may have indicated.
Since we don't know yet the actual data that is going to be stored we
decided that the database would best be storing each type of data in a
table. So we have tables for int, bigint, float, string, text and datetime
data (more tables can be added without a problem).

CREATE TABLE `param_datetime` (
  `id` int(11) NOT NULL default '0',
  `name` varchar(32) NOT NULL default '',
  `value` datetime default NULL,
  `match` datetime default NULL,
  `match2` datetime default NULL,
  `weight` tinyint(4) default NULL,
  KEY `id-name-value-match-match2` (`id`,`name`,`value`,`match`,`match2`)
) TYPE=InnoDB COMMENT='integer parameters';

`id` links to the id to which the data belongs
`name` is the name of the field
`value` contains the actual data
`match` and `match2` are used for storing targets, values to look for, etc.
`weight` is used for calculating the sort order

We've tested models for flat tables (a column for each parameter) and these
parameter tables, both as MyISAM and InnoDB. For high concurrency situations
the InnoDB tables outperformed MyISAM easily for larger tables. Parameter
tables were easier to expand compared to 'flat' tables: adding a column to a
500K+ record table is not fast.
Indexing is also more efficient with parameter tables in our case, since any
combination of properties may be used for searching. We have one limitation
however: you cannot search on more that 31 properties at the same time, due
to the limitation of the number of JOINs in MySQL.

We're currently running queries with up to 27 JOINs, but since they all use
an index, performance has been very constant. Clever caching of results is
also helping a lot to keep the load of the database server down to a
minimum.

I'll start to experiment with adding the date parts to the param_datetime
table and adding some indexes for these parts. Maybe this will speed this up
considerably...

Regards, Jigal.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: efficient query for: "it's your birthday today"

Reply via email to