Mike, you're right - sorry.
I've been reading High Performance MySQL today and got some great tips
from that which will help a lot. I think the fundamental challenge
now, is that the table contains a lot of timestamps, and querying
against these involves multiple range queries which makes indexing hard.
The "actions" table has the following columns (of relevance to the
example):
status_id
assignee_id
company_id
created_at
assigned_at
opened_at
updated_at
verified_at
due_at
solved_at
closed_at
Queries could be:
"Show all actions which are assigned to Tom, were created in
October and solved in November"
"Show all open actions which were opened before August, do not have
an assignee and were verified last week"
These queries which involve easily indexable fields (status_id,
assignee_id, company_id) and multiple conditions on different ranges
are what's difficult. The table is about 2.500.000 records and grows
at a daily rate of about 50.000 records (that number is growing
though). Once an action has been closed, it gets status "closed" and
is no longer of interest. 70% of the records in the table will be
status "closed".
I think what I'm looking for now, is some way to encode the different
date values into a single column which can be indexed and the value of
which gets calculated and updated by a background job. This will cost
some precision, but I hope that can be done. Otherwise I'm back to
considering alternative index/query-mechanisms.
Does my problem make a little more sense now? Thanks.
Morten
Let's say I would like to see all actions that were created in october
and solved in november.
On Jul 12, 2009, at 3:54 PM, mos wrote:
Morten,
Perhaps you could also add how many rows are in the table, how
many rows are added each day, what are the column types, and what do
the search queries look like?
Mike
At 11:39 AM 7/12/2009, Morten wrote:
Hi,
I'm working on a table that has about 12 columns against which
arbitrary queries must perform really well. Currently there are a lot
of indexes on the table, but I'm hitting some problems - and adding
more indexes seems a slippery slope (there are ~15 multi-column
indexes, I'd like that reduced).
So I'm looking for a way out and I'm currently considering:
* Building a memory table on top of the existing table
* Sphinx indexing and then throw the queries against Sphinx instead
* Using a different "in-memory-DB" like Tokyo Cabinet for the queries
* Building a series of "reporting tables" which each handle a subset
of the supported queries
All of the solutions would maintain the current table for consistency
and it's acceptable with a couple of minutes lag.
I'm tempted to go for the memory table and update that depending on
which rows have been updated in the parent table since last update.
Eliminating duplicates could be a challenge, unless I build a new
table for each update and then "rename" the tables - but that's
costly
in terms of memory.
What do people usually do in this situation? Any other solutions to
consider?
Thanks,
Morten
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=mo...@fastmail.fm
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=my.li...@mac.com
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org