Mike, you're right - sorry.

I've been reading High Performance MySQL today and got some great tips from that which will help a lot. I think the fundamental challenge now, is that the table contains a lot of timestamps, and querying against these involves multiple range queries which makes indexing hard.

The "actions" table has the following columns (of relevance to the example):

  status_id
  assignee_id
  company_id
  created_at
  assigned_at
  opened_at
  updated_at
  verified_at
  due_at
  solved_at
  closed_at

Queries could be:

"Show all actions which are assigned to Tom, were created in October and solved in November" "Show all open actions which were opened before August, do not have an assignee and were verified last week"

These queries which involve easily indexable fields (status_id, assignee_id, company_id) and multiple conditions on different ranges are what's difficult. The table is about 2.500.000 records and grows at a daily rate of about 50.000 records (that number is growing though). Once an action has been closed, it gets status "closed" and is no longer of interest. 70% of the records in the table will be status "closed".

I think what I'm looking for now, is some way to encode the different date values into a single column which can be indexed and the value of which gets calculated and updated by a background job. This will cost some precision, but I hope that can be done. Otherwise I'm back to considering alternative index/query-mechanisms.

Does my problem make a little more sense now? Thanks.

Morten



Let's say I would like to see all actions that were created in october and solved in november.


On Jul 12, 2009, at 3:54 PM, mos wrote:

Morten,
Perhaps you could also add how many rows are in the table, how many rows are added each day, what are the column types, and what do the search queries look like?

Mike

At 11:39 AM 7/12/2009, Morten wrote:

Hi,

I'm working on a table that has about 12 columns against which
arbitrary queries must perform really well. Currently there are a lot
of indexes on the table, but I'm hitting some problems - and adding
more indexes seems a slippery slope (there are ~15 multi-column
indexes, I'd like that reduced).

So I'm looking for a way out and I'm currently considering:

* Building a memory table on top of the existing table
* Sphinx indexing and then throw the queries against Sphinx instead
* Using a different "in-memory-DB" like Tokyo Cabinet for the queries
* Building a series of "reporting tables" which each handle a subset
of the supported queries

All of the solutions would maintain the current table for consistency
and it's acceptable with a couple of minutes lag.

I'm tempted to go for the memory table and update that depending on
which rows have been updated in the parent table since last update.
Eliminating duplicates could be a challenge, unless I build a new
table for each update and then "rename" the tables - but that's costly
in terms of memory.

What do people usually do in this situation? Any other solutions to
consider?

Thanks,

Morten



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=mo...@fastmail.fm


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=my.li...@mac.com



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to