On Thu, May 15, 2014 at 11:01 AM, Johan De Meersman <vegiv...@tuxera.be> wrote:
> ----- Original Message -----
>> From: "Larry Martell" <larry.mart...@gmail.com>
>> Subject: Re: Performance boost by splitting up large table?
>>
>> This table is queried based on requests from the users. There are 10
>> different lookup columns they can specify, and they can provide any or
>
> That makes it rather more of a bother, as MySQL can't (yet) skip columns in 
> an index, as far as I'm aware. Someone please correct me if I'm wrong here.
>
>
>> all of these. Currently each one of the columns has an index on it.
>
> I'm a bit fuzzy on multiple-index queries, but I think support isn't all too 
> sharp.
>
>
>> Would it be beneficial to create an index with all 10? Rarely are all
>> 10 specified in the query. Typically it's 3 or 4. Would it be
>> worthwhile to see which are much commonly specified and create an
>> index with just those? Or would it be better to put the commonly
>> selected columns on the index with the lookup columns?
>
>
> You may want to grab a day or week's worth of queries (either general_log 
> with all the overhead and disk space that entails, or tcpdump) and pump that 
> through pt-query-digest (Percona Toolkit) to see which combinations of fields 
> are most often used, and add the necessary covering indices to help those 
> queries.

We have a log going back over 2 years with over 200,000 queries, and
I've written python scripts to give the frequency of the column pulled
for the report and the column used for filtering. Not surprising, a
very small number of each are used most of the time, and a huge number
used just 1 or 2 times. I'll be digging into that next week.

> A few points to keep in mind during analysis:
>  * order of fields in the where clause is largely irrelevant (although 
> most-selective-first is preferred)
>  * not all the fields in the index must be queried; but you MUST query a full 
> prefix set
>    -> ie, for index (a, b, c, d) you can query (a, b) but for (a, c) only (a) 
> will be used
>  * every index constitutes a (small) performance penalty upon table updates, 
> so don't go too wild either :-)
>
> Also helpful:
>  * plenty of memory, at least enough to keep the working set in memory
>  * faster disks (SSD, or FusionIO things) helps bundles for data that doesn't 
> fit in memory

The db host is running with 250GB memory, db size is 470GB, spinning
2.5” 15k rpm drives, and 40 cores.

>  * if you notice a tendency for multiple users (say, a dozen) to run 
> identical queries (and that means /bitwise/ identical query text, down to the 
> last space), the query cache might help. Don't make it too big, though, a 
> couple of meg should suffice.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

Reply via email to