On Thu, May 15, 2014 at 11:01 AM, Johan De Meersman <vegiv...@tuxera.be> wrote: > ----- Original Message ----- >> From: "Larry Martell" <larry.mart...@gmail.com> >> Subject: Re: Performance boost by splitting up large table? >> >> This table is queried based on requests from the users. There are 10 >> different lookup columns they can specify, and they can provide any or > > That makes it rather more of a bother, as MySQL can't (yet) skip columns in > an index, as far as I'm aware. Someone please correct me if I'm wrong here. > > >> all of these. Currently each one of the columns has an index on it. > > I'm a bit fuzzy on multiple-index queries, but I think support isn't all too > sharp. > > >> Would it be beneficial to create an index with all 10? Rarely are all >> 10 specified in the query. Typically it's 3 or 4. Would it be >> worthwhile to see which are much commonly specified and create an >> index with just those? Or would it be better to put the commonly >> selected columns on the index with the lookup columns? > > > You may want to grab a day or week's worth of queries (either general_log > with all the overhead and disk space that entails, or tcpdump) and pump that > through pt-query-digest (Percona Toolkit) to see which combinations of fields > are most often used, and add the necessary covering indices to help those > queries.
We have a log going back over 2 years with over 200,000 queries, and I've written python scripts to give the frequency of the column pulled for the report and the column used for filtering. Not surprising, a very small number of each are used most of the time, and a huge number used just 1 or 2 times. I'll be digging into that next week. > A few points to keep in mind during analysis: > * order of fields in the where clause is largely irrelevant (although > most-selective-first is preferred) > * not all the fields in the index must be queried; but you MUST query a full > prefix set > -> ie, for index (a, b, c, d) you can query (a, b) but for (a, c) only (a) > will be used > * every index constitutes a (small) performance penalty upon table updates, > so don't go too wild either :-) > > Also helpful: > * plenty of memory, at least enough to keep the working set in memory > * faster disks (SSD, or FusionIO things) helps bundles for data that doesn't > fit in memory The db host is running with 250GB memory, db size is 470GB, spinning 2.5” 15k rpm drives, and 40 cores. > * if you notice a tendency for multiple users (say, a dozen) to run > identical queries (and that means /bitwise/ identical query text, down to the > last space), the query cache might help. Don't make it too big, though, a > couple of meg should suffice. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql