On Fri, Jul 31, 2009 at 8:14 PM, Milan Andric <mand...@gmail.com> wrote:
Hello,

I'm serving a burly Drupal install and at some points throughout the
day the mysql threads go way up and iowait peaks.  I'm not sure which
is causing which but during this time the server is unresponsive.  I
would like to determine if there is a poorly optimized query causing
this.

Drupal is a disaster :-)

I'm running a dozen drupals, and we've spent the last eight months heavily
optimizing every aspect. It's reasonably good now, but I'd still like to
move away from it.

Drupal wants to remain compatible with multiple databases, notably both
MySQL and Postgres, and for v7 also Oracle. This means that they can't
optimize their DB layer and use the fancy tricks of a single DB, but must do
things in ways that are compatible with all of them. Postgres doesn't have
autoincrement ? No autoincrement. Mysql doesn't have sequences ? No
sequences.

Some points:
 - Drupal uses it's own internal unique ID system, not autoincrement, for a
number of things. This means that every insert that uses this locks all
other inserts that use this.
 - Drupal has the bothersome tendency to do full table locks around some
inserts. This means that even if you move to InnoDB, you'll still get full
table locks. We patched this out according to info found on drupal.org and
other sites.
 - Drupal's innate caching mechanisms are DB bound. Fun and dandy, but
rather irritating if as soon as a busy page expires, a hundred threads all
try to update the same table at the same time. We implemented Memcached.
Patches, at least partial ones, are on drupal.org, but I know we did more
and better in-house.
 - Drupal can cache fine for anonymous users, but once you're logged in,
it's nearly impossible to cache. We worked around this by basically making
'static' cacheable pages, and pulling any dynamic content in through AJAX.
 - Get rid of Drupal's views (in drupal, that is, not wat is called a view
in a database). They generate horribly inefficient queries.
 - Full text search modules use MySQL's full text indices, but that's not
the best way of doing things. We implemented Sphinx search, there's an
existing drupal module for integration.

There's lots more that was done, but I can't provide all that info because
a) it's company internal, and b) because I'm not a developer and thus don't
know half of it :-)

Suffice it to say, I don't like drupal for high-traffic interactive sites.
Get away from it if you can.




  I'm logging slow queries but is there a way to see when the
> slow queries take place also?  I'd like to know what queries are being
> processed during this window of poor response time, usually around
> noon local time.


If you want more than just the slow queries, you'll have to enable the full
log. Be aware that this has noticeable impact on performance, and will spam
your disks. Log on different spindles if able, and monitor your disk usage
carefully - mysql stops functioning if it can't write logs.




-- 
Celsius is based on water temperature.
Fahrenheit is based on alcohol temperature.
Ergo, Fahrenheit is better than Celsius. QED.

Reply via email to