Re: MyISAM table size vs actual data, and performance

2013-02-22 Thread Johan De Meersman
- Original Message -
 From: Rick James rja...@yahoo-inc.com

Hey Rick,

Thanks for your thoughts.

 * Smells like some huge LONGTEXTs were INSERTed, then DELETEd.
  Perhaps just a single one of nearly 500M.

I considered that, too; but I can see the on-disk size grow over a period of a 
few months - it's not a sudden bump.


 * Yes, there is an impact on full table scans -- it has to step over
 the empty spots.  Or maybe not -- one big cow chip of 500MB would be
 easy to leap over.

Hmm, that is a point. I keep expecting a full tablescan to still use the PK, 
but this isn't InnoDB. Still, it's peculiar then that it suddenly becomes slow 
- I would expect the odds to tip in favour of an index scan as table_free 
grows, not the other way around.


 * OPTIMIZE TABLE is the primary way to recover the space.  It _may_
 be that space on the _end_ is automatically recovered.  If so, you
 might see the .MYD shrink even when OPTIMIZE is not run.

Yes, that's what I do, of course; but the free space should really be 
reallocated to updates - escpecially because of the longtext, which means it's 
allowed to fragment. Are there any tools available to analyze MyISAM datafiles? 
It'd be interesting to see how the free space is really distributed.


 * LONGTEXT is almost never useful.  Do you really think there are
 thingies that big?  Consider changing it to MEDIUMTEXT -- that would
 truncate any biggies to 16MB.
 
 * Smells like a key-value (EAV) schema design.  Such is destined to
 fail when trying to scale.  Yeah, you are probably stuck with
 Drupal.  Here are my comments and recommendations on EAV:
  http://mysql.rjweb.org/doc.php/eav

Yeps, Drupal. I could probably truncate that field, yes; but that's just 
another workaround, not a fix.

It's indeed a key/value scheme - every single page load that gets through the 
caches will select-star that entire table. It's crap, but I have precious 
little influence on the CMS decisionmaking.


 * Please try to find a way in your Email client to display STATUS
 without losing the spacing.

Heh, sorry. I've always preferred the wide layout, and tend to forget \G for 
mails.


 * When you switched to InnoDB, I hope you had innodb_file_per_table
 turned on.  That way, you can actually recoup the space when doing
 ALTER.  Otherwise, you will be stuck with a bloated ibdata1 file
 that you cannot easily shrink.

Default on all instances, of course.


 * In InnoDB, the LONGTEXT will usually be stored separately, thereby
 making a full table scan relatively efficient.

For now, they seem to be behaving. We'll see.


Thanks,
Johan


-- 
Unhappiness is discouraged and will be corrected with kitten pictures.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: MyISAM table size vs actual data, and performance

2013-02-21 Thread Rick James
* Smells like some huge LONGTEXTs were INSERTed, then DELETEd.  Perhaps just a 
single one of nearly 500M. 

* Yes, there is an impact on full table scans -- it has to step over the empty 
spots.  Or maybe not -- one big cow chip of 500MB would be easy to leap over.

* OPTIMIZE TABLE is the primary way to recover the space.  It _may_ be that 
space on the _end_ is automatically recovered.  If so, you might see the .MYD 
shrink even when OPTIMIZE is not run.

* LONGTEXT is almost never useful.  Do you really think there are thingies that 
big?  Consider changing it to MEDIUMTEXT -- that would truncate any biggies to 
16MB.

* Smells like a key-value (EAV) schema design.  Such is destined to fail when 
trying to scale.  Yeah, you are probably stuck with Drupal.  Here are my 
comments and recommendations on EAV:  http://mysql.rjweb.org/doc.php/eav

* Please try to find a way in your Email client to display STATUS without 
losing the spacing.

* When you switched to InnoDB, I hope you had innodb_file_per_table turned on.  
That way, you can actually recoup the space when doing ALTER.  Otherwise, you 
will be stuck with a bloated ibdata1 file that you cannot easily shrink.

* In InnoDB, the LONGTEXT will usually be stored separately, thereby making a 
full table scan relatively efficient.

 -Original Message-
 From: Johan De Meersman [mailto:vegiv...@tuxera.be]
 Sent: Friday, February 15, 2013 4:21 AM
 To: mysql.
 Subject: MyISAM table size vs actual data, and performance
 
 
 
 Hey list,
 
 I've got another peculiar thing going on :-) Let me give you a quick
 summary of the situation first: we host a number of Drupal sites, each
 site and it's db on separate VMs for reasons that are not important to
 this scenario. MySQL is 5.0.51a-24+lenny4-log (Debian); I don't have
 the exact Drupal version here but it's likely to be a 5.x branch.
 
 The easy thing to say would of course be upgrade your versions, but
 that's not an option right now. I don't really care if that means I
 have no actual *fix* for the problem - I know how to work around it.
 I'm just looking for a cause, ideally maybe even a specific known bug.
 Strangely enough, I'm seeing this on three distinct installs; but
 others with the same versions and setup (but different sites) seem to
 not exhibit the issue.
 
 So, what I'm seeing is this: Drupal's variable table keeps growing,
 but there does not seem to be more data. I understand how record
 allocation and free space in datafiles works, but this is well beyond
 the normal behaviour.
 
 
 http://www.tuxera.be/filestore/heciexohhohj/df-year.png
 
 As you can see here (the lime green line of /data), growth occurs
 gradually (and the issue happened in september, as well), until it
 seems to reach a certain point. At some point, however, performance on
 that table (notably select * - it's a drupal thing) pretty much
 instantly plummets, and the query takes around half a minute to run -
 whereas now, after reclaiming the free space, it takes 0.03 seconds.
 
 I don't have the exact numbers as I wasn't on-site yesterday evening,
 but since the disk is 5GB, the reclaimed space yesterday must have been
 around 850MB - for a table that is now 30MB. No records were deleted
 from the table, the workaround is as simple as OPTIMIZE TABLE
 variable - simply rebuild the table. The logs make no mention of a
 crashed table, so it's very unlikely that this is a borked index. Even
 if it were, I wouldn't expect a scan of 30MB in 1202 rows to take half
 a minute, on a table that is accessed so often that it's relevant
 blocks are bound to be in the filesystem cache.
 
 The table's structure is fairly simple, too:
 
 
 
 CREATE TABLE `variable` (
 `name` varchar(128) NOT NULL DEFAULT '', `value` longtext NOT NULL,
 PRIMARY KEY (`name`)
 ) ENGINE=MyISAM DEFAULT CHARSET=utf8
 
 
 
 
 I currently have another system that's also growing that table, here's
 a bit of session:
 
 
 blockquote
 mysql show table status like 'variable';
 +--++-++--++---
 --+-+--+---+---
 -+-+-+-+---
 --+--++-+
 | Name | Engine | Version | Row_format | Rows | Avg_row_length |
 | Data_length | Max_data_length | Index_length | Data_free |
 | Auto_increment | Create_time | Update_time | Check_time | Collation |
 | Checksum | Create_options | Comment |
 +--++-++--++---
 --+-+--+---+---
 -+-+-+-+---
 --+--++-+
 | variable | MyISAM | 10 | Dynamic | 1188 | 795 | 493277732 |
 | 281474976710655 | 41984 | 492332716 | NULL | 2011-12-13 16:18:53 |
 | 2013-02-15 12:35:18 | 2012-10-17 15:45:11 | utf8_general_ci | NULL |
 |
 | |