Hello Johan,

 Just an update. Using the "load index into cache" statement for the
200 million row indexed "source" table, my correlated update statement
ran in 1 hour, 45 minutes to update 144 million rows.   A 50% increase
in performance!

Thank you very much,

-Hank



On Fri, Sep 23, 2011 at 6:56 AM, Johan De Meersman <vegiv...@tuxera.be> wrote:
>
>
> ----- Original Message -----
>> From: "Hank" <hes...@gmail.com>
>>
>> (please read my ORIGINAL post with all this information).
>
> Welcome on the Internet, where everyone will tell you everything you need to 
> know, except for what you want to know :-)
>
>> I am trying to find a logical or reasonable explanation WHY this would be the
>> case, despite the fact that the documentation states otherwise (see: Right 
>> here:
>> http://dev.mysql.com/doc/refman/5.5/en/lock-tables-restrictions.html)
>
> I believe you're misinterpreting that, as is the author from the blog you 
> originally referenced.
>
> What it says, is "If you are going to run many operations". You are updating 
> many rows, but you are only doing ONE operation: a single update statement.
>
> While this explains why you're not seeing benefit, I have to admit that I'm 
> at a loss, too, as to why you are experiencing an actual slowdown - the 
> update statement will lock the tables, too, so it shouldn't really make any 
> difference at all.
>
>> But if seeing some SQL will make you happy, here is just one example:
>>
>> UPDATE dest d straight_join source s set d.seq=s.seq WHERE d.key=s.key;
>
> See, this is why people ask to see your queries. You never mentioned you were 
> doing a join in the update :-)
>
> I'll ignore the join condition in the where clause, as it makes little 
> difference in this case; but I do note that you're using a straight join. Is 
> the optimizer really reading the tables in the wrong order, or is it just 
> something you picked up off a blog without realising the impact?
>
>>  Source is indexed by key+seq (key is primary key, but seq is
>>  included as a covering index).
>
> Good practice, that should prevent source from being read from disk, if your 
> index is fully in the cache. Do you have an index on dest.key, too? That 
> might help performance as well if it fits in memory, too, because you'll only 
> need disk access for flushing writes, then.
>
>> This query takes about 3.5 hours when I don't use LOCK TABLES, and over 4 
>> hours when I do use LOCK TABLES.
>
> Most peculiar. Is the difference in performance consistent in repeated 
> executions?
>
>> And before testing each run, I do restart the server so there is no
>> query caching and I also use FLUSH TABLES between each test run.
>
> That's good, as it will give you the worst-case scenario. However, since such 
> an update is going to wreck your index cache anyway, you may just as well 
> preload the appropriate indices into it beforehand, if the cache is sized big 
> enough to hold them. That might give a minor performance boost, too, as the 
> server won't have to go to disk every so often to fetch index blocks.
> See http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html for that.
>
>> | key_buffer_size          | 402653184 |
>
> 400MB of key buffer, that's not bad at all. Do a /show table status/ in your 
> database, and sum the index sizes. If your key buffer is larger than this 
> (and why not scale it for growth a bit?) all your indices will fit, which 
> will save on disk access for index lookups *and* for index-covered queries.
>
>
>
> --
> Bier met grenadyn
> Is als mosterd by den wyn
> Sy die't drinkt, is eene kwezel
> Hy die't drinkt, is ras een ezel
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to