Hello Johan, Just an update. Using the "load index into cache" statement for the 200 million row indexed "source" table, my correlated update statement ran in 1 hour, 45 minutes to update 144 million rows. A 50% increase in performance!
Thank you very much, -Hank On Fri, Sep 23, 2011 at 6:56 AM, Johan De Meersman <vegiv...@tuxera.be> wrote: > > > ----- Original Message ----- >> From: "Hank" <hes...@gmail.com> >> >> (please read my ORIGINAL post with all this information). > > Welcome on the Internet, where everyone will tell you everything you need to > know, except for what you want to know :-) > >> I am trying to find a logical or reasonable explanation WHY this would be the >> case, despite the fact that the documentation states otherwise (see: Right >> here: >> http://dev.mysql.com/doc/refman/5.5/en/lock-tables-restrictions.html) > > I believe you're misinterpreting that, as is the author from the blog you > originally referenced. > > What it says, is "If you are going to run many operations". You are updating > many rows, but you are only doing ONE operation: a single update statement. > > While this explains why you're not seeing benefit, I have to admit that I'm > at a loss, too, as to why you are experiencing an actual slowdown - the > update statement will lock the tables, too, so it shouldn't really make any > difference at all. > >> But if seeing some SQL will make you happy, here is just one example: >> >> UPDATE dest d straight_join source s set d.seq=s.seq WHERE d.key=s.key; > > See, this is why people ask to see your queries. You never mentioned you were > doing a join in the update :-) > > I'll ignore the join condition in the where clause, as it makes little > difference in this case; but I do note that you're using a straight join. Is > the optimizer really reading the tables in the wrong order, or is it just > something you picked up off a blog without realising the impact? > >>  Source is indexed by key+seq (key is primary key, but seq is >>  included as a covering index). > > Good practice, that should prevent source from being read from disk, if your > index is fully in the cache. Do you have an index on dest.key, too? That > might help performance as well if it fits in memory, too, because you'll only > need disk access for flushing writes, then. > >> This query takes about 3.5 hours when I don't use LOCK TABLES, and over 4 >> hours when I do use LOCK TABLES. > > Most peculiar. Is the difference in performance consistent in repeated > executions? > >> And before testing each run, I do restart the server so there is no >> query caching and I also use FLUSH TABLES between each test run. > > That's good, as it will give you the worst-case scenario. However, since such > an update is going to wreck your index cache anyway, you may just as well > preload the appropriate indices into it beforehand, if the cache is sized big > enough to hold them. That might give a minor performance boost, too, as the > server won't have to go to disk every so often to fetch index blocks. > See http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html for that. > >> | key_buffer_size      | 402653184 | > > 400MB of key buffer, that's not bad at all. Do a /show table status/ in your > database, and sum the index sizes. If your key buffer is larger than this > (and why not scale it for growth a bit?) all your indices will fit, which > will save on disk access for index lookups *and* for index-covered queries. > > > > -- > Bier met grenadyn > Is als mosterd by den wyn > Sy die't drinkt, is eene kwezel > Hy die't drinkt, is ras een ezel > -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org