Re: Schemas for storing and reporting on hashtags
Dehua, Thanks. You are correct. Perhaps I was just over-thinking it. -Hank On Fri, Oct 19, 2012 at 9:48 PM, Dehua Yang meflyingf...@gmail.com wrote: Hi Hank I just can think like this table CREATE TABLE xyz ( hashtag VARCHAR(...) NOT NULL, comment_id ... NOT NULL, user_id bigint unsigned not null, PRIMARY KEY (hashtag, comment_id), INDEX(comment_id, hashtag), index idx_user_id(user_id) ) ENGINE = InnoDB; one user want to check all his comments select * from xyz where user_id=x; Actually, I think your case is very classic. Hope that you can tell me how would you to resolve this problem. On Sat, Oct 20, 2012 at 3:26 AM, Hank hes...@gmail.com wrote: It's actually much more complicated than a simple many-to-many relationship. For instance, users may enter many hashtags in many comments, and a user might want to see all the hashtags they've used, and then find all the comments with those hashtags. I'm not trying to re-create the specification or build it here in discussion. I'm looking to see if anyone has it done already, and I can review their flushed-out design. I've been a database architect for 20+ years, so I know what I'm doing. I'm not asking for people to show me what to do. Like I said, I could sit down and design it myself pretty quickly, but I would like to see what other people have *actually done* to solve the problem before. -Hank On Fri, Oct 19, 2012 at 2:42 PM, Rick James rja...@yahoo-inc.com wrote: Many-to-many? That is, can a comment have many different hashtags? And a hashtag can be associated with many comments? Best practice for many-to-many: CREATE TABLE xyz ( hashtag VARCHAR(...) NOT NULL, comment_id ... NOT NULL, PRIMARY KEY (hashtag, comment_id), INDEX(comment_id, hashtag) ) ENGINE = InnoDB; One might want to normalize the hashtags, but it does not seem warranted in this situation. -Original Message- From: Hank [mailto:hes...@gmail.com] Sent: Friday, October 19, 2012 8:58 AM To: MySql Subject: Schemas for storing and reporting on hashtags Are there any established best practices or schemas for incorporating twitter-like hashtags into a database ? Let's say I have a blog with a commenting system, and I want to allow people to add hashtags to the comments. I could certainly create one on my own (it's not that difficult), but I'd like to see what other people have done in terms of storage and features. I'm also looking for a solid basic implementation, not something overly complex. Thanks, -Hank (query, mysql) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- Gtalk : meflyingf...@gmail.com Skype name : meflyingfish Twitter: http://twitter.com/whitepoplar -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Schemas for storing and reporting on hashtags
Are there any established best practices or schemas for incorporating twitter-like hashtags into a database ? Let's say I have a blog with a commenting system, and I want to allow people to add hashtags to the comments. I could certainly create one on my own (it's not that difficult), but I'd like to see what other people have done in terms of storage and features. I'm also looking for a solid basic implementation, not something overly complex. Thanks, -Hank (query, mysql) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Schemas for storing and reporting on hashtags
It's actually much more complicated than a simple many-to-many relationship. For instance, users may enter many hashtags in many comments, and a user might want to see all the hashtags they've used, and then find all the comments with those hashtags. I'm not trying to re-create the specification or build it here in discussion. I'm looking to see if anyone has it done already, and I can review their flushed-out design. I've been a database architect for 20+ years, so I know what I'm doing. I'm not asking for people to show me what to do. Like I said, I could sit down and design it myself pretty quickly, but I would like to see what other people have *actually done* to solve the problem before. -Hank On Fri, Oct 19, 2012 at 2:42 PM, Rick James rja...@yahoo-inc.com wrote: Many-to-many? That is, can a comment have many different hashtags? And a hashtag can be associated with many comments? Best practice for many-to-many: CREATE TABLE xyz ( hashtag VARCHAR(...) NOT NULL, comment_id ... NOT NULL, PRIMARY KEY (hashtag, comment_id), INDEX(comment_id, hashtag) ) ENGINE = InnoDB; One might want to normalize the hashtags, but it does not seem warranted in this situation. -Original Message- From: Hank [mailto:hes...@gmail.com] Sent: Friday, October 19, 2012 8:58 AM To: MySql Subject: Schemas for storing and reporting on hashtags Are there any established best practices or schemas for incorporating twitter-like hashtags into a database ? Let's say I have a blog with a commenting system, and I want to allow people to add hashtags to the comments. I could certainly create one on my own (it's not that difficult), but I'd like to see what other people have done in terms of storage and features. I'm also looking for a solid basic implementation, not something overly complex. Thanks, -Hank (query, mysql) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: GA download reverted back to 5.5.24?
Check the manual: http://dev.mysql.com/doc/refman/5.5/en/news-5-5-25.html Shawn Green Thank you, but that warning note was not there when I first posted this message in here. I'm not sure when the warning note appeared, but I'd guess it was within the last 36 hours. All that I could see was that 5.5.25 mysteriously disappeared to be replaced by 5.5.24. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
GA download reverted back to 5.5.24?
I am in the process of reporting a new MySQL bug in 5.5.25 (doesn't exist in 5.5.24) - see: http://bugs.mysql.com/bug.php?id=65740 And I just noticed that at the mysql.com website, the GA downloads have just been reverted back to 5.5.24. Is there a blog or update site that might explain why they retracted 5.5.25? thanks, -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
preg_replace in update statement
I have a simple problem: I have a varchar field in the database, and I want to remove all text between WordA and WordB, including WordA and WordB, leaving all text before WordA and after WordB intact. Possible with just SQL? I know I can write a PHP program to do it, but it's not that important to spend that much time on. I'd like one SQL statement to do it. Thanks! -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: preg_replace in update statement
They are regular words. I was hoping someone would already know how to do it. I was trying to avoid rolling my own solution using the string functions. It gets really messy, really quick. -Hank On Thu, Mar 8, 2012 at 8:18 PM, Michael Dykman mdyk...@gmail.com wrote: If your words need to be regular expressions, per-se with meta characters, etc.. you are pretty much out of luck. If you only need to match some literal strings, a method can surely be derived through the heavy-handed use of mysql's string function library. http://dev.mysql.com/doc/refman/5.0/en/string-functions.html - michael dykman On Thu, Mar 8, 2012 at 4:11 PM, Hank hes...@gmail.com wrote: I have a simple problem: I have a varchar field in the database, and I want to remove all text between WordA and WordB, including WordA and WordB, leaving all text before WordA and after WordB intact. Possible with just SQL? I know I can write a PHP program to do it, but it's not that important to spend that much time on. I'd like one SQL statement to do it. Thanks! -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Error on 'flush tables' command
I'm getting an occasional error from one of my slaves running community mysql 5.5.16: 'Got an error writing communication packets' on query. Query: 'flush tables' Which halts replication on this slave until I issue a 'SET GLOBAL sql_slave_skip_counter=1' command on the slave. There are a few FEDERATED tables on the slave.. is that what would cause a communication packet error? If not, what else could cause this on a flush tables command? Thanks. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Quantity of time from difference of two Datetime values?
n Fri, Sep 30, 2011 at 11:08 PM, Jan Steinman j...@bytesmiths.com wrote: Okay, I've reviewed the online man page for date and time functions, and I've played with several likely candidates, and I am still having trouble subtracting two arbitrary Datetimes to get something that is useful. A simple subtraction yields the least useful thing possible: a modulo-100 difference of modulo-60 quantities. Other functions yield the proper answer, but not for a quantity of time that rolls over midnight, etc. Surely, there are tons of payroll apps that subtract the punch-out from the punch-in to come up with a quantity of time? What is YOUR favourite way of coming up with a quantity of time as the difference between two arbitrary Datetimes? Did I overlook something simple? Do I need to convert the two to scalar integers first? See the unix_timestamp() function. Converts date/times to a scalar integer of the number of seconds since the Unix Epoch. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: mysql listed as attach page by google?
I've been watching this page to see when my anti-virus tool was updated to scan for this specific virus/threat: http://www.virustotal.com/file-scan/report.html?id=d761babcb55d21b467dd698169c921995bf58eac5e9912596693fee52c8690a1-1317175019 I use AVG -Hank On Wed, Sep 28, 2011 at 8:45 AM, Rozeboom, Kay [DAS] kay.rozeb...@iowa.gov wrote: Does anyone know if this has been fixed yet? -Original Message- From: Jigal van Hemert [mailto:ji...@xs4all.nl] Sent: Monday, September 26, 2011 2:02 PM To: mysql@lists.mysql.com Subject: Re: mysql listed as attach page by google? Hi, On 26-9-2011 20:30, Michael Albert wrote: I don't suppose I am the first to notice this, but most of the pages on dev.mysql.com have been listed by google as attack pages, e.g http://dev.mysql.com/downloads/. Has there been a problem, or is google being overzealous? I fear Google is right. http://www.net-security.org/malware_news.php?id=1853 -- Kind regards / met vriendelijke groet, Jigal van Hemert. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=kay.rozeb...@iowa.gov -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=hes...@gmail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Update table on lost connection
Check out the GET_LOCK and RELEASE_LOCK virtual lock functions in MySQL. -Hank On Wed, Sep 28, 2011 at 9:15 AM, Alex Schaft al...@quicksoftware.co.za wrote: Hi, We're busy moving legacy apps from foxpro tables to mysql. User logins were tracked via a record in a table which the app then locked, preventing multiple logins for the same user code. I want to simulate this via a locked column in a mysql table, but would need the field to be cleared if the server loses the connection to the client. How would I do this, or is there an alternative? Thanks, Alex -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=hes...@gmail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Slower performance with LOCK TABLES
Hello Johan, Thanks for your comprehensive reply. I'll try to answer each of your questions below. -Hank But if seeing some SQL will make you happy, here is just one example: UPDATE dest d straight_join source s set d.seq=s.seq WHERE d.key=s.key; See, this is why people ask to see your queries. You never mentioned you were doing a join in the update :-) I'll ignore the join condition in the where clause, as it makes little difference in this case; but I do note that you're using a straight join. Is the optimizer really reading the tables in the wrong order, or is it just something you picked up off a blog without realising the impact? Yes, I'm using a straight join intentionally. I have 144 million unindexed rows in dest. I want Mysql to start with those rows sequentially, then join them to the matching record in source using its index (244 million rows). If I don't do that, mysql tries to use the indexed table first, causing a full table scans on dest . So with straight_join, it does it in proper order. During experimentation with different joins, a regular join would run for days. A straight join runs for 3-4 hours. Source is indexed by key+seq (key is primary key, but seq is included as a covering index). Good practice, that should prevent source from being read from disk, if your index is fully in the cache. With 244 million records in the source table, I'm not sure that would fit in the cache. Do you have an index on dest.key, too? That might help performance as well if it fits in memory, too, because you'll only need disk access for flushing writes, then. I do not have an index on dest.key, also intentionally, for two reasons. First, updating 144 million records in place is slow enough, but trying to update 144 million records AND the index on that field would absolutely kill the performance of the update. Once the update is complete, I re-create the index with a sort using myisamchk. Second, the starting value of dest.key for all 144 million records is 0 so an index on that field wouldn't really help, I think. This query takes about 3.5 hours when I don't use LOCK TABLES, and over 4 hours when I do use LOCK TABLES. Most peculiar. Is the difference in performance consistent in repeated executions? yes, I've done these tests about a dozen times now, and while not exactly scientific, the results are that LOCK TABLES always results in longer running times. Not just for this query, but other full table update/select/delete/insert queries. Not more than twice as long, but easily a 10% to 25% increase. And before testing each run, I do restart the server so there is no query caching and I also use FLUSH TABLES between each test run. That's good, as it will give you the worst-case scenario. However, since such an update is going to wreck your index cache anyway, you may just as well preload the appropriate indices into it beforehand, if the cache is sized big enough to hold them. That might give a minor performance boost, too, as the server won't have to go to disk every so often to fetch index blocks. See http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html for that. I'll look into that, but the index is huge. Ok, I just preloaded the source index using LOAD INDEX INTO CACHE source IGNORE LEAVES;.. it took two minutes/15 seconds to pre-load the index. I then ran two small tests on smaller tables using the same update statement, and they both yielded a massive increase in update speed. I'll have to rebuild the large dest table again to try it on the biggest UPDATE, but wow, if this is any indication, this was a great suggestion. I'll report back on the results later today. Thank you! | key_buffer_size | 402653184 | 400MB of key buffer, that's not bad at all. Do a /show table status/ in your database, and sum the index sizes. If your key buffer is larger than this (and why not scale it for growth a bit?) all your indices will fit, which will save on disk access for index lookups *and* for index-covered queries. The index length for source is 5,889,037,312. Thanks again for your assistance. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Slower performance with LOCK TABLES
Hello Johan, Just an update. Using the load index into cache statement for the 200 million row indexed source table, my correlated update statement ran in 1 hour, 45 minutes to update 144 million rows. A 50% increase in performance! Thank you very much, -Hank On Fri, Sep 23, 2011 at 6:56 AM, Johan De Meersman vegiv...@tuxera.be wrote: - Original Message - From: Hank hes...@gmail.com (please read my ORIGINAL post with all this information). Welcome on the Internet, where everyone will tell you everything you need to know, except for what you want to know :-) I am trying to find a logical or reasonable explanation WHY this would be the case, despite the fact that the documentation states otherwise (see: Right here: http://dev.mysql.com/doc/refman/5.5/en/lock-tables-restrictions.html) I believe you're misinterpreting that, as is the author from the blog you originally referenced. What it says, is If you are going to run many operations. You are updating many rows, but you are only doing ONE operation: a single update statement. While this explains why you're not seeing benefit, I have to admit that I'm at a loss, too, as to why you are experiencing an actual slowdown - the update statement will lock the tables, too, so it shouldn't really make any difference at all. But if seeing some SQL will make you happy, here is just one example: UPDATE dest d straight_join source s set d.seq=s.seq WHERE d.key=s.key; See, this is why people ask to see your queries. You never mentioned you were doing a join in the update :-) I'll ignore the join condition in the where clause, as it makes little difference in this case; but I do note that you're using a straight join. Is the optimizer really reading the tables in the wrong order, or is it just something you picked up off a blog without realising the impact? Source is indexed by key+seq (key is primary key, but seq is included as a covering index). Good practice, that should prevent source from being read from disk, if your index is fully in the cache. Do you have an index on dest.key, too? That might help performance as well if it fits in memory, too, because you'll only need disk access for flushing writes, then. This query takes about 3.5 hours when I don't use LOCK TABLES, and over 4 hours when I do use LOCK TABLES. Most peculiar. Is the difference in performance consistent in repeated executions? And before testing each run, I do restart the server so there is no query caching and I also use FLUSH TABLES between each test run. That's good, as it will give you the worst-case scenario. However, since such an update is going to wreck your index cache anyway, you may just as well preload the appropriate indices into it beforehand, if the cache is sized big enough to hold them. That might give a minor performance boost, too, as the server won't have to go to disk every so often to fetch index blocks. See http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html for that. | key_buffer_size | 402653184 | 400MB of key buffer, that's not bad at all. Do a /show table status/ in your database, and sum the index sizes. If your key buffer is larger than this (and why not scale it for growth a bit?) all your indices will fit, which will save on disk access for index lookups *and* for index-covered queries. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Slower performance with LOCK TABLES
Thanks for your reply. I failed to mention that these are MYISAM tables, so no transactions. And like I said, this is not a production box nor is there any application running, so there's no contention for the tables being locked. I'm trying to update a database design on two tables with 200 million records each, so anything I can do to increase the performance of these long running queries will shorten the migration running time. What I was referring to was that in the documentation, that when using LOCK TABLES, mysql does not update the key cache until the lock is released, versus when not using LOCK TABLES it does update the key cache on each insert/update/delete. see: http://tuxradar.com/practicalphp/18/2/22 In my testing, I'm seeing a slow down when I use LOCK TABLES versus running the same queries without it. I'm just trying to find a reason why that might be the case. -Hank On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis antonycur...@verizon.netwrote: LOCK TABLES...WRITE is very likely to reduce performance if you are using a transactional storage engine, such as InnoDB/XtraDB or PBXT. The reason is that only one connection is holding the write lock and no other concurrent operation may occur on the table. LOCK TABLES is only really useful for non-transactional tables and maybe a few specialized operations where it has its advantages but for 99.9% of cases, it should not be used. What does increase performance is the proper use of transactions with appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE MODE. Regards, Antony. On 21 Sep 2011, at 20:34, Hank wrote: According to everything I've read, using LOCK TABLES...WRITE for updates, inserts and deletes should improve performance of mysql server, but I think I've been seeing the opposite effect. I've been doing quite a bit of testing on a 64bit install of CentOS 5.5 installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell R610. There are no other VMs on this box, and there are no other users or threads running on the OS. Just me. I'm using this box strictly for testing of large database migration scripts. It seems like when I execute some of these long running statements without locking the tables, the code runs quite a bit faster than when I do lock the tables. And before testing each run, I do restart the server so there is no query caching and I also use FLUSH TABLES between each test run. All I'm asking is this: Can anything think of a scenario on a single user-box and mysql instance, that locking tables would cause these DML statements to slow down compared to not locking the tables? Thanks, -Hank
Re: Slower performance with LOCK TABLES
That is what I'm doing. I'm doing a correlated update on 200 million records. One UPDATE statement. Also, I'm not asking for a tutorial when not to use LOCK TABLES. I'm trying to figure out why, despite what the documentation says, using LOCK TABLES hinders performance for large update statements on MYISAM tables when it is supposed to increase performance on exactly the type of queries I am performing. If you can't help answer *that* question, please stop lecturing me on the reasons not to use LOCK TABLES. Thanks. -Hank On Thu, Sep 22, 2011 at 10:19 AM, Antony T Curtis antonycur...@verizon.netwrote: Even for MyISAM tables, LOCK TABLES is not usually the best solution for increasing performance. When there is little to no contention, LOCK TABLES doesn't offer much value. MyISAM works best when you can get more work done in a statement: Instead of executing a bunch of insert statements, combine them into a single multi-row insert statement, as an example. On 22 Sep 2011, at 06:13, Hank wrote: Thanks for your reply. I failed to mention that these are MYISAM tables, so no transactions. And like I said, this is not a production box nor is there any application running, so there's no contention for the tables being locked. I'm trying to update a database design on two tables with 200 million records each, so anything I can do to increase the performance of these long running queries will shorten the migration running time. What I was referring to was that in the documentation, that when using LOCK TABLES, mysql does not update the key cache until the lock is released, versus when not using LOCK TABLES it does update the key cache on each insert/update/delete. see: http://tuxradar.com/practicalphp/18/2/22 In my testing, I'm seeing a slow down when I use LOCK TABLES versus running the same queries without it. I'm just trying to find a reason why that might be the case. -Hank On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis antonycur...@verizon.net wrote: LOCK TABLES...WRITE is very likely to reduce performance if you are using a transactional storage engine, such as InnoDB/XtraDB or PBXT. The reason is that only one connection is holding the write lock and no other concurrent operation may occur on the table. LOCK TABLES is only really useful for non-transactional tables and maybe a few specialized operations where it has its advantages but for 99.9% of cases, it should not be used. What does increase performance is the proper use of transactions with appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE MODE. Regards, Antony. On 21 Sep 2011, at 20:34, Hank wrote: According to everything I've read, using LOCK TABLES...WRITE for updates, inserts and deletes should improve performance of mysql server, but I think I've been seeing the opposite effect. I've been doing quite a bit of testing on a 64bit install of CentOS 5.5 installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell R610. There are no other VMs on this box, and there are no other users or threads running on the OS. Just me. I'm using this box strictly for testing of large database migration scripts. It seems like when I execute some of these long running statements without locking the tables, the code runs quite a bit faster than when I do lock the tables. And before testing each run, I do restart the server so there is no query caching and I also use FLUSH TABLES between each test run. All I'm asking is this: Can anything think of a scenario on a single user-box and mysql instance, that locking tables would cause these DML statements to slow down compared to not locking the tables? Thanks, -Hank
Re: Slower performance with LOCK TABLES
Actually, that would be orders of magnitude slower. I'm using MYISAM tables, so there's no commit. On Thu, Sep 22, 2011 at 1:48 PM, Ananda Kumar anan...@gmail.com wrote: Hi, Why dont u use a stored proc to update rows ,where u commit for every 1k or 10k rows. This will be much faster than ur individual update stmt. regards anandkl On Thu, Sep 22, 2011 at 8:24 PM, Hank hes...@gmail.com wrote: That is what I'm doing. I'm doing a correlated update on 200 million records. One UPDATE statement. Also, I'm not asking for a tutorial when not to use LOCK TABLES. I'm trying to figure out why, despite what the documentation says, using LOCK TABLES hinders performance for large update statements on MYISAM tables when it is supposed to increase performance on exactly the type of queries I am performing. If you can't help answer *that* question, please stop lecturing me on the reasons not to use LOCK TABLES. Thanks. -Hank On Thu, Sep 22, 2011 at 10:19 AM, Antony T Curtis antonycur...@verizon.netwrote: Even for MyISAM tables, LOCK TABLES is not usually the best solution for increasing performance. When there is little to no contention, LOCK TABLES doesn't offer much value. MyISAM works best when you can get more work done in a statement: Instead of executing a bunch of insert statements, combine them into a single multi-row insert statement, as an example. On 22 Sep 2011, at 06:13, Hank wrote: Thanks for your reply. I failed to mention that these are MYISAM tables, so no transactions. And like I said, this is not a production box nor is there any application running, so there's no contention for the tables being locked. I'm trying to update a database design on two tables with 200 million records each, so anything I can do to increase the performance of these long running queries will shorten the migration running time. What I was referring to was that in the documentation, that when using LOCK TABLES, mysql does not update the key cache until the lock is released, versus when not using LOCK TABLES it does update the key cache on each insert/update/delete. see: http://tuxradar.com/practicalphp/18/2/22 In my testing, I'm seeing a slow down when I use LOCK TABLES versus running the same queries without it. I'm just trying to find a reason why that might be the case. -Hank On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis antonycur...@verizon.net wrote: LOCK TABLES...WRITE is very likely to reduce performance if you are using a transactional storage engine, such as InnoDB/XtraDB or PBXT. The reason is that only one connection is holding the write lock and no other concurrent operation may occur on the table. LOCK TABLES is only really useful for non-transactional tables and maybe a few specialized operations where it has its advantages but for 99.9% of cases, it should not be used. What does increase performance is the proper use of transactions with appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE MODE. Regards, Antony. On 21 Sep 2011, at 20:34, Hank wrote: According to everything I've read, using LOCK TABLES...WRITE for updates, inserts and deletes should improve performance of mysql server, but I think I've been seeing the opposite effect. I've been doing quite a bit of testing on a 64bit install of CentOS 5.5 installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell R610. There are no other VMs on this box, and there are no other users or threads running on the OS. Just me. I'm using this box strictly for testing of large database migration scripts. It seems like when I execute some of these long running statements without locking the tables, the code runs quite a bit faster than when I do lock the tables. And before testing each run, I do restart the server so there is no query caching and I also use FLUSH TABLES between each test run. All I'm asking is this: Can anything think of a scenario on a single user-box and mysql instance, that locking tables would cause these DML statements to slow down compared to not locking the tables? Thanks, -Hank
Re: Slower performance with LOCK TABLES
Sorry, but you do not understand my original issue or question. -Hank On Thu, Sep 22, 2011 at 2:10 PM, Ananda Kumar anan...@gmail.com wrote: No, Use a cursor(select column_name to be used in where condition of update stmt), loop through it for each update. regards anandkl On Thu, Sep 22, 2011 at 11:36 PM, Hank hes...@gmail.com wrote: Actually, that would be orders of magnitude slower. I'm using MYISAM tables, so there's no commit. On Thu, Sep 22, 2011 at 1:48 PM, Ananda Kumar anan...@gmail.com wrote: Hi, Why dont u use a stored proc to update rows ,where u commit for every 1k or 10k rows. This will be much faster than ur individual update stmt. regards anandkl On Thu, Sep 22, 2011 at 8:24 PM, Hank hes...@gmail.com wrote: That is what I'm doing. I'm doing a correlated update on 200 million records. One UPDATE statement. Also, I'm not asking for a tutorial when not to use LOCK TABLES. I'm trying to figure out why, despite what the documentation says, using LOCK TABLES hinders performance for large update statements on MYISAM tables when it is supposed to increase performance on exactly the type of queries I am performing. If you can't help answer *that* question, please stop lecturing me on the reasons not to use LOCK TABLES. Thanks. -Hank On Thu, Sep 22, 2011 at 10:19 AM, Antony T Curtis antonycur...@verizon.netwrote: Even for MyISAM tables, LOCK TABLES is not usually the best solution for increasing performance. When there is little to no contention, LOCK TABLES doesn't offer much value. MyISAM works best when you can get more work done in a statement: Instead of executing a bunch of insert statements, combine them into a single multi-row insert statement, as an example. On 22 Sep 2011, at 06:13, Hank wrote: Thanks for your reply. I failed to mention that these are MYISAM tables, so no transactions. And like I said, this is not a production box nor is there any application running, so there's no contention for the tables being locked. I'm trying to update a database design on two tables with 200 million records each, so anything I can do to increase the performance of these long running queries will shorten the migration running time. What I was referring to was that in the documentation, that when using LOCK TABLES, mysql does not update the key cache until the lock is released, versus when not using LOCK TABLES it does update the key cache on each insert/update/delete. see: http://tuxradar.com/practicalphp/18/2/22 In my testing, I'm seeing a slow down when I use LOCK TABLES versus running the same queries without it. I'm just trying to find a reason why that might be the case. -Hank On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis antonycur...@verizon.net wrote: LOCK TABLES...WRITE is very likely to reduce performance if you are using a transactional storage engine, such as InnoDB/XtraDB or PBXT. The reason is that only one connection is holding the write lock and no other concurrent operation may occur on the table. LOCK TABLES is only really useful for non-transactional tables and maybe a few specialized operations where it has its advantages but for 99.9% of cases, it should not be used. What does increase performance is the proper use of transactions with appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE MODE. Regards, Antony. On 21 Sep 2011, at 20:34, Hank wrote: According to everything I've read, using LOCK TABLES...WRITE for updates, inserts and deletes should improve performance of mysql server, but I think I've been seeing the opposite effect. I've been doing quite a bit of testing on a 64bit install of CentOS 5.5 installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell R610. There are no other VMs on this box, and there are no other users or threads running on the OS. Just me. I'm using this box strictly for testing of large database migration scripts. It seems like when I execute some of these long running statements without locking the tables, the code runs quite a bit faster than when I do lock the tables. And before testing each run, I do restart the server so there is no query caching and I also use FLUSH TABLES between each test run. All I'm asking is this: Can anything think of a scenario on a single user-box and mysql instance, that locking tables would cause these DML statements to slow down compared to not locking the tables? Thanks, -Hank
Re: Slower performance with LOCK TABLES
Like I said, the problem is not just one particular SQL statement. It is several dozen statements operating on tables with several hundred million records. The problem is that I am finding that when I use LOCK TABLES, these queries run slower (please read my ORIGINAL post with all this information). I am trying to find a logical or reasonable explanation WHY this would be the case, despite the fact that the documentation states otherwise (see: Right here: http://dev.mysql.com/doc/refman/5.5/en/lock-tables-restrictions.html ) But if seeing some SQL will make you happy, here is just one example: UPDATE dest d straight_join source s set d.seq=s.seq WHERE d.key=s.key; for 140 million records in dest and 220 million records in source. Source is indexed by key+seq (key is primary key, but seq is included as a covering index). There is no index on dest.seq -- that index is built once the update is complete. This query takes about 3.5 hours when I don't use LOCK TABLES, and over 4 hours when I do use LOCK TABLES. -Hank On Thu, Sep 22, 2011 at 2:18 PM, Ananda Kumar anan...@gmail.com wrote: May be if u can let the audience know a sip-net of ur sql, some can help u On Thu, Sep 22, 2011 at 11:43 PM, Hank hes...@gmail.com wrote: Sorry, but you do not understand my original issue or question. -Hank On Thu, Sep 22, 2011 at 2:10 PM, Ananda Kumar anan...@gmail.com wrote: No, Use a cursor(select column_name to be used in where condition of update stmt), loop through it for each update. regards anandkl On Thu, Sep 22, 2011 at 11:36 PM, Hank hes...@gmail.com wrote: Actually, that would be orders of magnitude slower. I'm using MYISAM tables, so there's no commit. On Thu, Sep 22, 2011 at 1:48 PM, Ananda Kumar anan...@gmail.comwrote: Hi, Why dont u use a stored proc to update rows ,where u commit for every 1k or 10k rows. This will be much faster than ur individual update stmt. regards anandkl On Thu, Sep 22, 2011 at 8:24 PM, Hank hes...@gmail.com wrote: That is what I'm doing. I'm doing a correlated update on 200 million records. One UPDATE statement. Also, I'm not asking for a tutorial when not to use LOCK TABLES. I'm trying to figure out why, despite what the documentation says, using LOCK TABLES hinders performance for large update statements on MYISAM tables when it is supposed to increase performance on exactly the type of queries I am performing. If you can't help answer *that* question, please stop lecturing me on the reasons not to use LOCK TABLES. Thanks. -Hank On Thu, Sep 22, 2011 at 10:19 AM, Antony T Curtis antonycur...@verizon.netwrote: Even for MyISAM tables, LOCK TABLES is not usually the best solution for increasing performance. When there is little to no contention, LOCK TABLES doesn't offer much value. MyISAM works best when you can get more work done in a statement: Instead of executing a bunch of insert statements, combine them into a single multi-row insert statement, as an example. On 22 Sep 2011, at 06:13, Hank wrote: Thanks for your reply. I failed to mention that these are MYISAM tables, so no transactions. And like I said, this is not a production box nor is there any application running, so there's no contention for the tables being locked. I'm trying to update a database design on two tables with 200 million records each, so anything I can do to increase the performance of these long running queries will shorten the migration running time. What I was referring to was that in the documentation, that when using LOCK TABLES, mysql does not update the key cache until the lock is released, versus when not using LOCK TABLES it does update the key cache on each insert/update/delete. see: http://tuxradar.com/practicalphp/18/2/22 In my testing, I'm seeing a slow down when I use LOCK TABLES versus running the same queries without it. I'm just trying to find a reason why that might be the case. -Hank On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis antonycur...@verizon.net wrote: LOCK TABLES...WRITE is very likely to reduce performance if you are using a transactional storage engine, such as InnoDB/XtraDB or PBXT. The reason is that only one connection is holding the write lock and no other concurrent operation may occur on the table. LOCK TABLES is only really useful for non-transactional tables and maybe a few specialized operations where it has its advantages but for 99.9% of cases, it should not be used. What does increase performance is the proper use of transactions with appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE MODE. Regards, Antony. On 21 Sep 2011, at 20:34, Hank wrote: According to everything I've read, using LOCK TABLES...WRITE for updates, inserts and deletes should improve performance of mysql server, but I think I've been seeing the opposite effect
Re: Slower performance with LOCK TABLES
On Thu, Sep 22, 2011 at 3:11 PM, Hassan Schroeder hassan.schroe...@gmail.com wrote: On Thu, Sep 22, 2011 at 11:51 AM, Hank hes...@gmail.com wrote: Like I said, the problem is not just one particular SQL statement. It is several dozen statements operating on tables with several hundred million records. The problem is that I am finding that when I use LOCK TABLES, these queries run slower (please read my ORIGINAL post with all this information). Wandering out my area of expertise here :-) but have you done any key cache tuning or are you running with the defaults? mysql show variables like 'key_%'; Also, what is the exact LOCK TABLE statement you're using? No, I haven't done any key cache tuning, as that's out of my area of expertise as well! I have 8GB of memory on this box, and I can go up to 12GB. Here are the variables: | key_buffer_size | 402653184 | | key_cache_age_threshold | 300 | | key_cache_block_size | 1024 | | key_cache_division_limit | 100 | The lock statement is quite simple: LOCK TABLE dest d write, source s read; thanks.
Slower performance with LOCK TABLES
According to everything I've read, using LOCK TABLES...WRITE for updates, inserts and deletes should improve performance of mysql server, but I think I've been seeing the opposite effect. I've been doing quite a bit of testing on a 64bit install of CentOS 5.5 installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell R610. There are no other VMs on this box, and there are no other users or threads running on the OS. Just me. I'm using this box strictly for testing of large database migration scripts. It seems like when I execute some of these long running statements without locking the tables, the code runs quite a bit faster than when I do lock the tables. And before testing each run, I do restart the server so there is no query caching and I also use FLUSH TABLES between each test run. All I'm asking is this: Can anything think of a scenario on a single user-box and mysql instance, that locking tables would cause these DML statements to slow down compared to not locking the tables? Thanks, -Hank
Re: myisamchk error (duplicate key records)
On Mon, Sep 19, 2011 at 7:19 AM, Johan De Meersman vegiv...@tuxera.bewrote: - Original Message - From: Hank hes...@gmail.com While running a -rq on a large table, I got the following error: myisamchk: warning: Duplicate key for record at 54381140 against record at 54380810 How do I find which records are duplicated (without doing the typical self-join or having cnt(*)1 query)? This table has 144 million rows, so that's not really feasible. Given that the error is a duplicate *key*, select key from table group by key having count(key) 1 is an index-covered query, and should thus be perfectly feasible :-) What I'm not so sure about, is wether the duplicate key will show up correctly in the index - as that index may be marked corrupt - and so, if it falls back to a full tablescan, it's indeed going to take a long time. If it does, however, there's no other option anyway: the only way to do it fast is an index, and that index is untrustworthy. Exactly - I can't create an index on the table until I remove the duplicate records. I suppose I could create a non-unique index on the key fields, and try that, but then I'd have to (1) create that index, and then (2) do the full table scan query. Either way, it's going to take a tremendous amount of time to do that. Alternatively, it would be most helpful if the tools provided that find and output the offending record block #s also provided a quick way to actually print out those offending rows so I could track down how they got in there in the first place. -Hank
Re: myisamchk error (duplicate key records)
Exactly - I can't create an index on the table until I remove the duplicate records. I was under the impression you were seeing this during a myisamchk run - which indicates you should *already* have a key on that field. Or am I interpreting that wrong? I'm trying to rebuild an index after disabling all keys using myisamchk and adding all 144 million records, so there is no current index on the table. I suppose I could create a non-unique index on the key fields, and try that, but then I'd have to (1) create that index, and then (2) do the full table scan query. No, if you create the index, you will not do a full tablescan afterwards. That's the whole point of an index :-) But in order to create the index, mysql has to do a full table scan AND a sort, which for 144 million records, is going to take a very long time. So an un-indexed full table scan without an index (i.e ...HAVING count(*)1) will actually take less time. So like I said, it would be intuitive and helpful if the tool finding the dup records actually provided enough info to view, find, and fix them so I don't have to index, sort, or table scan the table a second time.
Re: Quotes around INSERT and SELECT statements' arguments from the mysql CLI and PHP
what ugly style - if it is not numeric and you throw it to the database you are one of the many with a sql-injection because if you are get ivalid values until there you have done no sanitize before and do not here It's a matter of opinion. I never said the data wasn't sanitized (it is). But sometimes calculated values or bugs in PHP code end up with a null variable field. I was just suggesting the choice between two errors -- one syntax which will generate a hard failure of the query and likely whatever page, or a soft logical error, which won't. In either case, I have error trapping to catch both types of errors and alert me to them. I prefer the errors to be logical ones and not syntax errors. $sql=INSERT into table VALUES ( . (int)$id . ,' . mysql_real_escape_string($val) . '); or using a abstraction-layer (simple self written class) $sql=INSERT into table VALUES ( . (int)$id . ,' . $db-escape_string($val) . '); I think what you posted is ugly style which makes reading the actual SQL in PHP code much harder to read and debug. The data validation should take place elsewhere long before it gets to constructing the SQL statement.
Re: Quotes around INSERT and SELECT statements' arguments from the mysql CLI and PHP
Best of both worlds: $username=$_POST['username']; // do some stuff with username here $M=array(); // Array of things to be inserted into MySQL $M[username]=mysql_real_escape_string($username); // Everything that goes into $M is escaped $query=INSERT INTO table (username) VALUES ('{$M[username]}'); I'm not sure I'm seeing why, in particular, you are using an array here?
Re: Quotes around INSERT and SELECT statements' arguments from the mysql CLI and PHP
I want to be sure that all variables in the query are escaped. I don't trust myself or anyone else to do this to every variable right before the query: $someVar=mysql_real_escape_string($someVar); But you're doing exactly that right before the query anyway with: $M[username]=mysql_real_escape_string($username); You're just complicating things with the addition of an unneeded array. It seems much simpler and less cluttered to just do: $someVar=mysql_real_escape_string($someVar); before your insert. All you are doing is changing $someVar to $M[...] and then using $M[...] in the query. I really don't see the difference or benefit of using your array here. Both methods are doing exactly the same thing, except one is more convoluted. Now on the other hand, if you have several elements in the array $M to be inserted, and have a function like this to escape them all at once: for each ($M as $val) $val= mysql_real_escape_string($val); then your method starts to make more sense. -Hank
myisamchk error (duplicate key records)
While running a -rq on a large table, I got the following error: myisamchk: warning: Duplicate key for record at 54381140 against record at 54380810 How do I find which records are duplicated (without doing the typical self-join or having cnt(*)1 query)? This table has 144 million rows, so that's not really feasible. myisamchk --block-search # looked promising, but I can't find any documentation on how to use it properly. I tried myisamchk -b 54381140 table-name but it really doesn't do anything. I posted this identical question here six years ago, and I have the same problem again. I still can't find ANY documentation on --block-search or how to use it. Has anything changed in six years? Thanks.
Re: Quotes around INSERT and SELECT statements' arguments from the mysql CLI and PHP
On Sun, Sep 18, 2011 at 12:28 PM, Dotan Cohen dotanco...@gmail.com wrote: On Sun, Sep 18, 2011 at 17:44, Brandon Phelps bphe...@gls.com wrote: Personally I don't use any quotes for the numeric types, and single quotes for everything else. Ie: Thanks, Brandon. I understand then that quote type is a matter of taste. I always use double quotes in PHP and I've only recently started putting ticks around table and column names. I'll stick to your convention of no quotes around numerics and single quotes around everything else. I agree with Brandon's suggestions, I would just add when using numeric types in PHP statements where you have a variable replacement, for instance: $sql=INSERT into table VALUES ('$id','$val'); where $id is a numeric variable in PHP and a numeric field in the table, I'll include the $id in single quotes in the PHP statement, so even if the value of $id is null, alpha, or invalid (not numeric) it does not generate a mysql syntax error. Otherwise, without the single quotes, the statement would be: INSERT into table VALUES (,''); which would cause a syntax error. If you include the single quotes, it becomes: INSERT into table VALUES ('','') which won't cause a syntax error, but might cause some logic errors in the database. The choice is yours.
Re: Facebook Trapped In MySQL a 'Fate Worse Than Death'
Given the choice between doing right the first time, or having the second largest site on the internet, I'll take the latter, and deal with the problems of not doing it right the first time. -Hank On Tue, Jul 12, 2011 at 10:45 AM, Jerry Schwartz je...@gii.co.jp wrote: Let this be a lesson to all of those designers who say That will never happen.
Re: Found serious replication data-corruption bug in 5.5.8/5.5.11 (auto-increment in primary key)
Sveta Smirnova at Mysql just confirmed this bug in 5.5.13: http://bugs.mysql.com/45670 On Wed, Jun 15, 2011 at 5:38 PM, Claudio Nanni claudio.na...@gmail.comwrote: No worries! I think I would have figured that out! I'll feedback you tomorrow. Thanks again Claudio 2011/6/15 Hank hes...@gmail.com Oops... big typo in above steps... add the following line: replicate-ignore-table=db.log to the SLAVE my.cnf, and restart the SLAVE server. The master does not need to be restarted or changed. Just the SLAVE. Sorry about that. -Hank Eskin On Wed, Jun 15, 2011 at 5:19 PM, Claudio Nanni claudio.na...@gmail.com wrote: Great investigation Hank, congratulations. I will try this tomorrow morning(11:20pm now) and let you know if I can reproduce it on my environments. Thanks! Claudio 2011/6/15 Hank hes...@gmail.com Two additional notes: 1. Using the replicate-wild-ignore-table option in my.cnf produces the same results. 2. If the my.cnf replicate-ignore-table=db.log setting on the master is removed and mysql restarted so db.log is no longer ignored in replication, this bug goes away and correct results are reported on the slave. -Hank Eskin On Wed, Jun 15, 2011 at 4:38 PM, Hank hes...@gmail.com wrote: This is a follow-up to my previous post. I have been narrowing down what is causing this bug. It is a timing issue of a replication ignored table with an auto-increment primary key values leaking over into a non-ignored table with inserts immediately after the ignore table has had rows inserted. Basically, data from the ignored table is corrupting a non-ignored table on the slave upon immediate inserts. Here is how to repeat: On a master issue: use db; drop table test; CREATE TABLE test (id int NOT NULL, cnt int unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ; drop table log; CREATE TABLE log (id int unsigned NOT NULL AUTO_INCREMENT, log varchar(20), PRIMARY KEY (id) ) ENGINE=MyISAM AUTO_INCREMENT = 4; make sure those two tables are created on the slave through regular replication. on slave desc test; desc log; Once replicated, on the master, add the following line to the [mysqld] section of my.cnf, and then restart mysql. replicate-ignore-table=db.log The on the master, issue the following statements as a copy/paste of all of them at once. It's critical that the statements are executed in immediate succession (no delays) insert into log values (null,info1); insert into log values (null,info2); insert into log values (null,info3); insert into log values (null,info4); insert into test values (1,null); insert into log values (null,info5); insert into test values (1,null); insert into log values (null,info6); insert into test values (2,null); insert into log values (null,info7); insert into test values (2,null); insert into log values (null,info8); Here are the results from the master (all correct): masterselect * from log; +---+---+ | id| log | +---+---+ | 4 | info1 | | 5 | info2 | | 6 | info3 | | 7 | info4 | | 8 | info5 | | 9 | info6 | | 44450 | info7 | | 44451 | info8 | +---+---+ masterselect * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 2 | 2 | ++-+ Here are the results from the slave: slaveselect * from log; Empty set (0.00 sec) --- as expected, since it is ignored slaveselect * from test; ++---+ | id | cnt | ++---+ | 1 | 7 | -- should be 1, but has values from log on the master | 1 | 8 | -- should be 2 | 2 | 9 | -- should be 1 | 2 | 44450 | -- should be 2 ++---+ If there is the slightest delay between the inserts into log and test, the replication happens correctly. Thoughts? -Hank Eskin -- Claudio -- Claudio
Found serious replication data-corruption bug in 5.5.8/5.5.11 (auto-increment in primary key)
This is a follow-up to my previous post. I have been narrowing down what is causing this bug. It is a timing issue of a replication ignored table with an auto-increment primary key values leaking over into a non-ignored table with inserts immediately after the ignore table has had rows inserted. Basically, data from the ignored table is corrupting a non-ignored table on the slave upon immediate inserts. Here is how to repeat: On a master issue: use db; drop table test; CREATE TABLE test (id int NOT NULL, cnt int unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ; drop table log; CREATE TABLE log (id int unsigned NOT NULL AUTO_INCREMENT, log varchar(20), PRIMARY KEY (id) ) ENGINE=MyISAM AUTO_INCREMENT = 4; make sure those two tables are created on the slave through regular replication. on slave desc test; desc log; Once replicated, on the master, add the following line to the [mysqld] section of my.cnf, and then restart mysql. replicate-ignore-table=db.log The on the master, issue the following statements as a copy/paste of all of them at once. It's critical that the statements are executed in immediate succession (no delays) insert into log values (null,info1); insert into log values (null,info2); insert into log values (null,info3); insert into log values (null,info4); insert into test values (1,null); insert into log values (null,info5); insert into test values (1,null); insert into log values (null,info6); insert into test values (2,null); insert into log values (null,info7); insert into test values (2,null); insert into log values (null,info8); Here are the results from the master (all correct): masterselect * from log; +---+---+ | id| log | +---+---+ | 4 | info1 | | 5 | info2 | | 6 | info3 | | 7 | info4 | | 8 | info5 | | 9 | info6 | | 44450 | info7 | | 44451 | info8 | +---+---+ masterselect * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 2 | 2 | ++-+ Here are the results from the slave: slaveselect * from log; Empty set (0.00 sec) --- as expected, since it is ignored slaveselect * from test; ++---+ | id | cnt | ++---+ | 1 | 7 | -- should be 1, but has values from log on the master | 1 | 8 | -- should be 2 | 2 | 9 | -- should be 1 | 2 | 44450 | -- should be 2 ++---+ If there is the slightest delay between the inserts into log and test, the replication happens correctly. Thoughts? -Hank Eskin
Re: Found serious replication data-corruption bug in 5.5.8/5.5.11 (auto-increment in primary key)
Two additional notes: 1. Using the replicate-wild-ignore-table option in my.cnf produces the same results. 2. If the my.cnf replicate-ignore-table=db.log setting on the master is removed and mysql restarted so db.log is no longer ignored in replication, this bug goes away and correct results are reported on the slave. -Hank Eskin On Wed, Jun 15, 2011 at 4:38 PM, Hank hes...@gmail.com wrote: This is a follow-up to my previous post. I have been narrowing down what is causing this bug. It is a timing issue of a replication ignored table with an auto-increment primary key values leaking over into a non-ignored table with inserts immediately after the ignore table has had rows inserted. Basically, data from the ignored table is corrupting a non-ignored table on the slave upon immediate inserts. Here is how to repeat: On a master issue: use db; drop table test; CREATE TABLE test (id int NOT NULL, cnt int unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ; drop table log; CREATE TABLE log (id int unsigned NOT NULL AUTO_INCREMENT, log varchar(20), PRIMARY KEY (id) ) ENGINE=MyISAM AUTO_INCREMENT = 4; make sure those two tables are created on the slave through regular replication. on slave desc test; desc log; Once replicated, on the master, add the following line to the [mysqld] section of my.cnf, and then restart mysql. replicate-ignore-table=db.log The on the master, issue the following statements as a copy/paste of all of them at once. It's critical that the statements are executed in immediate succession (no delays) insert into log values (null,info1); insert into log values (null,info2); insert into log values (null,info3); insert into log values (null,info4); insert into test values (1,null); insert into log values (null,info5); insert into test values (1,null); insert into log values (null,info6); insert into test values (2,null); insert into log values (null,info7); insert into test values (2,null); insert into log values (null,info8); Here are the results from the master (all correct): masterselect * from log; +---+---+ | id| log | +---+---+ | 4 | info1 | | 5 | info2 | | 6 | info3 | | 7 | info4 | | 8 | info5 | | 9 | info6 | | 44450 | info7 | | 44451 | info8 | +---+---+ masterselect * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 2 | 2 | ++-+ Here are the results from the slave: slaveselect * from log; Empty set (0.00 sec) --- as expected, since it is ignored slaveselect * from test; ++---+ | id | cnt | ++---+ | 1 | 7 | -- should be 1, but has values from log on the master | 1 | 8 | -- should be 2 | 2 | 9 | -- should be 1 | 2 | 44450 | -- should be 2 ++---+ If there is the slightest delay between the inserts into log and test, the replication happens correctly. Thoughts? -Hank Eskin
Re: Found serious replication data-corruption bug in 5.5.8/5.5.11 (auto-increment in primary key)
Oops... big typo in above steps... add the following line: replicate-ignore-table=db.log to the SLAVE my.cnf, and restart the SLAVE server. The master does not need to be restarted or changed. Just the SLAVE. Sorry about that. -Hank Eskin On Wed, Jun 15, 2011 at 5:19 PM, Claudio Nanni claudio.na...@gmail.comwrote: Great investigation Hank, congratulations. I will try this tomorrow morning(11:20pm now) and let you know if I can reproduce it on my environments. Thanks! Claudio 2011/6/15 Hank hes...@gmail.com Two additional notes: 1. Using the replicate-wild-ignore-table option in my.cnf produces the same results. 2. If the my.cnf replicate-ignore-table=db.log setting on the master is removed and mysql restarted so db.log is no longer ignored in replication, this bug goes away and correct results are reported on the slave. -Hank Eskin On Wed, Jun 15, 2011 at 4:38 PM, Hank hes...@gmail.com wrote: This is a follow-up to my previous post. I have been narrowing down what is causing this bug. It is a timing issue of a replication ignored table with an auto-increment primary key values leaking over into a non-ignored table with inserts immediately after the ignore table has had rows inserted. Basically, data from the ignored table is corrupting a non-ignored table on the slave upon immediate inserts. Here is how to repeat: On a master issue: use db; drop table test; CREATE TABLE test (id int NOT NULL, cnt int unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ; drop table log; CREATE TABLE log (id int unsigned NOT NULL AUTO_INCREMENT, log varchar(20), PRIMARY KEY (id) ) ENGINE=MyISAM AUTO_INCREMENT = 4; make sure those two tables are created on the slave through regular replication. on slave desc test; desc log; Once replicated, on the master, add the following line to the [mysqld] section of my.cnf, and then restart mysql. replicate-ignore-table=db.log The on the master, issue the following statements as a copy/paste of all of them at once. It's critical that the statements are executed in immediate succession (no delays) insert into log values (null,info1); insert into log values (null,info2); insert into log values (null,info3); insert into log values (null,info4); insert into test values (1,null); insert into log values (null,info5); insert into test values (1,null); insert into log values (null,info6); insert into test values (2,null); insert into log values (null,info7); insert into test values (2,null); insert into log values (null,info8); Here are the results from the master (all correct): masterselect * from log; +---+---+ | id| log | +---+---+ | 4 | info1 | | 5 | info2 | | 6 | info3 | | 7 | info4 | | 8 | info5 | | 9 | info6 | | 44450 | info7 | | 44451 | info8 | +---+---+ masterselect * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 2 | 2 | ++-+ Here are the results from the slave: slaveselect * from log; Empty set (0.00 sec) --- as expected, since it is ignored slaveselect * from test; ++---+ | id | cnt | ++---+ | 1 | 7 | -- should be 1, but has values from log on the master | 1 | 8 | -- should be 2 | 2 | 9 | -- should be 1 | 2 | 44450 | -- should be 2 ++---+ If there is the slightest delay between the inserts into log and test, the replication happens correctly. Thoughts? -Hank Eskin -- Claudio
Re: Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)
That is the slave relay log dump I posted (and mis-labeled). Thanks. -Hank On Tue, Jun 14, 2011 at 2:34 AM, Claudio Nanni claudio.na...@gmail.comwrote: You should also have a look at the slave relay log. But in any case sounds like a bug. Claudio On Jun 14, 2011 5:18 AM, Hank hes...@gmail.com wrote: Both my master and slave bin logs look OK (I think).. master bin log: /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; SET TIMESTAMP=1308012505/*!*/; SET @@session.pseudo_thread_id=9/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=0/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 *//*!*/; SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; BEGIN /*!*/; use test/*!*/; SET TIMESTAMP=1308012505/*!*/; insert into test values (1,null) /*!*/; SET TIMESTAMP=1308012505/*!*/; COMMIT /*!*/; slave relay log: /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; SET TIMESTAMP=1308012505/*!*/; SET @@session.pseudo_thread_id=9/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=0/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 *//*!*/; SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; BEGIN /*!*/; use test/*!*/; SET TIMESTAMP=1308012505/*!*/; insert into test values (1,null) /*!*/; SET TIMESTAMP=1308012505/*!*/; COMMIT /*!*/; -Hank On Mon, Jun 13, 2011 at 10:38 PM, Hank hes...@gmail.com wrote: Yes, it's basic out-of-the box mysql replication. This appears to be an instance of this bug: http://bugs.mysql.com/bug.php?id=45670 But that bug report was closed two years ago. I have no idea if it's the server sending bad data or the slaves. I think it's the slaves, because on the slave error, it clearly is getting this statement: insert into test values (1,null) to replicate, but when it is executed, the null is converted into a random number. But it's happening on all of my slaves, a mix of 32 and 64 bit 5.5.8 and 5.5.11 boxes. http://bugs.mysql.com/bug.php?id=45670 -Hank On Mon, Jun 13, 2011 at 10:33 PM, Claudio Nanni claudio.na...@gmail.comwrote: Hank, I can't reproduce it right now, But it really seems a bug. Just a shot in the dark, Are you sure you have statement based and not mixed replication? I don't even know if that would affect , just an idea. Claudio On Jun 14, 2011 3:07 AM, Hank hes...@gmail.com wrote: Hello All, I have a 64bit, 5.5.8 master, and this bug appears on both 5.5.11 and 5.5.8 32 and 64-bit slaves (statement based replication). I'm finding an auto-increment field (part of a compound primary key) updates correctly using null to insert the next value on the master.. but when this statement is replicated on the slaves, instead of inserting the next value of the auto-increment field, it inserts 65535 for 'smallint' definitions of 'cnt' and seemingly high random numbers around 469422 for definitions of 'int' or 'bigint'. Easy to repeat: master: CREATE TABLE test ( id int NOT NULL, cnt int unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ; master: desc test; +---+-+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+-+--+-+-++ | id | int(11) | NO | PRI | NULL | | | cnt | int(11) | NO | PRI | NULL | auto_increment | +---+-+--+-+-++ master: insert into test values (1,null); master: select * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | --- looks good! ++-+ slave: desc test; +---+-+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+-+--+-+-++ | id | int(11) | NO | PRI | NULL | | | cnt | int(11) | NO | PRI | NULL | auto_increment | +---+-+--+-+-++ slave: select * from test; +++ | id | cnt | +++ | 1 | 469422 | should be 1
Re: Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)
The slave is receiving null as the statement based insert, not an out of range number from the master. I've been doing more research all day on this bug and have a bit more information as to what's causing it. I plan to write it up tomorrow and post it. Basically, everything works perfectly, until I add a replication-ignore-table=xxx statement in my.cnf where xxx is a different table with a unique id INT auto-increment as the single primary key And then the values being inserted into the test table (above, not ignored) represent the last-insert-id of the replication *ignored* table on the slave Yeah, pretty strange, I know. But totally repeatable. -Hank 2011/6/14 Halász Sándor h...@tbbs.net 2011/06/13 22:38 -0400, Hank But that bug report was closed two years ago. I have no idea if it's the server sending bad data or the slaves. I think it's the slaves, because on the slave error, it clearly is getting this statement: insert into test values (1,null) to replicate, but when it is executed, the null is converted into a random number. But it's happening on all of my slaves, a mix of 32 and 64 bit 5.5.8 and 5.5.11 boxes. If the master were sending random big numbers, and replication on the slave in the usual way handled out-of-bound numbers when not allowed to fail, then 65535 would be an expected value for a signless 16-bit number. Of course, if this were true, the slave would be getting not that statement but insert into test values (1,469422). -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=hes...@gmail.com
Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)
these results? -Hank
Re: Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)
Yes, it's basic out-of-the box mysql replication. This appears to be an instance of this bug: http://bugs.mysql.com/bug.php?id=45670 But that bug report was closed two years ago. I have no idea if it's the server sending bad data or the slaves. I think it's the slaves, because on the slave error, it clearly is getting this statement: insert into test values (1,null) to replicate, but when it is executed, the null is converted into a random number. But it's happening on all of my slaves, a mix of 32 and 64 bit 5.5.8 and 5.5.11 boxes. http://bugs.mysql.com/bug.php?id=45670 -Hank On Mon, Jun 13, 2011 at 10:33 PM, Claudio Nanni claudio.na...@gmail.comwrote: Hank, I can't reproduce it right now, But it really seems a bug. Just a shot in the dark, Are you sure you have statement based and not mixed replication? I don't even know if that would affect , just an idea. Claudio On Jun 14, 2011 3:07 AM, Hank hes...@gmail.com wrote: Hello All, I have a 64bit, 5.5.8 master, and this bug appears on both 5.5.11 and 5.5.8 32 and 64-bit slaves (statement based replication). I'm finding an auto-increment field (part of a compound primary key) updates correctly using null to insert the next value on the master.. but when this statement is replicated on the slaves, instead of inserting the next value of the auto-increment field, it inserts 65535 for 'smallint' definitions of 'cnt' and seemingly high random numbers around 469422 for definitions of 'int' or 'bigint'. Easy to repeat: master: CREATE TABLE test ( id int NOT NULL, cnt int unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ; master: desc test; +---+-+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+-+--+-+-++ | id | int(11) | NO | PRI | NULL | | | cnt | int(11) | NO | PRI | NULL | auto_increment | +---+-+--+-+-++ master: insert into test values (1,null); master: select * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | --- looks good! ++-+ slave: desc test; +---+-+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+-+--+-+-++ | id | int(11) | NO | PRI | NULL | | | cnt | int(11) | NO | PRI | NULL | auto_increment | +---+-+--+-+-++ slave: select * from test; +++ | id | cnt | +++ | 1 | 469422 | should be 1 +++ But the problem continues... master: insert into test values (1,null); master: select * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | --- correct ! | 1 | 2 | --- correct ! ++-+ slave select * from test; +++ | id | cnt | +++ | 1 | 469422 | should be 1 | 1 | 470673 | should be 2 +++ Now if I repeat the entire scenario using smallint for the 'cnt' field, here are the results: master CREATE TABLE test (id int NOT NULL, cnt smallint unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ; master desc test; +---+--+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+--+--+-+-++ | id | int(11) | NO | PRI | NULL | | | cnt | smallint(5) unsigned | NO | PRI | NULL | auto_increment | +---+--+--+-+-++ master: insert into test values (1,null); master: select * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | correct! ++-+ slave select * from test; +++ | id | cnt | +++ | 1 | 65535 | should be 1 +++ but this is different: master: insert into test values (1,null); master: select * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | correct! | 1 | 2 | correct! ++-+ slave select * from test; ++---+ | id | cnt | ++---+ | 1 | 65535 | should be 1, missing second record, too ++---+ slave show slave status; Error 'Duplicate entry '1-65535' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'insert into test values (1,null)' .. at which point I have to restart the slave due to the error: slave SET GLOBAL sql_slave_skip_counter=1; slave start; slave select * from test; ++---+ | id | cnt | ++---+ | 1 | 65535 | should be 1, still missing second record, too (of course) ++---+ Now if I manually replicate the statements just on the slave - it works perfectly: slave: truncate table test; slave: insert into test values (1,null
Re: Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)
Both my master and slave bin logs look OK (I think).. master bin log: /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; SET TIMESTAMP=1308012505/*!*/; SET @@session.pseudo_thread_id=9/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=0/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 *//*!*/; SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; BEGIN /*!*/; use test/*!*/; SET TIMESTAMP=1308012505/*!*/; insert into test values (1,null) /*!*/; SET TIMESTAMP=1308012505/*!*/; COMMIT /*!*/; slave bin log: /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; SET TIMESTAMP=1308012505/*!*/; SET @@session.pseudo_thread_id=9/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=0/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 *//*!*/; SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; BEGIN /*!*/; use test/*!*/; SET TIMESTAMP=1308012505/*!*/; insert into test values (1,null) /*!*/; SET TIMESTAMP=1308012505/*!*/; COMMIT /*!*/; -Hank On Mon, Jun 13, 2011 at 10:38 PM, Hank hes...@gmail.com wrote: Yes, it's basic out-of-the box mysql replication. This appears to be an instance of this bug: http://bugs.mysql.com/bug.php?id=45670 But that bug report was closed two years ago. I have no idea if it's the server sending bad data or the slaves. I think it's the slaves, because on the slave error, it clearly is getting this statement: insert into test values (1,null) to replicate, but when it is executed, the null is converted into a random number. But it's happening on all of my slaves, a mix of 32 and 64 bit 5.5.8 and 5.5.11 boxes. http://bugs.mysql.com/bug.php?id=45670 -Hank On Mon, Jun 13, 2011 at 10:33 PM, Claudio Nanni claudio.na...@gmail.comwrote: Hank, I can't reproduce it right now, But it really seems a bug. Just a shot in the dark, Are you sure you have statement based and not mixed replication? I don't even know if that would affect , just an idea. Claudio On Jun 14, 2011 3:07 AM, Hank hes...@gmail.com wrote: Hello All, I have a 64bit, 5.5.8 master, and this bug appears on both 5.5.11 and 5.5.8 32 and 64-bit slaves (statement based replication). I'm finding an auto-increment field (part of a compound primary key) updates correctly using null to insert the next value on the master.. but when this statement is replicated on the slaves, instead of inserting the next value of the auto-increment field, it inserts 65535 for 'smallint' definitions of 'cnt' and seemingly high random numbers around 469422 for definitions of 'int' or 'bigint'. Easy to repeat: master: CREATE TABLE test ( id int NOT NULL, cnt int unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ; master: desc test; +---+-+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+-+--+-+-++ | id | int(11) | NO | PRI | NULL | | | cnt | int(11) | NO | PRI | NULL | auto_increment | +---+-+--+-+-++ master: insert into test values (1,null); master: select * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | --- looks good! ++-+ slave: desc test; +---+-+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+-+--+-+-++ | id | int(11) | NO | PRI | NULL | | | cnt | int(11) | NO | PRI | NULL | auto_increment | +---+-+--+-+-++ slave: select * from test; +++ | id | cnt | +++ | 1 | 469422 | should be 1 +++ But the problem continues... master: insert into test values (1,null); master: select * from test; ++-+ | id | cnt | ++-+ | 1 | 1 | --- correct ! | 1 | 2 | --- correct ! ++-+ slave select * from test; +++ | id | cnt | +++ | 1 | 469422 | should be 1 | 1 | 470673 | should be 2 +++ Now if I repeat the entire scenario using smallint for the 'cnt' field, here are the results: master CREATE TABLE test (id int
Using @ variables with LIKE,CONCAT
This used to work fine in Mysql 4.3, but no longer works in 5.5.8: set @txt='needle'; select * from table where field like CONCAT('%',@txt,'%'); --returns the null set. If I substitute like this: select * from table where field like '%needle%'; it works perfectly (and as it did in 4.x). How can I get this to work in 5.5.x? Thanks, -Hank
Highest version of MySQL available for 2.4 kernel (RH/Cent OS 3.5)?
What is the highest version of MySQL available for a 2.4 kernel (Redhat/Cent OS 3.5)? And where can I find it to download? Thanks, -Hank
Moving data between two servers with SQL
Hello, I have a background process that runs as a combination of PHPMySQL. The end results are records in a table on server #1 (but not the entire table, just a small subset of the table needs to move). What's the easiest/cleanest way of moving those records to an identical table on another server? In Oracle, we used to be able to set up connection profiles and move data between servers with SQL, but I'm guessing that's not easy to do with MySQL. I'd prefer not to use mysql command line client commands to save the data as an OS file and then import that into the other server using another mysql command line client command. I'd like to find something cleaner than that. I'm using 5.5.8. thanks, -Hank
Re: Moving data between two servers with SQL
Wow, that might just work! I've seen Federated tables mentioned about, but I never knew that's what they are here for.. thanks. Can I have a host (remote) table on a MySQL 4.1.x server, and the federated table on a 5.5.8 server? -Hank On Tue, Jan 4, 2011 at 4:15 PM, Shawn Green (MySQL) shawn.l.gr...@oracle.com wrote: On 1/4/2011 15:53, Hank wrote: Hello, I have a background process that runs as a combination of PHPMySQL. The end results are records in a table on server #1 (but not the entire table, just a small subset of the table needs to move). What's the easiest/cleanest way of moving those records to an identical table on another server? In Oracle, we used to be able to set up connection profiles and move data between servers with SQL, but I'm guessing that's not easy to do with MySQL. I'd prefer not to use mysql command line client commands to save the data as an OS file and then import that into the other server using another mysql command line client command. I'd like to find something cleaner than that. I'm using 5.5.8. thanks, -Hank Have you looked at the FEDERATED storage engine? http://dev.mysql.com/doc/refman/5.5/en/federated-storage-engine.html -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN
Re: Moving data between two servers with SQL
Also, can I do this: insert into federated_table select * from local_table? -Hank On Tue, Jan 4, 2011 at 4:15 PM, Shawn Green (MySQL) shawn.l.gr...@oracle.com wrote: On 1/4/2011 15:53, Hank wrote: Hello, I have a background process that runs as a combination of PHPMySQL. The end results are records in a table on server #1 (but not the entire table, just a small subset of the table needs to move). What's the easiest/cleanest way of moving those records to an identical table on another server? In Oracle, we used to be able to set up connection profiles and move data between servers with SQL, but I'm guessing that's not easy to do with MySQL. I'd prefer not to use mysql command line client commands to save the data as an OS file and then import that into the other server using another mysql command line client command. I'd like to find something cleaner than that. I'm using 5.5.8. thanks, -Hank Have you looked at the FEDERATED storage engine? http://dev.mysql.com/doc/refman/5.5/en/federated-storage-engine.html -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN
Re: Moving data between two servers with SQL
Nevermind -- it's working absolutely perfectly between 5.5.8 and 4.1.x. Thanks again for the push. -Hank On Tue, Jan 4, 2011 at 5:14 PM, Hank hes...@gmail.com wrote: Also, can I do this: insert into federated_table select * from local_table? -Hank On Tue, Jan 4, 2011 at 4:15 PM, Shawn Green (MySQL) shawn.l.gr...@oracle.com wrote: On 1/4/2011 15:53, Hank wrote: Hello, I have a background process that runs as a combination of PHPMySQL. The end results are records in a table on server #1 (but not the entire table, just a small subset of the table needs to move). What's the easiest/cleanest way of moving those records to an identical table on another server? In Oracle, we used to be able to set up connection profiles and move data between servers with SQL, but I'm guessing that's not easy to do with MySQL. I'd prefer not to use mysql command line client commands to save the data as an OS file and then import that into the other server using another mysql command line client command. I'd like to find something cleaner than that. I'm using 5.5.8. thanks, -Hank Have you looked at the FEDERATED storage engine? http://dev.mysql.com/doc/refman/5.5/en/federated-storage-engine.html -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN
Re: odd problem with select as statement
Here's my 5 second guess.. 4E5664736... is being interpreted as a number in scientific notation .. i.e. 4*10^5664736 and the parser doesn't like that as a field name. -Hank On Mon, Dec 20, 2010 at 12:43 PM, Ramsey, Robert L robert-ram...@uiowa.eduwrote: I am having the hardest time getting a select as statement right. Here is the full query: select SUM(IF(image='EE0840D00E2ED8A317E5FA9899C48C19',1,0)) as EE0840D00E2ED8A317E5FA9899C48C19, SUM(IF(image='235C7987796D5B7CEBF56FBDA2BF7815',1,0)) as 235C7987796D5B7CEBF56FBDA2BF7815, SUM(IF(image='96DC0562ED6E6F7FE789A18E09BC5889',1,0)) as 96DC0562ED6E6F7FE789A18E09BC5889, SUM(IF(image='D8B0EA710D2EF408391132F451AE724A',1,0)) as D8B0EA710D2EF408391132F451AE724A, SUM(IF(image='018C4DB7229D7D2BEB040D241739B784',1,0)) as 018C4DB7229D7D2BEB040D241739B784, SUM(IF(image='98DE1FCA50AC9CE6E0FEA25BAB0177FE',1,0)) as 98DE1FCA50AC9CE6E0FEA25BAB0177FE, SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E5664736F400E8B482EA7AA67853D13, --offending line SUM(IF(image='FEB810A43A1B275605BD6B69F444700C',1,0)) as FEB810A43A1B275605BD6B69F444700C from dsrssfeed If I remove that one line, the query works fine. If I do: select SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E from dsrssfeed ; it works. But these fail: select SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E5664736F400E8B482EA7AA67853D13 from dsrssfeed ; select SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E5 from dsrssfeed ; It can't be field name length, since even 4E5 fails, the field name can start with a number since 4E succeeds. Any ideas? The goal is to see what arbitrary images have information associated with them. The table has two fields: image is a UID that is the primary key, and caption which is a varchar(255) that has information about the image. Images are added and deleted from the table as they are changed on a web page. The UID is generated by a third party program that I have to interface with and have no control over. An array of image UIDs is sent to the php script and the script needs to determine which UIDs are present in the table. Rather than make N number of individual queries as I iterate through the array, I iterate through the array and build the query on the fly to make one query. Then I iterate through the array again and check the value in the field. 1 means the UID has an entry, 0 means it doesn't. I thought doing 1 mysql call would be more efficient than lots of calls as I iterate through the array. But since there will probably never be more than 100 images in the table at any one time, it may not make any difference. But now I'm just curious as to why this is happening. Thanks, Bob
Re: odd problem with select as statement
i.e. just try this: mysql select 4E5664736F400E8B482EA7AA67853D13; ERROR 1367 (22007): Illegal double '4E5664736' value found during parsing -Hank On Mon, Dec 20, 2010 at 12:50 PM, Hank hes...@gmail.com wrote: Here's my 5 second guess.. 4E5664736... is being interpreted as a number in scientific notation .. i.e. 4*10^5664736 and the parser doesn't like that as a field name. -Hank On Mon, Dec 20, 2010 at 12:43 PM, Ramsey, Robert L robert-ram...@uiowa.edu wrote: I am having the hardest time getting a select as statement right. Here is the full query: select SUM(IF(image='EE0840D00E2ED8A317E5FA9899C48C19',1,0)) as EE0840D00E2ED8A317E5FA9899C48C19, SUM(IF(image='235C7987796D5B7CEBF56FBDA2BF7815',1,0)) as 235C7987796D5B7CEBF56FBDA2BF7815, SUM(IF(image='96DC0562ED6E6F7FE789A18E09BC5889',1,0)) as 96DC0562ED6E6F7FE789A18E09BC5889, SUM(IF(image='D8B0EA710D2EF408391132F451AE724A',1,0)) as D8B0EA710D2EF408391132F451AE724A, SUM(IF(image='018C4DB7229D7D2BEB040D241739B784',1,0)) as 018C4DB7229D7D2BEB040D241739B784, SUM(IF(image='98DE1FCA50AC9CE6E0FEA25BAB0177FE',1,0)) as 98DE1FCA50AC9CE6E0FEA25BAB0177FE, SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E5664736F400E8B482EA7AA67853D13, --offending line SUM(IF(image='FEB810A43A1B275605BD6B69F444700C',1,0)) as FEB810A43A1B275605BD6B69F444700C from dsrssfeed If I remove that one line, the query works fine. If I do: select SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E from dsrssfeed ; it works. But these fail: select SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E5664736F400E8B482EA7AA67853D13 from dsrssfeed ; select SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E5 from dsrssfeed ; It can't be field name length, since even 4E5 fails, the field name can start with a number since 4E succeeds. Any ideas? The goal is to see what arbitrary images have information associated with them. The table has two fields: image is a UID that is the primary key, and caption which is a varchar(255) that has information about the image. Images are added and deleted from the table as they are changed on a web page. The UID is generated by a third party program that I have to interface with and have no control over. An array of image UIDs is sent to the php script and the script needs to determine which UIDs are present in the table. Rather than make N number of individual queries as I iterate through the array, I iterate through the array and build the query on the fly to make one query. Then I iterate through the array again and check the value in the field. 1 means the UID has an entry, 0 means it doesn't. I thought doing 1 mysql call would be more efficient than lots of calls as I iterate through the array. But since there will probably never be more than 100 images in the table at any one time, it may not make any difference. But now I'm just curious as to why this is happening. Thanks, Bob
A better REPAIR TABLE for myisam tables (or for upgrading tables)
I've posted a similar post in the past -- but there I was mucking around with blank index files and frm files to fool myisamchk into repairing a table. But now I think I've come across a much better and more efficient way to do a REPAIR Table in order to upgrade my database tables from Mysql 4.1.x to 5.5.8. All this comes from the fact that REPAIR TABLE does not rebuild the table indexes like myisamchk does, which is very unfortunate. Sure, REPAIR TABLE works great for small tables, but if you have any tables of larger size (millions of records or more, with multiple indexes), REPAIR TABLE can take hours or days to do a simple repair/upgrade. And in most cases, applications just can't be down for that long during an upgrade cycle (not everyone runs a huge shop with multiple dev/test/upgrade/production servers). So here is what I have done, and propose this as a better REPAIR TABLE for MYISAM tables (in pseudo code): 1. Retrieve the original CREATE TABLE DDL with show create table SOURCE 2. Modify DDL to change the table name to a new target table, let's call it TARGET 3. Execute new DDL to create empty TARGET table 4. Run 'myisamchk -r --keys-used=0 TARGET' (to disable all index keys on new table) 5. flush tables; lock table SOURCE read, TARGET write; 6. insert into TARGET select * From SOURCE; 7. flush tables; unlock tables; 8. 'myisamchk -prqn TARGET' (repair to re-enable all keys, do not modify MYD table, use sorting, in parallel) 9. rename tables to replace SOURCE with TARGET I've written a PHP script to do exactly this, and it works beautifully. My source tables are mysql 4.1.x tables, and the target tables are now fully 5.5 compliant (verified with mysqlcheck --check-upgrade). The best part is that for tables with 50 million short rows, it ran in 7 minutes, and a table with 30 million rows, it ran in 4 minutes. I'm now running it on a table with over 200 million rows, and I expect it to take an hour or so... but in all cases, doing a REPAIR TABLE on any of these large tables would take days to complete. So why can't the REPAIR TABLE command do something like this in the background for large MYISAM tables? -Hank
Re: A better REPAIR TABLE for myisam tables (or for upgrading tables)
Sorry... One small correction to my above post.. 'FLUSH TABLES' should be issued between steps 8 and 9. My 200+ million record table completed in 71 minutes. -Hank mysql; query;
Mysql 5.5.x -- when is it going to go GA?
Mysql 5.5 -- when is it going to go GA? And when it does, which version will it be? 5.5.8 or 5.5.6rc? Thanks, -Hank
Re: Purposely Corrupting a table
It's easy to corrupt the MYISAM index (MYI) file... I do something like this in linux -- assuming your table is not tiny, and mysql isn't running or you have a lock on the table: dd if=table.MYI of=table2.MYI bs=2048 count=100 then copy table2.MYI over table.MYI and then flush tables and then unlock. Your table will be unreadable until you rebuild the index with REPAIR TABLE or myisamchk. The MYD file will remain intact. If your MYI file is smaller than 200k, then just reduce the count=#. -Hank On Tue, Oct 19, 2010 at 7:53 PM, Steve Staples sstap...@mnsi.net wrote: Ok, been googling all morning, and keep getting the same post (on multiple different sites). Is there a way, where i corrupt a table purposely? I've tried playing with the .MYD file, and yeah, it marks it deleted under the check routine, but the table is still readable/writable, just doesn't have any info when selecting it... is there another way to corrupt the table, where you can't even select from it, or the responce back from a select is an error? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Constructing query to display item count based on increments of time
Here's what I came up with: select concat(left(DATE_FORMAT(start_time, %Y-%m-%d %h:%i ),15),0) as time, count(*) from table group by time -Hank How would one go about to construct a query that counts items within an increment or span of time, let's say increments of 10 minutes? Imagine a simple table where each row has a timestamp, and the query should return the count of items occurring within the timespan of a defined period. Say, 09:00: 14 09:10: 31 09:20: 25 09:30: 0 09:40: 12 etc. I have been able to get collections of item occurrence based on month and day by using GROUP BY together with a DATE_FORMAT( start_time, %d %m %Y ) eg. I can however not seem to be able to find the solution to grouping based on the minute increments in my example above. Any suggestions? -- Kind regards Pascual Strømsnæs -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=travis_...@hotmail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=hes...@gmail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Constructing query to display item count based on increments of time
Ok, I can see that. Here's a different approach that gets it down to two function calls and some math.. and the DATE_FORMAT call might not even be needed depending on the actual application. select DATE_FORMAT(start_time, %Y-%m-%d %h: ) as dhour, 10*(minute(start_time)%6) as dtime ,count(*) from table group by dhour,dtime; -Hank On Wed, Oct 6, 2010 at 4:22 PM, Johan De Meersman vegiv...@tuxera.be wrote: Two people already who suggested a text-based approach vs. my numeric approach. Analysing, my method takes a single function call per record (to_unixtime); Travis' takes 4 (concat, date_format, truncate, minute) and Hank's 3 (concate, left, date_format). Someone feel like benchmarking ? :-D On Wed, Oct 6, 2010 at 5:44 PM, Hank hes...@gmail.com wrote: Here's what I came up with: select concat(left(DATE_FORMAT(start_time, %Y-%m-%d %h:%i ),15),0) as time, count(*) from table group by time -Hank How would one go about to construct a query that counts items within an increment or span of time, let's say increments of 10 minutes? Imagine a simple table where each row has a timestamp, and the query should return the count of items occurring within the timespan of a defined period. Say, 09:00: 14 09:10: 31 09:20: 25 09:30: 0 09:40: 12 etc. I have been able to get collections of item occurrence based on month and day by using GROUP BY together with a DATE_FORMAT( start_time, %d %m %Y ) eg. I can however not seem to be able to find the solution to grouping based on the minute increments in my example above. Any suggestions? -- Kind regards Pascual Strømsnæs -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=travis_...@hotmail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=hes...@gmail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Not to show until a certain date
On Fri, Oct 1, 2010 at 2:34 PM, Joerg Bruehe joerg.bru...@oracle.com wrote: Hi! Hank wrote: On Wed, Sep 29, 2010 at 8:33 AM, Patrice Olivier-Wilson b...@biz-comm.com wrote: On 9/28/10 8:33 PM, Chris W wrote: SELECT * FROM announcements WHERE announcements_expiredate CURDATE() AND announcements_postdate = CURDATE() ORDER BY announcements_expiredate ASC Or how about something like this: SELECT * FROM announcements WHERE CURDATE() between announcements_postdate and announcements_expiredate ORDER BY announcements_expiredate ASC The syntax is correct, but I don't think this statement will be optimized as well as the other proposal: BETWEEN is intended for column BETWEEN const1 AND const2, whereas your statement is const BETWEEN column1 AND column2. But that only really matters if there are indexes on the column1 and column2 fields. And for the optimizer, wouldn't it make sense to map BETWEEN into two comparison statements (columnconst1 and column=const2) or (constcolumn1 and const=column2) where both scenarios the optimizer may be able to use indexes on the fields? It's exactly the same as the other proposal: CURDATE() announcements_postdate and CURDATE()= announcements_expiredate which still is using two different fields for the comparisons... so wouldn't both scenarios end up in the exact same place? -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Not to show until a certain date
I would argue that it's not a logical error. Typically, in my experience when something expires, the date of expiry is inclusive, and any date after expiry is not. Take any discount, coupon, or promotion -- if it expires on December 31, it's still available on that day. If an article expires on December 31, I would interpret that as still being valid on December 31, and not valid on January 1. Another example is Credit Cards expire on the last day of the month of expiry, but they are still valid on that date. But assuming for a moment that for this application, the date of expiry is not inclusive (i.e. that expirydate actually means date of deletion), one could still do: curdate() between postdate and date_sub(expiredate, INTERVAL 1 day) And to reply to Joerg Bruehe, I have used this method of BETWEEN many times over the years without any performance or optimization issues... but sure, that doesn't mean they will never exist for other applications, but it has worked well for me. Best, -Hank On Fri, Oct 1, 2010 at 4:50 PM, BMBasal bmb37...@gmail.com wrote: Your suggestion seems more elegant. However, you missed the mathematical meaning of BETWEEN in SQL: it is inclusive of both lower and upper bounds. In the case raised by Patrice Olivier-Wilson, when an announcement expires on announcements_expiredate, it should not show on that date, and thereafter. But using BETWEEN, it will show on announcements_expiredate, thus a logical error. -Original Message- From: Hank [mailto:hes...@gmail.com] Sent: Friday, October 01, 2010 4:10 PM To: Joerg Bruehe Cc: mysql@lists.mysql.com Subject: Re: Not to show until a certain date On Fri, Oct 1, 2010 at 2:34 PM, Joerg Bruehe joerg.bru...@oracle.com wrote: Hi! Hank wrote: On Wed, Sep 29, 2010 at 8:33 AM, Patrice Olivier-Wilson b...@biz-comm.com wrote: On 9/28/10 8:33 PM, Chris W wrote: SELECT * FROM announcements WHERE announcements_expiredate CURDATE() AND announcements_postdate = CURDATE() ORDER BY announcements_expiredate ASC Or how about something like this: SELECT * FROM announcements WHERE CURDATE() between announcements_postdate and announcements_expiredate ORDER BY announcements_expiredate ASC The syntax is correct, but I don't think this statement will be optimized as well as the other proposal: BETWEEN is intended for column BETWEEN const1 AND const2, whereas your statement is const BETWEEN column1 AND column2. But that only really matters if there are indexes on the column1 and column2 fields. And for the optimizer, wouldn't it make sense to map BETWEEN into two comparison statements (columnconst1 and column=const2) or (constcolumn1 and const=column2) where both scenarios the optimizer may be able to use indexes on the fields? It's exactly the same as the other proposal: CURDATE() announcements_postdate and CURDATE()= announcements_expiredate which still is using two different fields for the comparisons... so wouldn't both scenarios end up in the exact same place? -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=li...@cs.albany.edu -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=hes...@gmail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Not to show until a certain date
On Wed, Sep 29, 2010 at 8:33 AM, Patrice Olivier-Wilson b...@biz-comm.com wrote: On 9/28/10 8:33 PM, Chris W wrote: SELECT * FROM announcements WHERE announcements_expiredate CURDATE() AND announcements_postdate = CURDATE() ORDER BY announcements_expiredate ASC Or how about something like this: SELECT * FROM announcements WHERE CURDATE() between announcements_postdate and announcements_expiredate ORDER BY announcements_expiredate ASC -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Migrating my mindset from MyISAM to InnoDB
2. Don't stare at the screen. Start it, script the process have it email your phone when it's done. Do something else in the mean time. I don't literally stare at the screen -- of course I script it and do other things.. but when I have a resource limited environment, it sure would be nice to have *some idea* of the progress of the rebuild. By staring at the blank screen, I really meant to say that there is absolutely no feedback at all during the process, to get even any idea of how far it has completed and how far it has to go. From my initial tests at rebuilding a 5.6 million record table (4.75 hours), trying to rebuild a 200 million record table would take more than 7 days. And I have two of those tables to rebuild. I can accomplish the same myISAM rebuild in two hours. Unfortunately, no. MySQL threads should really make periodic updates to their status so you can see the progress of long-running queries in the show processlist output. http://bugs.mysql.com/bug.php?id=26182 included a patch that adds progress updates to select statements, so it should be possible to do the same for ALTER TABLEs as well. Wow, that sure would be nice... even with some extended information like myisamchk output. That would be an awesome feature to add to 5.5. Expect to see anywhere from a 1.5x to a 3x increase in size when converting from myisam to innodb, depending on your field types and indexes. It's the penalty you pay for supporting transactions and concurrent read/write access, and for switching to an index-organized table. Now that you put it that way, I'm thinking of just sticking with myisam. I can't spend two weeks upgrading the two 200 million row tables. Thanks for all your comments. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Migrating my mindset from MyISAM to InnoDB
Primarily due to many positive posts I've seen about MySQL 5.5 and advances in InnoDB, I'm seriously considering converting all my MyISAM databases to InnoDB. I don't need many of the InnoDB features, but if I'm going to upgrade from 4.1.14 to 5.5, I might as well bit the bullet since that seems to be the direction of MySQL/Oracle. I've been using MySQL 4.1.14 for years in my production environment, including one master and several slaves for report and long running queries. Every 6 to 12 months the master MYI index files grow fairly large, so I take the production database offline, and run myisamchk -r on the index files to rebuild them and shrink them back down again. I usually get a 20% to 30% space saving and improved performance after the rebuilds. This has worked very well for me for, well, almost 10 years now! And when I say large my two main tables have about 200 million rows, and the myisamchk can take between 60-160 minutes to complete. I very much like how verbose myisamchk is in detailing which index it is currently rebuilding, and the progress in terms of records re-indexed. SO, my questions are this: 1. With InnoDB, do the indexes ever need to be rebuilt to reduce index size and improve performance like I get with MyISAM? 2. If so, are there any tools like myisamchk to monitor the InnoDB index rebuild process, other than issuing a repair table... and staring indefinitely at a blank screen until it finishes hours later? 3. I've been testing the rebuild process during upgrading using alter table table_name engine=innodb to convert my tables from 4.1.14 to 5.5.6, and I'm seeing a 130% increase (more than double) in the raw disk space required for the new InnoDB tables compared to their old MyISAM counterparts. (I am using single-file-per-table). Is this normal? If not, how can I adjust the space requirements for these tables so they don't take up so much additional space? I'm sure I'll have more questions later, but many thanks for your comments and thoughts. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: numbering the result set rows for insertion into another table
On Mon, Sep 20, 2010 at 7:36 AM, Shawn Green (MySQL) shawn.l.gr...@oracle.com wrote: Hello Hank, On 9/18/2010 9:35 PM, Hank wrote: I have the following pseudo code running on mysql 4.x: set @cnt:=0; insert ignore into dest_table select t1.field1, t1.field2, t1.field3, t2.field1, t1.field3, t2.ts, @cnt:=...@cnt+1 from table1 as t1 left join table2 as t2 using (field1, field2) order by t2.ts; This works perfectly to sequentially number the result set rows inserted into dest_table in order of t2.ts (a timestamp field). In my upgrade to mysql 5.1.14-community, the numbers returned by @cnt are not in order... they trend upward from 0 to the number of records inserted, but they're far from in order... so somehow mysql is inserting the rows in some strange order. How can I fix this so it works in both mysql 4.x and 5.x? I am not sure you can fix this to work properly in a single statement for 5.1.14. The order of operations appears out of sequence to what you need. When executing an SQL statement, there are several stages to the processing. 1)gather rows and filter on matches (FROM ... and JOIN ...) 2)filter the results of 1 (WHERE) 3)apply any GROUP BY 4)filter the results of 3 (HAVING) 5)sort the results (ORDER BY) 6)window the results (LIMIT) It appears that computation of your @cnt variable is performed BEFORE the ORDER BY and not after the ORDER BY. This is completely in line with how the SQL Standard says a query should operate. What if you wanted to ORDER BY on the @cnt column and we did not compute it until after that stage of processing? That would break standards compatibility. To make this work the way you want, you need to create a temporary table with the results of your query sorted the way you want them. Then, query that temporary table and add your column of sequential numbers to the first results. There may possibly be a saving grace for you, though. 5.1.14 was a very early release in the 5.1 series. It is possible that someone else noticed the same problem and a later version may be operating as you want. We are currently releasing 5.1.50 which contains 34 rounds of bugfixes above and beyond your current 5.1.14. I suggest you upgrade and try again. Even if this does not fix the behavior to act as you want, the upgrade will at least remove your exposure to hundreds of identified bugs. -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN Hello Shawn, Many thanks for your detailed reply. This is a test/dev box which I do plan to upgrade to the newest mysql version (5.1.x or maybe 5.5.x) in a couple of weeks. But I found a solution to my problem... I'm not setting the @cnt value in the insert...select statement, but I added a second statement right after it to do this, which works as I intended: set @cnt:=0; update dest_table set hc...@cnt:=...@cnt+1 where clause order by ts; This works for both mysql 4.x and 5.1.15. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
numbering the result set rows for insertion into another table
I have the following pseudo code running on mysql 4.x: set @cnt:=0; insert ignore into dest_table select t1.field1, t1.field2, t1.field3, t2.field1, t1.field3, t2.ts, @cnt:=...@cnt+1 from table1 as t1 left join table2 as t2 using (field1, field2) order by t2.ts; This works perfectly to sequentially number the result set rows inserted into dest_table in order of t2.ts (a timestamp field). In my upgrade to mysql 5.1.14-community, the numbers returned by @cnt are not in order... they trend upward from 0 to the number of records inserted, but they're far from in order... so somehow mysql is inserting the rows in some strange order. How can I fix this so it works in both mysql 4.x and 5.x? Many thanks. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: question about VIEWS in 5.1.x
On 02/09/2010 8:30 p, Hank wrote: Simple question about views: Hank, Have you tried running away from the problem :-) by doing... CREATE PROCEDURE `combo`(theid INT) BEGIN (SELECT * FROM table1 WHERE id = theid) UNION (SELECT * FROM table2 WHERE id = theid); END$$ then calling it using call combo(value); Wow - thanks. This works perfectly. I'm assuming I can use call combo(value) in PHP and it returns the result set as if it were a proper table? -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
question about VIEWS in 5.1.x
Simple question about views: I have a view such as: create view combo as select * from table1 union select * from table2; Where table1 and table2 are very large and identical and have a non-unique key on field id.. when I do a: select * from combo where id='value' ; the system seems to be doing a table scan of one or both tables.. I can't even do an: explain select * from combo where field='value' ; the system seems to hang on the explain. SHOW PROCESSLIST says the explain is Sending data . Issuing either one of the view components with the where clause returns results in a fraction of a second (pretty much a full indexed lookup) I know when I used to use Oracle, the where clause would be applied to all parts of the view, but in this case, I can't even figure out what MySQL is trying to do. (I've also tried UNION ALL with the same results). Any suggestions on how to query both tables using the indexed and the view at the same time? That was my intention. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Realistic settings for [myisamchk] in my.cnf
I'm looking to optimize the myisamchk settings for some table rebuilds I need to do. I'm running CentOS 5 and MySQL 5.1 in a VMWare VM with 4 vCPUs and 4GB of memory. All the examples I can find online look like they are several years old, and just copied from someone else's config. I think with 4GB of memory, the settings can be better than this example: [myisamchk] key_buffer = 256M sort_buffer_size = 256M read_buffer = 2M write_buffer = 2M Any suggestions? Thanks, -Hank query, mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Queue / FIFO in MySQL?
On Mon, Sep 7, 2009 at 8:18 PM, Allen Fowlerallen.fow...@yahoo.com wrote: Hello, I need to create a system where records are generated by a producer process and processed by several worker processes. I was thinking about something like: Producer: 1) Producer INSERTs new records with state = new worker = null 2) Producer sleeps and loops back to step #1 Worker(s): 1) Worker UPDATEs all records with worker = pid state = working where state == new 2) Worker SELECTs all records where worker = pid state = working 3) For each record that is done, worker updates record with state = done 4) Worker loops back to step #1 Note: In this scheme the worker winds up with all new records generated since the last worker claimed any. Not sure how else to guarantee atomicity. I would prefer only n records per request. Ideas? bly using python...) Thank you, :) Assuming you are using MYISAM tables, all you really need to do is (a) use a LOCK TABLE before the first UPDATE statement and UNLOCK TABLES after, and (b) put a LIMIT clause on the UPDATE statement. Other than that, what you outlined is exactly what I do for a very similar process, although right now I only have one worker process, but if I wanted to add more, it's already built to handle that. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Speeding up a pretty simple correlated update query
Hello All, I'm reposting this since I didn't get much response the last time, so I'm hoping to reach out again. My correlated update query (see below) was running for 9 days before I killed it. Here is my original question: I have a legacy application which was written using a compound primary key of an item number (non unique) along with a category ID. The combination of the item number and category ID make the records unique. I am in the process of replacing the compound (VARCHAR) keys with an unique integer key in these tables. So I have created an item_seq table and assigned a unique sequence number to each compound key -- it looks like this (all tables are myisam tables, and mysql version 5.0) desc item_seq; +---+--+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+--+--+-+-++ | seq | int(10) unsigned | NO | PRI | NULL| auto_increment | | itemid| char(11) | NO | MUL | || | category | char(4) | NO | | || +---+--+--+-+-++ I also have my main transactional table with about 180,000,000 rows -- it looks like this: desc item_trans; +-+---+--+-+-+---+ | Field | Type | Null | Key | Default | Extra | +-+---+--+-+-+---+ | seq | int(10) unsigned | NO | MUL | | | | itemid | char(11) | NO | PRI | | | | category| char(4) | NO | PRI | | | | transid | int(10) | NO | PRI | | | Currently the seq field is null for the entire table. So of course, I want to update the main transaction table with the new sequence number. So I've disabled all the keys on the item_trans table -- since I am updating every row, it wouldn't (shouldn't) be using the index anyway. Here is my correlated update query: update item_trans i, item_seq is set i.seq=is.seq where is.itemid=i.itemid and is.category=i.category; If I run an explain on the select version of the update, this is what I get: ++-+--++---++-++---+---+ | id | select_type | table| type | possible_keys | key| key_len | ref| rows | Extra | ++-+--++---++-++---+---+ | 1 | SIMPLE | item_trans| ALL| PRIMARY | NULL | NULL | NULL | 178948797 | | | 1 | SIMPLE | item_seq | eq_ref | itemid| itemid | 20 | g.item_trans.itemid,g.item_trans.category| 1 | | ++-+--++---++-++---+---+ ... which is exactly what I would expect it to do. Update every record of the item_trans table, and do a full index lookup on the items_seq table. SO... I've been running this query to update item_trans, and it's been running for 5 days now. I've also tried running this with the primary key index on the item_trans table (but not the seq index), and that ran slower in my initial tests. Are there any faster ways to update 180 million records with a correlated update query? And I'm fairly certain that trying to do this in PHP one-record at a time would take much longer than a SQL solution. Thanks, -Hank
Re: Speeding up a pretty simple correlated update query
On Sun, Sep 6, 2009 at 6:01 PM, mos mo...@fastmail.fm wrote: So I've disabled all the keys on the item_trans table -- since I am updating every row, it wouldn't (shouldn't) be using the index anyway. You can't disable unique indexes or primary keys. They are always active. You can only deactivate non-unique indexes. Here are a couple of suggestions. For now drop the index on item_seq.seq and desc.seq. Are you sure you have a compound index on item_seq.itemid and item_seq.category?? do a Show create table item_seq and also Show create table desc to see what you've got. Make sure your my.cnf file has key_buffer_size=500M equal to about 30% of your available memory. You can always reduce it later. Of course there is another way of doing it, if you are willing to have the tail wag the dog. You may kick yourself for not discovering it yourself. :) set @num:=0; set @last:=''; create table new_item_trans select IF(concat(itemid,category),@last, @num:=...@num+1,@num) Seq, itemid, category, transid, ... ,@last:=concat(itemid,category) as TMPLast from item_trans order by concat(itemid,category); Now you can use the Alter statement to add your indexes and get rid of the TMPLast column. To build the item_seq table you would now use: create table item_seq select seq, itemid, category from new_item_trans group by seq, itemid, category; And of course build your indexes on seq and rename the new_item_trans. I guarantee you this last solution will not take 9 days to complete! :-) Mike Hi Mike, Thanks for your reply. First, in my tests, I've created the target table (item_trans) as a copy of the source table with no indexes at all (even no primary key). Once I get the item_seq field populated, I'll go back and re-create the indexes in batch using myisamchk (I've posted about this recently). Second, I like your second creative solution (I never would have come up with that), but in order for it to work, mysql would have to sort 180 million records before creating the table or retrieve them out of the table via the contactenated index, both of which I think will take a long time... but I'll certainly give it a shot tomorrow and let you know how it goes. Thanks again. -Hank
Re: upgrading from 4.1 to 5.0 trick
On Fri, Aug 28, 2009 at 9:18 AM, Shawn Green shawn.gr...@sun.com wrote: Hank wrote: Hello All, I'm in the process of upgrading my database from 4.1 to 5.0 on CentOS. I've been testing the mysqlcheck --check-upgrade --auto-repair command, and on one of my MYISAM tables, it's taking forever to upgrade the table. It has about 114 million rows, and I'm guessing it needs to be upgraded due to the VARCHAR columns. Anyway, it's been running for a day and a half, and I finally had to kill it. So will this old trick still work? I've done this many times on 4.1 with great success: In mysql 5.0 - I create two new empty tables, one identical to the original and one identical but with no indexes. I name these tables with _ion and _ioff suffixes. I then do a insert into table_ioff select * from source which inserts just the original data into the new table, but doesn't have to rebuild any indexes. I then flush the tables. Then in the file system, I swap the table_ion.frm and table_ion.MYI files with the table_ioff ones. Flush tables again. I then just use myisamchk -r to repair the index file. It runs in about an hour. Can I do this same thing to upgrade the tables, instead of using mysqlcheck, which seems to be rebuilding the table row-by-row, instead of sorting (which myisamchk does). thanks. -Hank Hello Hank, Your technique will work within the following narrow limits of operation: * This will only work for MyISAM tables. * myisamchk is dangerous to run against any table that is in active use as it operates at the file level and has caused corruptions with live tables. Whenever possible either stop the server or prevent access from MySQL to that table with a FLUSH TABLES WITH READ LOCK before using myisamchk. http://dev.mysql.com/doc/refman/5.0/en/flush.html Alternatively, you should be able to match or improve this import then index process if you use an ALTER TABLE ... DISABLE KEYS command before the import followed by an ALTER TABLE ... ENABLE KEYS command after the import or if you use LOAD DATA INFILE ... . Also if you can import all of the data to an empty table in a single batch (statement), the indexes will be computed only once using the batch-index algorithm (it's a sort, not a merge) and that will also save processing time. http://dev.mysql.com/doc/refman/5.0/en/alter-table.html http://dev.mysql.com/doc/refman/5.0/en/load-data.html http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html The overall problem is still that the on-disk structure of the 5.0 tables has changed and that you still need to perform some kind of dump-restore or rebuild of the data as part of the conversion. Warmest regards, Shawn Green, MySQL Senior Support Engineer Sun Microsystems, Inc. Office: Blountville, TN Hello Shawn, Thanks for your reply. Yes, I have all of your conditions covered. 1. They are myisam tables 2. This is not a production system, so other people aren't accessing the tables. 3. And your last comment about dump/restore is taken care of (in my original note) since I am creating a new table (without indexes) in mysql 5.0, and then inserting all the data from the old table into the new one. Then I'm swapping the MYI/frm files, and then rebuilding the new table. I've tested this several times now, and it works like a charm. Finally, I don't like to use the ALTER TABLE DISABLE/ENABLE statements, since they operate in silent mode -- I have no idea what it's doing, or how long to expect the process to take. It would be very nice of those commands had some built-in progress meter or feedback/callback method.
Speeding up a pretty simple correlated update query
Hello All, I have a legacy application which was written using a compound primary key of an item number (non unique) along with a category ID. The combination of the item number and category ID make the records unique. I am in the process of replacing the compound (VARCHAR) keys with an unique integer key in these tables. So I have created an item_seq table and assigned a unique sequence number to each compound key -- it looks like this (all tables are myisam tables, and mysql version 5.0) desc item_seq; +---+--+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+--+--+-+-++ | seq | int(10) unsigned | NO | PRI | NULL| auto_increment | | itemid| char(11) | NO | MUL | || | category | char(4) | NO | | || +---+--+--+-+-++ I also have my main transactional table with about 180,000,000 rows -- it looks like this: desc item_trans; +-+---+--+-+-+---+ | Field | Type | Null | Key | Default | Extra | +-+---+--+-+-+---+ | seq | int(10) unsigned | NO | MUL | | | | itemid | char(11) | NO | PRI | | | | category| char(4) | NO | PRI | | | | transid | int(10) | NO | PRI | | | Currently the seq field is null for the entire table. So of course, I want to update the main transaction table with the new sequence number. So I've disabled all the keys on the item_trans table -- since I am updating every row, it wouldn't (shouldn't) be using the index anyway. Here is my correlated update query: update item_trans i, item_seq is set i.seq=is.seq where is.itemid=i.itemid and is.category=i.category; If I run an explain on the select version of the update, this is what I get: ++-+--++---++-++---+---+ | id | select_type | table| type | possible_keys | key| key_len | ref| rows | Extra | ++-+--++---++-++---+---+ | 1 | SIMPLE | item_trans| ALL| PRIMARY | NULL | NULL | NULL | 178948797 | | | 1 | SIMPLE | item_seq | eq_ref | itemid| itemid | 20 | g.item_trans.itemid,g.item_trans.category| 1 | | ++-+--++---++-++---+---+ ... which is exactly what I would expect it to do. Update every record of the item_trans table, and do a full index lookup on the items_seq table. SO... I've been running this query to update item_trans, and it's been running for 5 days now. I've also tried running this with the primary key index on the item_trans table (but not the seq index), and that ran slower in my initial tests. Are there any faster ways to update 180 million records with a correlated update query? And I'm fairly certain that trying to do this in PHP one-record at a time would take much longer than a SQL solution. Thanks, -Hank
Re: Speeding up a pretty simple correlated update query
Hello Gavin, That's what I did with the first one-to-one table to create the unique SEQ field mapping to each item/category combination. The problem is on the TRANSACTION table, where there are multiple instances of each item/category. If I just put a auto_increment primary key on that table, I'd get a unique TRANSACTION ID, which is not what I want. I want to populate the transaction table with the new integer seq key created in the first table. I guess I should have stated that my overall objective here is to eventually drop the VARCHAR itemid and category id fields from the transaction table, leaving only the new item sequence id (plus transid) as the primary key. There are many tables throughout the schema that do this, and I would be replacing them all. It's just that this is the largest table, and the correlated update is taking a long time, and I'm looking for a better solution (if one exists). thanks. -Hank On Wed, Sep 2, 2009 at 7:50 PM, Gavin Towey gto...@ffn.com wrote: Do you know that if you create seq column on the original table as an auto_increment primary key, it will fill in the numbers automatically? There's no need to create the values on another table and update with a join. Regards, Gavin Towey -Original Message- From: Hank [mailto:hes...@gmail.com] Sent: Wednesday, September 02, 2009 4:35 PM To: mysql@lists.mysql.com Subject: Speeding up a pretty simple correlated update query Hello All, I have a legacy application which was written using a compound primary key of an item number (non unique) along with a category ID. The combination of the item number and category ID make the records unique. I am in the process of replacing the compound (VARCHAR) keys with an unique integer key in these tables. So I have created an item_seq table and assigned a unique sequence number to each compound key -- it looks like this (all tables are myisam tables, and mysql version 5.0) desc item_seq; +---+--+--+-+-++ | Field | Type | Null | Key | Default | Extra | +---+--+--+-+-++ | seq | int(10) unsigned | NO | PRI | NULL| auto_increment | | itemid| char(11) | NO | MUL | || | category | char(4) | NO | | || +---+--+--+-+-++ I also have my main transactional table with about 180,000,000 rows -- it looks like this: desc item_trans; +-+---+--+-+-+---+ | Field | Type | Null | Key | Default | Extra | +-+---+--+-+-+---+ | seq | int(10) unsigned | NO | MUL | | | | itemid | char(11) | NO | PRI | | | | category| char(4) | NO | PRI | | | | transid | int(10) | NO | PRI | | | Currently the seq field is null for the entire table. So of course, I want to update the main transaction table with the new sequence number. So I've disabled all the keys on the item_trans table -- since I am updating every row, it wouldn't (shouldn't) be using the index anyway. Here is my correlated update query: update item_trans i, item_seq is set i.seq=is.seq where is.itemid=i.itemid and is.category=i.category; If I run an explain on the select version of the update, this is what I get: ++-+--++---++-++---+---+ | id | select_type | table| type | possible_keys | key| key_len | ref| rows | Extra | ++-+--++---++-++---+---+ | 1 | SIMPLE | item_trans| ALL| PRIMARY | NULL | NULL | NULL | 178948797 | | | 1 | SIMPLE | item_seq | eq_ref | itemid| itemid | 20 | g.item_trans.itemid,g.item_trans.category| 1 | | ++-+--++---++-++---+---+ ... which is exactly what I would expect it to do. Update every record of the item_trans table, and do a full index lookup on the items_seq table. SO... I've been running this query to update item_trans, and it's been running for 5 days now. I've also tried running this with the primary key index on the item_trans table (but not the seq index), and that ran slower in my initial
upgrading from 4.1 to 5.0 trick
Hello All, I'm in the process of upgrading my database from 4.1 to 5.0 on CentOS. I've been testing the mysqlcheck --check-upgrade --auto-repair command, and on one of my MYISAM tables, it's taking forever to upgrade the table. It has about 114 million rows, and I'm guessing it needs to be upgraded due to the VARCHAR columns. Anyway, it's been running for a day and a half, and I finally had to kill it. So will this old trick still work? I've done this many times on 4.1 with great success: In mysql 5.0 - I create two new empty tables, one identical to the original and one identical but with no indexes. I name these tables with _ion and _ioff suffixes. I then do a insert into table_ioff select * from source which inserts just the original data into the new table, but doesn't have to rebuild any indexes. I then flush the tables. Then in the file system, I swap the table_ion.frm and table_ion.MYI files with the table_ioff ones. Flush tables again. I then just use myisamchk -r to repair the index file. It runs in about an hour. Can I do this same thing to upgrade the tables, instead of using mysqlcheck, which seems to be rebuilding the table row-by-row, instead of sorting (which myisamchk does). thanks. -Hank
Using Flush Hosts in cron
I have the blocked connection problem between three of my servers, all behind two firewalls and on one switch. Occaisionally the mysql servers start blocking the hosts. What's the downside to running a Flush hosts once per minute on these mysql servers? The only hosts that are connecting are one of three or four hosts behind the same firewall. thanks. -Hank query
Re: Mysql and Textarea
On 5/22/07, sam rumaizan [EMAIL PROTECTED] wrote: I'm just a php beginner. So please be patient with my stupid questions. What am I missing in this code that causing the function can't update the textarea? Update button erase the old information from mysql database and replace it with nothing. Basically it can't read what is inside the Textarea box. Why?? Read the highlighted code. Your problem is in this line: mysql_real_escape_string($_REQUEST['Assign_Engineer'][' .$id . ']['Job_Title']); you want something more like (may or may not work): mysql_real_escape_string($_REQUEST[Assign_Engineer[$id]['Job_Title']]); Since this is a PHP problem, and you can't figure it out, I'd suggest moving your request to a PHP list. -Hank
duplicating a replicated slave environment
Hello All, I have a 4.1.14 mysql database master and slave set up. For this slave #1, I have the IO thread running constantly, and the SQL thread running once a day to update all pending updates from the master (then I shut it off). So for most of the day, this database is static (except for the collecting relay logs). I have a new machine to be another slave of the same master (slave #2). I can not shut down or lock the master in order to copy the master database to the slave #2 (it is 44GB total, and would take over an hour to copy). I have copied the (static) database from Slave #1 to Slave #2. How can I now configure Slave #2 to process the pending relay-logs and bring it up to date? Obviously I would need to copy (and rename?) the relay logs, but what about the master.info and relay-log.info files? Or in other words, can I use the show slave status information on Slave #1 to setup Slave #2 in the CHANGE MASTER TO command? Thanks. -Hank
Re: How to look for balanced parenthesis?
I used to use UltraEdit, but then switched to EditPlus because it can edit remote files almost transparently. (Opening a file FTP's it down, you edit local copy, Saving FTP's it back.) FYI - UltraEdit can do this - It uses SFTP over SSH to edit remote files. Works like a charm. -Hank mysql, query
Re: TIMESTAMP field not automatically updating last_updated field
Are the other fields in the update statement actually changing the data? I don't know for sure, but if the data on disk is the same as the update statement, mysql won't actually update the record, and therefore might not update the last_updated field also. Just a thought. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: count(*) send a wrong value
My guess would that your PHP code is not written correctly. For instance, if you have a query in PHP: $sql=select * from my_table where cid=123; ...and are using the PHP function mysql_numrows() to count the results, and then for your next test... you're just changing the query to: $sql=select count(*) from my_table where cid=123 and still using the mysql_numrows() to get the result, that is your error. You'll need to use mysql_result() or some other fetch function to get the results of the query. That's my guess. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: I can't find the missing rows in a table--
Don't you want the queries to be outer join and not left join? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: MySQL query question
since I'm not sure how users would ever be directly associated with teams - I would have expected to find players to be associated with teams - so forgive me if this doesn't resemble very much what you're doing: Think corporate projects, not sports. Here's my take on the original query.. you don't actually need to use the teams table in the query, as long as you have DISTINCT in the Select: SELECT DISTINCT username FROM users u, users_teams ut, projects_teams pt , projects p WHERE p.project_id = '1' AND pt.project_id = p.project_id AND ut.team_id = pt.team_id AND u.user_id = ut.user_id Also, just a style comment, I would find it confusing just to use id as the key in the projects, team, and user tables.. and user_id, team_id, and project_id in the associative tables... the field names should be consistent throughout, so when reading queries, it's obvious which id one is talking about. On 12/24/05, Josh Mellicker [EMAIL PROTECTED] wrote: I have several tables, all with many-to-many joining tables. users users_teams teams teams_projects projects --- So, with a projects.id = 1, I want to get all the usernames of people on teams assigned to that project. SELECT DISTINCT username FROM users, users_teams, teams, projects_teams, projects WHERE projects.id = '1' AND projects_teams.project_id = projects.id AND teams.id = projects_teams.team_id AND users_teams.user_id = users.id gives me ALL the users who are on any team... even teams not assigned to that project. What gives? My brain hurts. Thanks for any help. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Do I need to tweak my server variables for this SELECT statement?
Based on the size (and # of colums) of your result set, I'm not sure any amount of RAM would allow 360,000 records to be stored and sorted entirely in memory. After some point, mysql just decideds to use a temp table. That's the limit of my tweaking skills, so someone else is going to have to help out here on the way to maximize the memory or minimize the time for the internal sorting of the result set, if it's possible. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Do I need to tweak my server variables for this SELECT statement?
Now that I know what's causing the slow queries, what can I do to fix it? The only thing I can suggest is breaking up the query into two parts - the first part to retrieve just the product codes and salesrank, and sort and limit that.. save in a temp table or use application code to retrieve and print the rest of the product info. Sorting 300,000+ records in that huge result set is going to take some time (although it shouldn't take 10 minutes). -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Do I need to tweak my server variables for this SELECT statement?
I don't think the problem is going to be solved with the my.cnf file. Here's what I would try.. 1- run and time the original query without the ORDER BY or LIMIT clauses 2- run and time the following breakdown queries, to see if the indexes are at least working correctly: -- test catprod SELECT pn_pricecompare_catprod.category, pn_pricecompare_catprod.asin FROM pn_pricecompare_catprod WHERE pn_pricecompare_catprod.category = '283155' -- test product SELECT pn_pricecompare_product.title, pn_pricecompare_product.prod_id, pn_pricecompare_product.image_small, pn_pricecompare_product.brand, pn_pricecompare_product.manufacturer, pn_pricecompare_product.mpn, pn_pricecompare_product.model, pn_pricecompare_product.artist, pn_pricecompare_product.author, pn_pricecompare_product.binding, pn_pricecompare_product.label, pn_pricecompare_product.audiencerating, pn_pricecompare_product.studio, pn_pricecompare_product.releasedate, pn_pricecompare_product.numberofpages, pn_pricecompare_product.pubdate, pn_pricecompare_product.publisher, pn_pricecompare_product.searchindex, pn_pricecompare_product.lowest_price, pn_pricecompare_product.num_merchants FROM pn_pricecompare_product WHERE pn_pricecompare_product.asin IN (some test asins) ORDER BY pn_pricecompare_product.salesrank ASC LIMIT 0,10 -- test just getting a count of the join result SELECT count(*) FROM pn_pricecompare_catprod, pn_pricecompare_product WHERE pn_pricecompare_catprod.category = '283155' AND pn_pricecompare_catprod.asin = pn_pricecompare_product.asin The results of those queries should shed some light on where the problem is being introduced. Also, table descriptions of both tables would be helpful in locating the problem. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Do I need to tweak my server variables for this SELECT statement?
Grant, You can just to a desc pn_pricecompare_catprod and desc pn_pricecompare_product and post the results. The CREATE TABLE statements would be OK, but the describes are better. The flush the query cache, I think if you do a flush tables. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Do I need to tweak my server variables for this SELECT statement?
Ok, so the next step would be to try the original query with just the LIMIT clause, and then just the ORDER BY (but not both). The results of select count(*) query would be helpful to know just how many records mysql is trying to sort and limit. And do you really need a separte index key on `salesrank` on the product table? -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Relocation of database store
You can also create a sym-link for the seperate databases/directories, and leave my.cnf as-is. I've been doing that since 3.23, and it's never caused me any problems. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Distance between Zip codes
Don't forget that you're not caclucating driving distance, but great circle distance, which is roughly a straight line over short distances. If you radius is great than, say 50 miles, people might complain that the actual driving distance is much greater than the straight line distance you provided. -- -Hank mysql, query -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: How to avoid redundancy between PK and indices ?
I understand what you're saying. The problem is that if mysql attempted to do a query like you suggest: Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And f=1; It can only use one index for the query, and hopefully, the optimizer will pick one of the six indexes with the fewest keys to scan. But even if it could virtualize the PK that way, it could still cause a scan of millions of records while limiting the table scan to one of the six non-unique keys. In other words, it would/could take alot of time to see if a record is unique upon inserting new records - not something you'd be happy with performance wise, I'm sure. Therefore, a true, concatenated key that enforces uniqueness and can operate immediately upon inserts is really necessary, regardless of what other indexes are on the columns. The type of query you're suggesting can be done with bitmapped indexes (Oracle has them), where the indexes values are stored as bitmaps, and you can combine them so Oracle uses multiple indexes in one query to quickly pair down the records to scan. Bitmapped indexes work very well with the cardinality of keys is less than 10,000 (number of unique key values). In a nutshell, think of a field for sex/gender and a table of 1 million records. A bitmapped index of that field would only be 125,000 bytes long (1 million bits) (one bit=one record), and to find all the M records, just map the on bits to the record number in the datatable. For fields with larger possible values (say, state of residence - 50 values), each location would be represented by 6 bits. Pretty simple concept, but great performance gains can be had over regular btree indexes. I think this is what you're getting at. When I asked the MySQL AB folks at the first conference in Orlando a couple of years ago about adding bitmapped index support in MySQL, they didn't really know what I was talking about. The developer I spoke to thought I was suggesting creating indexes on bitmapped images. No, not exactly. I hope they know what it is now, though, and have (or already have) considered adding support for it in MySQL. -Hank On 10/5/05, C.R. Vegelin [EMAIL PROTECTED] wrote: Hi Hank, You are quite right. I need separate non-unique indices on a, b, c, d, e and f to avoid table scans. And when each combi must be unique I need a Primary Key (a,b,c,d,e,f). And only Key a (a) seems to be redundant with the primary key ... Suppose there would be a PK (a,b,c,d,e,f) defined, without a separate PK index. And let's assume some rows like: columns:a b c d e f row1 has: 1 1 1 1 1 1 row2 has: 1 1 1 1 1 2 row3 has: 1 1 1 1 1 3 etc. Then checking on unique PK could be done by MySQL internally with: Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And f=1; to avoid a duplicate primary key for row1, by using / joining the separate index tables. With this Select query, MySQL could / should make use of the 6 existing separate indices. Uniqueness can be fully guaranteed with these 6 non-unique indices in this case. In other words, a separate PK index is fully redundant in this case, right ? In addition, it would save space without the longer concatenate key of a+b+c+d+e+f. Thanks, Cor - Original Message - From: Hank [EMAIL PROTECTED] To: C.R. Vegelin [EMAIL PROTECTED] Cc: mysql@lists.mysql.com Sent: Wednesday, October 05, 2005 5:57 PM Subject: Re: How to avoid redundancy between PK and indices ? It depends.. if this is your create table statement: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f) ); Then only one unique index is being created on the concatenate key of a+b+c+d+e+f. Queries on any fields other than A will cause a full table scan. On the other hand, if your create table is: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY a (a), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); This will create the primary key, plus six additional indexes, each of which is queryable. But in this case, the KEY a (a) non-unique index is redundent with the primary key, so to do what you want - a unique index on a+b+c+d+e+f PLUS the ability to independtly search the b c d e and f fields, here is the create table you'll need to use: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); -- -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL
Re: How to avoid redundancy between PK and indices ?
It depends.. if this is your create table statement: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f) ); Then only one unique index is being created on the concatenate key of a+b+c+d+e+f. Queries on any fields other than A will cause a full table scan. On the other hand, if your create table is: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY a (a), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); This will create the primary key, plus six additional indexes, each of which is queryable. But in this case, the KEY a (a) non-unique index is redundent with the primary key, so to do what you want - a unique index on a+b+c+d+e+f PLUS the ability to independtly search the b c d e and f fields, here is the create table you'll need to use: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); -- -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Limiting DISTINCT To One Column
To check for more than one channel-per-record in the table: select channel,count(*) as cnt from ChannelStatus group by channel having cnt1 should return zero records if you have no dups. On 9/29/05, Hal Vaughan [EMAIL PROTECTED] wrote: I have a query like this: SELECT DISTINCT Channel, ChannelType, Source FROM ChannelStatus; Each channel is supposedly listed in this table only 1 time, but just in case, what I really want to do is make sure that no channels are duplicated. Is there some way to make the keyword DISTINCT apply to Channel only? Sort of a shortcut to (example in pseudocode, although it'd be in Perl): SELECT DISTINCT Channel FROM ChannelStatus; FOR EACH Channel SELECT Channel, ChannelType FROM Source WHERE Channel = 'channel' ENDLOOP Thanks! Hal -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Circular Replication
The long story short is we use the fact that MySQL has the ability to run the SQL thread and the IO thread of replication separately, and control them individually. I'm fairly green with replication, but I have a simple cron job that starts a PHP program that issues a slave start, watches for the time behind master to be zero seconds, then issues a slave stop. This repeats every 10 minutes (it takes about one minute to update 10 minutes of master data), so my slave is at most (worst case) 10 minutes behind the master. This could be done every two hours or even once per day. I'll be setting up a second master to do this same thing once per day to act as my daily backup. Once the daily backup completes replication, I can flush tables and backup the database tables to the backup device for long term backups. What are the differences between doing this and turning the SQL and IO threads on spearetly? Just IMO, that seems like alot of manipulation that's not really necessary, but it's possible I'm missing something. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Circular Replication
I'll be setting up a second master to do this same thing once per day to act as my daily backup. Oops...I meant to say second slave. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
myisamchk error
While running a -rq on a large table, I got the following error: myisamchk: warning: Duplicate key for record at 54381140 against record at 54380810 How do I find which record is duplicated (without doing the typical self-join query)? This table has 70 million rows, so that's not really feasible. myisamchk --block-search # looked promising, but I can't find any documentation on how to use it properly. thanks. -- -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: looking for a pure startup opportunity..
i've read way too many articles about the 3 kids/guys/etc... who managed to get $10 million in funding for esentially a basic idea, but they had/have traffic/eyeballs!!! Welcome to 1999. Blind reliance on Traffic and Eyeballs as a business plan was what (in part) caused the great runups in valuations that preceded the Internet crash. What counts, and what draws the venture capital is a well thought out business plan based on traditional and well proven business models (i.e. selling stuff with a healthy margin, offering in-demand services, etc). And even if you could get 10 million in funding, you think you and your partner get to walk away with 5 million each? That's not how it works. They invest that money into the COMPANY, and expect YOU to turn 10 million into at least 50 million. They also OWN YOU during that time - it's no cakewalk - just ask any VC funded startup what they think of having VC investors. Getting funding is only the first step in a very long road to eventually selling out, IF you're able to grow the company 5x or more. so, if you're looking at your shrinking retirement going to iraq That statement makes no sense whatsoever. but with the right combination of web development skills (perl/php/c/apache/linux/mysql/html/css) we can do some damage. there are numerous opportunities for the right 3-5 person team! For a serious business, it takes alot more than just raw development skills... like marketing, sales, management, finance, etc. I've read too many stories of smart guys who created amazing cutting-edge technology (i.e. Apple), only to lose it all because they had no idea how to correctly market or sell it. Anyway, this has little to do with mysql (,query). -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
classic outer join problem
I have a table of school class assignments with the following fields - the first four fields are the primary key: Year (int) Term (enum, Spring, Fall,Winter) ClassID (int) SectionID (int) Attachement (varchar 225) The attachment field is a pointer to a file in the OS of an uploaded file, in the format like this: /uploads/2003/Fall/330/1/conversions.doc When an old class item is imported into a new class item for a new year/term/class/section (new record), the attachment field is copied over, and the OS file stays where it is.. the attachment field value is then a pointer the to the OS file for a previous year/term (i.e. the OS file is not copied to the new location filesystem structure). I've been trying to construct a self-joining query to list all the attachments which are NOT referenced by some future year/term/class/section. The desired query result is a list of files I can DELETE - i.e. files not imported or being pointed to by any other class_item record in a different year/term. Keep in mind that pointer to files in the same Year/Term (but different class/section) are NOT to be deleted. The system currently has MySQL version 4.0.1, so I can't use subqueries (i.e. NOT IN (...)). Any suggestions would be greatly appreciated. thanks. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Unique Index on multiple columns that can contain NULL in MySQL
I think you should review the very recent thread why NOT NULL in PRIMARY key?? which might shed some light on your particular issue. In a nutshell, NULL!=NULL, so the database engine can not detect the duplicate rows, as is expected. -Hank On 5/4/05, Dennis Fogg [EMAIL PROTECTED] wrote: I'm getting lots of duplicate rows even though I have a unique index defined over multiple columns. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: zip code search within x miles
Talk about over complicating things... here's the above query simplifed. I can not figure out why they were self joining the table three times: SELECT b.zip_code, b.state, (3956 * (2 * ASIN(SQRT( POWER(SIN(((a.lat-b.lat)*0.017453293)/2),2) + COS(a.lat*0.017453293) * COS(b.lat*0.017453293) * POWER(SIN(((a.lng-b.lng)*0.017453293)/2),2) AS distance FROM zips a, zips b WHERE a.zip_code = '90210' GROUP BY distance having distance = 5; -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: zip code search within x miles
On 4/19/05, Keith Ivey [EMAIL PROTECTED] wrote: Also, the index on zip_code, latitude, and longitude doesn't make sense. Yeah - I didn't even notice the indexes in the table def (I used my own existing zip code table). That table def and query were obviously created by someone pretty green with SQL. -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: zip code search within x miles
No, those indexes were intentional. If you read the section of the manual on optimizing queries, you will encounter a page that mentions what are known as covering indexes. The advantage to a covering index is that if your data is numeric and in the index, Except that the zip code field is not (and should not be) numeric, so the qualification test fails. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: zip code search within x miles
Let's say you've got, oh, 2000 records to search through. You're gonna end up doing a JOIN with: 2,000 X 65,000 == 130,000,000 tuples (records/results). 130 MILLION tuples is *way* too many for your basic $20/month site. I'd say take some easy shortcuts first... like limit the join to the zip code table by the target state (which of course is indexed), then add the one, two or at most three neighboring states, if you're near a border. Or just limit the join to all the neighboring states in one shot). One, two, or three short running queries all in SQL is a whole-lot better than adding un-normalized and redundant fields to the source data table and populating it in a cron job and triggers. Talk about taking the long way around. And if anyone is looking for a datafile with worldwide cities (about 2.7 million records) and their lat/longs (not zips, though), here's a place to download it for free: http://www.maxmind.com/app/worldcities For $50, you can get the addition of population of each city. -Hank mysql, query -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: zip code search within x miles
Applying this same thing to apply to the 80k estimated US zipcodes currently Just for the record, there are about 43,000 distinct US zip codes... and 56,000 zip codes if you double count the zips with multiple city names (when zip codes cross city limits). -Hank mysql, query -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: problem with adding timestamp as a column
I'd suggest not using the keyword timestamp as a column name. I'd suggest using ts or tstamp or something like that. To update the records to the current timestamp: update try set tstamp=null; should do it. I don't know why the default isn't working, though. On Mon, 28 Feb 2005 10:05:32 -0500, Zhe Wang [EMAIL PROTECTED] wrote: Hi, there, I have MySQL 4.1.10. I need to add a timestamp column to an existing table. I am having a problem of setting the newly added column to be the current time. This is what I did: CREATE TABLE try (id INTEGER); INSERT INTO try VALUES(1), (2), (3); ALTER TABLE try ADD timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP; SELECT * FROM try; +--+-+ | id | timestamp | +--+-+ |1 | -00-00 00:00:00 | |2 | -00-00 00:00:00 | |3 | -00-00 00:00:00 | +--+-+ I've read the on-line manual regarding the change in timestamp, still couldn't figure out why 0's got inserted instead of a meaningful current time stamp. I would greatly appreciate if someone can let me know what the correct way is. Unfortunately I cannot recreate the table. Thank you very much! Regards, Zhe -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
fsockopen causing connection errors
Hello, First, I tried searching the mysql list archives, but the archive search function on mysql.com does not seem to be working. Second, I'm using the PHP fsockopen() function on port 3306 to test if a database is up and responsive. If the socket opens, I'm assuming the database is up, and I then call fclose() to close the socket. The problem is that if I call this several times in a row quickly, the database senses a connection error problem and refuses to connect until I issue a flush hosts command. Is there are more proper was to use fsockopen()/fclose() to simply ping the database? I do NOT want to use mysql_connect in this case - I just want to ping 3306 without causing MySQL to count them as broken connections and disconnect the client host. Thanks, -Hank __ Do you Yahoo!? Check out the new Yahoo! Front Page. www.yahoo.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
PHP/MySQL/Apache interaction problem
I really need some help/direction, so I'll try to explain this as best I can. I have a PHP script that involves three very long running queries (2-3 minutes each). The users of my website kick this off from their webpage, so it runs as nobody from inside Apache. In order to prevent a user from running the same script twice, I do a GET_LOCK call at the begining of the script, and return a message if it is already running. Everything worked perfectly for the last few years with this method, until last week when my web server was upgraded to a 2.4 kernel, Apache 2 and PHP 4.3.3 (from a 2.2 kernel, Apache 1.3 and PHP 4.1.2). The database server (mysql 3.23.41) is running on a different machine has not changed. Under the new system, the PHP script runs, and fires off the first of the three queries. Exactly 2 minutes into the script running (while the first query is still running), Apache times out and returns a blank page (expected), and I lose the GET_LOCK lock, but the PHP script continues to run, and in fact the other two queries also run and complete under the same mysql connection ID. So I'm not losing the mysql connection, and the PHP script continues to run, so why am I losing the lock? I've run all the queries from the command line, and I can't replicate the problem. Where can I look to find where/why mysql is giving up the lock, when the PHP script continues to run under the same connection ID? (I can understand the problem if the mysql connection dies and the script auto-re-connects, but that isn't happening.). Thanks, -Hank __ Do you Yahoo!? New Yahoo! Photos - easier uploading and sharing. http://photos.yahoo.com/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]