Re: Schemas for storing and reporting on hashtags

2012-10-25 Thread Hank
Dehua,

Thanks. You are correct.  Perhaps I was just over-thinking it.

-Hank


On Fri, Oct 19, 2012 at 9:48 PM, Dehua Yang meflyingf...@gmail.com wrote:
 Hi Hank

 I just can think like this table

 CREATE TABLE xyz (
 hashtag VARCHAR(...) NOT NULL,
 comment_id ... NOT NULL,
 user_id bigint unsigned not null,
 PRIMARY KEY (hashtag, comment_id),
 INDEX(comment_id, hashtag),
 index idx_user_id(user_id)
 ) ENGINE = InnoDB;


  one user want to check all his comments
 select * from xyz where user_id=x;

 Actually, I think your case is very classic.

 Hope that you can tell me how would you to resolve this problem.



 On Sat, Oct 20, 2012 at 3:26 AM, Hank hes...@gmail.com wrote:

 It's actually much more complicated than a simple many-to-many
 relationship.  For instance, users may enter many hashtags in many
 comments, and a user might want to see all the hashtags they've used,
 and then find all the comments with those hashtags.

 I'm not trying to re-create the specification or build it here in
 discussion.  I'm looking to see if anyone has it done already, and I
 can review their flushed-out design.   I've been a database architect
 for 20+ years, so I know what I'm doing.  I'm not asking for people to
 show me what to do.  Like I said, I could sit down and design it
 myself pretty quickly, but I would like to see what other people have
 *actually done* to solve the problem before.

 -Hank


 On Fri, Oct 19, 2012 at 2:42 PM, Rick James rja...@yahoo-inc.com wrote:
  Many-to-many?  That is, can a comment have many different hashtags?  And
  a hashtag can be associated with many comments?
 
  Best practice for many-to-many:
  CREATE TABLE xyz (
  hashtag VARCHAR(...) NOT NULL,
  comment_id ... NOT NULL,
  PRIMARY KEY (hashtag, comment_id),
  INDEX(comment_id, hashtag)
  ) ENGINE = InnoDB;
 
  One might want to normalize the hashtags, but it does not seem
  warranted in this situation.
 
  -Original Message-
  From: Hank [mailto:hes...@gmail.com]
  Sent: Friday, October 19, 2012 8:58 AM
  To: MySql
  Subject: Schemas for storing and reporting on hashtags
 
  Are there any established best practices or schemas for incorporating
  twitter-like hashtags  into a database ?
 
  Let's say I have a blog with a commenting system, and I want to allow
  people to add hashtags to the comments.
 
  I could certainly create one on my own (it's not that difficult), but
  I'd like to see what other people have done in terms of storage and
  features.
 
  I'm also looking for a solid basic implementation, not something overly
  complex.
 
  Thanks,
 
  -Hank
 
  (query, mysql)
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:http://lists.mysql.com/mysql
 

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql




 --
 Gtalk : meflyingf...@gmail.com
 Skype name : meflyingfish
 Twitter: http://twitter.com/whitepoplar


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Schemas for storing and reporting on hashtags

2012-10-19 Thread Hank
Are there any established best practices or schemas for
incorporating twitter-like hashtags  into a database ?

Let's say I have a blog with a commenting system, and I want to allow
people to add hashtags to the comments.

I could certainly create one on my own (it's not that difficult), but
I'd like to see what other people have done in terms of storage and
features.

I'm also looking for a solid basic implementation, not something
overly complex.

Thanks,

-Hank

(query, mysql)

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Schemas for storing and reporting on hashtags

2012-10-19 Thread Hank
It's actually much more complicated than a simple many-to-many
relationship.  For instance, users may enter many hashtags in many
comments, and a user might want to see all the hashtags they've used,
and then find all the comments with those hashtags.

I'm not trying to re-create the specification or build it here in
discussion.  I'm looking to see if anyone has it done already, and I
can review their flushed-out design.   I've been a database architect
for 20+ years, so I know what I'm doing.  I'm not asking for people to
show me what to do.  Like I said, I could sit down and design it
myself pretty quickly, but I would like to see what other people have
*actually done* to solve the problem before.

-Hank


On Fri, Oct 19, 2012 at 2:42 PM, Rick James rja...@yahoo-inc.com wrote:
 Many-to-many?  That is, can a comment have many different hashtags?  And a 
 hashtag can be associated with many comments?

 Best practice for many-to-many:
 CREATE TABLE xyz (
 hashtag VARCHAR(...) NOT NULL,
 comment_id ... NOT NULL,
 PRIMARY KEY (hashtag, comment_id),
 INDEX(comment_id, hashtag)
 ) ENGINE = InnoDB;

 One might want to normalize the hashtags, but it does not seem warranted in 
 this situation.

 -Original Message-
 From: Hank [mailto:hes...@gmail.com]
 Sent: Friday, October 19, 2012 8:58 AM
 To: MySql
 Subject: Schemas for storing and reporting on hashtags

 Are there any established best practices or schemas for incorporating
 twitter-like hashtags  into a database ?

 Let's say I have a blog with a commenting system, and I want to allow
 people to add hashtags to the comments.

 I could certainly create one on my own (it's not that difficult), but
 I'd like to see what other people have done in terms of storage and
 features.

 I'm also looking for a solid basic implementation, not something overly
 complex.

 Thanks,

 -Hank

 (query, mysql)

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: GA download reverted back to 5.5.24?

2012-06-30 Thread Hank
 Check the manual:
 http://dev.mysql.com/doc/refman/5.5/en/news-5-5-25.html
 Shawn Green

Thank you, but that warning note was not there when I first posted
this message in here.  I'm not sure when the warning note appeared,
but I'd guess it was within the last 36 hours. All that I could see
was that 5.5.25 mysteriously disappeared to be replaced by 5.5.24.

-Hank

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



GA download reverted back to 5.5.24?

2012-06-28 Thread Hank
I am in the process of reporting a new MySQL bug in 5.5.25 (doesn't
exist in 5.5.24) - see: http://bugs.mysql.com/bug.php?id=65740

And I just noticed that at the mysql.com website, the GA downloads
have just been reverted back to 5.5.24.

Is there a blog or update site that might explain why they retracted 5.5.25?

thanks,

-Hank

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



preg_replace in update statement

2012-03-08 Thread Hank
I have a simple problem:

I have a varchar field in the database, and I want to remove all text
between WordA and WordB, including WordA and WordB, leaving all text
before WordA and after WordB intact.

Possible with just SQL?  I know I can write a PHP program to do it,
but it's not that important to spend that much time on.  I'd like one
SQL statement to do it.

Thanks!

-Hank

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: preg_replace in update statement

2012-03-08 Thread Hank
They are regular words.  I was hoping someone would already know how
to do it. I was trying to avoid rolling my own solution using the
string functions. It gets really messy, really quick.


-Hank



On Thu, Mar 8, 2012 at 8:18 PM, Michael Dykman mdyk...@gmail.com wrote:
 If your words need to be regular expressions, per-se with meta
 characters, etc..  you are pretty much out of luck.  If you only need
 to match some literal strings, a method can surely be derived through
 the heavy-handed use of mysql's string function library.

 http://dev.mysql.com/doc/refman/5.0/en/string-functions.html

  - michael dykman

 On Thu, Mar 8, 2012 at 4:11 PM, Hank hes...@gmail.com wrote:
 I have a simple problem:

 I have a varchar field in the database, and I want to remove all text
 between WordA and WordB, including WordA and WordB, leaving all text
 before WordA and after WordB intact.

 Possible with just SQL?  I know I can write a PHP program to do it,
 but it's not that important to spend that much time on.  I'd like one
 SQL statement to do it.

 Thanks!

 -Hank

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql




 --
  - michael dykman
  - mdyk...@gmail.com

  May the Source be with you.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Error on 'flush tables' command

2011-11-18 Thread Hank
I'm getting an occasional error from one of my slaves running
community mysql 5.5.16:

'Got an error writing communication packets' on query.  Query: 'flush tables'

Which halts replication on this slave until I issue a 'SET GLOBAL
sql_slave_skip_counter=1' command on the slave.

There are a few FEDERATED tables on the slave.. is that what would
cause a communication packet error?

If not, what else could cause this on a flush tables command?

Thanks.

-Hank

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Quantity of time from difference of two Datetime values?

2011-09-30 Thread Hank
n Fri, Sep 30, 2011 at 11:08 PM, Jan Steinman j...@bytesmiths.com wrote:
 Okay, I've reviewed the online man page for date and time functions, and I've 
 played with several likely candidates, and I am still having trouble 
 subtracting two arbitrary Datetimes to get something that is useful. A simple 
 subtraction yields the least useful thing possible: a modulo-100 difference 
 of modulo-60 quantities. Other functions yield the proper answer, but not for 
 a quantity of time that rolls over midnight, etc.

 Surely, there are tons of payroll apps that subtract the punch-out from the 
 punch-in to come up with a quantity of time?

 What is YOUR favourite way of coming up with a quantity of time as the 
 difference between two arbitrary Datetimes? Did I overlook something simple? 
 Do I need to convert the two to scalar integers first?


See the unix_timestamp() function.  Converts date/times to a scalar
integer of the number of seconds since the Unix Epoch.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: mysql listed as attach page by google?

2011-09-28 Thread Hank
I've been watching this page to see when my anti-virus tool was
updated to scan for this specific virus/threat:

http://www.virustotal.com/file-scan/report.html?id=d761babcb55d21b467dd698169c921995bf58eac5e9912596693fee52c8690a1-1317175019

I use AVG

-Hank



On Wed, Sep 28, 2011 at 8:45 AM, Rozeboom, Kay [DAS]
kay.rozeb...@iowa.gov wrote:
 Does anyone know if this has been fixed yet?


 -Original Message-
 From: Jigal van Hemert [mailto:ji...@xs4all.nl]
 Sent: Monday, September 26, 2011 2:02 PM
 To: mysql@lists.mysql.com
 Subject: Re: mysql listed as attach page by google?

 Hi,

 On 26-9-2011 20:30, Michael Albert wrote:
 I don't suppose I am the first to notice this, but most of
 the pages on dev.mysql.com have been listed by google
 as attack pages, e.g http://dev.mysql.com/downloads/.
 Has there been a problem, or is google being overzealous?

 I fear Google is right.

 http://www.net-security.org/malware_news.php?id=1853

 --
 Kind regards / met vriendelijke groet,

 Jigal van Hemert.

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql?unsub=kay.rozeb...@iowa.gov


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql?unsub=hes...@gmail.com



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Update table on lost connection

2011-09-28 Thread Hank
Check out the GET_LOCK and RELEASE_LOCK virtual lock functions in MySQL.

-Hank



On Wed, Sep 28, 2011 at 9:15 AM, Alex Schaft al...@quicksoftware.co.za wrote:
 Hi,

 We're busy moving legacy apps from foxpro tables to mysql. User logins were
 tracked via a record in a table which the app then locked, preventing
 multiple logins for the same user code.

 I want to simulate this via a locked column in a mysql table, but would
 need the field to be cleared if the server loses the connection to the
 client. How would I do this, or is there an alternative?

 Thanks,
 Alex




 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql?unsub=hes...@gmail.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Slower performance with LOCK TABLES

2011-09-23 Thread Hank
Hello Johan,
 Thanks for your comprehensive reply. I'll try to answer each of your
questions below.
-Hank

  But if seeing some SQL will make you happy, here is just one example:
 
  UPDATE dest d straight_join source s set d.seq=s.seq WHERE d.key=s.key;

 See, this is why people ask to see your queries. You never mentioned you were 
 doing a join in the update :-)

 I'll ignore the join condition in the where clause, as it makes little 
 difference in this case; but I do note that
 you're using a straight join. Is the optimizer really reading the tables in 
 the wrong order, or is it just
 something you picked up off a blog without realising the impact?

Yes, I'm using a straight join intentionally.  I have 144 million
unindexed rows in dest. I want Mysql to start with those rows
sequentially,  then join them to the matching record in source using
its index (244 million rows).  If I don't do that, mysql tries to use
the indexed table first, causing a full table scans on dest .  So with
straight_join, it does it in proper order.  During experimentation
with different joins, a regular join would run for days.  A straight
join runs for 3-4 hours.

   Source is indexed by key+seq (key is primary key, but seq is  included as 
  a covering index).

 Good practice, that should prevent source from being read from disk, if your 
 index is fully in the cache.

With 244 million records in the source table, I'm not sure that
would fit in the cache.

 Do you have an index on dest.key, too? That might help performance as well if 
 it fits in memory, too,
 because you'll only need disk access for flushing writes, then.

I do not have an index on dest.key, also intentionally,  for two
reasons. First, updating 144 million records in place is slow enough,
but trying to update 144 million records AND the index on that field
would absolutely kill the performance of the update. Once the update
is complete, I re-create the index with a sort using myisamchk.
Second, the starting value of dest.key for all 144 million records
is 0 so an index on that field wouldn't really help, I think.

  This query takes about 3.5 hours when I don't use LOCK TABLES, and over 4 
  hours when I do
  use LOCK TABLES.

 Most peculiar. Is the difference in performance consistent in repeated 
 executions?

yes, I've done these tests about a dozen times now, and while not
exactly scientific, the results are that LOCK TABLES always results in
longer running times. Not just for this query, but other full table
update/select/delete/insert queries.  Not more than twice as long, but
easily a 10% to 25% increase.

  And before testing each run, I do restart the server so there is no
  query caching and I also use FLUSH TABLES between each test run.

 That's good, as it will give you the worst-case scenario. However, since such 
 an update is going to wreck
your index cache anyway, you may just as well preload the appropriate indices 
into it beforehand, if the
cache is sized big enough to hold them. That might give a minor performance 
boost, too, as the server won't have to go to disk every so often to fetch 
index blocks.
 See http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html for that.

I'll look into that, but the index is huge.  Ok, I just preloaded the
source index using LOAD INDEX INTO CACHE source IGNORE LEAVES;..
it took two minutes/15 seconds to pre-load the index. I then ran  two
small tests on smaller tables using the same update statement, and
they both yielded a massive increase in update speed. I'll have to
rebuild the large dest table again to try it on the biggest UPDATE,
but wow, if this is any indication, this was a great suggestion. I'll
report back on the results later today. Thank you!

  | key_buffer_size          | 402653184 |

 400MB of key buffer, that's not bad at all. Do a /show table status/ in your 
 database, and sum the index sizes. If your key buffer is larger than this 
 (and why not scale it for growth a bit?) all your indices will fit, which 
 will save on disk access for index lookups *and* for index-covered queries.

The index length for source is 5,889,037,312.

Thanks again for your assistance.

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Slower performance with LOCK TABLES

2011-09-23 Thread Hank
Hello Johan,

 Just an update. Using the load index into cache statement for the
200 million row indexed source table, my correlated update statement
ran in 1 hour, 45 minutes to update 144 million rows.   A 50% increase
in performance!

Thank you very much,

-Hank



On Fri, Sep 23, 2011 at 6:56 AM, Johan De Meersman vegiv...@tuxera.be wrote:


 - Original Message -
 From: Hank hes...@gmail.com

 (please read my ORIGINAL post with all this information).

 Welcome on the Internet, where everyone will tell you everything you need to 
 know, except for what you want to know :-)

 I am trying to find a logical or reasonable explanation WHY this would be the
 case, despite the fact that the documentation states otherwise (see: Right 
 here:
 http://dev.mysql.com/doc/refman/5.5/en/lock-tables-restrictions.html)

 I believe you're misinterpreting that, as is the author from the blog you 
 originally referenced.

 What it says, is If you are going to run many operations. You are updating 
 many rows, but you are only doing ONE operation: a single update statement.

 While this explains why you're not seeing benefit, I have to admit that I'm 
 at a loss, too, as to why you are experiencing an actual slowdown - the 
 update statement will lock the tables, too, so it shouldn't really make any 
 difference at all.

 But if seeing some SQL will make you happy, here is just one example:

 UPDATE dest d straight_join source s set d.seq=s.seq WHERE d.key=s.key;

 See, this is why people ask to see your queries. You never mentioned you were 
 doing a join in the update :-)

 I'll ignore the join condition in the where clause, as it makes little 
 difference in this case; but I do note that you're using a straight join. Is 
 the optimizer really reading the tables in the wrong order, or is it just 
 something you picked up off a blog without realising the impact?

  Source is indexed by key+seq (key is primary key, but seq is
  included as a covering index).

 Good practice, that should prevent source from being read from disk, if your 
 index is fully in the cache. Do you have an index on dest.key, too? That 
 might help performance as well if it fits in memory, too, because you'll only 
 need disk access for flushing writes, then.

 This query takes about 3.5 hours when I don't use LOCK TABLES, and over 4 
 hours when I do use LOCK TABLES.

 Most peculiar. Is the difference in performance consistent in repeated 
 executions?

 And before testing each run, I do restart the server so there is no
 query caching and I also use FLUSH TABLES between each test run.

 That's good, as it will give you the worst-case scenario. However, since such 
 an update is going to wreck your index cache anyway, you may just as well 
 preload the appropriate indices into it beforehand, if the cache is sized big 
 enough to hold them. That might give a minor performance boost, too, as the 
 server won't have to go to disk every so often to fetch index blocks.
 See http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html for that.

 | key_buffer_size          | 402653184 |

 400MB of key buffer, that's not bad at all. Do a /show table status/ in your 
 database, and sum the index sizes. If your key buffer is larger than this 
 (and why not scale it for growth a bit?) all your indices will fit, which 
 will save on disk access for index lookups *and* for index-covered queries.



 --
 Bier met grenadyn
 Is als mosterd by den wyn
 Sy die't drinkt, is eene kwezel
 Hy die't drinkt, is ras een ezel


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Slower performance with LOCK TABLES

2011-09-22 Thread Hank
Thanks for your reply.  I failed to mention that these are MYISAM tables, so
no transactions.  And like I said, this is not a production box nor is there
any application running, so there's no contention for the tables being
locked.  I'm trying to update a database design on two tables with 200
million records each, so anything I can do to increase the performance of
these long running queries will shorten the migration running time.

What I was referring to was that in the documentation,  that when using LOCK
TABLES, mysql does not update the key cache until the lock is released,
versus when not using LOCK TABLES it does update the key cache on each
insert/update/delete.

see: http://tuxradar.com/practicalphp/18/2/22

In my testing, I'm seeing a slow down when I use LOCK TABLES versus running
the same queries without it.  I'm just trying to find a reason why that
might be the case.

-Hank


On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis
antonycur...@verizon.netwrote:

 LOCK TABLES...WRITE is very likely to reduce performance if you are using a
 transactional storage engine, such as InnoDB/XtraDB or PBXT. The reason is
 that only one connection is holding the write lock and no other concurrent
 operation may occur on the table.

 LOCK TABLES is only really useful for non-transactional tables and maybe a
 few specialized operations where it has its advantages but for 99.9% of
 cases, it should not be used.

 What does increase performance is the proper use of transactions with
 appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE MODE.

 Regards,

 Antony.



 On 21 Sep 2011, at 20:34, Hank wrote:

  According to everything I've read, using LOCK TABLES...WRITE for updates,
 inserts and deletes should improve performance of mysql server, but I
 think
 I've been seeing the opposite effect.

 I've been doing quite a bit of testing on a 64bit install of CentOS 5.5
 installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell R610.
 There are no other VMs on this box, and there are no other users or
 threads
 running on the OS. Just me.  I'm using this box strictly for testing of
 large database migration scripts.

 It seems like when I execute some of these long running statements without
 locking the tables, the code runs quite a bit faster than when I do lock
 the
 tables.  And before testing each run, I do restart the server so there is
 no
 query caching and I also use FLUSH TABLES between each test run.

 All I'm asking is this:  Can anything think of a scenario on a single
 user-box and mysql instance, that locking tables would cause these DML
 statements to slow down compared to not locking the tables?

 Thanks,

 -Hank





Re: Slower performance with LOCK TABLES

2011-09-22 Thread Hank
That is what I'm doing. I'm doing a correlated update on 200 million
records. One UPDATE statement.

Also, I'm not asking for a tutorial when not to use LOCK TABLES.  I'm trying
to figure out why, despite what the documentation says, using LOCK TABLES
hinders performance for large update statements on MYISAM tables when it is
supposed to increase performance on exactly the type of queries I am
performing.

If you can't help answer *that* question, please stop lecturing me on the
reasons not to use LOCK TABLES. Thanks.

-Hank


On Thu, Sep 22, 2011 at 10:19 AM, Antony T Curtis
antonycur...@verizon.netwrote:

 Even for MyISAM tables, LOCK TABLES is not usually the best solution for
 increasing performance. When there is little to no contention, LOCK TABLES
 doesn't offer much value.

 MyISAM works best when you can get more work done in a statement: Instead
 of executing a bunch of insert statements, combine them into a single
 multi-row insert statement, as an example.


 On 22 Sep 2011, at 06:13, Hank wrote:

 Thanks for your reply.  I failed to mention that these are MYISAM tables,
 so no transactions.  And like I said, this is not a production box nor is
 there any application running, so there's no contention for the tables being
 locked.  I'm trying to update a database design on two tables with 200
 million records each, so anything I can do to increase the performance of
 these long running queries will shorten the migration running time.

 What I was referring to was that in the documentation,  that when using
 LOCK TABLES, mysql does not update the key cache until the lock is released,
 versus when not using LOCK TABLES it does update the key cache on each
 insert/update/delete.

 see: http://tuxradar.com/practicalphp/18/2/22

 In my testing, I'm seeing a slow down when I use LOCK TABLES versus running
 the same queries without it.  I'm just trying to find a reason why that
 might be the case.

 -Hank


 On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis 
 antonycur...@verizon.net wrote:

 LOCK TABLES...WRITE is very likely to reduce performance if you are using
 a transactional storage engine, such as InnoDB/XtraDB or PBXT. The reason is
 that only one connection is holding the write lock and no other concurrent
 operation may occur on the table.

 LOCK TABLES is only really useful for non-transactional tables and maybe a
 few specialized operations where it has its advantages but for 99.9% of
 cases, it should not be used.

 What does increase performance is the proper use of transactions with
 appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE MODE.

 Regards,

 Antony.



 On 21 Sep 2011, at 20:34, Hank wrote:

  According to everything I've read, using LOCK TABLES...WRITE for updates,
 inserts and deletes should improve performance of mysql server, but I
 think
 I've been seeing the opposite effect.

 I've been doing quite a bit of testing on a 64bit install of CentOS 5.5
 installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell R610.
 There are no other VMs on this box, and there are no other users or
 threads
 running on the OS. Just me.  I'm using this box strictly for testing of
 large database migration scripts.

 It seems like when I execute some of these long running statements
 without
 locking the tables, the code runs quite a bit faster than when I do lock
 the
 tables.  And before testing each run, I do restart the server so there is
 no
 query caching and I also use FLUSH TABLES between each test run.

 All I'm asking is this:  Can anything think of a scenario on a single
 user-box and mysql instance, that locking tables would cause these DML
 statements to slow down compared to not locking the tables?

 Thanks,

 -Hank







Re: Slower performance with LOCK TABLES

2011-09-22 Thread Hank
Actually, that would be orders of magnitude slower.

  I'm using MYISAM tables, so there's no commit.




On Thu, Sep 22, 2011 at 1:48 PM, Ananda Kumar anan...@gmail.com wrote:

 Hi,
 Why dont u use a stored proc to update rows ,where u commit for every 1k or
 10k rows.
 This will be much faster than ur individual update stmt.

 regards
 anandkl

 On Thu, Sep 22, 2011 at 8:24 PM, Hank hes...@gmail.com wrote:

 That is what I'm doing. I'm doing a correlated update on 200 million
 records. One UPDATE statement.

 Also, I'm not asking for a tutorial when not to use LOCK TABLES.  I'm
 trying
 to figure out why, despite what the documentation says, using LOCK TABLES
 hinders performance for large update statements on MYISAM tables when it
 is
 supposed to increase performance on exactly the type of queries I am
 performing.

 If you can't help answer *that* question, please stop lecturing me on the
 reasons not to use LOCK TABLES. Thanks.

 -Hank


 On Thu, Sep 22, 2011 at 10:19 AM, Antony T Curtis
 antonycur...@verizon.netwrote:

  Even for MyISAM tables, LOCK TABLES is not usually the best solution for
  increasing performance. When there is little to no contention, LOCK
 TABLES
  doesn't offer much value.
 
  MyISAM works best when you can get more work done in a statement:
 Instead
  of executing a bunch of insert statements, combine them into a single
  multi-row insert statement, as an example.
 
 
  On 22 Sep 2011, at 06:13, Hank wrote:
 
  Thanks for your reply.  I failed to mention that these are MYISAM
 tables,
  so no transactions.  And like I said, this is not a production box nor
 is
  there any application running, so there's no contention for the tables
 being
  locked.  I'm trying to update a database design on two tables with 200
  million records each, so anything I can do to increase the performance
 of
  these long running queries will shorten the migration running time.
 
  What I was referring to was that in the documentation,  that when using
  LOCK TABLES, mysql does not update the key cache until the lock is
 released,
  versus when not using LOCK TABLES it does update the key cache on each
  insert/update/delete.
 
  see: http://tuxradar.com/practicalphp/18/2/22
 
  In my testing, I'm seeing a slow down when I use LOCK TABLES versus
 running
  the same queries without it.  I'm just trying to find a reason why that
  might be the case.
 
  -Hank
 
 
  On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis 
  antonycur...@verizon.net wrote:
 
  LOCK TABLES...WRITE is very likely to reduce performance if you are
 using
  a transactional storage engine, such as InnoDB/XtraDB or PBXT. The
 reason is
  that only one connection is holding the write lock and no other
 concurrent
  operation may occur on the table.
 
  LOCK TABLES is only really useful for non-transactional tables and
 maybe a
  few specialized operations where it has its advantages but for 99.9% of
  cases, it should not be used.
 
  What does increase performance is the proper use of transactions with
  appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE MODE.
 
  Regards,
 
  Antony.
 
 
 
  On 21 Sep 2011, at 20:34, Hank wrote:
 
   According to everything I've read, using LOCK TABLES...WRITE for
 updates,
  inserts and deletes should improve performance of mysql server, but I
  think
  I've been seeing the opposite effect.
 
  I've been doing quite a bit of testing on a 64bit install of CentOS
 5.5
  installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell
 R610.
  There are no other VMs on this box, and there are no other users or
  threads
  running on the OS. Just me.  I'm using this box strictly for testing
 of
  large database migration scripts.
 
  It seems like when I execute some of these long running statements
  without
  locking the tables, the code runs quite a bit faster than when I do
 lock
  the
  tables.  And before testing each run, I do restart the server so there
 is
  no
  query caching and I also use FLUSH TABLES between each test run.
 
  All I'm asking is this:  Can anything think of a scenario on a single
  user-box and mysql instance, that locking tables would cause these DML
  statements to slow down compared to not locking the tables?
 
  Thanks,
 
  -Hank
 
 
 
 
 





Re: Slower performance with LOCK TABLES

2011-09-22 Thread Hank
Sorry, but you do not understand my original issue or question.

-Hank


On Thu, Sep 22, 2011 at 2:10 PM, Ananda Kumar anan...@gmail.com wrote:

 No,
 Use a cursor(select column_name to be used in where condition of update
 stmt), loop through it for each update.

 regards
 anandkl


 On Thu, Sep 22, 2011 at 11:36 PM, Hank hes...@gmail.com wrote:


 Actually, that would be orders of magnitude slower.

   I'm using MYISAM tables, so there's no commit.




 On Thu, Sep 22, 2011 at 1:48 PM, Ananda Kumar anan...@gmail.com wrote:

 Hi,
 Why dont u use a stored proc to update rows ,where u commit for every 1k
 or 10k rows.
 This will be much faster than ur individual update stmt.

 regards
 anandkl

 On Thu, Sep 22, 2011 at 8:24 PM, Hank hes...@gmail.com wrote:

 That is what I'm doing. I'm doing a correlated update on 200 million
 records. One UPDATE statement.

 Also, I'm not asking for a tutorial when not to use LOCK TABLES.  I'm
 trying
 to figure out why, despite what the documentation says, using LOCK
 TABLES
 hinders performance for large update statements on MYISAM tables when it
 is
 supposed to increase performance on exactly the type of queries I am
 performing.

 If you can't help answer *that* question, please stop lecturing me on
 the
 reasons not to use LOCK TABLES. Thanks.

 -Hank


 On Thu, Sep 22, 2011 at 10:19 AM, Antony T Curtis
 antonycur...@verizon.netwrote:

  Even for MyISAM tables, LOCK TABLES is not usually the best solution
 for
  increasing performance. When there is little to no contention, LOCK
 TABLES
  doesn't offer much value.
 
  MyISAM works best when you can get more work done in a statement:
 Instead
  of executing a bunch of insert statements, combine them into a single
  multi-row insert statement, as an example.
 
 
  On 22 Sep 2011, at 06:13, Hank wrote:
 
  Thanks for your reply.  I failed to mention that these are MYISAM
 tables,
  so no transactions.  And like I said, this is not a production box nor
 is
  there any application running, so there's no contention for the tables
 being
  locked.  I'm trying to update a database design on two tables with 200
  million records each, so anything I can do to increase the performance
 of
  these long running queries will shorten the migration running time.
 
  What I was referring to was that in the documentation,  that when
 using
  LOCK TABLES, mysql does not update the key cache until the lock is
 released,
  versus when not using LOCK TABLES it does update the key cache on each
  insert/update/delete.
 
  see: http://tuxradar.com/practicalphp/18/2/22
 
  In my testing, I'm seeing a slow down when I use LOCK TABLES versus
 running
  the same queries without it.  I'm just trying to find a reason why
 that
  might be the case.
 
  -Hank
 
 
  On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis 
  antonycur...@verizon.net wrote:
 
  LOCK TABLES...WRITE is very likely to reduce performance if you are
 using
  a transactional storage engine, such as InnoDB/XtraDB or PBXT. The
 reason is
  that only one connection is holding the write lock and no other
 concurrent
  operation may occur on the table.
 
  LOCK TABLES is only really useful for non-transactional tables and
 maybe a
  few specialized operations where it has its advantages but for 99.9%
 of
  cases, it should not be used.
 
  What does increase performance is the proper use of transactions with
  appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE
 MODE.
 
  Regards,
 
  Antony.
 
 
 
  On 21 Sep 2011, at 20:34, Hank wrote:
 
   According to everything I've read, using LOCK TABLES...WRITE for
 updates,
  inserts and deletes should improve performance of mysql server, but
 I
  think
  I've been seeing the opposite effect.
 
  I've been doing quite a bit of testing on a 64bit install of CentOS
 5.5
  installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell
 R610.
  There are no other VMs on this box, and there are no other users or
  threads
  running on the OS. Just me.  I'm using this box strictly for testing
 of
  large database migration scripts.
 
  It seems like when I execute some of these long running statements
  without
  locking the tables, the code runs quite a bit faster than when I do
 lock
  the
  tables.  And before testing each run, I do restart the server so
 there is
  no
  query caching and I also use FLUSH TABLES between each test run.
 
  All I'm asking is this:  Can anything think of a scenario on a
 single
  user-box and mysql instance, that locking tables would cause these
 DML
  statements to slow down compared to not locking the tables?
 
  Thanks,
 
  -Hank
 
 
 
 
 







Re: Slower performance with LOCK TABLES

2011-09-22 Thread Hank
Like I said, the problem is not just one particular SQL statement. It is
several dozen statements operating on tables with several hundred million
records.  The problem is that I am finding that when I use LOCK TABLES,
these queries run slower (please read my ORIGINAL post with all this
information).  I am trying to find a logical or reasonable explanation WHY
this would be the case, despite the fact that the documentation states
otherwise (see: Right here:
http://dev.mysql.com/doc/refman/5.5/en/lock-tables-restrictions.html )

But if seeing some SQL will make you happy, here is just one example:

UPDATE dest d straight_join source s set d.seq=s.seq WHERE d.key=s.key;

for 140 million records in dest and  220 million records in source.
 Source is indexed by key+seq (key is primary key, but seq is included as a
covering index). There is no index on dest.seq -- that index is built once
the update is complete.  This query takes about 3.5 hours when I don't use
LOCK TABLES, and over 4 hours when I do use LOCK TABLES.

-Hank


On Thu, Sep 22, 2011 at 2:18 PM, Ananda Kumar anan...@gmail.com wrote:

 May be if u can let the audience know a sip-net of ur sql, some can help u


 On Thu, Sep 22, 2011 at 11:43 PM, Hank hes...@gmail.com wrote:


 Sorry, but you do not understand my original issue or question.

 -Hank



 On Thu, Sep 22, 2011 at 2:10 PM, Ananda Kumar anan...@gmail.com wrote:

 No,
 Use a cursor(select column_name to be used in where condition of update
 stmt), loop through it for each update.

 regards
 anandkl


 On Thu, Sep 22, 2011 at 11:36 PM, Hank hes...@gmail.com wrote:


 Actually, that would be orders of magnitude slower.

   I'm using MYISAM tables, so there's no commit.




 On Thu, Sep 22, 2011 at 1:48 PM, Ananda Kumar anan...@gmail.comwrote:

 Hi,
 Why dont u use a stored proc to update rows ,where u commit for every
 1k or 10k rows.
 This will be much faster than ur individual update stmt.

 regards
 anandkl

 On Thu, Sep 22, 2011 at 8:24 PM, Hank hes...@gmail.com wrote:

 That is what I'm doing. I'm doing a correlated update on 200 million
 records. One UPDATE statement.

 Also, I'm not asking for a tutorial when not to use LOCK TABLES.  I'm
 trying
 to figure out why, despite what the documentation says, using LOCK
 TABLES
 hinders performance for large update statements on MYISAM tables when
 it is
 supposed to increase performance on exactly the type of queries I am
 performing.

 If you can't help answer *that* question, please stop lecturing me on
 the
 reasons not to use LOCK TABLES. Thanks.

 -Hank


 On Thu, Sep 22, 2011 at 10:19 AM, Antony T Curtis
 antonycur...@verizon.netwrote:

  Even for MyISAM tables, LOCK TABLES is not usually the best solution
 for
  increasing performance. When there is little to no contention, LOCK
 TABLES
  doesn't offer much value.
 
  MyISAM works best when you can get more work done in a statement:
 Instead
  of executing a bunch of insert statements, combine them into a
 single
  multi-row insert statement, as an example.
 
 
  On 22 Sep 2011, at 06:13, Hank wrote:
 
  Thanks for your reply.  I failed to mention that these are MYISAM
 tables,
  so no transactions.  And like I said, this is not a production box
 nor is
  there any application running, so there's no contention for the
 tables being
  locked.  I'm trying to update a database design on two tables with
 200
  million records each, so anything I can do to increase the
 performance of
  these long running queries will shorten the migration running time.
 
  What I was referring to was that in the documentation,  that when
 using
  LOCK TABLES, mysql does not update the key cache until the lock is
 released,
  versus when not using LOCK TABLES it does update the key cache on
 each
  insert/update/delete.
 
  see: http://tuxradar.com/practicalphp/18/2/22
 
  In my testing, I'm seeing a slow down when I use LOCK TABLES versus
 running
  the same queries without it.  I'm just trying to find a reason why
 that
  might be the case.
 
  -Hank
 
 
  On Thu, Sep 22, 2011 at 12:42 AM, Antony T Curtis 
  antonycur...@verizon.net wrote:
 
  LOCK TABLES...WRITE is very likely to reduce performance if you are
 using
  a transactional storage engine, such as InnoDB/XtraDB or PBXT. The
 reason is
  that only one connection is holding the write lock and no other
 concurrent
  operation may occur on the table.
 
  LOCK TABLES is only really useful for non-transactional tables and
 maybe a
  few specialized operations where it has its advantages but for
 99.9% of
  cases, it should not be used.
 
  What does increase performance is the proper use of transactions
 with
  appropriate use of SELECT...FOR UPDATE and SELECT...LOCK IN SHARE
 MODE.
 
  Regards,
 
  Antony.
 
 
 
  On 21 Sep 2011, at 20:34, Hank wrote:
 
   According to everything I've read, using LOCK TABLES...WRITE for
 updates,
  inserts and deletes should improve performance of mysql server,
 but I
  think
  I've been seeing the opposite effect

Re: Slower performance with LOCK TABLES

2011-09-22 Thread Hank
On Thu, Sep 22, 2011 at 3:11 PM, Hassan Schroeder 
hassan.schroe...@gmail.com wrote:

 On Thu, Sep 22, 2011 at 11:51 AM, Hank hes...@gmail.com wrote:
  Like I said, the problem is not just one particular SQL statement. It is
  several dozen statements operating on tables with several hundred million
  records.  The problem is that I am finding that when I use LOCK TABLES,
  these queries run slower (please read my ORIGINAL post with all this
  information).

 Wandering out my area of expertise here :-) but have you done any
 key cache tuning or are you running with the defaults?

 mysql show variables like 'key_%';

 Also, what is the exact LOCK TABLE statement you're using?


No, I haven't done any key cache tuning, as that's out of my area of
expertise as well! I have 8GB of memory on this box, and I can go up to
12GB.

Here are the variables:

| key_buffer_size  | 402653184 |
| key_cache_age_threshold  | 300   |
| key_cache_block_size | 1024  |
| key_cache_division_limit | 100   |

The lock statement is quite simple:

LOCK TABLE dest d write, source s read;

thanks.


Slower performance with LOCK TABLES

2011-09-21 Thread Hank
According to everything I've read, using LOCK TABLES...WRITE for updates,
inserts and deletes should improve performance of mysql server, but I think
I've been seeing the opposite effect.

I've been doing quite a bit of testing on a 64bit install of CentOS 5.5
installed as a guest OS on a VMWare ESXi 4.0 hypervisor on a Dell R610.
 There are no other VMs on this box, and there are no other users or threads
running on the OS. Just me.  I'm using this box strictly for testing of
large database migration scripts.

It seems like when I execute some of these long running statements without
locking the tables, the code runs quite a bit faster than when I do lock the
tables.  And before testing each run, I do restart the server so there is no
query caching and I also use FLUSH TABLES between each test run.

All I'm asking is this:  Can anything think of a scenario on a single
user-box and mysql instance, that locking tables would cause these DML
statements to slow down compared to not locking the tables?

Thanks,

-Hank


Re: myisamchk error (duplicate key records)

2011-09-19 Thread Hank
On Mon, Sep 19, 2011 at 7:19 AM, Johan De Meersman vegiv...@tuxera.bewrote:

 - Original Message -
  From: Hank hes...@gmail.com
 
  While running a -rq on a large table, I got the following error:
 
  myisamchk: warning: Duplicate key for record at   54381140 against
  record at   54380810
 
  How do I find which records are duplicated (without doing the typical
  self-join  or having cnt(*)1 query)?  This table has 144 million
  rows, so that's not really feasible.

 Given that the error is a duplicate *key*, select key from table group
 by key having count(key)  1 is an index-covered query, and should thus
 be perfectly feasible :-)

 What I'm not so sure about, is wether the duplicate key will show up
 correctly in the index - as that index may be marked corrupt - and so, if it
 falls back to a full tablescan, it's indeed going to take a long time. If it
 does, however, there's no other option anyway: the only way to do it fast is
 an index, and that index is untrustworthy.


Exactly - I can't create an index on the table until I remove the duplicate
records.  I suppose I could create a non-unique index on the key fields, and
try that, but then I'd have to (1) create that index, and then (2) do the
full table scan query.  Either way, it's going to take a tremendous amount
of time to do that.

Alternatively, it would be most helpful if the tools provided that find and
output the offending record block #s also provided a quick way to actually
print out those offending rows so I could track down how they got in there
in the first place.

-Hank


Re: myisamchk error (duplicate key records)

2011-09-19 Thread Hank

  Exactly - I can't create an index on the table until I remove the
  duplicate records.

 I was under the impression you were seeing this during a myisamchk run -
 which indicates you should *already* have a key on that field. Or am I
 interpreting that wrong?


I'm trying to rebuild an index after disabling all keys using myisamchk and
adding all 144 million records, so there is no current index on the table.

 I suppose I could create a non-unique index on the key
  fields, and try that, but then I'd have to (1) create that index, and
 then (2) do
  the full table scan query.

 No, if you create the index, you will not do a full tablescan afterwards.
 That's the whole point of an index :-)


But in order to create the index, mysql has to do a full table scan AND a
sort, which for 144 million records, is going to take a very long time.  So
an un-indexed full table scan without an index (i.e ...HAVING count(*)1)
 will actually take less time.

So like I said, it would be intuitive and helpful if the tool finding the
dup records actually provided enough info to view, find, and fix them so I
don't have to index, sort, or table scan the table a second time.


Re: Quotes around INSERT and SELECT statements' arguments from the mysql CLI and PHP

2011-09-19 Thread Hank

 what ugly style - if it is not numeric and you throw it to the database
 you are one of the many with a sql-injection because if you are get
 ivalid values until there you have done no sanitize before and do not here


It's a matter of opinion.  I never said the data wasn't sanitized (it is).
 But sometimes calculated values or bugs in PHP code end up with a null
variable field.  I was just suggesting the choice between two errors -- one
syntax which will generate a hard failure of the query and likely whatever
page, or a soft logical error, which won't. In either case, I have error
trapping to catch both types of errors and alert me to them. I prefer the
errors to be logical ones and not syntax errors.


 $sql=INSERT into table VALUES ( . (int)$id . ,' .
mysql_real_escape_string($val) . ');
 or using a abstraction-layer (simple self written class)
 $sql=INSERT into table VALUES ( . (int)$id . ,' .
$db-escape_string($val) . ');

I think what you posted is ugly style which makes reading the actual SQL
in PHP code much harder to read and debug.  The data validation should take
place elsewhere long before it gets to constructing the SQL statement.


Re: Quotes around INSERT and SELECT statements' arguments from the mysql CLI and PHP

2011-09-19 Thread Hank
Best of both worlds:
 $username=$_POST['username'];
 // do some stuff with username here
 $M=array();  // Array of things to be inserted into MySQL
 $M[username]=mysql_real_escape_string($username); // Everything that
 goes into $M is escaped
 $query=INSERT INTO table (username) VALUES ('{$M[username]}');


I'm not sure I'm seeing why, in particular, you are using an array here?


Re: Quotes around INSERT and SELECT statements' arguments from the mysql CLI and PHP

2011-09-19 Thread Hank


 I want to be sure that all variables in the query are escaped. I don't
 trust myself or anyone else to do this to every variable right before
 the query:
 $someVar=mysql_real_escape_string($someVar);


But you're doing exactly that right before the query anyway with:

$M[username]=mysql_real_escape_string($username);

You're just complicating things with the addition of an unneeded array.  It
seems much simpler and less cluttered to just do:
  $someVar=mysql_real_escape_string($someVar);
before your insert.  All you are doing is changing $someVar to $M[...]
and then using $M[...] in the query.  I really don't see the difference or
benefit of using your array here.  Both methods are doing exactly the same
thing, except one is more convoluted.

Now on the other hand, if you have several elements in the array $M to be
inserted, and have a function like this to escape them all at once:

for each ($M as $val)  $val= mysql_real_escape_string($val);

then your method starts to make more sense.

-Hank


myisamchk error (duplicate key records)

2011-09-18 Thread Hank
While running a -rq on a large table, I got the following error:

myisamchk: warning: Duplicate key for record at   54381140 against
record at   54380810

How do I find which records are duplicated (without doing the typical
self-join  or having cnt(*)1 query)?  This table has 144 million rows, so
that's not really feasible.

myisamchk --block-search # looked promising, but I can't find any
documentation on how to use it properly.

I tried myisamchk -b 54381140 table-name but it really doesn't do
anything.

I posted this identical question here six years ago, and I have the same
problem again.  I still can't find ANY documentation on --block-search or
how to use it.  Has anything changed in six years?

Thanks.


Re: Quotes around INSERT and SELECT statements' arguments from the mysql CLI and PHP

2011-09-18 Thread Hank
On Sun, Sep 18, 2011 at 12:28 PM, Dotan Cohen dotanco...@gmail.com wrote:

 On Sun, Sep 18, 2011 at 17:44, Brandon Phelps bphe...@gls.com wrote:
  Personally I don't use any quotes for the numeric types, and single
 quotes
  for everything else.  Ie:
 

 Thanks, Brandon. I understand then that quote type is a matter of
 taste. I always use double quotes in PHP and I've only recently
 started putting ticks around table and column names. I'll stick to
 your convention of no quotes around numerics and single quotes around
 everything else.


I agree with Brandon's suggestions, I would just add when using numeric
types in PHP statements where you have a variable replacement, for instance:

$sql=INSERT into table VALUES ('$id','$val');

where $id is a numeric variable in PHP and a numeric field in the table,
I'll include the $id in single quotes in the PHP statement, so even if the
value of $id is null, alpha, or invalid (not numeric) it does not generate a
mysql syntax error. Otherwise, without the single quotes, the statement
would be:

INSERT into table VALUES (,'');

 which would cause a syntax error.  If you include the single quotes, it
becomes:

INSERT into table VALUES ('','')

which won't cause a syntax error, but might cause some logic errors in the
database.  The choice is yours.


Re: Facebook Trapped In MySQL a 'Fate Worse Than Death'

2011-07-12 Thread Hank
Given the choice between doing right the first time, or having the second
largest site on the internet, I'll take the latter, and deal with the
problems of not doing it right the first time.


-Hank


On Tue, Jul 12, 2011 at 10:45 AM, Jerry Schwartz je...@gii.co.jp wrote:

 Let this be a lesson to all of those designers who say That will never
 happen.




Re: Found serious replication data-corruption bug in 5.5.8/5.5.11 (auto-increment in primary key)

2011-06-16 Thread Hank
 Sveta Smirnova at Mysql just confirmed this bug in 5.5.13:
http://bugs.mysql.com/45670


On Wed, Jun 15, 2011 at 5:38 PM, Claudio Nanni claudio.na...@gmail.comwrote:

 No worries!

 I think I would have figured that out!

 I'll feedback you tomorrow.

 Thanks again

 Claudio

 2011/6/15 Hank hes...@gmail.com

 Oops... big typo in above steps... add the following line:

  replicate-ignore-table=db.log

 to the SLAVE my.cnf, and restart the SLAVE server.

 The master does not need to be restarted or changed. Just the SLAVE.

 Sorry about that.

 -Hank Eskin



 On Wed, Jun 15, 2011 at 5:19 PM, Claudio Nanni claudio.na...@gmail.com
 wrote:

  Great investigation Hank,
  congratulations.
 
  I will try this tomorrow morning(11:20pm now)  and let you know if I can
  reproduce it on my environments.
 
  Thanks!
 
  Claudio
 
 
  2011/6/15 Hank hes...@gmail.com
 
  Two additional notes:
 
  1.  Using the replicate-wild-ignore-table option in my.cnf produces
 the
  same results.
 
  2.  If the my.cnf replicate-ignore-table=db.log setting on the master
   is
  removed and mysql restarted so db.log is no longer ignored in
  replication,
  this bug goes away and correct results are reported on the slave.
 
  -Hank Eskin
 
 
  On Wed, Jun 15, 2011 at 4:38 PM, Hank hes...@gmail.com wrote:
 
  
   This is a follow-up to my previous post.  I have been narrowing down
  what
   is causing this bug.  It is a timing issue of a replication ignored
  table
   with an auto-increment primary key values leaking over into a
  non-ignored
   table with inserts immediately after the ignore table has had rows
  inserted.
  
   Basically, data from the ignored table is corrupting a non-ignored
 table
  on
   the slave upon immediate inserts.
  
   Here is how to repeat:
  
   On a master issue:
  
   use db;
   drop table test;
   CREATE TABLE test (id int NOT NULL,   cnt int unsigned  NOT NULL
AUTO_INCREMENT,  PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ;
   drop table log;
   CREATE TABLE log (id int unsigned NOT NULL AUTO_INCREMENT, log
  varchar(20),
PRIMARY KEY (id) ) ENGINE=MyISAM AUTO_INCREMENT = 4;
  
   make sure those two tables are created on the slave through regular
   replication.
  
   on slave
  
   desc test;
   desc log;
  
   Once replicated, on the master, add the following line to the
 [mysqld]
   section of my.cnf, and then restart mysql.
  
replicate-ignore-table=db.log
  
   The on the master, issue the following statements as a copy/paste of
 all
  of
   them at once.
   It's critical that the statements are executed in immediate
 succession
  (no
   delays)
  
   insert into log values (null,info1);
   insert into log values (null,info2);
   insert into log values (null,info3);
   insert into log values (null,info4);
   insert into test values (1,null);
   insert into log values (null,info5);
   insert into test values (1,null);
   insert into log values (null,info6);
   insert into test values (2,null);
   insert into log values (null,info7);
   insert into test values (2,null);
   insert into log values (null,info8);
  
   Here are the results from the master (all correct):
  
   masterselect * from log;
   +---+---+
   | id| log   |
   +---+---+
   | 4 | info1 |
   | 5 | info2 |
   | 6 | info3 |
   | 7 | info4 |
   | 8 | info5 |
   | 9 | info6 |
   | 44450 | info7 |
   | 44451 | info8 |
   +---+---+
   masterselect * from test;
   ++-+
   | id | cnt |
   ++-+
   |  1 |   1 |
   |  1 |   2 |
   |  2 |   1 |
   |  2 |   2 |
   ++-+
   Here are the results from the slave:
  
   slaveselect * from log;
  
   Empty set (0.00 sec)  --- as expected, since it is ignored
  
   slaveselect * from test;
   ++---+
   | id | cnt   |
   ++---+
   |  1 | 7 |   -- should be 1, but has values from log on the
  master
   |  1 | 8 |   -- should be 2
   |  2 | 9 |   -- should be 1
   |  2 | 44450 |   -- should be 2
   ++---+
  
   If there is the slightest delay between the inserts into log and
  test,
   the replication happens correctly.
  
   Thoughts?
  
   -Hank Eskin
  
 
 
 
 
  --
  Claudio
 




 --
 Claudio



Found serious replication data-corruption bug in 5.5.8/5.5.11 (auto-increment in primary key)

2011-06-15 Thread Hank
This is a follow-up to my previous post.  I have been narrowing down what is
causing this bug.  It is a timing issue of a replication ignored table with
an auto-increment primary key values leaking over into a non-ignored table
with inserts immediately after the ignore table has had rows inserted.

Basically, data from the ignored table is corrupting a non-ignored table on
the slave upon immediate inserts.

Here is how to repeat:

On a master issue:

use db;
drop table test;
CREATE TABLE test (id int NOT NULL,   cnt int unsigned  NOT NULL
 AUTO_INCREMENT,  PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ;
drop table log;
CREATE TABLE log (id int unsigned NOT NULL AUTO_INCREMENT, log varchar(20),
 PRIMARY KEY (id) ) ENGINE=MyISAM AUTO_INCREMENT = 4;

make sure those two tables are created on the slave through regular
replication.

on slave

desc test;
desc log;

Once replicated, on the master, add the following line to the [mysqld]
section of my.cnf, and then restart mysql.

 replicate-ignore-table=db.log

The on the master, issue the following statements as a copy/paste of all of
them at once.
It's critical that the statements are executed in immediate succession (no
delays)

insert into log values (null,info1);
insert into log values (null,info2);
insert into log values (null,info3);
insert into log values (null,info4);
insert into test values (1,null);
insert into log values (null,info5);
insert into test values (1,null);
insert into log values (null,info6);
insert into test values (2,null);
insert into log values (null,info7);
insert into test values (2,null);
insert into log values (null,info8);

Here are the results from the master (all correct):

masterselect * from log;
+---+---+
| id| log   |
+---+---+
| 4 | info1 |
| 5 | info2 |
| 6 | info3 |
| 7 | info4 |
| 8 | info5 |
| 9 | info6 |
| 44450 | info7 |
| 44451 | info8 |
+---+---+
masterselect * from test;
++-+
| id | cnt |
++-+
|  1 |   1 |
|  1 |   2 |
|  2 |   1 |
|  2 |   2 |
++-+
Here are the results from the slave:

slaveselect * from log;

Empty set (0.00 sec)  --- as expected, since it is ignored

slaveselect * from test;
++---+
| id | cnt   |
++---+
|  1 | 7 |   -- should be 1, but has values from log on the master
|  1 | 8 |   -- should be 2
|  2 | 9 |   -- should be 1
|  2 | 44450 |   -- should be 2
++---+

If there is the slightest delay between the inserts into log and test,
the replication happens correctly.

Thoughts?

-Hank Eskin


Re: Found serious replication data-corruption bug in 5.5.8/5.5.11 (auto-increment in primary key)

2011-06-15 Thread Hank
Two additional notes:

1.  Using the replicate-wild-ignore-table option in my.cnf produces the
same results.

2.  If the my.cnf replicate-ignore-table=db.log setting on the master  is
removed and mysql restarted so db.log is no longer ignored in replication,
this bug goes away and correct results are reported on the slave.

-Hank Eskin


On Wed, Jun 15, 2011 at 4:38 PM, Hank hes...@gmail.com wrote:


 This is a follow-up to my previous post.  I have been narrowing down what
 is causing this bug.  It is a timing issue of a replication ignored table
 with an auto-increment primary key values leaking over into a non-ignored
 table with inserts immediately after the ignore table has had rows inserted.

 Basically, data from the ignored table is corrupting a non-ignored table on
 the slave upon immediate inserts.

 Here is how to repeat:

 On a master issue:

 use db;
 drop table test;
 CREATE TABLE test (id int NOT NULL,   cnt int unsigned  NOT NULL
  AUTO_INCREMENT,  PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ;
 drop table log;
 CREATE TABLE log (id int unsigned NOT NULL AUTO_INCREMENT, log varchar(20),
  PRIMARY KEY (id) ) ENGINE=MyISAM AUTO_INCREMENT = 4;

 make sure those two tables are created on the slave through regular
 replication.

 on slave

 desc test;
 desc log;

 Once replicated, on the master, add the following line to the [mysqld]
 section of my.cnf, and then restart mysql.

  replicate-ignore-table=db.log

 The on the master, issue the following statements as a copy/paste of all of
 them at once.
 It's critical that the statements are executed in immediate succession (no
 delays)

 insert into log values (null,info1);
 insert into log values (null,info2);
 insert into log values (null,info3);
 insert into log values (null,info4);
 insert into test values (1,null);
 insert into log values (null,info5);
 insert into test values (1,null);
 insert into log values (null,info6);
 insert into test values (2,null);
 insert into log values (null,info7);
 insert into test values (2,null);
 insert into log values (null,info8);

 Here are the results from the master (all correct):

 masterselect * from log;
 +---+---+
 | id| log   |
 +---+---+
 | 4 | info1 |
 | 5 | info2 |
 | 6 | info3 |
 | 7 | info4 |
 | 8 | info5 |
 | 9 | info6 |
 | 44450 | info7 |
 | 44451 | info8 |
 +---+---+
 masterselect * from test;
 ++-+
 | id | cnt |
 ++-+
 |  1 |   1 |
 |  1 |   2 |
 |  2 |   1 |
 |  2 |   2 |
 ++-+
 Here are the results from the slave:

 slaveselect * from log;

 Empty set (0.00 sec)  --- as expected, since it is ignored

 slaveselect * from test;
 ++---+
 | id | cnt   |
 ++---+
 |  1 | 7 |   -- should be 1, but has values from log on the master
 |  1 | 8 |   -- should be 2
 |  2 | 9 |   -- should be 1
 |  2 | 44450 |   -- should be 2
 ++---+

 If there is the slightest delay between the inserts into log and test,
 the replication happens correctly.

 Thoughts?

 -Hank Eskin



Re: Found serious replication data-corruption bug in 5.5.8/5.5.11 (auto-increment in primary key)

2011-06-15 Thread Hank
Oops... big typo in above steps... add the following line:

  replicate-ignore-table=db.log

to the SLAVE my.cnf, and restart the SLAVE server.

The master does not need to be restarted or changed. Just the SLAVE.

Sorry about that.

-Hank Eskin



On Wed, Jun 15, 2011 at 5:19 PM, Claudio Nanni claudio.na...@gmail.comwrote:

 Great investigation Hank,
 congratulations.

 I will try this tomorrow morning(11:20pm now)  and let you know if I can
 reproduce it on my environments.

 Thanks!

 Claudio


 2011/6/15 Hank hes...@gmail.com

 Two additional notes:

 1.  Using the replicate-wild-ignore-table option in my.cnf produces the
 same results.

 2.  If the my.cnf replicate-ignore-table=db.log setting on the master
  is
 removed and mysql restarted so db.log is no longer ignored in
 replication,
 this bug goes away and correct results are reported on the slave.

 -Hank Eskin


 On Wed, Jun 15, 2011 at 4:38 PM, Hank hes...@gmail.com wrote:

 
  This is a follow-up to my previous post.  I have been narrowing down
 what
  is causing this bug.  It is a timing issue of a replication ignored
 table
  with an auto-increment primary key values leaking over into a
 non-ignored
  table with inserts immediately after the ignore table has had rows
 inserted.
 
  Basically, data from the ignored table is corrupting a non-ignored table
 on
  the slave upon immediate inserts.
 
  Here is how to repeat:
 
  On a master issue:
 
  use db;
  drop table test;
  CREATE TABLE test (id int NOT NULL,   cnt int unsigned  NOT NULL
   AUTO_INCREMENT,  PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ;
  drop table log;
  CREATE TABLE log (id int unsigned NOT NULL AUTO_INCREMENT, log
 varchar(20),
   PRIMARY KEY (id) ) ENGINE=MyISAM AUTO_INCREMENT = 4;
 
  make sure those two tables are created on the slave through regular
  replication.
 
  on slave
 
  desc test;
  desc log;
 
  Once replicated, on the master, add the following line to the [mysqld]
  section of my.cnf, and then restart mysql.
 
   replicate-ignore-table=db.log
 
  The on the master, issue the following statements as a copy/paste of all
 of
  them at once.
  It's critical that the statements are executed in immediate succession
 (no
  delays)
 
  insert into log values (null,info1);
  insert into log values (null,info2);
  insert into log values (null,info3);
  insert into log values (null,info4);
  insert into test values (1,null);
  insert into log values (null,info5);
  insert into test values (1,null);
  insert into log values (null,info6);
  insert into test values (2,null);
  insert into log values (null,info7);
  insert into test values (2,null);
  insert into log values (null,info8);
 
  Here are the results from the master (all correct):
 
  masterselect * from log;
  +---+---+
  | id| log   |
  +---+---+
  | 4 | info1 |
  | 5 | info2 |
  | 6 | info3 |
  | 7 | info4 |
  | 8 | info5 |
  | 9 | info6 |
  | 44450 | info7 |
  | 44451 | info8 |
  +---+---+
  masterselect * from test;
  ++-+
  | id | cnt |
  ++-+
  |  1 |   1 |
  |  1 |   2 |
  |  2 |   1 |
  |  2 |   2 |
  ++-+
  Here are the results from the slave:
 
  slaveselect * from log;
 
  Empty set (0.00 sec)  --- as expected, since it is ignored
 
  slaveselect * from test;
  ++---+
  | id | cnt   |
  ++---+
  |  1 | 7 |   -- should be 1, but has values from log on the
 master
  |  1 | 8 |   -- should be 2
  |  2 | 9 |   -- should be 1
  |  2 | 44450 |   -- should be 2
  ++---+
 
  If there is the slightest delay between the inserts into log and
 test,
  the replication happens correctly.
 
  Thoughts?
 
  -Hank Eskin
 




 --
 Claudio



Re: Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)

2011-06-14 Thread Hank
That is the slave relay log dump I posted (and mis-labeled). Thanks.

-Hank

On Tue, Jun 14, 2011 at 2:34 AM, Claudio Nanni claudio.na...@gmail.comwrote:

 You should also have a look at the slave relay log.

 But in any case sounds like a bug.

 Claudio
 On Jun 14, 2011 5:18 AM, Hank hes...@gmail.com wrote:
  Both my master and slave bin logs look OK (I think)..
 
  master bin log:
 
  /*!40019 SET @@session.max_insert_delayed_threads=0*/;
  /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
  DELIMITER /*!*/;
  SET TIMESTAMP=1308012505/*!*/;
  SET @@session.pseudo_thread_id=9/*!*/;
  SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0,
  @@session.unique_checks=1, @@session.autocommit=1/*!*/;
  SET @@session.sql_mode=0/*!*/;
  SET @@session.auto_increment_increment=1,
  @@session.auto_increment_offset=1/*!*/;
  /*!\C latin1 *//*!*/;
  SET
 
 @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
  SET @@session.lc_time_names=0/*!*/;
  SET @@session.collation_database=DEFAULT/*!*/;
  BEGIN
  /*!*/;
  use test/*!*/;
  SET TIMESTAMP=1308012505/*!*/;
  insert into test values (1,null)
  /*!*/;
  SET TIMESTAMP=1308012505/*!*/;
  COMMIT
  /*!*/;
 
 
  slave relay log:
 
  /*!40019 SET @@session.max_insert_delayed_threads=0*/;
  /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
  DELIMITER /*!*/;
  SET TIMESTAMP=1308012505/*!*/;
  SET @@session.pseudo_thread_id=9/*!*/;
  SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0,
  @@session.unique_checks=1, @@session.autocommit=1/*!*/;
  SET @@session.sql_mode=0/*!*/;
  SET @@session.auto_increment_increment=1,
  @@session.auto_increment_offset=1/*!*/;
  /*!\C latin1 *//*!*/;
  SET
 
 @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
  SET @@session.lc_time_names=0/*!*/;
  SET @@session.collation_database=DEFAULT/*!*/;
  BEGIN
  /*!*/;
  use test/*!*/;
  SET TIMESTAMP=1308012505/*!*/;
  insert into test values (1,null)
  /*!*/;
  SET TIMESTAMP=1308012505/*!*/;
  COMMIT
  /*!*/;
 
 
  -Hank
 
 
  On Mon, Jun 13, 2011 at 10:38 PM, Hank hes...@gmail.com wrote:
 
 
  Yes, it's basic out-of-the box mysql replication.
 
  This appears to be an instance of this bug:
  http://bugs.mysql.com/bug.php?id=45670
 
  But that bug report was closed two years ago. I have no idea if it's the
  server sending bad data or the slaves. I think it's the slaves, because
 on
  the slave error, it clearly is getting this statement: insert into test
  values (1,null) to replicate, but when it is executed, the null is
  converted into a random number. But it's happening on all of my slaves,
 a
  mix of 32 and 64 bit 5.5.8 and 5.5.11 boxes.
  http://bugs.mysql.com/bug.php?id=45670
  -Hank
 
 
 
  On Mon, Jun 13, 2011 at 10:33 PM, Claudio Nanni 
 claudio.na...@gmail.comwrote:
 
  Hank,
 
  I can't reproduce it right now,
  But it really seems a bug.
  Just a shot in the dark, Are you sure you have statement based and not
  mixed replication?
  I don't even know if that would affect , just an idea.
 
  Claudio
  On Jun 14, 2011 3:07 AM, Hank hes...@gmail.com wrote:
   Hello All,
  
   I have a 64bit, 5.5.8 master, and this bug appears on both 5.5.11 and
  5.5.8
   32 and 64-bit slaves (statement based replication).
  
   I'm finding an auto-increment field (part of a compound primary key)
  updates
   correctly using null to insert the next value on the master.. but
 when
   this statement is replicated on the slaves, instead of inserting the
  next
   value of the auto-increment field, it inserts 65535 for 'smallint'
   definitions of 'cnt' and seemingly high random numbers around 469422
 for
   definitions of 'int' or 'bigint'.
  
   Easy to repeat:
  
   master: CREATE TABLE test ( id int NOT NULL, cnt int unsigned NOT
 NULL
   AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ;
   master: desc test;
   +---+-+--+-+-++
   | Field | Type | Null | Key | Default | Extra |
   +---+-+--+-+-++
   | id | int(11) | NO | PRI | NULL | |
   | cnt | int(11) | NO | PRI | NULL | auto_increment |
   +---+-+--+-+-++
  
   master: insert into test values (1,null);
   master: select * from test;
   ++-+
   | id | cnt |
   ++-+
   | 1 | 1 | --- looks good!
   ++-+
  
   slave: desc test;
   +---+-+--+-+-++
   | Field | Type | Null | Key | Default | Extra |
   +---+-+--+-+-++
   | id | int(11) | NO | PRI | NULL | |
   | cnt | int(11) | NO | PRI | NULL | auto_increment |
   +---+-+--+-+-++
  
   slave: select * from test;
   +++
   | id | cnt |
   +++
   | 1 | 469422 |  should be 1

Re: Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)

2011-06-14 Thread Hank
The slave is receiving null as the statement based insert, not an out of
range number from the master.

I've been doing more research all day on this bug and have a bit more
information as to what's causing it.  I plan to write it up tomorrow and
post it.

Basically, everything works perfectly, until I add a
replication-ignore-table=xxx statement in my.cnf where xxx is a
different table with a unique id INT auto-increment as the single primary
key  And then the values being inserted into the test table (above, not
ignored) represent the last-insert-id of the replication *ignored* table on
the slave

Yeah, pretty strange, I know.  But totally repeatable.

-Hank


2011/6/14 Halász Sándor h...@tbbs.net

  2011/06/13 22:38 -0400, Hank 
 But that bug report was closed two years ago.  I have no idea if it's the
 server sending bad data or the slaves. I think it's the slaves, because on
 the slave error, it clearly is getting this statement:  insert into test
 values (1,null) to replicate, but when it is executed, the null is
 converted into a random number.  But it's happening on all of my slaves, a
 mix of 32 and 64 bit 5.5.8 and 5.5.11 boxes.
 

 If the master were sending random big numbers, and replication on the slave
 in the usual way handled out-of-bound numbers when not allowed to fail, then
 65535 would be an expected value for a signless 16-bit number. Of course, if
 this were true, the slave would be getting not that statement but insert
 into test values (1,469422).


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=hes...@gmail.com




Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)

2011-06-13 Thread Hank
 these results?

-Hank


Re: Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)

2011-06-13 Thread Hank
Yes, it's basic out-of-the box mysql replication.

This appears to be an instance of this bug:
http://bugs.mysql.com/bug.php?id=45670

But that bug report was closed two years ago.  I have no idea if it's the
server sending bad data or the slaves. I think it's the slaves, because on
the slave error, it clearly is getting this statement:  insert into test
values (1,null) to replicate, but when it is executed, the null is
converted into a random number.  But it's happening on all of my slaves, a
mix of 32 and 64 bit 5.5.8 and 5.5.11 boxes.
http://bugs.mysql.com/bug.php?id=45670
-Hank


On Mon, Jun 13, 2011 at 10:33 PM, Claudio Nanni claudio.na...@gmail.comwrote:

 Hank,

 I can't reproduce it right now,
 But it really seems a bug.
 Just a shot in the dark, Are you sure you have statement based and not
 mixed replication?
 I don't even know if that would affect , just an idea.

 Claudio
  On Jun 14, 2011 3:07 AM, Hank hes...@gmail.com wrote:
  Hello All,
 
  I have a 64bit, 5.5.8 master, and this bug appears on both 5.5.11 and
 5.5.8
  32 and 64-bit slaves (statement based replication).
 
  I'm finding an auto-increment field (part of a compound primary key)
 updates
  correctly using null to insert the next value on the master.. but when
  this statement is replicated on the slaves, instead of inserting the next
  value of the auto-increment field, it inserts 65535 for 'smallint'
  definitions of 'cnt' and seemingly high random numbers around 469422 for
  definitions of 'int' or 'bigint'.
 
  Easy to repeat:
 
  master: CREATE TABLE test ( id int NOT NULL, cnt int unsigned NOT NULL
  AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ;
  master: desc test;
  +---+-+--+-+-++
  | Field | Type | Null | Key | Default | Extra |
  +---+-+--+-+-++
  | id | int(11) | NO | PRI | NULL | |
  | cnt | int(11) | NO | PRI | NULL | auto_increment |
  +---+-+--+-+-++
 
  master: insert into test values (1,null);
  master: select * from test;
  ++-+
  | id | cnt |
  ++-+
  | 1 | 1 | --- looks good!
  ++-+
 
  slave: desc test;
  +---+-+--+-+-++
  | Field | Type | Null | Key | Default | Extra |
  +---+-+--+-+-++
  | id | int(11) | NO | PRI | NULL | |
  | cnt | int(11) | NO | PRI | NULL | auto_increment |
  +---+-+--+-+-++
 
  slave: select * from test;
  +++
  | id | cnt |
  +++
  | 1 | 469422 |  should be 1
  +++
 
  But the problem continues...
 
  master: insert into test values (1,null);
  master: select * from test;
  ++-+
  | id | cnt |
  ++-+
  | 1 | 1 | --- correct !
  | 1 | 2 | --- correct !
  ++-+
 
  slave select * from test;
  +++
  | id | cnt |
  +++
  | 1 | 469422 |  should be 1
  | 1 | 470673 |  should be 2
  +++
 
  Now if I repeat the entire scenario using smallint for the 'cnt' field,
  here are the results:
 
  master CREATE TABLE test (id int NOT NULL, cnt smallint unsigned NOT
 NULL
  AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ;
  master desc test;
  +---+--+--+-+-++
  | Field | Type | Null | Key | Default | Extra |
  +---+--+--+-+-++
  | id | int(11) | NO | PRI | NULL | |
  | cnt | smallint(5) unsigned | NO | PRI | NULL | auto_increment |
  +---+--+--+-+-++
  master: insert into test values (1,null);
  master: select * from test;
  ++-+
  | id | cnt |
  ++-+
  | 1 | 1 |  correct!
  ++-+
 
  slave select * from test;
  +++
  | id | cnt |
  +++
  | 1 | 65535 |  should be 1
  +++
 
  but this is different:
 
  master: insert into test values (1,null);
  master: select * from test;
  ++-+
  | id | cnt |
  ++-+
  | 1 | 1 |  correct!
  | 1 | 2 |  correct!
  ++-+
 
  slave select * from test;
  ++---+
  | id | cnt |
  ++---+
  | 1 | 65535 |  should be 1, missing second record, too
  ++---+
  slave show slave status;
 
   Error 'Duplicate entry '1-65535' for key 'PRIMARY'' on query.
 Default
  database: 'test'. Query: 'insert into test values (1,null)'
 
  .. at which point I have to restart the slave due to the error:
 
  slave SET GLOBAL sql_slave_skip_counter=1; slave start;
  slave select * from test;
  ++---+
  | id | cnt |
  ++---+
  | 1 | 65535 |  should be 1, still missing second record, too (of
  course)
  ++---+
 
 
  Now if I manually replicate the statements just on the slave - it works
  perfectly:
 
  slave: truncate table test;
  slave: insert into test values (1,null

Re: Found a possible replication bug in 5.5.8/5.5.11 (auto-increment in primary key)

2011-06-13 Thread Hank
Both my master and slave bin logs look OK (I think)..

master bin log:

/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
SET TIMESTAMP=1308012505/*!*/;
SET @@session.pseudo_thread_id=9/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0,
@@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=0/*!*/;
SET @@session.auto_increment_increment=1,
@@session.auto_increment_offset=1/*!*/;
/*!\C latin1 *//*!*/;
SET
@@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
use test/*!*/;
SET TIMESTAMP=1308012505/*!*/;
insert into test values (1,null)
/*!*/;
SET TIMESTAMP=1308012505/*!*/;
COMMIT
/*!*/;


slave bin log:

/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
SET TIMESTAMP=1308012505/*!*/;
SET @@session.pseudo_thread_id=9/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0,
@@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=0/*!*/;
SET @@session.auto_increment_increment=1,
@@session.auto_increment_offset=1/*!*/;
/*!\C latin1 *//*!*/;
SET
@@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
use test/*!*/;
SET TIMESTAMP=1308012505/*!*/;
insert into test values (1,null)
/*!*/;
SET TIMESTAMP=1308012505/*!*/;
COMMIT
/*!*/;


-Hank


On Mon, Jun 13, 2011 at 10:38 PM, Hank hes...@gmail.com wrote:


 Yes, it's basic out-of-the box mysql replication.

 This appears to be an instance of this bug:
 http://bugs.mysql.com/bug.php?id=45670

 But that bug report was closed two years ago.  I have no idea if it's the
 server sending bad data or the slaves. I think it's the slaves, because on
 the slave error, it clearly is getting this statement:  insert into test
 values (1,null) to replicate, but when it is executed, the null is
 converted into a random number.  But it's happening on all of my slaves, a
 mix of 32 and 64 bit 5.5.8 and 5.5.11 boxes.
  http://bugs.mysql.com/bug.php?id=45670
 -Hank



 On Mon, Jun 13, 2011 at 10:33 PM, Claudio Nanni 
 claudio.na...@gmail.comwrote:

 Hank,

 I can't reproduce it right now,
 But it really seems a bug.
 Just a shot in the dark, Are you sure you have statement based and not
 mixed replication?
 I don't even know if that would affect , just an idea.

 Claudio
  On Jun 14, 2011 3:07 AM, Hank hes...@gmail.com wrote:
  Hello All,
 
  I have a 64bit, 5.5.8 master, and this bug appears on both 5.5.11 and
 5.5.8
  32 and 64-bit slaves (statement based replication).
 
  I'm finding an auto-increment field (part of a compound primary key)
 updates
  correctly using null to insert the next value on the master.. but when
  this statement is replicated on the slaves, instead of inserting the
 next
  value of the auto-increment field, it inserts 65535 for 'smallint'
  definitions of 'cnt' and seemingly high random numbers around 469422 for
  definitions of 'int' or 'bigint'.
 
  Easy to repeat:
 
  master: CREATE TABLE test ( id int NOT NULL, cnt int unsigned NOT NULL
  AUTO_INCREMENT, PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ;
  master: desc test;
  +---+-+--+-+-++
  | Field | Type | Null | Key | Default | Extra |
  +---+-+--+-+-++
  | id | int(11) | NO | PRI | NULL | |
  | cnt | int(11) | NO | PRI | NULL | auto_increment |
  +---+-+--+-+-++
 
  master: insert into test values (1,null);
  master: select * from test;
  ++-+
  | id | cnt |
  ++-+
  | 1 | 1 | --- looks good!
  ++-+
 
  slave: desc test;
  +---+-+--+-+-++
  | Field | Type | Null | Key | Default | Extra |
  +---+-+--+-+-++
  | id | int(11) | NO | PRI | NULL | |
  | cnt | int(11) | NO | PRI | NULL | auto_increment |
  +---+-+--+-+-++
 
  slave: select * from test;
  +++
  | id | cnt |
  +++
  | 1 | 469422 |  should be 1
  +++
 
  But the problem continues...
 
  master: insert into test values (1,null);
  master: select * from test;
  ++-+
  | id | cnt |
  ++-+
  | 1 | 1 | --- correct !
  | 1 | 2 | --- correct !
  ++-+
 
  slave select * from test;
  +++
  | id | cnt |
  +++
  | 1 | 469422 |  should be 1
  | 1 | 470673 |  should be 2
  +++
 
  Now if I repeat the entire scenario using smallint for the 'cnt'
 field,
  here are the results:
 
  master CREATE TABLE test (id int

Using @ variables with LIKE,CONCAT

2011-05-11 Thread Hank
This used to work fine in Mysql 4.3, but no longer works in 5.5.8:

set @txt='needle';
select * from table where field  like CONCAT('%',@txt,'%');

--returns the null set.  If I substitute like this:

select * from table where field  like '%needle%';

it works perfectly (and as it did in 4.x).

How can I get this to work in 5.5.x?

Thanks,

-Hank


Highest version of MySQL available for 2.4 kernel (RH/Cent OS 3.5)?

2011-04-06 Thread Hank
What is the highest version of MySQL available for a 2.4 kernel (Redhat/Cent
OS 3.5)?

And where can I find it to download?

Thanks,

-Hank


Moving data between two servers with SQL

2011-01-04 Thread Hank
Hello,

   I have a background process that runs as a combination of PHPMySQL.  The
end results are records in a table on server #1 (but not the entire table,
just a small subset of the table needs to move).  What's the
easiest/cleanest way of moving those records to an identical table on
another server?  In Oracle, we used to be able to set up connection profiles
and move data between servers with SQL, but I'm guessing that's not easy to
do with MySQL.  I'd prefer not to use mysql command line client commands to
save the data as an OS file and then import that into the other server using
another mysql command line client command. I'd like to find something
cleaner than that.

I'm using 5.5.8.

thanks,

-Hank


Re: Moving data between two servers with SQL

2011-01-04 Thread Hank
Wow, that might just work!   I've seen Federated tables mentioned about,
but I never knew that's what they are here for.. thanks.

 Can I have a host (remote) table on a MySQL 4.1.x server, and the federated
table on a 5.5.8 server?

-Hank


On Tue, Jan 4, 2011 at 4:15 PM, Shawn Green (MySQL) 
shawn.l.gr...@oracle.com wrote:

 On 1/4/2011 15:53, Hank wrote:

 Hello,

I have a background process that runs as a combination of PHPMySQL.
  The
 end results are records in a table on server #1 (but not the entire table,
 just a small subset of the table needs to move).  What's the
 easiest/cleanest way of moving those records to an identical table on
 another server?  In Oracle, we used to be able to set up connection
 profiles
 and move data between servers with SQL, but I'm guessing that's not easy
 to
 do with MySQL.  I'd prefer not to use mysql command line client commands
 to
 save the data as an OS file and then import that into the other server
 using
 another mysql command line client command. I'd like to find something
 cleaner than that.

 I'm using 5.5.8.

 thanks,

 -Hank

  Have you looked at the FEDERATED storage engine?
 http://dev.mysql.com/doc/refman/5.5/en/federated-storage-engine.html

 --
 Shawn Green
 MySQL Principal Technical Support Engineer
 Oracle USA, Inc.
 Office: Blountville, TN



Re: Moving data between two servers with SQL

2011-01-04 Thread Hank
Also, can I do this:

insert into federated_table select * from local_table?

-Hank


On Tue, Jan 4, 2011 at 4:15 PM, Shawn Green (MySQL) 
shawn.l.gr...@oracle.com wrote:

 On 1/4/2011 15:53, Hank wrote:

 Hello,

I have a background process that runs as a combination of PHPMySQL.
  The
 end results are records in a table on server #1 (but not the entire table,
 just a small subset of the table needs to move).  What's the
 easiest/cleanest way of moving those records to an identical table on
 another server?  In Oracle, we used to be able to set up connection
 profiles
 and move data between servers with SQL, but I'm guessing that's not easy
 to
 do with MySQL.  I'd prefer not to use mysql command line client commands
 to
 save the data as an OS file and then import that into the other server
 using
 another mysql command line client command. I'd like to find something
 cleaner than that.

 I'm using 5.5.8.

 thanks,

 -Hank

  Have you looked at the FEDERATED storage engine?
 http://dev.mysql.com/doc/refman/5.5/en/federated-storage-engine.html

 --
 Shawn Green
 MySQL Principal Technical Support Engineer
 Oracle USA, Inc.
 Office: Blountville, TN



Re: Moving data between two servers with SQL

2011-01-04 Thread Hank
Nevermind -- it's working absolutely perfectly between 5.5.8 and 4.1.x.
 Thanks again for the push.

-Hank


On Tue, Jan 4, 2011 at 5:14 PM, Hank hes...@gmail.com wrote:


 Also, can I do this:

 insert into federated_table select * from local_table?

 -Hank


 On Tue, Jan 4, 2011 at 4:15 PM, Shawn Green (MySQL) 
 shawn.l.gr...@oracle.com wrote:

 On 1/4/2011 15:53, Hank wrote:

 Hello,

I have a background process that runs as a combination of PHPMySQL.
  The
 end results are records in a table on server #1 (but not the entire
 table,
 just a small subset of the table needs to move).  What's the
 easiest/cleanest way of moving those records to an identical table on
 another server?  In Oracle, we used to be able to set up connection
 profiles
 and move data between servers with SQL, but I'm guessing that's not easy
 to
 do with MySQL.  I'd prefer not to use mysql command line client commands
 to
 save the data as an OS file and then import that into the other server
 using
 another mysql command line client command. I'd like to find something
 cleaner than that.

 I'm using 5.5.8.

 thanks,

 -Hank

  Have you looked at the FEDERATED storage engine?
 http://dev.mysql.com/doc/refman/5.5/en/federated-storage-engine.html

 --
 Shawn Green
 MySQL Principal Technical Support Engineer
 Oracle USA, Inc.
 Office: Blountville, TN





Re: odd problem with select as statement

2010-12-20 Thread Hank
Here's my 5 second guess..

4E5664736... is being interpreted as a number in scientific notation  ..
i.e.  4*10^5664736  and the parser doesn't like that as a field name.

-Hank


On Mon, Dec 20, 2010 at 12:43 PM, Ramsey, Robert L
robert-ram...@uiowa.eduwrote:

 I am having the hardest time getting a select as statement right.  Here is
 the full query:

 select
 SUM(IF(image='EE0840D00E2ED8A317E5FA9899C48C19',1,0)) as
 EE0840D00E2ED8A317E5FA9899C48C19,
 SUM(IF(image='235C7987796D5B7CEBF56FBDA2BF7815',1,0)) as
 235C7987796D5B7CEBF56FBDA2BF7815,
 SUM(IF(image='96DC0562ED6E6F7FE789A18E09BC5889',1,0)) as
 96DC0562ED6E6F7FE789A18E09BC5889,
 SUM(IF(image='D8B0EA710D2EF408391132F451AE724A',1,0)) as
 D8B0EA710D2EF408391132F451AE724A,
 SUM(IF(image='018C4DB7229D7D2BEB040D241739B784',1,0)) as
 018C4DB7229D7D2BEB040D241739B784,
 SUM(IF(image='98DE1FCA50AC9CE6E0FEA25BAB0177FE',1,0)) as
 98DE1FCA50AC9CE6E0FEA25BAB0177FE,
 SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as
 4E5664736F400E8B482EA7AA67853D13,   --offending line
 SUM(IF(image='FEB810A43A1B275605BD6B69F444700C',1,0)) as
 FEB810A43A1B275605BD6B69F444700C
 from dsrssfeed

 If I remove that one line, the query works fine.  If I do:

 select  SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E from
 dsrssfeed ;

 it works.  But these fail:

 select  SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as
 4E5664736F400E8B482EA7AA67853D13 from dsrssfeed ;
 select  SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E5 from
 dsrssfeed ;

 It can't be field name length, since even 4E5 fails, the field name can
 start with a number since 4E succeeds.

 Any ideas?

 The goal is to see what arbitrary images have information associated with
 them.  The table has two fields:  image is a UID that is the primary key,
 and caption which is a varchar(255) that has information about the image.
  Images are added and deleted from the table as they are changed on a web
 page.  The UID is generated by a third party program that I have to
 interface with and have no control over.  An array of image UIDs is sent to
 the php script and the script needs to determine which UIDs are present in
 the table.  Rather than make N number of individual queries as I iterate
 through the array, I iterate through the array and build the query on the
 fly to make one query.  Then I iterate through the array again and check the
 value in the field.  1 means the UID has an entry, 0 means it doesn't.  I
 thought doing 1 mysql call would be more efficient than lots of calls as I
 iterate through the array.  But since there will probably never be more than
 100 images in the table at any one time, it may not make any difference.
  But now I'm just curious as to why this is happening.

 Thanks,

 Bob







Re: odd problem with select as statement

2010-12-20 Thread Hank
i.e.  just try this:

mysql  select 4E5664736F400E8B482EA7AA67853D13;
ERROR 1367 (22007): Illegal double '4E5664736' value found during parsing

-Hank


On Mon, Dec 20, 2010 at 12:50 PM, Hank hes...@gmail.com wrote:


 Here's my 5 second guess..

 4E5664736... is being interpreted as a number in scientific notation  ..
 i.e.  4*10^5664736  and the parser doesn't like that as a field name.

 -Hank


 On Mon, Dec 20, 2010 at 12:43 PM, Ramsey, Robert L 
 robert-ram...@uiowa.edu wrote:

 I am having the hardest time getting a select as statement right.  Here is
 the full query:

 select
 SUM(IF(image='EE0840D00E2ED8A317E5FA9899C48C19',1,0)) as
 EE0840D00E2ED8A317E5FA9899C48C19,
 SUM(IF(image='235C7987796D5B7CEBF56FBDA2BF7815',1,0)) as
 235C7987796D5B7CEBF56FBDA2BF7815,
 SUM(IF(image='96DC0562ED6E6F7FE789A18E09BC5889',1,0)) as
 96DC0562ED6E6F7FE789A18E09BC5889,
 SUM(IF(image='D8B0EA710D2EF408391132F451AE724A',1,0)) as
 D8B0EA710D2EF408391132F451AE724A,
 SUM(IF(image='018C4DB7229D7D2BEB040D241739B784',1,0)) as
 018C4DB7229D7D2BEB040D241739B784,
 SUM(IF(image='98DE1FCA50AC9CE6E0FEA25BAB0177FE',1,0)) as
 98DE1FCA50AC9CE6E0FEA25BAB0177FE,
 SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as
 4E5664736F400E8B482EA7AA67853D13,   --offending line
 SUM(IF(image='FEB810A43A1B275605BD6B69F444700C',1,0)) as
 FEB810A43A1B275605BD6B69F444700C
 from dsrssfeed

 If I remove that one line, the query works fine.  If I do:

 select  SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E from
 dsrssfeed ;

 it works.  But these fail:

 select  SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as
 4E5664736F400E8B482EA7AA67853D13 from dsrssfeed ;
 select  SUM(IF(image='4E5664736F400E8B482EA7AA67853D13',1,0)) as 4E5 from
 dsrssfeed ;

 It can't be field name length, since even 4E5 fails, the field name can
 start with a number since 4E succeeds.

 Any ideas?

 The goal is to see what arbitrary images have information associated with
 them.  The table has two fields:  image is a UID that is the primary key,
 and caption which is a varchar(255) that has information about the image.
  Images are added and deleted from the table as they are changed on a web
 page.  The UID is generated by a third party program that I have to
 interface with and have no control over.  An array of image UIDs is sent to
 the php script and the script needs to determine which UIDs are present in
 the table.  Rather than make N number of individual queries as I iterate
 through the array, I iterate through the array and build the query on the
 fly to make one query.  Then I iterate through the array again and check the
 value in the field.  1 means the UID has an entry, 0 means it doesn't.  I
 thought doing 1 mysql call would be more efficient than lots of calls as I
 iterate through the array.  But since there will probably never be more than
 100 images in the table at any one time, it may not make any difference.
  But now I'm just curious as to why this is happening.

 Thanks,

 Bob








A better REPAIR TABLE for myisam tables (or for upgrading tables)

2010-12-17 Thread Hank
I've posted a similar post in the past -- but there I was mucking around
with blank index files and frm files to fool myisamchk into repairing a
table.

 But now I think I've come across a much better and more efficient way to do
a REPAIR Table in order to upgrade my database tables from Mysql 4.1.x to
5.5.8.

All this comes from the fact that REPAIR TABLE does not rebuild the table
indexes like myisamchk does, which is very unfortunate.  Sure, REPAIR TABLE
works great for small tables, but if you have any tables of larger size
(millions of records or more, with multiple indexes), REPAIR TABLE can take
hours or days to do a simple repair/upgrade.  And in most cases,
applications just can't be down for that long during an upgrade cycle (not
everyone runs a huge shop with multiple dev/test/upgrade/production
servers).

So here is what I have done, and propose this as a better REPAIR TABLE for
MYISAM tables (in pseudo code):

1.  Retrieve the original CREATE TABLE DDL with show create table SOURCE
2.  Modify DDL to change the table name to a new target table, let's call it
TARGET
3.  Execute new DDL to create empty TARGET table
4.  Run 'myisamchk -r --keys-used=0 TARGET'  (to disable all index keys on
new table)
5.  flush tables; lock table SOURCE read, TARGET write;
6.  insert into TARGET select * From SOURCE;
7.  flush tables; unlock tables;
8.  'myisamchk -prqn TARGET'  (repair to re-enable all keys, do not modify
MYD table, use sorting, in parallel)
9.  rename tables to replace SOURCE with TARGET

I've written a PHP script to do exactly this, and it works beautifully.  My
source tables are mysql 4.1.x tables, and the target tables are now fully
5.5 compliant  (verified with mysqlcheck --check-upgrade).

The best part is that for tables with 50 million short rows, it ran in 7
minutes, and a table with 30 million rows, it ran in 4 minutes.

I'm now running it on a table with over 200 million rows, and I expect it to
take an hour or so... but in all cases, doing a REPAIR TABLE on any of these
large tables would take days to complete.

So why can't the REPAIR TABLE command do something like this in the
background for large MYISAM tables?

-Hank


Re: A better REPAIR TABLE for myisam tables (or for upgrading tables)

2010-12-17 Thread Hank
Sorry...

One small correction to my above post..

'FLUSH TABLES' should be issued between steps 8 and 9.

My 200+ million record table completed in 71 minutes.

-Hank


mysql; query;


Mysql 5.5.x -- when is it going to go GA?

2010-12-14 Thread Hank
Mysql 5.5 -- when is it going to go GA?

And when it does, which version will it be?  5.5.8 or 5.5.6rc?

Thanks,

-Hank


Re: Purposely Corrupting a table

2010-10-19 Thread Hank
It's easy to corrupt the MYISAM index (MYI) file... I do something
like this in linux -- assuming your table is not tiny, and mysql isn't
running or you have a lock on the table:

dd if=table.MYI of=table2.MYI bs=2048 count=100

then copy table2.MYI over table.MYI and then flush tables and then unlock.

Your table will be unreadable until you rebuild the index with REPAIR
TABLE or myisamchk.  The MYD file will remain intact.

If your MYI file is smaller than 200k, then just reduce the count=#.

-Hank



 On Tue, Oct 19, 2010 at 7:53 PM, Steve Staples sstap...@mnsi.net wrote:

 Ok, been googling all morning, and keep getting the same post (on
 multiple different sites).

 Is there a way, where i corrupt a table purposely?   I've tried playing
 with the .MYD file, and yeah, it marks it deleted under the check
 routine,  but the table is still readable/writable, just doesn't have
 any info when selecting it...

 is there another way to corrupt the table, where you can't even select
 from it, or the responce back from a select is an error?




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Constructing query to display item count based on increments of time

2010-10-06 Thread Hank
Here's what I came up with:

 select concat(left(DATE_FORMAT(start_time, %Y-%m-%d %h:%i
),15),0) as time, count(*)  from table group by time

-Hank


 How would one go about to construct a query that counts items within an
 increment or span of time, let's say increments of 10 minutes?
 Imagine a simple table where each row has a timestamp, and the query
 should return the count of items occurring within the timespan of a defined
 period.

 Say,

 09:00: 14
 09:10: 31
 09:20: 25
 09:30:  0
 09:40: 12

 etc.

 I have been able to get collections of item occurrence based on month and
 day by using GROUP BY together with a DATE_FORMAT( start_time, %d %m %Y )
 eg.
 I can however not seem to be able to find the solution to grouping based
 on the minute increments in my example above.

 Any suggestions?

 --
 Kind regards


 Pascual Strømsnæs

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe: http://lists.mysql.com/mysql?unsub=travis_...@hotmail.com



 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql?unsub=hes...@gmail.com



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Constructing query to display item count based on increments of time

2010-10-06 Thread Hank
Ok, I can see that.  Here's a different approach that gets it down to
two function calls and some math.. and the DATE_FORMAT call might not
even be needed depending on the actual application.

select
   DATE_FORMAT(start_time, %Y-%m-%d %h: ) as dhour,
  10*(minute(start_time)%6) as dtime ,count(*)
from table
group by dhour,dtime;

-Hank



On Wed, Oct 6, 2010 at 4:22 PM, Johan De Meersman vegiv...@tuxera.be wrote:
 Two people already who suggested a text-based approach vs. my numeric
 approach.

 Analysing, my method takes a single function call per record (to_unixtime);
 Travis' takes 4 (concat, date_format, truncate, minute) and Hank's 3
 (concate, left, date_format).

 Someone feel like benchmarking ? :-D



 On Wed, Oct 6, 2010 at 5:44 PM, Hank hes...@gmail.com wrote:

 Here's what I came up with:

  select concat(left(DATE_FORMAT(start_time, %Y-%m-%d %h:%i
 ),15),0) as time, count(*)  from table group by time

 -Hank

 
  How would one go about to construct a query that counts items within an
  increment or span of time, let's say increments of 10 minutes?
  Imagine a simple table where each row has a timestamp, and the query
  should return the count of items occurring within the timespan of a
  defined
  period.
 
  Say,
 
  09:00: 14
  09:10: 31
  09:20: 25
  09:30:  0
  09:40: 12
 
  etc.
 
  I have been able to get collections of item occurrence based on month
  and
  day by using GROUP BY together with a DATE_FORMAT( start_time, %d %m
  %Y )
  eg.
  I can however not seem to be able to find the solution to grouping
  based
  on the minute increments in my example above.
 
  Any suggestions?
 
  --
  Kind regards
 
 
  Pascual Strømsnæs
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:
  http://lists.mysql.com/mysql?unsub=travis_...@hotmail.com
 
 
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:    http://lists.mysql.com/mysql?unsub=hes...@gmail.com
 
 

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be




 --
 Bier met grenadyn
 Is als mosterd by den wyn
 Sy die't drinkt, is eene kwezel
 Hy die't drinkt, is ras een ezel


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Not to show until a certain date

2010-10-01 Thread Hank
On Fri, Oct 1, 2010 at 2:34 PM, Joerg Bruehe joerg.bru...@oracle.com wrote:
 Hi!


 Hank wrote:
 On Wed, Sep 29, 2010 at 8:33 AM, Patrice Olivier-Wilson
 b...@biz-comm.com wrote:
 On 9/28/10 8:33 PM, Chris W wrote:

 SELECT *
 FROM announcements
 WHERE announcements_expiredate  CURDATE()
 AND announcements_postdate = CURDATE()
 ORDER BY announcements_expiredate ASC

 Or how about something like this:

 SELECT *
 FROM announcements
 WHERE  CURDATE() between announcements_postdate and announcements_expiredate
 ORDER BY announcements_expiredate ASC

 The syntax is correct, but I don't think this statement will be
 optimized as well as the other proposal:
 BETWEEN is intended for column BETWEEN const1 AND const2,
 whereas your statement is const BETWEEN column1 AND column2.


But that only really matters if there are indexes on the column1 and
column2 fields.

And for the optimizer, wouldn't it make sense to map BETWEEN into two
comparison statements (columnconst1 and column=const2)  or
(constcolumn1 and const=column2) where both scenarios the
optimizer may be able to use indexes on the fields?  It's exactly the
same as the other proposal:

CURDATE()  announcements_postdate and CURDATE()=
announcements_expiredate which still is using two different fields
for the comparisons... so wouldn't both scenarios end up in the exact
same place?

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Not to show until a certain date

2010-10-01 Thread Hank
I would argue that it's not a logical error.  Typically, in my
experience when something expires, the date of expiry is inclusive,
and any date after expiry is not.  Take any discount, coupon, or
promotion -- if it expires on December 31, it's still available on
that day.  If an article expires on December 31, I would interpret
that as still being valid on December 31, and not valid on January 1.
Another example is Credit Cards expire on the last day of the month of
expiry, but they are still valid on that date.

But assuming for a moment that for this application, the date of
expiry is not inclusive (i.e. that expirydate actually means date of
deletion), one could still do:

curdate() between postdate and date_sub(expiredate, INTERVAL 1 day)

And to reply to Joerg Bruehe, I have used this method of BETWEEN many
times over the years without any performance or optimization issues...
but sure, that doesn't mean they will never exist for other
applications, but it has worked well for me.

Best,

-Hank


On Fri, Oct 1, 2010 at 4:50 PM, BMBasal bmb37...@gmail.com wrote:
 Your suggestion seems more elegant. However, you missed the mathematical
 meaning of BETWEEN in SQL:
 it is inclusive of both lower and upper bounds.

 In the case raised by Patrice Olivier-Wilson, when an announcement expires
 on announcements_expiredate, it should not show on that date, and
 thereafter.

 But using BETWEEN, it will show on announcements_expiredate, thus a logical
 error.

 -Original Message-
 From: Hank [mailto:hes...@gmail.com]
 Sent: Friday, October 01, 2010 4:10 PM
 To: Joerg Bruehe
 Cc: mysql@lists.mysql.com
 Subject: Re: Not to show until a certain date

 On Fri, Oct 1, 2010 at 2:34 PM, Joerg Bruehe joerg.bru...@oracle.com
 wrote:
 Hi!


 Hank wrote:
 On Wed, Sep 29, 2010 at 8:33 AM, Patrice Olivier-Wilson
 b...@biz-comm.com wrote:
 On 9/28/10 8:33 PM, Chris W wrote:

 SELECT *
 FROM announcements
 WHERE announcements_expiredate  CURDATE()
 AND announcements_postdate = CURDATE()
 ORDER BY announcements_expiredate ASC

 Or how about something like this:

 SELECT *
 FROM announcements
 WHERE  CURDATE() between announcements_postdate and
 announcements_expiredate
 ORDER BY announcements_expiredate ASC

 The syntax is correct, but I don't think this statement will be
 optimized as well as the other proposal:
 BETWEEN is intended for column BETWEEN const1 AND const2,
 whereas your statement is const BETWEEN column1 AND column2.


 But that only really matters if there are indexes on the column1 and
 column2 fields.

 And for the optimizer, wouldn't it make sense to map BETWEEN into two
 comparison statements (columnconst1 and column=const2)  or
 (constcolumn1 and const=column2) where both scenarios the
 optimizer may be able to use indexes on the fields?  It's exactly the
 same as the other proposal:

 CURDATE()  announcements_postdate and CURDATE()=
 announcements_expiredate which still is using two different fields
 for the comparisons... so wouldn't both scenarios end up in the exact
 same place?

 -Hank

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql?unsub=li...@cs.albany.edu


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql?unsub=hes...@gmail.com



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Not to show until a certain date

2010-09-29 Thread Hank
On Wed, Sep 29, 2010 at 8:33 AM, Patrice Olivier-Wilson
b...@biz-comm.com wrote:
 On 9/28/10 8:33 PM, Chris W wrote:


 SELECT *
 FROM announcements
 WHERE announcements_expiredate  CURDATE()
 AND announcements_postdate = CURDATE()
 ORDER BY announcements_expiredate ASC

Or how about something like this:

SELECT *
FROM announcements
WHERE  CURDATE() between announcements_postdate and announcements_expiredate
ORDER BY announcements_expiredate ASC

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Migrating my mindset from MyISAM to InnoDB

2010-09-29 Thread Hank
 2.  Don't stare at the screen.  Start it, script the process  have it email 
 your phone when it's done.  Do something else in the mean time.

I don't literally stare at the screen -- of course I script it and do
other things.. but when I have a resource limited environment, it sure
would be nice to have *some idea* of the progress of the rebuild.  By
staring at the blank screen, I really meant to say that there is
absolutely no feedback at all during the process, to get even any idea
of how far it has completed and how far it has to go.

From my initial tests at rebuilding a 5.6 million record table (4.75
hours), trying to rebuild a 200 million record table would take more
than 7 days. And I have two of those tables to rebuild.  I can
accomplish the same myISAM rebuild in two hours.

Unfortunately, no.  MySQL threads should really make periodic updates to
their status so you can see the progress of long-running queries in the
show processlist output.  http://bugs.mysql.com/bug.php?id=26182 included
a patch that adds progress updates to select statements, so it should be
possible to do the same for ALTER TABLEs as well.

Wow, that sure would be nice... even with some extended information
like myisamchk output. That would be an awesome feature to add to 5.5.

Expect to see anywhere from a 1.5x to a 3x increase in size when converting
from myisam to innodb, depending on your field types and indexes.  It's the
penalty you pay for supporting transactions and concurrent read/write
access, and for switching to an index-organized table.

Now that you put it that way, I'm thinking of just sticking with
myisam.  I can't spend two weeks upgrading the two 200 million row
tables.

Thanks for all your comments.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Migrating my mindset from MyISAM to InnoDB

2010-09-28 Thread Hank
Primarily due to many positive posts I've seen about MySQL 5.5 and
advances in InnoDB, I'm seriously considering converting all my MyISAM
databases to InnoDB.   I don't need many of the InnoDB features, but
if I'm going to upgrade from 4.1.14 to 5.5, I might as well bit the
bullet since that seems to be the direction of MySQL/Oracle.

I've been using MySQL 4.1.14 for years in my production environment,
including one master and several slaves for report and long running
queries.

Every 6 to 12 months the master MYI index files grow fairly large, so
I take the production database offline, and run myisamchk -r on the
index files to rebuild them and shrink them back down again. I usually
get a 20% to 30% space saving and improved performance after the
rebuilds. This has worked very well for me for, well, almost 10 years
now!

And when I say large my two main tables have about 200 million rows,
and the myisamchk can take between 60-160 minutes to complete.

I very much like how verbose myisamchk is in detailing which index it
is currently rebuilding, and the progress in terms of records
re-indexed.

SO, my questions are this:

1. With InnoDB, do the indexes ever need to be rebuilt to reduce index
size and improve performance like I get with MyISAM?
2. If so, are there any tools like myisamchk to monitor the InnoDB
index rebuild process, other than issuing a repair table... and
staring indefinitely at a blank screen until it finishes hours later?
3.  I've been testing the rebuild process during upgrading using
alter table table_name engine=innodb to convert my tables from
4.1.14 to 5.5.6, and I'm seeing a 130% increase (more than double) in
the raw disk space required for the new InnoDB tables compared to
their old MyISAM counterparts. (I am using single-file-per-table).  Is
this normal?  If not, how can I adjust the space requirements for
these tables so they don't take up so much additional space?

I'm sure I'll have more questions later, but many thanks for your
comments and thoughts.

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: numbering the result set rows for insertion into another table

2010-09-20 Thread Hank
On Mon, Sep 20, 2010 at 7:36 AM, Shawn Green (MySQL)
shawn.l.gr...@oracle.com wrote:
 Hello Hank,

 On 9/18/2010 9:35 PM, Hank wrote:

 I have the following pseudo code running on mysql 4.x:

 set @cnt:=0;
 insert ignore into dest_table
       select t1.field1,  t1.field2,  t1.field3,  t2.field1,
 t1.field3, t2.ts, @cnt:=...@cnt+1
       from table1 as t1 left join table2 as t2 using (field1, field2)
       order by t2.ts;

 This works perfectly to sequentially number the result set rows
 inserted into dest_table in order of t2.ts (a timestamp field).

 In my upgrade to mysql 5.1.14-community, the numbers returned by @cnt
 are not in order... they trend upward from 0 to the number of records
 inserted, but they're far from in order... so somehow mysql is
 inserting the rows in some strange order.

 How can I fix this so it works in both mysql 4.x and 5.x?


 I am not sure you can fix this to work properly in a single statement for
 5.1.14. The order of operations appears out of sequence to what you need.

 When executing an SQL statement, there are several stages to the processing.
 1)gather rows and filter on matches (FROM ... and JOIN ...)
 2)filter the results of 1 (WHERE)
 3)apply any GROUP BY
 4)filter the results of 3 (HAVING)
 5)sort the results (ORDER BY)
 6)window the results (LIMIT)

 It appears that computation of your @cnt variable is performed BEFORE the
 ORDER BY and not after the ORDER BY.  This is completely in line with how
 the SQL Standard says a query should operate.  What if you wanted to ORDER
 BY on the @cnt column and we did not compute it until after that stage of
 processing? That would break standards compatibility. To make this work the
 way you want, you need to create a temporary table with the results of your
 query sorted the way you want them. Then, query that temporary table and add
 your column of sequential numbers to the first results.


 There may possibly be a saving grace for you, though. 5.1.14 was a very
 early release in the 5.1 series. It is possible that someone else noticed
 the same problem and a later version may be operating as you want.  We are
 currently releasing 5.1.50 which contains 34 rounds of bugfixes above and
 beyond your current 5.1.14. I suggest you upgrade and try again. Even if
 this does not fix the behavior to act as you want, the upgrade will at least
 remove your exposure to hundreds of identified bugs.

 --
 Shawn Green
 MySQL Principal Technical Support Engineer
 Oracle USA, Inc.
 Office: Blountville, TN


Hello Shawn,

  Many thanks for your detailed reply.  This is a test/dev box which I
do plan to upgrade to the newest mysql version (5.1.x or maybe 5.5.x)
in a couple of weeks.

But I found a solution to my problem... I'm not setting the @cnt value
in the insert...select statement, but I added a second statement
right after it to do this, which works as I intended:

set @cnt:=0;
update  dest_table set hc...@cnt:=...@cnt+1 where clause order by ts;

This works for both mysql 4.x and 5.1.15.

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



numbering the result set rows for insertion into another table

2010-09-18 Thread Hank
I have the following pseudo code running on mysql 4.x:

set @cnt:=0;
insert ignore into dest_table
  select t1.field1,  t1.field2,  t1.field3,  t2.field1,
t1.field3, t2.ts, @cnt:=...@cnt+1
  from table1 as t1 left join table2 as t2 using (field1, field2)
  order by t2.ts;

This works perfectly to sequentially number the result set rows
inserted into dest_table in order of t2.ts (a timestamp field).

In my upgrade to mysql 5.1.14-community, the numbers returned by @cnt
are not in order... they trend upward from 0 to the number of records
inserted, but they're far from in order... so somehow mysql is
inserting the rows in some strange order.

How can I fix this so it works in both mysql 4.x and 5.x?

Many thanks.

-Hank

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: question about VIEWS in 5.1.x

2010-09-03 Thread Hank
On 02/09/2010 8:30 p, Hank wrote:

 Simple question about views:


 Hank,
 Have you tried running away from the problem :-) by doing...

 CREATE PROCEDURE `combo`(theid INT)
 BEGIN
(SELECT * FROM table1 WHERE id = theid)
UNION
(SELECT * FROM table2 WHERE id = theid);
END$$

 then calling it using

 call combo(value);


Wow - thanks. This works perfectly.   I'm assuming I can use call
combo(value) in PHP and it returns the result set as if it were a
proper table?

-Hank

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



question about VIEWS in 5.1.x

2010-09-02 Thread Hank
Simple question about views:

I have a view such as:

 create view combo as
 select * from table1
 union
 select * from table2;

Where table1 and table2 are very large and identical and have a
non-unique key on field id..

when I do a:

select * from combo where id='value'  ;

the system seems to be doing a table scan of one or both tables.. I
can't even do an:

explain select * from combo where field='value' ;

the system seems to hang on the explain.  SHOW PROCESSLIST says the
explain is Sending data .

Issuing either one of the view components with the where clause
returns results in a fraction of a second (pretty much a full indexed
lookup)

I know when I used to use Oracle, the where clause would be applied to
all parts of the view, but in this case, I can't even figure out what
MySQL is trying to do.

(I've also tried UNION ALL with the same results).

Any suggestions on how to query both tables using the indexed and the
view at the same time?  That was my intention.

-Hank

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Realistic settings for [myisamchk] in my.cnf

2010-01-05 Thread Hank
I'm looking to optimize the myisamchk settings for some table rebuilds
I need to do.

I'm running CentOS 5 and MySQL 5.1 in a VMWare VM with 4 vCPUs and 4GB
of memory.

All the examples I can find online look like they are several years
old, and just copied from someone else's config.  I think with 4GB of
memory, the settings can be better than this example:

[myisamchk]
key_buffer = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M

Any suggestions?  Thanks,

-Hank

query, mysql

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Queue / FIFO in MySQL?

2009-09-07 Thread Hank
On Mon, Sep 7, 2009 at 8:18 PM, Allen Fowlerallen.fow...@yahoo.com wrote:
 Hello,

 I need to create a system where records are generated by a producer process 
 and processed by several worker processes.

 I was thinking about something like:

 Producer:
 1) Producer INSERTs new records with state = new  worker = null
 2) Producer sleeps and loops back to step #1

 Worker(s):
 1) Worker UPDATEs all records with worker = pid  state = working 
 where state == new
 2) Worker SELECTs all records where worker = pid  state = working
 3) For each record that is done, worker updates record with state = done
 4) Worker loops back to step #1

 Note: In this scheme the worker winds up with all new records generated 
 since the last worker claimed any. Not sure how else to guarantee atomicity. 
 I would prefer only n records per request. Ideas?
bly using python...)

 Thank you,
 :)


Assuming you are using MYISAM tables, all you really need to do is (a)
use a LOCK TABLE before the first UPDATE statement and UNLOCK TABLES
after, and (b) put a LIMIT clause on the UPDATE statement.  Other than
that, what you outlined is exactly what I do for a very similar
process, although right now I only have one worker process, but if I
wanted to add more, it's already built to handle that.

-Hank

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Speeding up a pretty simple correlated update query

2009-09-06 Thread Hank
Hello All,
 I'm reposting this since I didn't get much response the last time, so I'm
hoping to reach out again.  My correlated update query (see below) was
running for 9 days before I killed it.   Here is my original question:

  I have a legacy application which was written using a compound primary key
of an item number (non unique) along with a category ID. The combination of
the item number and category ID make the records unique.

  I am in the process of replacing the compound (VARCHAR) keys with an
unique integer key in these tables.

So I have created an item_seq table and assigned a unique sequence number to
each compound key -- it looks like this (all tables are myisam tables, and
mysql version 5.0)

desc item_seq;
+---+--+--+-+-++
| Field | Type | Null | Key | Default | Extra  |
+---+--+--+-+-++
| seq   | int(10) unsigned | NO   | PRI | NULL| auto_increment |
| itemid| char(11) | NO   | MUL | ||
| category  | char(4)  | NO   | | ||
+---+--+--+-+-++

I also have my main transactional table with about 180,000,000 rows -- it
looks like this:

desc item_trans;
+-+---+--+-+-+---+
| Field   | Type  | Null | Key | Default |
Extra |
+-+---+--+-+-+---+
| seq | int(10) unsigned  | NO   | MUL | |
|
| itemid  | char(11)  | NO   | PRI | |
|
| category| char(4)   | NO   | PRI | |
|
| transid | int(10)   | NO   | PRI | |
|

Currently the seq field is null for the entire table.  So of course, I
want to update the main transaction table with the new sequence number.

So I've disabled all the keys on the item_trans table -- since I am
updating every row, it wouldn't (shouldn't) be using the index anyway.

Here is my correlated update query:

 update item_trans i, item_seq is
 set i.seq=is.seq
 where is.itemid=i.itemid and is.category=i.category;

  If I run an explain on the select version of the update, this is what I
get:

++-+--++---++-++---+---+
| id | select_type | table| type   | possible_keys | key| key_len |
ref| rows  | Extra |
++-+--++---++-++---+---+
|  1 | SIMPLE  | item_trans| ALL| PRIMARY   | NULL   | NULL
   | NULL   | 178948797 |   |
|  1 | SIMPLE  | item_seq | eq_ref | itemid| itemid | 20  |
g.item_trans.itemid,g.item_trans.category| 1 |
|
++-+--++---++-++---+---+

... which is exactly what I would expect it to do.  Update every record of
the item_trans table, and do a full index lookup on the items_seq table.

SO... I've been running this query to update item_trans, and it's been
running for 5 days now.

I've also tried running this with the primary key index on the item_trans
table (but not the seq index), and that ran slower in my initial tests.

Are there any faster ways to update 180 million records with a correlated
update query?  And I'm fairly certain that trying to do this in PHP
one-record at a time would take much longer than a SQL solution.

Thanks,

-Hank


Re: Speeding up a pretty simple correlated update query

2009-09-06 Thread Hank
On Sun, Sep 6, 2009 at 6:01 PM, mos mo...@fastmail.fm wrote:


  So I've disabled all the keys on the item_trans table -- since I am
 updating every row, it wouldn't (shouldn't) be using the index anyway.


 You can't disable unique indexes or primary keys. They are always active.
 You can only deactivate non-unique indexes.

 Here are a couple of suggestions.

 For now drop the index on item_seq.seq and desc.seq.
 Are you sure you have a compound index on item_seq.itemid and
 item_seq.category??

 do a Show create table item_seq and also Show create table desc to see
 what you've got.

 Make sure your my.cnf file has
 key_buffer_size=500M

 equal to about 30% of your available memory. You can always reduce it
 later.

 Of course there is another way of doing it, if you are willing to have the
 tail wag the dog. You may kick yourself for not discovering it yourself. :)

 set @num:=0;
 set @last:='';
 create table new_item_trans select IF(concat(itemid,category),@last,
 @num:=...@num+1,@num) Seq, itemid, category, transid, ...
 ,@last:=concat(itemid,category) as TMPLast from item_trans order by
 concat(itemid,category);

 Now you can use the Alter statement to add your indexes and get rid of the
 TMPLast column.

 To build the  item_seq table you would now use:

 create table item_seq select seq, itemid, category from new_item_trans
 group by seq, itemid, category;
 And of course build your indexes on seq and rename the new_item_trans.

 I guarantee you this last solution will not take 9 days to complete! :-)

 Mike


Hi Mike,

 Thanks for your reply.  First, in my tests, I've created the target table
(item_trans) as a copy of the source table with no indexes at all (even no
primary key). Once I get the item_seq field populated, I'll go back and
re-create the indexes in batch using myisamchk (I've posted about this
recently).

 Second, I like your second creative solution (I never would have come up
with that), but in order for it to work, mysql would have to sort 180
million records before creating the table or retrieve them out of the table
via the contactenated index, both of which I think will take a long time...
but I'll certainly give it a shot tomorrow and let you know how it goes.
Thanks again.

-Hank


Re: upgrading from 4.1 to 5.0 trick

2009-09-02 Thread Hank
On Fri, Aug 28, 2009 at 9:18 AM, Shawn Green shawn.gr...@sun.com wrote:

 Hank wrote:

 Hello All,
  I'm in the process of upgrading my database from 4.1 to 5.0 on CentOS.
  I've been testing the mysqlcheck --check-upgrade   --auto-repair
 command,
 and on one of my MYISAM tables, it's taking forever to upgrade the table.
  It has about 114 million rows, and I'm guessing it needs to be upgraded
 due
 to the VARCHAR columns. Anyway, it's been running for a day and a half,
 and
 I finally had to kill it.

 So will this old trick still work?  I've done this many times on 4.1
 with
 great success:

 In mysql 5.0 - I create two new empty tables, one identical to the
 original
 and one identical but with no indexes.  I name these tables with _ion
 and
 _ioff suffixes.

 I then do a insert into table_ioff select * from source which inserts
 just
 the original data into the new table, but doesn't have to rebuild any
 indexes.  I then flush the tables.

 Then in the file system, I swap the table_ion.frm and table_ion.MYI
 files with the table_ioff ones.  Flush tables again.

  I then just use myisamchk -r to repair the index file.  It runs in about
 an
 hour.

 Can I do this same thing to upgrade the tables, instead of using
 mysqlcheck, which seems to be rebuilding the table row-by-row, instead of
 sorting (which myisamchk does).

 thanks.

 -Hank


 Hello Hank,

 Your technique will work within the following narrow limits of operation:

 * This will only work for MyISAM tables.

 * myisamchk is dangerous to run against any table that is in active use as
 it operates at the file level and has caused corruptions with live tables.
  Whenever possible either stop the server or prevent access from MySQL to
 that table with a FLUSH TABLES WITH READ LOCK before using myisamchk.

 http://dev.mysql.com/doc/refman/5.0/en/flush.html

 Alternatively, you should be able to match or improve this import then
 index process if you use an ALTER TABLE ... DISABLE KEYS command before
 the import followed by an ALTER TABLE ... ENABLE KEYS command after the
 import or if you use LOAD DATA INFILE ... . Also if you can import all of
 the data to an empty table in a single batch (statement), the indexes will
 be computed only once using the batch-index algorithm (it's a sort, not a
 merge) and that will also save processing time.

 http://dev.mysql.com/doc/refman/5.0/en/alter-table.html
 http://dev.mysql.com/doc/refman/5.0/en/load-data.html

 http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html

 The overall problem is still that the on-disk structure of the 5.0 tables
 has changed and that you still need to perform some kind of dump-restore or
 rebuild of the data as part of the conversion.
 Warmest regards,
 Shawn Green, MySQL Senior Support Engineer
 Sun Microsystems, Inc.
 Office: Blountville, TN


 Hello Shawn,

 Thanks for your reply.   Yes, I have all of your conditions covered.
1. They are myisam tables
2. This is not a production system, so other people aren't accessing the
tables.
3. And your last comment about dump/restore is taken care of (in my original
note) since I am creating a new table (without indexes) in mysql 5.0, and
then inserting all the data from the old table into the new one.  Then I'm
swapping the MYI/frm files, and then rebuilding the new table.

I've tested this several times now, and it works like a charm.

Finally, I don't like to use the ALTER TABLE DISABLE/ENABLE statements,
since they operate in silent mode -- I have no idea what it's doing, or how
long to expect the process to take.  It would be very nice of those commands
had some built-in progress meter or feedback/callback method.


Speeding up a pretty simple correlated update query

2009-09-02 Thread Hank
Hello All,
  I have a legacy application which was written using a compound primary key
of an item number (non unique) along with a category ID. The combination of
the item number and category ID make the records unique.

  I am in the process of replacing the compound (VARCHAR) keys with an
unique integer key in these tables.

So I have created an item_seq table and assigned a unique sequence number to
each compound key -- it looks like this (all tables are myisam tables, and
mysql version 5.0)

desc item_seq;
+---+--+--+-+-++
| Field | Type | Null | Key | Default | Extra  |
+---+--+--+-+-++
| seq   | int(10) unsigned | NO   | PRI | NULL| auto_increment |
| itemid| char(11) | NO   | MUL | ||
| category  | char(4)  | NO   | | ||
+---+--+--+-+-++

I also have my main transactional table with about 180,000,000 rows -- it
looks like this:

desc item_trans;
+-+---+--+-+-+---+
| Field   | Type  | Null | Key | Default |
Extra |
+-+---+--+-+-+---+
| seq | int(10) unsigned  | NO   | MUL | |
|
| itemid  | char(11)  | NO   | PRI | |
|
| category| char(4)   | NO   | PRI | |
|
| transid | int(10)   | NO   | PRI | |
|

Currently the seq field is null for the entire table.  So of course, I
want to update the main transaction table with the new sequence number.

So I've disabled all the keys on the item_trans table -- since I am
updating every row, it wouldn't (shouldn't) be using the index anyway.

Here is my correlated update query:

 update item_trans i, item_seq is
 set i.seq=is.seq
 where is.itemid=i.itemid and is.category=i.category;

  If I run an explain on the select version of the update, this is what I
get:

++-+--++---++-++---+---+
| id | select_type | table| type   | possible_keys | key| key_len |
ref| rows  | Extra |
++-+--++---++-++---+---+
|  1 | SIMPLE  | item_trans| ALL| PRIMARY   | NULL   | NULL
   | NULL   | 178948797 |   |
|  1 | SIMPLE  | item_seq | eq_ref | itemid| itemid | 20  |
g.item_trans.itemid,g.item_trans.category| 1 |
|
++-+--++---++-++---+---+

... which is exactly what I would expect it to do.  Update every record of
the item_trans table, and do a full index lookup on the items_seq table.

SO... I've been running this query to update item_trans, and it's been
running for 5 days now.

I've also tried running this with the primary key index on the item_trans
table (but not the seq index), and that ran slower in my initial tests.

Are there any faster ways to update 180 million records with a correlated
update query?  And I'm fairly certain that trying to do this in PHP
one-record at a time would take much longer than a SQL solution.

Thanks,

-Hank


Re: Speeding up a pretty simple correlated update query

2009-09-02 Thread Hank
Hello Gavin,
 That's what I did with the first one-to-one table to create the unique SEQ
field mapping to each item/category combination.  The problem is on the
TRANSACTION table, where there are multiple instances of each item/category.
 If I just put a auto_increment primary key on that table, I'd get a unique
TRANSACTION ID, which is not what I want.  I want to populate the
transaction table with the new integer seq key created in the first table.

I guess I should have stated that my overall objective here is to eventually
drop the VARCHAR itemid and category id fields from the transaction table,
leaving only the new item sequence id (plus transid) as the primary key.
There are many tables throughout the schema that do this, and I would be
replacing them all.  It's just that this is the largest table, and the
correlated update is taking a long time, and I'm looking for a better
solution (if one exists).  thanks.

-Hank


On Wed, Sep 2, 2009 at 7:50 PM, Gavin Towey gto...@ffn.com wrote:

 Do you know that if you create seq column on the original table as an
 auto_increment primary key, it will fill in the numbers automatically?
  There's no need to create the values on another table and update with a
 join.

 Regards,
 Gavin Towey

 -Original Message-
 From: Hank [mailto:hes...@gmail.com]
 Sent: Wednesday, September 02, 2009 4:35 PM
 To: mysql@lists.mysql.com
 Subject: Speeding up a pretty simple correlated update query

 Hello All,
  I have a legacy application which was written using a compound primary key
 of an item number (non unique) along with a category ID. The combination of
 the item number and category ID make the records unique.

  I am in the process of replacing the compound (VARCHAR) keys with an
 unique integer key in these tables.

 So I have created an item_seq table and assigned a unique sequence number
 to
 each compound key -- it looks like this (all tables are myisam tables, and
 mysql version 5.0)

 desc item_seq;
 +---+--+--+-+-++
 | Field | Type | Null | Key | Default | Extra  |
 +---+--+--+-+-++
 | seq   | int(10) unsigned | NO   | PRI | NULL| auto_increment |
 | itemid| char(11) | NO   | MUL | ||
 | category  | char(4)  | NO   | | ||
 +---+--+--+-+-++

 I also have my main transactional table with about 180,000,000 rows -- it
 looks like this:

 desc item_trans;

 +-+---+--+-+-+---+
 | Field   | Type  | Null | Key | Default |
 Extra |

 +-+---+--+-+-+---+
 | seq | int(10) unsigned  | NO   | MUL | |
|
 | itemid  | char(11)  | NO   | PRI | |
|
 | category| char(4)   | NO   | PRI | |
|
 | transid | int(10)   | NO   | PRI | |
|

 Currently the seq field is null for the entire table.  So of course, I
 want to update the main transaction table with the new sequence number.

 So I've disabled all the keys on the item_trans table -- since I am
 updating every row, it wouldn't (shouldn't) be using the index anyway.

 Here is my correlated update query:

  update item_trans i, item_seq is
  set i.seq=is.seq
  where is.itemid=i.itemid and is.category=i.category;

  If I run an explain on the select version of the update, this is what I
 get:


 ++-+--++---++-++---+---+
 | id | select_type | table| type   | possible_keys | key| key_len |
 ref| rows  | Extra |

 ++-+--++---++-++---+---+
 |  1 | SIMPLE  | item_trans| ALL| PRIMARY   | NULL   | NULL
   | NULL   | 178948797 |   |
 |  1 | SIMPLE  | item_seq | eq_ref | itemid| itemid | 20  |
 g.item_trans.itemid,g.item_trans.category| 1 |
 |

 ++-+--++---++-++---+---+

 ... which is exactly what I would expect it to do.  Update every record of
 the item_trans table, and do a full index lookup on the items_seq table.

 SO... I've been running this query to update item_trans, and it's been
 running for 5 days now.

 I've also tried running this with the primary key index on the item_trans
 table (but not the seq index), and that ran slower in my initial

upgrading from 4.1 to 5.0 trick

2009-08-26 Thread Hank
Hello All,
  I'm in the process of upgrading my database from 4.1 to 5.0 on CentOS.
 I've been testing the mysqlcheck --check-upgrade   --auto-repair command,
and on one of my MYISAM tables, it's taking forever to upgrade the table.
 It has about 114 million rows, and I'm guessing it needs to be upgraded due
to the VARCHAR columns. Anyway, it's been running for a day and a half, and
I finally had to kill it.

So will this old trick still work?  I've done this many times on 4.1 with
great success:

In mysql 5.0 - I create two new empty tables, one identical to the original
and one identical but with no indexes.  I name these tables with _ion and
_ioff suffixes.

I then do a insert into table_ioff select * from source which inserts just
the original data into the new table, but doesn't have to rebuild any
indexes.  I then flush the tables.

Then in the file system, I swap the table_ion.frm and table_ion.MYI
files with the table_ioff ones.  Flush tables again.

 I then just use myisamchk -r to repair the index file.  It runs in about an
hour.

Can I do this same thing to upgrade the tables, instead of using
mysqlcheck, which seems to be rebuilding the table row-by-row, instead of
sorting (which myisamchk does).

thanks.

-Hank


Using Flush Hosts in cron

2008-09-09 Thread Hank
I have the blocked connection problem between three of my servers, all
behind two firewalls and on one switch.

Occaisionally the mysql servers start blocking the hosts.

What's the downside to running a Flush hosts once per minute on these
mysql servers?

The only hosts that are connecting are one of three or four hosts behind the
same firewall.

thanks.

-Hank


query


Re: Mysql and Textarea

2007-05-22 Thread Hank

On 5/22/07, sam rumaizan [EMAIL PROTECTED] wrote:


  I'm just a php beginner.
So please be patient with my stupid questions.

What am I missing in this code that causing the function can't update the
textarea?

Update button erase the old information from mysql database and replace it
with nothing. Basically it can't read what is inside the Textarea box. Why??

Read the highlighted code.



Your problem is in this line:

mysql_real_escape_string($_REQUEST['Assign_Engineer'][' .$id .
']['Job_Title']);


you want something more like (may or may not work):

mysql_real_escape_string($_REQUEST[Assign_Engineer[$id]['Job_Title']]);

Since this is a PHP problem, and you can't figure it out, I'd suggest moving
your request to a PHP list.

-Hank


duplicating a replicated slave environment

2007-05-08 Thread Hank

Hello All,

I have a 4.1.14 mysql database master and slave set up.

For this slave #1, I have the IO thread running constantly, and the SQL
thread running once a day to update all pending updates from the master
(then I shut it off).  So for most of the day, this database is static
(except for the collecting relay logs).

I have a new machine to be another slave of the same master (slave #2).

I can not shut down or lock the master in order to copy the master database
to the slave #2 (it is 44GB total, and would take over an hour to copy).

I have copied the (static) database from Slave #1 to Slave #2.  How can I
now configure Slave #2 to process the pending relay-logs and bring it up to
date?

Obviously I would need to copy (and rename?) the relay logs, but what about
the master.info and relay-log.info files?

Or in other words, can I use the show slave status information on Slave #1
to setup Slave #2 in the CHANGE MASTER TO command?

Thanks.

-Hank


Re: How to look for balanced parenthesis?

2006-07-10 Thread Hank

I used to use UltraEdit, but then switched to EditPlus because it can edit
remote files almost transparently. (Opening a file FTP's it down, you edit
local copy, Saving FTP's it back.)




FYI - UltraEdit can do this - It uses SFTP over SSH to edit remote files.
Works like a charm.

-Hank
mysql, query


Re: TIMESTAMP field not automatically updating last_updated field

2006-03-31 Thread Hank
Are the other fields in the update statement actually changing the
data? I don't know for sure, but if the data on disk is the same as
the update statement, mysql won't actually update the record, and
therefore might not update the last_updated field also.  Just a
thought.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: count(*) send a wrong value

2006-01-23 Thread Hank
My guess would that your PHP code is not written correctly.

For instance, if you have a query in PHP:

$sql=select * from my_table where cid=123;

...and are using the PHP function mysql_numrows() to count the
results, and then for your next test... you're just changing the query
to:

$sql=select count(*) from my_table where cid=123

and still using the mysql_numrows()  to get the result, that is your
error.  You'll need to use mysql_result() or some other fetch function
to get the results of the query.

That's my guess.

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: I can't find the missing rows in a table--

2006-01-01 Thread Hank
Don't you want the queries to be outer join and not left join?

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: MySQL query question

2005-12-24 Thread Hank
since I'm not sure how users would
 ever be directly associated with teams - I would have expected to find
 players to be associated with teams - so forgive me if this doesn't
 resemble very much what you're doing:

Think corporate projects, not sports.

Here's my take on the original query.. you don't actually need to use
the teams table in the query, as long as you have DISTINCT in the
Select:

SELECT DISTINCT username
FROM users u, users_teams ut, projects_teams pt , projects p
WHERE p.project_id = '1'
AND pt.project_id = p.project_id
AND ut.team_id = pt.team_id
AND u.user_id = ut.user_id

Also, just a style comment, I would find it confusing just to use id
as the key in the projects, team, and user tables.. and user_id,
team_id, and project_id in the associative tables... the field
names should be consistent throughout, so when reading queries, it's
obvious which id one is talking about.





On 12/24/05, Josh Mellicker [EMAIL PROTECTED] wrote:
 I have several tables, all with many-to-many joining tables.

 users

 users_teams

 teams

 teams_projects

 projects


 ---

 So, with a projects.id = 1, I want to get all the usernames of people
 on teams assigned to that project.

 SELECT DISTINCT username
 FROM users, users_teams, teams, projects_teams, projects
 WHERE projects.id = '1'
 AND projects_teams.project_id = projects.id
 AND teams.id = projects_teams.team_id
 AND users_teams.user_id = users.id

 gives me ALL the users who are on any team... even teams not assigned
 to that project.

 What gives? My brain hurts. Thanks for any help.

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Do I need to tweak my server variables for this SELECT statement?

2005-12-23 Thread Hank
Based on the size (and # of colums) of your result set, I'm not sure
any amount of RAM would allow 360,000 records to be stored and sorted
entirely in memory.  After some point, mysql just decideds to use a
temp table.

That's the limit of my tweaking skills, so someone else is going to
have to help out here on the way to maximize the memory or minimize
the time for the internal sorting of the result set, if it's possible.

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Do I need to tweak my server variables for this SELECT statement?

2005-12-22 Thread Hank
 Now that I know what's causing the slow queries, what can I do to fix it?

The only thing I can suggest is breaking up the query into two parts -
the first part to retrieve just the product codes and salesrank, and
sort and limit that.. save in a temp table or use application code to
retrieve and print the rest of the product info.  Sorting 300,000+
records in that huge result set is going to take some time (although
it shouldn't take 10 minutes).

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Do I need to tweak my server variables for this SELECT statement?

2005-12-21 Thread Hank
I don't think the problem is going to be solved with the my.cnf file.

Here's what I would try..

1- run and time the original query without the ORDER BY or LIMIT clauses

2- run and  time the following breakdown queries, to see if the
indexes are at least working correctly:

-- test catprod
SELECT
   pn_pricecompare_catprod.category,
   pn_pricecompare_catprod.asin
FROM pn_pricecompare_catprod
WHERE
   pn_pricecompare_catprod.category =  '283155'

-- test product
SELECT
   pn_pricecompare_product.title,
   pn_pricecompare_product.prod_id,
   pn_pricecompare_product.image_small,
   pn_pricecompare_product.brand,
   pn_pricecompare_product.manufacturer,
   pn_pricecompare_product.mpn,
   pn_pricecompare_product.model,
   pn_pricecompare_product.artist,
   pn_pricecompare_product.author,
   pn_pricecompare_product.binding,
   pn_pricecompare_product.label,
   pn_pricecompare_product.audiencerating,
   pn_pricecompare_product.studio,
   pn_pricecompare_product.releasedate,
   pn_pricecompare_product.numberofpages,
   pn_pricecompare_product.pubdate,
   pn_pricecompare_product.publisher,
   pn_pricecompare_product.searchindex,
   pn_pricecompare_product.lowest_price,
   pn_pricecompare_product.num_merchants
FROM pn_pricecompare_product
WHERE
   pn_pricecompare_product.asin IN  (some test asins)
ORDER BY pn_pricecompare_product.salesrank ASC
LIMIT 0,10

-- test just getting a count of the join result
SELECT count(*)
FROM
   pn_pricecompare_catprod,
   pn_pricecompare_product
WHERE
   pn_pricecompare_catprod.category =  '283155' AND
   pn_pricecompare_catprod.asin =  pn_pricecompare_product.asin

The results of those queries should shed some light on where the
problem is being introduced.

Also, table descriptions of both tables would be helpful in locating
the problem.

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Do I need to tweak my server variables for this SELECT statement?

2005-12-21 Thread Hank
Grant,

  You can just to a desc pn_pricecompare_catprod and desc
pn_pricecompare_product and post the results.  The CREATE TABLE
statements would be OK, but the describes are better.

The flush the query cache, I think if you do a flush tables.

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Do I need to tweak my server variables for this SELECT statement?

2005-12-21 Thread Hank
Ok, so the next step would be to try the original query with just the
LIMIT clause, and then just the ORDER BY (but not both).

The results of select count(*) query would be helpful to know just
how many records mysql is trying to sort and limit.

And do you really need a separte index key on `salesrank` on the
product table?

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Relocation of database store

2005-10-10 Thread Hank
You can also create a sym-link for the seperate databases/directories,
and leave my.cnf as-is.   I've been doing that since 3.23, and it's
never caused me any problems.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Distance between Zip codes

2005-10-09 Thread Hank
Don't forget that you're not caclucating driving distance, but
great circle distance, which is roughly a straight line over short
distances.  If you radius is great than, say 50 miles, people might
complain that the actual driving distance is much greater than the
straight line distance you provided.

--

-Hank

mysql, query

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: How to avoid redundancy between PK and indices ?

2005-10-06 Thread Hank
I understand what you're saying.

The problem is that if mysql attempted to do a query like you suggest:

Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And f=1;

It can only use one index for the query, and hopefully, the optimizer
will pick one of the six indexes with the fewest keys to scan.  But
even if it could virtualize the PK that way, it could still cause a
scan of millions of records while limiting the table scan to one of
the six non-unique keys.  In other words, it would/could take alot of
time to see if a record is unique upon inserting new records - not
something you'd be happy with performance wise, I'm sure.  Therefore,
a true, concatenated key that enforces uniqueness and can operate
immediately upon inserts is really necessary, regardless of what other
indexes are on the columns.

The type of query you're suggesting can be done with bitmapped indexes
(Oracle has them), where the indexes values are stored as bitmaps, and
you can combine them so Oracle uses multiple indexes in one query to
quickly pair down the records to scan.  Bitmapped indexes work very
well with the cardinality of keys is less than 10,000 (number of
unique key values).  In a nutshell, think of a field for sex/gender
and a table of 1 million records. A bitmapped index of that field
would only be 125,000 bytes long (1 million bits) (one bit=one
record), and to find all the M records, just map the on bits to
the record number in the datatable. For fields with larger possible
values (say, state of residence - 50 values), each location would be
represented by 6 bits. Pretty simple concept, but great performance
gains can be had over regular btree indexes.  I think this is what
you're getting at.

When I asked the MySQL AB folks at the first conference in Orlando a
couple of years ago about adding bitmapped index support in MySQL,
they didn't really know what I was talking about. The developer I
spoke to thought I was suggesting creating indexes on bitmapped
images. No, not exactly.   I hope they know what it is now, though,
and have (or already have) considered adding support for it in MySQL.

-Hank

On 10/5/05, C.R. Vegelin [EMAIL PROTECTED] wrote:
 Hi Hank,
 You are quite right.
 I need separate non-unique indices on a, b, c, d, e and f to avoid table
 scans.
 And when each combi must be unique I need a Primary Key (a,b,c,d,e,f).
 And only Key a (a) seems to be redundant with the primary key ...
 Suppose there would be a PK (a,b,c,d,e,f) defined, without a separate PK
 index.
 And let's assume some rows like:
  columns:a   b   c   d   e   f
  row1 has:  1  1   1   1   1   1
  row2 has:  1  1   1   1   1   2
  row3 has:  1  1   1   1   1   3
  etc.
 Then checking on unique PK could be done by MySQL internally with:
  Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And
  f=1;
 to avoid a duplicate primary key for row1, by using / joining the separate
 index tables.
 With this Select query, MySQL could / should make use of the 6 existing
 separate indices.
 Uniqueness can be fully guaranteed with these 6 non-unique indices in this
 case.
 In other words, a separate PK index is fully redundant in this case, right ?
 In addition, it would save space without the longer concatenate key of
 a+b+c+d+e+f.
 Thanks, Cor

 - Original Message -
 From: Hank [EMAIL PROTECTED]
 To: C.R. Vegelin [EMAIL PROTECTED]
 Cc: mysql@lists.mysql.com
 Sent: Wednesday, October 05, 2005 5:57 PM
 Subject: Re: How to avoid redundancy between PK and indices ?


 It depends.. if this is your create table statement:

 CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f)
 );

 Then only one unique index is being created on the concatenate key of
 a+b+c+d+e+f.  Queries on any fields other than A will cause a full
 table scan.

 On the other hand, if your create table is:

 CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f),
  KEY a (a),
  KEY b (b),
  KEY c (c),
  KEY d (d),
  KEY e (e),
  KEY f (f)
 );

 This will create the primary key, plus six additional indexes, each of
 which is queryable. But in this case, the KEY a (a) non-unique index
 is redundent with the primary key, so to do what you want - a unique
 index on a+b+c+d+e+f PLUS the ability to independtly search the  b c d
 e and f fields, here is the create table you'll need to use:

 CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f),
  KEY b (b),
  KEY c (c),
  KEY d (d),
  KEY e (e),
  KEY f (f)
 );


 --

 -Hank

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL

Re: How to avoid redundancy between PK and indices ?

2005-10-05 Thread Hank
It depends.. if this is your create table statement:

CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f)
);

Then only one unique index is being created on the concatenate key of
a+b+c+d+e+f.  Queries on any fields other than A will cause a full
table scan.

On the other hand, if your create table is:

CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f),
  KEY a (a),
  KEY b (b),
  KEY c (c),
  KEY d (d),
  KEY e (e),
  KEY f (f)
);

This will create the primary key, plus six additional indexes, each of
which is queryable. But in this case, the KEY a (a) non-unique index
is redundent with the primary key, so to do what you want - a unique
index on a+b+c+d+e+f PLUS the ability to independtly search the  b c d
e and f fields, here is the create table you'll need to use:

CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f),
  KEY b (b),
  KEY c (c),
  KEY d (d),
  KEY e (e),
  KEY f (f)
);


--

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Limiting DISTINCT To One Column

2005-09-29 Thread Hank
To check for more than one channel-per-record in the table:

select channel,count(*) as cnt from ChannelStatus group by channel having cnt1

should return zero records if you have no dups.


On 9/29/05, Hal Vaughan [EMAIL PROTECTED] wrote:
 I have a query like this:

 SELECT DISTINCT Channel, ChannelType, Source FROM ChannelStatus;

 Each channel is supposedly listed in this table only 1 time, but just in case,
 what I really want to do is make sure that no channels are duplicated.  Is
 there some way to make the keyword DISTINCT apply to Channel only?  Sort of
 a shortcut to (example in pseudocode, although it'd be in Perl):

 SELECT DISTINCT Channel FROM ChannelStatus;
 FOR EACH Channel
SELECT Channel, ChannelType FROM Source WHERE Channel = 'channel'
 ENDLOOP

 Thanks!

 Hal

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Circular Replication

2005-09-25 Thread Hank
 The long story short is we use the fact that MySQL has the ability to
 run the SQL thread and the IO thread of replication separately, and
 control them individually.

I'm fairly green with replication, but I have a simple cron job that
starts a PHP program that issues a slave start, watches for the
time behind master to be zero seconds, then issues a slave stop.
This repeats every 10 minutes (it takes about one minute to update 10
minutes of master data), so my slave is at most (worst case) 10
minutes behind the master.  This could be done every two hours or even
once per day.  I'll be setting up a second master to do this same
thing once per day to act as my daily backup.  Once the daily backup
completes replication, I can flush tables and backup the database
tables to the backup device for long term backups.

What are the differences between doing this and turning the SQL and IO
threads on spearetly? Just IMO, that seems like alot of manipulation
that's not really necessary, but it's possible I'm missing something.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Circular Replication

2005-09-25 Thread Hank
 I'll be setting up a second master to do this same
 thing once per day to act as my daily backup.

Oops...I meant to say second slave.

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



myisamchk error

2005-07-31 Thread Hank
While running a -rq on a large table, I got the following error:

myisamchk: warning: Duplicate key for record at   54381140 against
record at   54380810


How do I find which record is duplicated (without doing the typical
self-join query)?  This table has 70 million rows, so that's not
really feasible.

myisamchk --block-search # looked promising, but I can't find any
documentation on how to use it properly.

thanks.

-- 

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: looking for a pure startup opportunity..

2005-07-03 Thread Hank
 i've read way too many articles about the 3 kids/guys/etc... who managed to
 get $10 million in funding for esentially a basic idea, but they had/have
 traffic/eyeballs!!!

Welcome to 1999.  Blind reliance on Traffic and Eyeballs  as a
business plan was what (in part) caused the great runups in valuations
that preceded the Internet crash.

What counts, and what draws the venture capital is a well thought out
business plan based on traditional and well proven business models
(i.e. selling stuff with a healthy margin, offering in-demand
services, etc).

And even if you could get 10 million in funding, you think you and
your partner get to walk away with 5 million each?  That's not how it
works.  They invest that money into the COMPANY, and expect YOU to
turn 10 million into at least 50 million.  They also OWN YOU during
that time - it's no cakewalk - just ask any VC funded startup what
they think of having VC investors.  Getting funding is only the first
step in a very long road to eventually selling out, IF you're able to
grow the company 5x or more.

 so, if you're looking at your shrinking retirement going to iraq

That statement makes no sense whatsoever.

 but with the right combination of web development skills
 (perl/php/c/apache/linux/mysql/html/css) we can do some damage. there are
 numerous opportunities for the right 3-5 person team!

For a serious business, it takes alot more than just raw development
skills... like marketing, sales, management, finance, etc.   I've read
too many stories of smart guys who created amazing cutting-edge
technology (i.e. Apple), only to lose it all because they had no idea
how to correctly market or sell it.

Anyway, this has little to do with mysql (,query).

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



classic outer join problem

2005-05-24 Thread Hank
I have a table of school class assignments with the following fields -
the first four fields are the primary key:

Year (int) 
Term (enum, Spring, Fall,Winter)
ClassID (int)
SectionID (int)
Attachement (varchar 225)

The attachment field is a pointer to a file in the OS of an uploaded
file, in the format like this:  
/uploads/2003/Fall/330/1/conversions.doc

When an old class item is imported into a new class item for a new
year/term/class/section (new record), the attachment field is copied
over, and the OS file stays where it is.. the attachment field value
is then a pointer the to the OS file for a previous year/term (i.e.
the OS file is not copied to the new location filesystem structure).

I've been trying to construct a self-joining query to list all the
attachments which are NOT referenced by some future
year/term/class/section.   The desired query result is a list of files
I can DELETE - i.e. files not imported or being pointed to by any
other class_item record in a different year/term.

Keep in mind that pointer to files in the same Year/Term (but
different class/section) are NOT to be deleted.

The system currently has MySQL version 4.0.1, so I can't use
subqueries (i.e. NOT IN (...)).

Any suggestions would be greatly appreciated. thanks.
-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Unique Index on multiple columns that can contain NULL in MySQL

2005-05-04 Thread Hank
I think you should review the very recent thread why NOT NULL in
PRIMARY key?? which might shed some light on your particular issue.

In a nutshell, NULL!=NULL, so the database engine can not detect the
duplicate rows, as is expected.

-Hank

On 5/4/05, Dennis Fogg [EMAIL PROTECTED] wrote:
 I'm getting lots of duplicate rows even though I have a
 unique index defined over multiple columns.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: zip code search within x miles

2005-04-19 Thread Hank
Talk about over complicating things... here's the above query simplifed.

I can not figure out why they were self joining the table three times:

SELECT b.zip_code, b.state,
   (3956 * (2 * ASIN(SQRT(
   POWER(SIN(((a.lat-b.lat)*0.017453293)/2),2) +
   COS(a.lat*0.017453293) *
   COS(b.lat*0.017453293) *
   POWER(SIN(((a.lng-b.lng)*0.017453293)/2),2) AS distance
FROM zips a, zips b
WHERE
   a.zip_code = '90210'
GROUP BY distance
having distance = 5;


-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: zip code search within x miles

2005-04-19 Thread Hank
On 4/19/05, Keith Ivey [EMAIL PROTECTED] wrote:
 Also, the index on zip_code, latitude, and longitude doesn't
 make sense.  

Yeah - I didn't even notice the indexes in the table def (I used my
own existing zip code table).  That table def and query were obviously
created by someone pretty green with SQL.

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: zip code search within x miles

2005-04-19 Thread Hank
 No, those indexes were intentional.  If you read the section of the manual
 on optimizing queries, you will encounter a page that mentions what are
 known as covering indexes.  The advantage to a covering index is that if
 your data is numeric and in the index, 

Except that the zip code field is not (and should not be) numeric, so
the qualification test fails.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: zip code search within x miles

2005-04-18 Thread Hank
 Let's say you've got, oh, 2000 records to search through.
 You're gonna end up doing a JOIN with:
 2,000 X 65,000 == 130,000,000 tuples (records/results).
 130 MILLION tuples is *way* too many for your basic $20/month site.

I'd say take some easy shortcuts first... like limit the join to the
zip code table by the target state (which of course is indexed), then
add the one, two or at most three neighboring states, if you're near a
border.  Or just limit the join to all the neighboring states in one
shot). One, two, or three short running queries all in SQL is a
whole-lot better than adding un-normalized and redundant fields to the
source data table and populating it in a cron job and triggers.  Talk
about taking the long way around.

And if anyone is looking for a datafile with worldwide cities (about
2.7 million records) and their lat/longs (not zips, though), here's a
place to download it for free:

http://www.maxmind.com/app/worldcities

For $50, you can get the addition of population of each city.

-Hank

mysql, query

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: zip code search within x miles

2005-04-18 Thread Hank
 Applying this same thing to apply to the 80k estimated US zipcodes
 currently 

Just for the record, there are about 43,000 distinct US zip codes...
and 56,000 zip codes if you double count the zips with multiple city
names (when zip codes cross city limits).

-Hank

mysql, query

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: problem with adding timestamp as a column

2005-02-28 Thread Hank
I'd suggest not using the keyword timestamp as a column name.  I'd
suggest using ts or tstamp or something like that.

To update the records to the current timestamp:

update try set tstamp=null;

should do it.  I don't know why the default isn't working, though.




On Mon, 28 Feb 2005 10:05:32 -0500, Zhe Wang [EMAIL PROTECTED] wrote:
 Hi, there,
 
  I have MySQL 4.1.10. I need to add a timestamp column to an existing
 table. I am having a problem of setting the newly added column to be the
 current time.
 
  This is what I did:
 
 CREATE TABLE try (id INTEGER); INSERT INTO try VALUES(1), (2), (3);
 
 ALTER TABLE try ADD timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP;
 
 SELECT * FROM try;
 
 +--+-+
 
 | id   | timestamp   |
 
 +--+-+
 
 |1 | -00-00 00:00:00 |
 
 |2 | -00-00 00:00:00 |
 
 |3 | -00-00 00:00:00 |
 
 +--+-+
 
  I've  read the on-line manual regarding the change in timestamp, still
 couldn't figure out why 0's got inserted instead of a meaningful current
 time stamp. I would greatly appreciate if someone can let me know what
 the correct way is. Unfortunately I cannot recreate the table.
 
   Thank you very much!
 
 Regards,
 Zhe
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
 
 


-- 

-Hank

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



fsockopen causing connection errors

2004-11-15 Thread Henry Hank

Hello,

  First, I tried searching the mysql list archives, but the archive search
function on mysql.com does not seem to be working.

Second, I'm using the PHP fsockopen() function on port 3306 to test if a
database is up and responsive. If the socket opens, I'm assuming the database
is up, and I then call fclose() to close the socket.
 
  The problem is that if I call this several times in a row quickly, the
database senses a connection error problem and refuses to connect until I
issue a flush hosts command. 

  Is there are more proper was to use fsockopen()/fclose() to simply ping the
database?  I do NOT want to use mysql_connect in this case - I just want to
ping 3306 without causing MySQL to count them as broken connections and
disconnect the client host. 

Thanks,

-Hank



__ 
Do you Yahoo!? 
Check out the new Yahoo! Front Page. 
www.yahoo.com 
 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



PHP/MySQL/Apache interaction problem

2004-01-05 Thread Henry Hank

I really need some help/direction, so I'll try to explain this as best I can.  

I have a PHP script that involves three very long running queries (2-3 minutes
each).  The users of my website kick this off from their webpage, so it runs as
nobody from inside Apache.  In order to prevent a user from running the same
script twice, I do a GET_LOCK call at the begining of the script, and return
a message if it is already running. 

Everything worked perfectly for the last few years with this method, until last
week when my web server was upgraded to a 2.4 kernel, Apache 2 and PHP 4.3.3
(from a 2.2 kernel, Apache 1.3 and PHP 4.1.2).  The database server (mysql
3.23.41) is running on a different machine has not changed.

Under the new system, the PHP script runs, and fires off the first of the three
queries.  Exactly 2 minutes into the script running (while the first query is
still running), Apache times out and returns a blank page (expected), and I
lose the GET_LOCK lock, but the PHP script continues to run, and in fact the
other two queries also run and complete under the same mysql connection ID.  So
I'm not losing the mysql connection, and the PHP script continues to run, so
why am I losing the lock? 

I've run all the queries from the command line, and I can't replicate the
problem.  Where can I look to find where/why mysql is giving up the lock, when
the PHP script continues to run under the same connection ID?  (I can
understand the problem if the mysql connection dies and the script
auto-re-connects, but that isn't happening.).

Thanks,

-Hank





__
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



  1   2   >