Re: [Maria-developers] Implementing implicit primary key in mysql server
Hi all, I implement a demo patch, based on 5.5.18. *1. CREATE TABLE* root@localhost : plx 11:54:46 create table test_no_pk (col1 varchar(32)); Query OK, 0 rows affected (0.01 sec) root@localhost : plx 11:55:05 desc test_no_pk; +--+-+--+-+-++ | Field| Type| Null | Key | Default | Extra | +--+-+--+-+-++ | col1 | varchar(32) | YES | | NULL|| | __row_id | bigint(20) | NO | PRI | NULL| auto_increment | +--+-+--+-+-++ 2 rows in set (0.01 sec) if users has not defined a PK, I will add it automatically. *2. ALTER TABLE* root@localhost : plx 11:55:10 alter table test_no_pk add id int, add primary key(id); Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:02 desc test_no_pk; +---+-+--+-+-+---+ | Field | Type| Null | Key | Default | Extra | +---+-+--+-+-+---+ | col1 | varchar(32) | YES | | NULL| | | id| int(11) | NO | PRI | 0 | | +---+-+--+-+-+---+ 2 rows in set (0.01 sec) When users add a PK, I will remove implicit PK automatically. root@localhost : plx 11:57:07 alter table test_no_pk drop primary key; Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:42 desc test_no_pk; +--+-+--+-+-++ | Field| Type| Null | Key | Default | Extra | +--+-+--+-+-++ | col1 | varchar(32) | YES | | NULL|| | id | int(11) | NO | | 0 || | __row_id | bigint(20) | NO | PRI | NULL| auto_increment | +--+-+--+-+-++ 3 rows in set (0.00 sec) When users dropped PK, I will add it automatically. *3. INSERT VALUES* root@localhost : plx 11:59:22 insert into test_no_pk values('abc',2); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' root@localhost : plx 11:59:23 insert into test_no_pk values('abc',4); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' it will report duplicate, *Sergei, can you help me to find why?* *4. SELECT ** root@localhost : plx 12:07:23 select * from test_no_pk; +--++ | col1 | id | +--++ | abc | 6 | +--++ 1 row in set (0.00 sec) root@localhost : plx 12:07:30 select __row_id from test_no_pk; +--+ | __row_id | +--+ |1 | +--+ 1 row in set (0.00 sec) When users run SELECT *, row_id will be filter. *5. SHOW CREATE* root@localhost : plx 12:07:35 show create table test_no_pk\G *** 1. row *** Table: test_no_pk Create Table: CREATE TABLE `test_no_pk` ( `col1` varchar(32) DEFAULT NULL, `id` int(11) NOT NULL DEFAULT '0' ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) row_id will be hidden. Thanks, Lixun On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng pengli...@gmail.com wrote: Hi Jeremy, Thank you for your suggestion. I also want to just add the PK field for custom automatically, but some of our customs can't accept it. Because they are using SELECT * FROM table or INSERT INTO table VALUES(...) , if I add a visible PK for them, SELECT * will show this value, then their applications will report errors. So I have to set this field as an implicit filed. Thanks, Lixun On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole jeremyc...@google.comwrote: Lixun, I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more complicated than it might seem). While I think it is not that hard to expose the __id field to replication internals, I think in order for this to really work it would need to be exposed to other tools, such as mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it is in sync between master and slave). As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically adding a field using auto_increment when the user doesn't provide a PK: __id BIGINT UNSIGNED NOT NULL auto_increment, PRIMARY KEY(__id) Regards, Jeremy On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng pengli...@gmail.com wrote: Hi Sergei, You are right, let users add primary key is best. But I can't let users who don't want to create primary key can't use our MySQL service. Amazon RDS also allow users to create the tables without primary key, just change binlog_format to MIXED to solve replication problem. I think this feature is very useful for any MySQL cloud service providers, all of them will face this
Re: [Maria-developers] Implementing implicit primary key in mysql server
it makes hardly sense to add a primary key not used in select statements this way and leads only in a lot of problems and wasted space / performance it is common practice to remove keys before large bulk inserts and add the key *after* inserting the data which would not work with the expected benefit with your patch Am 05.07.2013 06:08, schrieb Lixun Peng: Hi all, I implement a demo patch, based on 5.5.18. *1. CREATE TABLE* root@localhost : plx 11:54:46 create table test_no_pk (col1 varchar(32)); Query OK, 0 rows affected (0.01 sec) root@localhost : plx 11:55:05 desc test_no_pk; +--+-+--+-+-++ | Field| Type| Null | Key | Default | Extra | +--+-+--+-+-++ | col1 | varchar(32) | YES | | NULL|| | __row_id | bigint(20) | NO | PRI | NULL| auto_increment | +--+-+--+-+-++ 2 rows in set (0.01 sec) if users has not defined a PK, I will add it automatically. *2. ALTER TABLE* root@localhost : plx 11:55:10 alter table test_no_pk add id int, add primary key(id); Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:02 desc test_no_pk; +---+-+--+-+-+---+ | Field | Type| Null | Key | Default | Extra | +---+-+--+-+-+---+ | col1 | varchar(32) | YES | | NULL| | | id| int(11) | NO | PRI | 0 | | +---+-+--+-+-+---+ 2 rows in set (0.01 sec) When users add a PK, I will remove implicit PK automatically. root@localhost : plx 11:57:07 alter table test_no_pk drop primary key; Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:42 desc test_no_pk; +--+-+--+-+-++ | Field| Type| Null | Key | Default | Extra | +--+-+--+-+-++ | col1 | varchar(32) | YES | | NULL|| | id | int(11) | NO | | 0 || | __row_id | bigint(20) | NO | PRI | NULL| auto_increment | +--+-+--+-+-++ 3 rows in set (0.00 sec) When users dropped PK, I will add it automatically. *3. INSERT VALUES* root@localhost : plx 11:59:22 insert into test_no_pk values('abc',2); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' root@localhost : plx 11:59:23 insert into test_no_pk values('abc',4); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' it will report duplicate, *Sergei, can you help me to find why?* *4. SELECT ** root@localhost : plx 12:07:23 select * from test_no_pk; +--++ | col1 | id | +--++ | abc | 6 | +--++ 1 row in set (0.00 sec) root@localhost : plx 12:07:30 select __row_id from test_no_pk; +--+ | __row_id | +--+ |1 | +--+ 1 row in set (0.00 sec) When users run SELECT *, row_id will be filter. *5. SHOW CREATE* root@localhost : plx 12:07:35 show create table test_no_pk\G *** 1. row *** Table: test_no_pk Create Table: CREATE TABLE `test_no_pk` ( `col1` varchar(32) DEFAULT NULL, `id` int(11) NOT NULL DEFAULT '0' ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) row_id will be hidden. Thanks, Lixun On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng pengli...@gmail.com mailto:pengli...@gmail.com wrote: Hi Jeremy, Thank you for your suggestion. I also want to just add the PK field for custom automatically, but some of our customs can't accept it. Because they are using SELECT * FROM table or INSERT INTO table VALUES(...) , if I add a visible PK for them, SELECT * will show this value, then their applications will report errors. So I have to set this field as an implicit filed. Thanks, Lixun On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole jeremyc...@google.com mailto:jeremyc...@google.com wrote: Lixun, I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more complicated than it might seem). While I think it is not that hard to expose the __id field to replication internals, I think in order for this to really work it would need to be exposed to other tools, such as mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it is in sync between master and slave). As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead
Re: [Maria-developers] Implementing implicit primary key in mysql server
Yes, for normal user is un-meaningful. The case is our MySQL Cloud Service, so many users are using our MySQL db for CMS or other programs. And it usually has no primary key or any unique keys, so it makes me headache. I think you know, if binlog_format=ROW, and tables have no any unique keys, what will happen. Now we just change binlog_forma=MIXED to avoid this problem. But our middleware need ROW format, so it still makes me headache. Because our users don't want to add PK by themselves, they don't know how to modify their application, they just download it and install in their web server. So I want to add a implicit Primary Key for each tables that have no unique keys. Then we can use ROW binlog format, so many problems will be solved. On Fri, Jul 5, 2013 at 12:15 PM, Reindl Harald h.rei...@thelounge.netwrote: it makes hardly sense to add a primary key not used in select statements this way and leads only in a lot of problems and wasted space / performance it is common practice to remove keys before large bulk inserts and add the key *after* inserting the data which would not work with the expected benefit with your patch Am 05.07.2013 06:08, schrieb Lixun Peng: Hi all, I implement a demo patch, based on 5.5.18. *1. CREATE TABLE* root@localhost : plx 11:54:46 create table test_no_pk (col1 varchar(32)); Query OK, 0 rows affected (0.01 sec) root@localhost : plx 11:55:05 desc test_no_pk; +--+-+--+-+-++ | Field| Type| Null | Key | Default | Extra | +--+-+--+-+-++ | col1 | varchar(32) | YES | | NULL|| | __row_id | bigint(20) | NO | PRI | NULL| auto_increment | +--+-+--+-+-++ 2 rows in set (0.01 sec) if users has not defined a PK, I will add it automatically. *2. ALTER TABLE* root@localhost : plx 11:55:10 alter table test_no_pk add id int, add primary key(id); Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:02 desc test_no_pk; +---+-+--+-+-+---+ | Field | Type| Null | Key | Default | Extra | +---+-+--+-+-+---+ | col1 | varchar(32) | YES | | NULL| | | id| int(11) | NO | PRI | 0 | | +---+-+--+-+-+---+ 2 rows in set (0.01 sec) When users add a PK, I will remove implicit PK automatically. root@localhost : plx 11:57:07 alter table test_no_pk drop primary key; Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:42 desc test_no_pk; +--+-+--+-+-++ | Field| Type| Null | Key | Default | Extra | +--+-+--+-+-++ | col1 | varchar(32) | YES | | NULL|| | id | int(11) | NO | | 0 || | __row_id | bigint(20) | NO | PRI | NULL| auto_increment | +--+-+--+-+-++ 3 rows in set (0.00 sec) When users dropped PK, I will add it automatically. *3. INSERT VALUES* root@localhost : plx 11:59:22 insert into test_no_pk values('abc',2); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' root@localhost : plx 11:59:23 insert into test_no_pk values('abc',4); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' it will report duplicate, *Sergei, can you help me to find why?* *4. SELECT ** root@localhost : plx 12:07:23 select * from test_no_pk; +--++ | col1 | id | +--++ | abc | 6 | +--++ 1 row in set (0.00 sec) root@localhost : plx 12:07:30 select __row_id from test_no_pk; +--+ | __row_id | +--+ |1 | +--+ 1 row in set (0.00 sec) When users run SELECT *, row_id will be filter. *5. SHOW CREATE* root@localhost : plx 12:07:35 show create table test_no_pk\G *** 1. row *** Table: test_no_pk Create Table: CREATE TABLE `test_no_pk` ( `col1` varchar(32) DEFAULT NULL, `id` int(11) NOT NULL DEFAULT '0' ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) row_id will be hidden. Thanks, Lixun On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng pengli...@gmail.commailto: pengli...@gmail.com wrote: Hi Jeremy, Thank you for your suggestion. I also want to just add the PK field for custom automatically, but some of our customs can't accept it. Because they are using SELECT * FROM table or INSERT INTO table VALUES(...) , if I add a visible PK for them,
Re: [Maria-developers] Implementing implicit primary key in mysql server
Am 05.07.2013 06:32, schrieb Lixun Peng: Yes, for normal user is un-meaningful it is not only un-meaningful it is harmful for people knwoing what they are doing by massive overhead with no benefit - hence after a bulk insert your implicit key has to be removed while as example the intented unique key on a varchar added The case is our MySQL Cloud Service, so many users are using our MySQL db for CMS or other programs. And it usually has no primary key or any unique keys, so it makes me headache. i doubt you can solve social problems with tech I think you know, if binlog_format=ROW, and tables have no any unique keys, what will happen. Now we just change binlog_forma=MIXED to avoid this problem. But our middleware need ROW format, so it still makes me headache. Because our users don't want to add PK by themselves, they don't know how to modify their application, they just download it and install in their web server. So I want to add a implicit Primary Key for each tables that have no unique keys. Then we can use ROW binlog format, so many problems will be solved. corner cases - this at least needs to be enabled via my.cnf and must not affect users with well database designs and shoot them in the leg due bulk inserts On Fri, Jul 5, 2013 at 12:15 PM, Reindl Harald h.rei...@thelounge.net mailto:h.rei...@thelounge.net wrote: it makes hardly sense to add a primary key not used in select statements this way and leads only in a lot of problems and wasted space / performance it is common practice to remove keys before large bulk inserts and add the key *after* inserting the data which would not work with the expected benefit with your patch Am 05.07.2013 06:08, schrieb Lixun Peng: Hi all, I implement a demo patch, based on 5.5.18. *1. CREATE TABLE* root@localhost : plx 11:54:46 create table test_no_pk (col1 varchar(32)); Query OK, 0 rows affected (0.01 sec) root@localhost : plx 11:55:05 desc test_no_pk; +--+-+--+-+-++ | Field| Type| Null | Key | Default | Extra | +--+-+--+-+-++ | col1 | varchar(32) | YES | | NULL|| | __row_id | bigint(20) | NO | PRI | NULL| auto_increment | +--+-+--+-+-++ 2 rows in set (0.01 sec) if users has not defined a PK, I will add it automatically. *2. ALTER TABLE* root@localhost : plx 11:55:10 alter table test_no_pk add id int, add primary key(id); Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:02 desc test_no_pk; +---+-+--+-+-+---+ | Field | Type| Null | Key | Default | Extra | +---+-+--+-+-+---+ | col1 | varchar(32) | YES | | NULL| | | id| int(11) | NO | PRI | 0 | | +---+-+--+-+-+---+ 2 rows in set (0.01 sec) When users add a PK, I will remove implicit PK automatically. root@localhost : plx 11:57:07 alter table test_no_pk drop primary key; Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:42 desc test_no_pk; +--+-+--+-+-++ | Field| Type| Null | Key | Default | Extra | +--+-+--+-+-++ | col1 | varchar(32) | YES | | NULL|| | id | int(11) | NO | | 0 || | __row_id | bigint(20) | NO | PRI | NULL| auto_increment | +--+-+--+-+-++ 3 rows in set (0.00 sec) When users dropped PK, I will add it automatically. *3. INSERT VALUES* root@localhost : plx 11:59:22 insert into test_no_pk values('abc',2); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' root@localhost : plx 11:59:23 insert into test_no_pk values('abc',4); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' it will report duplicate, *Sergei, can you help me to find why?* *4. SELECT ** root@localhost : plx 12:07:23 select * from test_no_pk; +--++ | col1 | id | +--++ | abc | 6 | +--++ 1 row in set (0.00 sec) root@localhost : plx 12:07:30 select __row_id from test_no_pk; +--+ | __row_id | +--+ |1 | +--+ 1 row in set (0.00 sec)
Re: [Maria-developers] Implementing implicit primary key in mysql server
Hi Jeremy, Thank you for your suggestion. I also want to just add the PK field for custom automatically, but some of our customs can't accept it. Because they are using SELECT * FROM table or INSERT INTO table VALUES(...) , if I add a visible PK for them, SELECT * will show this value, then their applications will report errors. So I have to set this field as an implicit filed. Thanks, Lixun On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole jeremyc...@google.com wrote: Lixun, I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more complicated than it might seem). While I think it is not that hard to expose the __id field to replication internals, I think in order for this to really work it would need to be exposed to other tools, such as mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it is in sync between master and slave). As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically adding a field using auto_increment when the user doesn't provide a PK: __id BIGINT UNSIGNED NOT NULL auto_increment, PRIMARY KEY(__id) Regards, Jeremy On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng pengli...@gmail.com wrote: Hi Sergei, You are right, let users add primary key is best. But I can't let users who don't want to create primary key can't use our MySQL service. Amazon RDS also allow users to create the tables without primary key, just change binlog_format to MIXED to solve replication problem. I think this feature is very useful for any MySQL cloud service providers, all of them will face this problem in one day. I will try to do some analysis/research in this feature implement first, I will need your help :-) I will notice any new updates in this email. Thanks, Lixun On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik s...@mariadb.orgwrote: Hi, Lixun! On Jun 18, Lixun Peng wrote: Hi, As we know, InnoDB has implicit primary key if a table hasn't defined a primary key. However MySQL server doesn't know this primary key, so this primary key will not apear in binlog. When we are using ROW format binlog for replication, if a table has no any indexes, that's a disaster. If a table without indexes do a DML (UPDATE/DELETE), of course it will run a long time in master, but in slave, it still need a long time. It will cause serious slave replication delay. ... I think mysql server can do the same thing as InnoDB do, if user doesn't define the primary key, mysql can add the primary key automatically. How do you think? Well, that's doable. A much easier solution would be to require a user to create a primary key. It's a one-line change: - Table_flags ha_table_flags() const { return cached_table_flags; } + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; } But what you suggest is possible too, I believe. Regards, Sergei -- Senior Database Engineer @ Taobao.com Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp -- Senior Database Engineer @ Taobao.com Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
[Maria-developers] Implementing implicit primary key in mysql server
Hi, As we know, InnoDB has implicit primary key if a table hasn't defined a primary key. However MySQL server doesn't know this primary key, so this primary key will not apear in binlog. When we are using ROW format binlog for replication, if a table has no any indexes, that's a disaster. If a table without indexes do a DML (UPDATE/DELETE), of course it will run a long time in master, but in slave, it still need a long time. It will cause serious slave replication delay. For example: binlog_format=ROW; CREATE TABLE table1 ( col1 varchar(32), col2 int ); table1 hasn't any indexes, if table1 has 100 million rows, I do a query UPDATE table1 SET col1='xxx' WHERE col2=100, and this query will effect 1 million rows. Then it has 1 million UPDATE events in binlog. After slave received these events, each events should scan all rows!!! It will cause serious replication delay! I saw:https://kb.askmonty.org/en/row-based-replication-with-no-primary-key/ But if no any indexes, it still a problem. I think mysql server can do the same thing as InnoDB do, if user doesn't define the primary key, mysql can add the primary key automatically. For example: If user create a table like table1, we can do ALTER TABLE table1 ADD __id bigint auto_increment, ADD PRIMARY KEY(__id), so in fact, the table1 structure is CREATE TABLE table1 ( col1 varchar(32), col2 int, __id bigint auto_increment primary key ); But when user do SELECT * FROM table xxx or INSERT INTO table VALUES(xx), we should ignore implicit primary key, __id will not apear in the result set. How do you think? Thanks, Lixun -- Senior Database Engineer @ Taobao.com Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp