Re: Unique ID's across multiple databases
I may have missed what you are trying to do here. NoSQL is really a bad name and should really be renamed to NoREL instead. NoSQL implementations are not used just because of limitations of traditional RDBMS when it comes to sheer traffic volume, they are also used because they scale horizontally very well. When I was reading though all these emails it seems to be you would have 200+ nodes here with the SAME data. Maybe that's not what you are trying to do? I also did not know the software for maintaining the data was already in place. JW On Tue, Sep 14, 2010 at 12:57 AM, Johan De Meersman vegiv...@tuxera.bewrote: On Mon, Sep 13, 2010 at 8:59 PM, Johnny Withers joh...@pixelated.netwrote: This sounds like a good job for a 'NoSQL' system. Maybe? I can't help but blink at that. How exactly is NoSQL going to fix issues that are related to topology, not inherent SQL limitations ? Which particular incarnation of NoSQL are you thinking of ? Also, I suspect rewriting all 200 clients' services is not going to be an option :-) -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- - Johnny Withers 601.209.4985 joh...@pixelated.net
Re: Unique ID's across multiple databases
On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: Unique ID's across multiple databases
This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.bewrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: Unique ID's across multiple databases
Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: Unique ID's across multiple databases
I could be way off here, but how about letting your unique id be a calculated column of the the server's MAC address concatenated with an auto-increment id column? I hope this helps... ~~Fish~~ On Mon, Sep 13, 2010 at 7:26 AM, Johan De Meersman vegiv...@tuxera.bewrote: Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: Unique ID's across multiple databases
I had some coffee and realized that actually, using a UUID might be something to look at. There have been quite a few discussions about using a UUID as a unique id and it does have some gotchas. Just Google: mysql uuid Have a great day ~~Fish~~ On Mon, Sep 13, 2010 at 7:30 AM, Fish Kungfu fish.kun...@gmail.com wrote: I could be way off here, but how about letting your unique id be a calculated column of the the server's MAC address concatenated with an auto-increment id column? I hope this helps... ~~Fish~~ On Mon, Sep 13, 2010 at 7:26 AM, Johan De Meersman vegiv...@tuxera.bewrote: Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: Unique ID's across multiple databases
Hell, yeah. :) Actually, the ID system I described below works quite well according to my tests. I feel very comfortable with it both from primary key size and dynamically increasable database number point of views. What I actually don't like in it is the concatenated unique ID (ID + SID) pairs. To use two fields for primary and foreign keys is not the most convenient to say the least. :) I am just wondering if anyone has any better idea to fulfill the requirements (small index size, dynamically increasable numbe of databases in the array, incremental-like ID's are optimal for the MySQL indexing engine) and avoid this silly drawback. :) On Mon, Sep 13, 2010 at 1:26 PM, Johan De Meersman vegiv...@tuxera.bewrote: Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: Unique ID's across multiple databases
Well, thanks, but I'm afraid using UUID's (even with hex compression) is kind of a suicide, when it comes to performance. This is a good summary about the issues: http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/ So, some kind of auto_increment or sequencing must be the optimal solution here. On Mon, Sep 13, 2010 at 2:05 PM, Fish Kungfu fish.kun...@gmail.com wrote: I had some coffee and realized that actually, using a UUID might be something to look at. There have been quite a few discussions about using a UUID as a unique id and it does have some gotchas. Just Google: mysql uuid Have a great day ~~Fish~~ On Mon, Sep 13, 2010 at 7:30 AM, Fish Kungfu fish.kun...@gmail.com wrote: I could be way off here, but how about letting your unique id be a calculated column of the the server's MAC address concatenated with an auto-increment id column? I hope this helps... ~~Fish~~ On Mon, Sep 13, 2010 at 7:26 AM, Johan De Meersman vegiv...@tuxera.be wrote: Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
RE: Unique ID's across multiple databases
-Original Message- From: Kiss Dániel [mailto:n...@dinagon.com] Sent: Sunday, September 12, 2010 1:47 PM To: mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Unique ID's across multiple databases Hi, I'm designing a master-to-master replication architecture. I wonder what the best way is to make sure both databases generate unique row ID's, so there won't be ID conflicts when replicating both directions. I read on forums about pro's and con's using UUID's, also about setting the *auto-increment-increment *and *auto-increment-offset* system variables. I'm not a fan of UUID's for this purpose. They are two big, slow to generate, and they don't 100% guarantee global uniqueness anyway. On the other hand, the *auto-increment-offset *trick limits the number of possible databases in the system in advance, which is not so nice. [JS] UUIDs are likely enough to be unique that you don't have to worry about it. I can't speak to the speed. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com So, I'm thinking about a composite unique ID system, where the value of the *server-id *system variable is always a part of the ID. The server-id has to be set uniquely anyway for any replication architecture in MySQL. A sample would look like this: CREATE TABLE SampleParents ( ID INT UNSIGNED NOT NULL AUTO_INCREMENT, SID SMALLINT UNSIGNED NOT NULL, SampleData VARCHAR(50) NOT NULL, PRIMARY KEY (ID, SID)) ENGINE=InnoDB; CREATE TABLE SampleChildren ( ID INT UNSIGNED NOT NULL AUTO_INCREMENT, SID SMALLINT UNSIGNED NOT NULL, SampleParentID INT UNSIGNED NOT NULL, SampleParentSID SMALLINT UNSIGNED NOT NULL, SampleData VARCHAR(50) NOT NULL, PRIMARY KEY (ID, SID), KEY (SampleParentID, SampleParentSID), CONSTRAINT FOREIGN KEY (SampleParentID, SampleParentSID) REFERENCES SampleParents (ID, SID)) ENGINE=InnoDB; Where SID is always filled with the value of the @@server_id global variable according to the server where the record is created. How I see it is that it has the following advantages. - The ID+SID pair guarantees pripary key uniqueness within the replication array. - Auto_increment is a magnitude faster than generating UUID's. - SID adds only 2 bytes in this case to the size of the primary key item. It can be even 1 byte if I'm sure I'll never exceed maximum 255 servers. But anyhow, it is still way smaller than the 16 byte of a UUID field, even if using BIGINT's. - Keeps the efficiency of the primary key indexing, because the key is still very close to a strictly incremental value. Well, you can see what the obvious disadvantage is: - Primary and foreign keys are always double fields. This is not so convinient when you are joining tables and add the WHERE clauses to your queries. It might even negatively affect the evaluation speed of join conditions, although I have no idea yet, how much. (My gut feeling is that it's not a big issue, due to the good query optimizer of MySQL.) My question is. Does anyone have any better idea, how to approach this problem? Thanks, -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Unique ID's across multiple databases
-Original Message- From: vegiv...@gmail.com [mailto:vegiv...@gmail.com] On Behalf Of Johan De Meersman Sent: Monday, September 13, 2010 7:27 AM To: Kiss Dániel Cc: Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) [JS] It sounds like you are trying to protect against a regional disaster. This is precisely the type of scenario for which NAS or FibreChannel is used. You let the storage medium take care of replication. Typically you'd only need two units, perhaps on opposite sides of the country, using FibreChannel over IP. I've been out of this market (sales/support side) for many years, so I don't know what the current technology costs, but if you can afford it that is the way to go. It will make your life much simpler. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Unique ID's across multiple databases
-Original Message- From: Kiss Dániel [mailto:n...@dinagon.com] Sent: Monday, September 13, 2010 11:49 AM To: Jerry Schwartz Cc: Johan De Meersman; Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Well, not exactly. I do not own all the databases. Some of them are placed at customers, some of them are at my data warehouse. So, neither NAS or Fibre Channel is a solution in this case. [JS] Then you have a mess on your hands. Are you going to be mirroring these databases separately for each customer? I wish you well. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 4:30 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: vegiv...@gmail.com [mailto:vegiv...@gmail.com] On Behalf Of Johan De Meersman Sent: Monday, September 13, 2010 7:27 AM To: Kiss Dániel Cc: Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) [JS] It sounds like you are trying to protect against a regional disaster. This is precisely the type of scenario for which NAS or FibreChannel is used. You let the storage medium take care of replication. Typically you'd only need two units, perhaps on opposite sides of the country, using FibreChannel over IP. I've been out of this market (sales/support side) for many years, so I don't know what the current technology costs, but if you can afford it that is the way to go. It will make your life much simpler. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Unique ID's across multiple databases
This sounds like a good job for a 'NoSQL' system. Maybe? JW On Mon, Sep 13, 2010 at 1:46 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: Kiss Dániel [mailto:n...@dinagon.com] Sent: Monday, September 13, 2010 11:49 AM To: Jerry Schwartz Cc: Johan De Meersman; Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Well, not exactly. I do not own all the databases. Some of them are placed at customers, some of them are at my data warehouse. So, neither NAS or Fibre Channel is a solution in this case. [JS] Then you have a mess on your hands. Are you going to be mirroring these databases separately for each customer? I wish you well. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 4:30 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: vegiv...@gmail.com [mailto:vegiv...@gmail.com] On Behalf Of Johan De Meersman Sent: Monday, September 13, 2010 7:27 AM To: Kiss Dániel Cc: Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) [JS] It sounds like you are trying to protect against a regional disaster. This is precisely the type of scenario for which NAS or FibreChannel is used. You let the storage medium take care of replication. Typically you'd only need two units, perhaps on opposite sides of the country, using FibreChannel over IP. I've been out of this market (sales/support side) for many years, so I don't know what the current technology costs, but if you can afford it that is the way to go. It will make your life much simpler. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=joh...@pixelated.net -- - Johnny Withers 601.209.4985 joh...@pixelated.net
Re: Unique ID's across multiple databases
Well, that would be the plan, yes. :-) Anyway, I'll get over the problem sooner or later. :-) On Mon, Sep 13, 2010 at 8:46 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: Kiss Dániel [mailto:n...@dinagon.com] Sent: Monday, September 13, 2010 11:49 AM To: Jerry Schwartz Cc: Johan De Meersman; Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Well, not exactly. I do not own all the databases. Some of them are placed at customers, some of them are at my data warehouse. So, neither NAS or Fibre Channel is a solution in this case. [JS] Then you have a mess on your hands. Are you going to be mirroring these databases separately for each customer? I wish you well. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 4:30 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: vegiv...@gmail.com [mailto:vegiv...@gmail.com] On Behalf Of Johan De Meersman Sent: Monday, September 13, 2010 7:27 AM To: Kiss Dániel Cc: Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) [JS] It sounds like you are trying to protect against a regional disaster. This is precisely the type of scenario for which NAS or FibreChannel is used. You let the storage medium take care of replication. Typically you'd only need two units, perhaps on opposite sides of the country, using FibreChannel over IP. I've been out of this market (sales/support side) for many years, so I don't know what the current technology costs, but if you can afford it that is the way to go. It will make your life much simpler. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
RE: Unique ID's across multiple databases
From: Kiss Dániel [mailto:n...@dinagon.com] Sent: Monday, September 13, 2010 3:17 PM To: Jerry Schwartz Cc: Johan De Meersman; Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Well, that would be the plan, yes. :-) Anyway, I'll get over the problem sooner or later. :-) [JS] I guess I don't grasp your topology at all. If you are going to be replicating each database separately, who cares whether or not the IDs are unique across databases? Were you thinking of commingling the data from all of these individual databases into one big database? That doesn't make sense to me from a fail-over standpoint. Depending upon the industry and their level of paranoia (by inclination or by regulation), you might have to have a separate remote SYSTEM (not database) per customer. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 8:46 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: Kiss Dániel [mailto:n...@dinagon.com] Sent: Monday, September 13, 2010 11:49 AM To: Jerry Schwartz Cc: Johan De Meersman; Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Well, not exactly. I do not own all the databases. Some of them are placed at customers, some of them are at my data warehouse. So, neither NAS or Fibre Channel is a solution in this case. [JS] Then you have a mess on your hands. Are you going to be mirroring these databases separately for each customer? I wish you well. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 4:30 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: vegiv...@gmail.com [mailto:vegiv...@gmail.com] On Behalf Of Johan De Meersman Sent: Monday, September 13, 2010 7:27 AM To: Kiss Dániel Cc: Max Schubert; mysql@lists.mysql.com; replicat...@lists.mysql.com Subject: Re: Unique ID's across multiple databases Hmm, that's a very interesting scenario, indeed. One bad connection will break the chain, though, so in effect you'll be multiplying the disconnecting rate... I think you'd be better of with a star topology, but MySQL unfortunately only allows ring-types. This is gonna require some good thinking on your part :-) [JS] It sounds like you are trying to protect against a regional disaster. This is precisely the type of scenario for which NAS or FibreChannel is used. You let the storage medium take care of replication. Typically you'd only need two units, perhaps on opposite sides of the country, using FibreChannel over IP. I've been out of this market (sales/support side) for many years, so I don't know what the current technology costs, but if you can afford it that is the way to go. It will make your life much simpler. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com On Mon, Sep 13, 2010 at 12:28 PM, Kiss Dániel n...@dinagon.com wrote: This is actually more for failover scenarios where databases are spread in multiple locations with unreliable internet connections. But you want to keep every single location working even when they are cut off from the other databases. The primary purpose is not load distribution. On Mon, Sep 13, 2010 at 12:03 PM, Johan De Meersman vegiv...@tuxera.be wrote: On Sun, Sep 12, 2010 at 9:45 PM, Kiss Dániel n...@dinagon.com wrote: offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal What benefit do you think you will reap from that many masters ? Don't forget that every write still has to be done on every server, so you're not actually distributing that load; while for reads you only need simple slaves. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Unique ID's across multiple databases
-Original Message- From: Kiss Dániel [mailto:n...@dinagon.com] Sent: Monday, September 13, 2010 5:59 AM Well, thanks, but I'm afraid using UUID's (even with hex compression) is kind of a suicide, when it comes to performance. This is a good summary about the issues: http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/ Is this UUID issue unique to mySQL or are there other RDBMS's that handle it better (Postgress, Oracle, SQL Server, etc?) I too have a need for a unique identifier that will mesh with other databases periodically. So that a user in one local DB/server will get migrated to a master DB which in turn will sync up with remote sites so that all sites will have all users in it each night (for example). Having a mapping of UUID to local ID seems one way, but I feel there is a lot of room for collisions and integrity issues that way no? There are some solutions at the bottom of that blog post. Are those not good then? They seem interesting to me. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
RE: Unique ID's across multiple databases
On Mon, September 13, 2010 15:37, Daevid Vincent wrote: -Original Message- From: Kiss D�niel [mailto:n...@dinagon.com] Sent: Monday, September 13, 2010 5:59 AM Well, thanks, but I'm afraid using UUID's (even with hex compression) is kind of a suicide, when it comes to performance. This is a good summary about the issues: http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/ Is this UUID issue unique to mySQL or are there other RDBMS's that handle it better (Postgress, Oracle, SQL Server, etc?) I too have a need for a unique identifier that will mesh with other databases periodically. So that a user in one local DB/server will get migrated to a master DB which in turn will sync up with remote sites so that all sites will have all users in it each night (for example). Having a mapping of UUID to local ID seems one way, but I feel there is a lot of room for collisions and integrity issues that way no? There are some solutions at the bottom of that blog post. Are those not good then? They seem interesting to me. Why does it have to be one field.� Two fields: ServerID and the SequenceID Across servers the pair would be unique and within a given server the Sequence ID is the equivalent of a manual auto-increment fields�� Set it via Max(SequenceID)+1 where ServerID is the local serverID.�� Have an index set for the combined fields as well as the Sequence ID field perhaps.� SOURCE IP FROM HEADER: *Please block this account's access to the * *internet until its cleaned up. We are basing * *this on an analysis of the header NOT the FROM* *address. * -- William R. Mussatto Systems Engineer http://www.csz.com 909-920-9154
Re: Unique ID's across multiple databases
On Mon, Sep 13, 2010 at 8:59 PM, Johnny Withers joh...@pixelated.netwrote: This sounds like a good job for a 'NoSQL' system. Maybe? I can't help but blink at that. How exactly is NoSQL going to fix issues that are related to topology, not inherent SQL limitations ? Which particular incarnation of NoSQL are you thinking of ? Also, I suspect rewriting all 200 clients' services is not going to be an option :-) -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Unique ID's across multiple databases
Hi, I'm designing a master-to-master replication architecture. I wonder what the best way is to make sure both databases generate unique row ID's, so there won't be ID conflicts when replicating both directions. I read on forums about pro's and con's using UUID's, also about setting the *auto-increment-increment *and *auto-increment-offset* system variables. I'm not a fan of UUID's for this purpose. They are two big, slow to generate, and they don't 100% guarantee global uniqueness anyway. On the other hand, the *auto-increment-offset *trick limits the number of possible databases in the system in advance, which is not so nice. So, I'm thinking about a composite unique ID system, where the value of the *server-id *system variable is always a part of the ID. The server-id has to be set uniquely anyway for any replication architecture in MySQL. A sample would look like this: CREATE TABLE SampleParents ( ID INT UNSIGNED NOT NULL AUTO_INCREMENT, SID SMALLINT UNSIGNED NOT NULL, SampleData VARCHAR(50) NOT NULL, PRIMARY KEY (ID, SID)) ENGINE=InnoDB; CREATE TABLE SampleChildren ( ID INT UNSIGNED NOT NULL AUTO_INCREMENT, SID SMALLINT UNSIGNED NOT NULL, SampleParentID INT UNSIGNED NOT NULL, SampleParentSID SMALLINT UNSIGNED NOT NULL, SampleData VARCHAR(50) NOT NULL, PRIMARY KEY (ID, SID), KEY (SampleParentID, SampleParentSID), CONSTRAINT FOREIGN KEY (SampleParentID, SampleParentSID) REFERENCES SampleParents (ID, SID)) ENGINE=InnoDB; Where SID is always filled with the value of the @@server_id global variable according to the server where the record is created. How I see it is that it has the following advantages. - The ID+SID pair guarantees pripary key uniqueness within the replication array. - Auto_increment is a magnitude faster than generating UUID's. - SID adds only 2 bytes in this case to the size of the primary key item. It can be even 1 byte if I'm sure I'll never exceed maximum 255 servers. But anyhow, it is still way smaller than the 16 byte of a UUID field, even if using BIGINT's. - Keeps the efficiency of the primary key indexing, because the key is still very close to a strictly incremental value. Well, you can see what the obvious disadvantage is: - Primary and foreign keys are always double fields. This is not so convinient when you are joining tables and add the WHERE clauses to your queries. It might even negatively affect the evaluation speed of join conditions, although I have no idea yet, how much. (My gut feeling is that it's not a big issue, due to the good query optimizer of MySQL.) My question is. Does anyone have any better idea, how to approach this problem? Thanks,
Re: Unique ID's across multiple databases
On 12 Sep 2010, at 19:47, Kiss Dániel wrote: - SID adds only 2 bytes in this case to the size of the primary key item. It can be even 1 byte if I'm sure I'll never exceed maximum 255 servers. But anyhow, it is still way smaller than the 16 byte of a UUID field, even if using BIGINT's. You can maintain your own sequence tables a la postgres if you use transactions to ensure atomicity, though that doesn't help across databases (I suspect the same is true in postgres). FWIW my auto_increment_offset value is usually the same as my server ID. auto_increment_increment also reduces the number of usable values for integer keys too; e.g if it's 10, you've reduced your key space by a factor of 10 too, meaning you may need to use bigints if you potentially have more than 400m records in any given table. One small observation that I've seen work is to use binary fields for UUIDs and always wrap access to them in hex/unhex functions. You halve your storage requirements for the same amount of data that way. Marcus smime.p7s Description: S/MIME cryptographic signature
Re: Unique ID's across multiple databases
Server offset + increment works really well, is simple, and well documented and reliable - not sure why you would want to re-invent something that works so well :). -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Unique ID's across multiple databases
You may be right. I'm not arguing that offset + increment is working. I'm just wondering if that's the optimal solution when you do not know how many servers you will have in your array in the future. In my view, the offset + increment thingy is good if you know in advance that you'll have a limited number of servers. But if you have no idea that you will have 2, 20, or 200 servers in your array in the future, you just can't pick an optimal increment value. It just doesn't scale well enough to me. If you go with BIGINT ID's, you may have a big enough interval to be generous and pick a big increment value and allow 200 or even 2000 to make sure you cover worst case scenarios. I'm just not sure if it's worth it to use up 8 bytes for a primary key, when in general 4/5/6 is more than enough. Any thoughts on this? On Sun, Sep 12, 2010 at 9:32 PM, Max Schubert m...@webwizarddesign.comwrote: Server offset + increment works really well, is simple, and well documented and reliable - not sure why you would want to re-invent something that works so well :). -- MySQL Replication Mailing List For list archives: http://lists.mysql.com/replication To unsubscribe: http://lists.mysql.com/replication?unsub=n...@dinagon.com