I came up with a solution: using special "md5" to deal with conflicts
steps to insert a url
1. insert into table(md5,url) values('md5 of url', url)
2. if get a duplicate entry for primary key
2.1 select md5,url from table where md5 like 'xxxx%' for update
2.2 if url really exists, don't need insert anything
2.3 generate a special md5 for this url starts with xxxx,maybe
xxxx00001 and insert it
2.4 commit transaction
On Wed, Nov 6, 2013 at 10:56 AM, Li Li <[email protected]> wrote:
> I prefer your solution in that it's something like Optimistic Locking.
> but the problem is that if I define md5 as unique key and there exists
> 2 different urls with the same md5. I can't insert the second url
> anymore
>
> On Tue, Nov 5, 2013 at 11:55 PM, Dan Nelson <[email protected]> wrote:
>> In the last episode (Nov 05), Li Li said:
>>> I want to create a table with a long varchar column, maybe it's the url.
>>> according to dns spec, the url's max length is fixed. but I have
>>> to deal with url having long params such as
>>> a.html?q=aaaaaaaaaaaaaaaaaaaaaaaaaaaa&fl=bbbbbbbbbbbbbbbbbbbbbbbb....
>>> I want the url is unique when inserting it.
>>> I googled and found
>>> http://stackoverflow.com/questions/6800866/how-to-store-urls-in-mysql
>>> this post suggests use md5 of url. But in theory, there will be
>>> conflict that two different urls will have the same md5(even it's
>>> probablitiy is very small). I want to a absolute correct solution.
>>> one method i can come up with is using select ... for update
>>> 1. begin transaction
>>> 2. select url from tb where md5='xxxx' for update
>>> 3. if the url is not exist, insert into this url; else do nothing
>>
>> It might be more efficient to optimize for the common case here. The
>> assumption is that an md5 (or sha1 or sha2) hash collision is extremely
>> unlikely, so you could just insert your new row, and if you get a "duplicate
>> entry for primary key" error, then you can "select url from tb where
>> md5='xxxx' ", and compare the retreived url with the one you want to insert.
>>
>> --
>> Dan Nelson
>> [email protected]
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql