subject:"\[algogeeks\] Re\: Judging whether a URL exists among millions, insert if not"

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

2008-08-22 Thread [EMAIL PROTECTED]

I think you better set unique constraint on URL column and generate a index with URL. All the methods above has to search the table at least once anyway and if so adds constraint and let DB handle it is most efficient. --~--~-~--~~~---~--~~ You received this

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

2008-08-21 Thread Abdul Habra

two things: 1. select count(*) from table where HASH_CODE=hc and select count( HASH_CODE) from table where HASH_CODE=hc are equivalent 2. hash code uniquness is not guaranteed. Say your hash code is 32 bit signed integer. you could have at most 2^31 distinct hashcodes (roughly 2 billions). On

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

2008-08-21 Thread Fred

On Aug 21, 12:38 pm, Ashish Chugh [EMAIL PROTECTED] wrote: Few more suggestions, Instead of select count(*) from table where HASH_CODE=hc and URL='urlToFind' to select count( HASH_CODE) from table where HASH_CODE=hc is better, since HASH_CODE is unique. You can cache all hash codes or

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

2008-08-20 Thread Ashish Chugh

Instead of MD5, I think hashCode will suffice. Also it would be unieque for each url and will take lesser number of bytes. Regards, /Ashish On Wed, Aug 20, 2008 at 2:12 PM, Fred [EMAIL PROTECTED] wrote: Hi, all: I've got such a problem: there are millions of URLs in the database, and

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

2008-08-20 Thread Abdul Habra

I agree with Ashish. Use hashCode. Here is my suggestion: Add a new column to your db table, lets call it HASH_CODE whenever you add a url row, populate the HASH_CODE with the hashcode of the URL. When you want to search for the existance of a URL: select count(*) from table where HASH_CODE=hc

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

[algogeeks] Re: Judging whether a URL exists among millions, insert if not

5 matches

Site Navigation

Mail list logo

Footer information