Re: [algogeeks] Re: storing URL's

2012-05-20 Thread atul anand
question is more of like asking which data structure is suitable for implementing DNS server like functionality ? On Sat, May 19, 2012 at 10:46 PM, Gene gene.ress...@gmail.com wrote: This question has no answer. Every good student of computer science will know that you choose a data structure

Re: [algogeeks] Re: storing URL's

2012-05-19 Thread Ramindar Singh
@Rahul rope data structure wont be a good idea... Performance Operation Rope String(Array) index O(log n) O(1) split O(log n) O(1) concatenate O(log n) O(n) insert O(log n) O(n) delete O(log n) O(n) report O(log n) O(1) build O(n) O(n) the question says data structure used should store

[algogeeks] Re: storing URL's

2012-05-19 Thread Gene
This question has no answer. Every good student of computer science will know that you choose a data structure based on the _operations_ that must be performed on it: insert, lookup and what flavors of lookup, delete, etc.. So if an interviewer uses this question, he or she is probably trying to

[algogeeks] Re: storing URL's

2012-05-19 Thread Varun
That's agreed Gene. Answer depends on context. On Saturday, 19 May 2012 22:46:06 UTC+5:30, Gene wrote: This question has no answer. Every good student of computer science will know that you choose a data structure based on the _operations_ that must be performed on it: insert, lookup and

Re: [algogeeks] Re: storing URL's

2012-05-18 Thread Ashish Goel
Tiger Tree Hash Best Regards Ashish Goel Think positive and find fuel in failure +919985813081 +919966006652 On Thu, May 17, 2012 at 11:16 PM, Prem Krishna Chettri hprem...@gmail.comwrote: For the cases where Storing the Value is the only Concern and (Not the Retrieval efficiency), I would

Re: [algogeeks] Re: storing URL's

2012-05-18 Thread rahul ranjan
rope data structure can be gud in such cases.. hashing may not be too efficient as many url wud almost be same as mentioned by prakash. trie is another option but i believe overhead in trie will be more... correct me if i am wrong. On Fri, May 18, 2012 at 11:33 AM, Ashish Goel

Re: [algogeeks] Re: storing URL's

2012-05-17 Thread Prakash D
We can still improve this trie idea.. say we have urls like www.google.com www.goodbye.com www.google.com/transliterate www.goodstrain.com/good we can subdivide everything under www.goo I mean we can store each character as a node in a trie and call it like a URL dictionary On Wed, May 16,

Re: [algogeeks] Re: storing URL's

2012-05-17 Thread Prem Krishna Chettri
For the cases where Storing the Value is the only Concern and (Not the Retrieval efficiency), I would Suggest Something called DFA Subset minimization.. Google for it ... and after the final subset as said U can use something called DAWG for the most Most Optimal solution.. On Thu, May 17, 2012

[algogeeks] Re: storing URL's

2012-05-16 Thread omega9
On May 16, 10:33 am, atul anand atul.87fri...@gmail.com wrote: @amit : here is the reason :- each url sayhttp://www.geeksforgeeks.org you will hash following

[algogeeks] Re: storing URL's

2012-05-15 Thread Varun
should be a tree based on domain in url and directory mentioned in url. On Tuesday, 15 May 2012 21:20:55 UTC+5:30, atul007 wrote: Given a file which contain millions of URL's. which data structure would you use for storing these URL's . data structure used should store and fetch data in

Re: [algogeeks] Re: storing URL's

2012-05-15 Thread atul anand
i was thinking about using TRIE or patricia tree. hashing is another but it wont work if URLs are in millions is there any better data structure ? On Tue, May 15, 2012 at 11:37 PM, Varun tewari.va...@gmail.com wrote: should be a tree based on domain in url and directory mentioned in url. On

Re: [algogeeks] Re: storing URL's

2012-05-15 Thread Amit Mittal
Why hashing won;t work for millions of URL. If you hash each URL in to a distinct 32 bit integer, you can map 2^32 URL which is around 4 billion. it should work. On Wed, May 16, 2012 at 10:42 AM, atul anand atul.87fri...@gmail.comwrote: i was thinking about using TRIE or patricia tree. hashing

Re: [algogeeks] Re: storing URL's

2012-05-15 Thread atul anand
@amit : here is the reason :- each url say http://www.geeksforgeeks.org you will hash following urls http://www.geeksforgeeks.org http://www.geeksforgeeks.org/archives http://www.geeksforgeeks.org/archives/19248 http://www.geeksforgeeks.org/archives/