Re: [algogeeks] top search queries

2011-01-31 Thread snehal jain
please give your ideas for this On Fri, Jan 21, 2011 at 2:28 PM, snehal jain learner@gmail.com wrote: Magy! receives a huge amount of queries everyday. They don’t want to store all of them, since many of the queries are unique (submitted just one time by just one user). Anyway, for

Re: [algogeeks] top search queries

2011-01-31 Thread Srikar
First Approach, 0) the queries can be considered as an infinite stream. maintain a global count of the number of queries coming from the stream (used for calc the frequency %). 1) maintain a min-heap (has just top 100 queries by frequency) + hash table (where we store frequency for each word in

Re: [algogeeks] top search queries

2011-01-31 Thread juver++
@above in your second approach, in the worst case you need to traverse the heap in O(n) time. -- You received this message because you are subscribed to the Google Groups Algorithm Geeks group. To post to this group, send email to algogeeks@googlegroups.com. To unsubscribe from this group,

[algogeeks] top search queries

2011-01-21 Thread snehal jain
Magy! receives a huge amount of queries everyday. They don’t want to store all of them, since many of the queries are unique (submitted just one time by just one user). Anyway, for caching reason they want to understand what are the queries whose frequency exceed s=0.1%. How do you solve the