Re: [algogeeks] Re: top search queries

2011-02-04 Thread Algoose chase
Going by the hint given in the problem

Divide the stream into windows and assume that we have enough space to store
all the queries from the stream window in a hash table along frequency
count. Also maintain a global Hash table that will contain the frequency
counts of top n queries seen so far. Make sure that n100 .Once the window
is filled with enough queries, merge them into the global hash table. If a
query is already available in the global hash table update the frequency
count.Again this solution cannot be accurate due to limited memory and it
allows for some error. larger the value of n lesser the error proneness.

On Thu, Feb 3, 2011 at 9:51 PM, Srikar srikar2...@gmail.com wrote:

 @algoose I see what you are saying. what do you propose? checking out your
 link now...



 On Thu, Feb 3, 2011 at 11:44 AM, Algoose chase harishp...@gmail.comwrote:

 @Srikar

 In your first approach you cant simply ignore the queries that are not
 present in the heap because you have a stream of queries and you never know
 if the query that you are about to ignore is going be received frequently or
 not in future. Your approach is like find the top 100 queries from the
 stream and keep updating the frequencies of only those queries using heap
 and hash table. If you have to process some 1,00,000 , with a space for only
 100 elements you cant find the frequencies correctly.

 this is a nice article related to this :
 http://www.americanscientist.org/issues/pub/the-britney-spears-problem



 On Tue, Feb 1, 2011 at 8:09 PM, sankalp srivastava 
 richi.sankalp1...@gmail.com wrote:

 @guy above juver++
 The solution , i don't think can get better than this , because you
 need to store the querries anyway (at least for the output )

 --
 You received this message because you are subscribed to the Google Groups
 Algorithm Geeks group.
 To post to this group, send email to algogeeks@googlegroups.com.
 To unsubscribe from this group, send email to
 algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Algorithm Geeks group.
 To post to this group, send email to algogeeks@googlegroups.com.
 To unsubscribe from this group, send email to
 algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Algorithm Geeks group.
 To post to this group, send email to algogeeks@googlegroups.com.
 To unsubscribe from this group, send email to
 algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.



Re: [algogeeks] Re: top search queries

2011-02-03 Thread Algoose chase
@Srikar

In your first approach you cant simply ignore the queries that are not
present in the heap because you have a stream of queries and you never know
if the query that you are about to ignore is going be received frequently or
not in future. Your approach is like find the top 100 queries from the
stream and keep updating the frequencies of only those queries using heap
and hash table. If you have to process some 1,00,000 , with a space for only
100 elements you cant find the frequencies correctly.

this is a nice article related to this :
http://www.americanscientist.org/issues/pub/the-britney-spears-problem


On Tue, Feb 1, 2011 at 8:09 PM, sankalp srivastava 
richi.sankalp1...@gmail.com wrote:

 @guy above juver++
 The solution , i don't think can get better than this , because you
 need to store the querries anyway (at least for the output )

 --
 You received this message because you are subscribed to the Google Groups
 Algorithm Geeks group.
 To post to this group, send email to algogeeks@googlegroups.com.
 To unsubscribe from this group, send email to
 algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.



Re: [algogeeks] Re: top search queries

2011-02-03 Thread Srikar
@algoose I see what you are saying. what do you propose? checking out your
link now...



On Thu, Feb 3, 2011 at 11:44 AM, Algoose chase harishp...@gmail.com wrote:

 @Srikar

 In your first approach you cant simply ignore the queries that are not
 present in the heap because you have a stream of queries and you never know
 if the query that you are about to ignore is going be received frequently or
 not in future. Your approach is like find the top 100 queries from the
 stream and keep updating the frequencies of only those queries using heap
 and hash table. If you have to process some 1,00,000 , with a space for only
 100 elements you cant find the frequencies correctly.

 this is a nice article related to this :
 http://www.americanscientist.org/issues/pub/the-britney-spears-problem



 On Tue, Feb 1, 2011 at 8:09 PM, sankalp srivastava 
 richi.sankalp1...@gmail.com wrote:

 @guy above juver++
 The solution , i don't think can get better than this , because you
 need to store the querries anyway (at least for the output )

 --
 You received this message because you are subscribed to the Google Groups
 Algorithm Geeks group.
 To post to this group, send email to algogeeks@googlegroups.com.
 To unsubscribe from this group, send email to
 algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Algorithm Geeks group.
 To post to this group, send email to algogeeks@googlegroups.com.
 To unsubscribe from this group, send email to
 algogeeks+unsubscr...@googlegroups.comalgogeeks%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/algogeeks?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.



[algogeeks] Re: top search queries

2011-02-02 Thread sankalp srivastava
@guy above juver++
The solution , i don't think can get better than this , because you
need to store the querries anyway (at least for the output )

-- 
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.