I think we have to use reservoir sampling here.
On Dec 17 2011, 10:20 am, Ankur Garg ankurga...@gmail.com wrote:
suggest algo to find k most frequently occuring numbers from a file of very
large size containing numbers.
--
You received this message because you are subscribed to the Google
Nice Soln Lucifer,
i had problem of tracking kth value when coming across two siblings, each
sibling has many childs
so i think a bottom up approach would be better for finding number of
elements(say* y*) x
finally at root we have y,
if y=k then kth element is x
else kth elemnt is x
Surender
On
How about using a Tournament tree . As the size of the size of huge we
heap is better option to opt for , which is also been implemented in
Tournament tree , and when the same element come more than once just
increment the count for the Frequency of the Internal Node . The
leaves will contain
@Lucifer : if for the same question , we consider a Max heap instead of Min
heap then complexity of the same algo will be O(N) ... right???
On Fri, Dec 16, 2011 at 8:50 PM, Lucifer sourabhd2...@gmail.com wrote:
@my previous post..
While explaining the run-time I have made an editing
@atul..
Complexity would be 2(n-k+1) = O(2*(n-k))
Basically the condition based on which the traversal is performed will
change. I have modified some part of the original post to show that:
Instead of having the initial count as K have it as N-K+1 .. when
taking a max-heap..
To solve the
@my previous post..
While explaining the run-time I have made an editing mistake...
instead of 'N' nodes it should be 'K' nodes..
i.e.
Hence, for any given bin- tree having 'K' nodes, the number of null
nodes is 'K+1'.
These null nodes are nothing but the nodes where the check nodeValue
X failed
@sunny why we look at all k number which are greater then x , correct ?
Lets think in this way
we wants to check if kth smallest element in heap thats =x isn't it ? so
if root of mean heap is greater then x then none other elements will less
then x so we terminate .
else our algorithm will
@All
I don't think the algo given above is entirely correct.. Or may be i
didn't it properly...
So basically say a preorder traversal of the heap is done based on
whether the current root value X. As the algo says that at any point
if k0 and we hit a node value which is =X , then we are done.
@Lucifer : yes even i found flaw in the above algo when i gave it a second
thought but didnt get time to post it.
bcoz min heap has property that the parent node is less than its both
child(subtree to be more precise ) but it does not confirm that left child
is always smaller than right child of
oops...
wanted to write the same but yeah its meaning turns out to be totally
different :(
anyways very well explained by Lucifier
@shashank
i think now u will be able to get why there will be only 2k comparisons in
the worst case
On Thu, Dec 15, 2011 at 10:51 PM, atul anand
i am Considering given heap as min heap
Sol - because heap has property that root will has lower value than all
the elements in its left sub tree and right sub tree
so in main we will call a function passing root and value k and x
if at any time root is greater than x and k 0 that means rest of
@sunny : this will work :)
On Wed, Dec 14, 2011 at 4:08 PM, sunny agrawal sunny816.i...@gmail.comwrote:
i am Considering given heap as min heap
Sol - because heap has property that root will has lower value than all
the elements in its left sub tree and right sub tree
so in main we will
@sunny it will 3k not 2k ? u forgot to count the root element so over all
time complexity will O(3K)=O(K)
correct me if am wrong ?
Thanks
Shashank Mani
CSE, BIT Mesra
--
You received this message because you are subscribed to the Google Groups
Algorithm Geeks group.
To view this discussion
more clarification , why number of comparisons are 3k , because we are
looking for only those nodes which are less then x and each nodea can max 2
childs , so tottal 3k comparisons . so it will O(3K) .
Thanks
Shashank
CSE , BIT Mesra
--
You received this message because you are subscribed
@shashank
nope only 2k comparisions
k numbers which are greater than x and k which are less than x
dont think in terms of root and child
On Thu, Dec 15, 2011 at 12:57 PM, WgpShashank shashank7andr...@gmail.comwrote:
more clarification , why number of comparisons are 3k , because we are
looking
@Atul: The initial heap is given. You have to maintain the heap
property as k elements are removed, which is O(k log n). This does not
satisfy the original request for an algorithm that is O(k) in the
worst-case, independent of the size of the heap.
Dave
On Dec 13, 10:31 pm, atul anand
16 matches
Mail list logo