Terry wrote: > This problem is from the book programming pearls . Can anyone explain > me the soln. How does he do it. > > Given a file containing 4,300,000,000 integers, how > can you find one that appears at least twice? (Explain, first, > why this must happen.) > > > > Binary search finds an element that occurs at least twice by > recursively > searching the subinterval that contains more than half of the integers. > > My original solution did not guarantee that the number of integers is > halved > in each iteration, so the worst-case run time of its log2 N passes was > proportional to N log N. Jim Saxe of Carnegie-Mellon University reduced > > that to linear time by observing that the search can avoid carrying too > > many duplicates. When his search knows that a duplicate must be in a > current > range of M integers, it will only store M + I integers on its current > work > tape; if more integers would have gone on the tape, his program merely > discards them. Although his method frequently ignores input variables, > its strategy is conservative enough to ensure that it finds at least > one > duplicate.
You seem to already have the solution. I'll attempt to rephrase. I'll assume the integers are 32-bit; otherwise the problem is ill-posed. A repeat must happen via the pigeonhole principle. Make a scan through the integers. If the leading bit is 1, put it in file 1.txt; if it is 0, put it in file 0.txt. Keep a count of the number of integers in 1.txt and 0.txt. As soon as one of the counts equals 2^31 + 1, stop. Now, recur on the file with 2^31 + 1 elements, but examine the second bit and stop when one of the counts reaches 2^30 + 1. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to algogeeks@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/algogeeks -~----------~----~----~----~------~----~------~--~---