[algogeeks] Re: Interview question: can this be done?

2006-02-16 Thread Prateek

 4), which will vary the value of k for many times. 

I think to cover up this problem..
1. we can store the starting and ending numbers for every K in another
file (with file name of every set) and then sort the file names
according to the starting values for every K set,

2. hence creating an index based on the starting values of every set..

3. so to check if a number is in a set or not we will simply have to
perform a binary search on this index

4. and then for every matching set scan that particular set's file to
see if the number is present or not.

Thus the time for checking multiple instances will be reduced to some
extent..

Regards,
Prateek.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: Interview question: can this be done?

2006-02-15 Thread beelzebub

I like Terry's idea.
Let's say the 5000 numbers are: {1,2,...,5000}

For every 200 numbers you choose, create a 5000 bit string .. which
corresponds to 625 bytes
which is infact less than the 800 bytes you would require to store the
200 numbers as ints.
You don't store the 200 numbers explicitly, only these 5000bit
bitstrings.

To analyze a particular subset, read in its bitstring. To check k,
calculate it's offset in the bitstring
and check if its 0 or 1. This is constant time operation.

I actually liked Kevin's original idea of using prime numbers and
coming up with a single hash value. But yeah, there are practical
limitations with that. You don't really need them all to be prime, u
just need them to be co-prime.



[algogeeks] Re: Interview question: can this be done?

2006-02-15 Thread Kevin

I get what Terry means now. But it still uses 625/800 = 78% of the
naive method in terms of diskspace (or memory, whatever), so I think
the save is not big enough (the job interview is RD targeted, which I
assume they want to hear one with large saving).

Prateek's idea is to reduce the time of whole set read in (suppose they
are saved to disk since there are many of them). It should work if the
give k does not show up frequently. For each k, on average, I think we
only need to read in 200/5000 = 4% percent of the whole sets, on 96%
times we only need to check the begining and ending index. But I am
kind of worry for the step 4), which will vary the value of k for many
times.

beelzebub mentioned co-prime, interesting. I will think a little bit
more of it.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Algorithm Geeks group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~--~~~~--~~--~--~---



[algogeeks] Re: Interview question: can this be done?

2006-02-14 Thread Kevin

I didn't fully get what you mean, but sounds not memory efficient: if
we need to store the 200 integers per set, and don't forget they say it
could be a lot of sets (even have to write to disk because memory does
not fit).



[algogeeks] Re: Interview question: can this be done?

2006-02-14 Thread Prateek

I think a better alternative could be to choose EVEN 5000 numbers
(taking mod of 2 of any number out of these can help to check whether
it can be in the set or not) and then make out set of 200 from these
5000 even numbers..

the set of 200 nos can be written on the disk in a sorted manner so
that a binary search can be applied to them.. we can take out the first
and the last number from the file and check whether its there or not..
and if its in range then apply binary search to find out the number.