On Wed, Oct 24, 2001 at 04:15:08PM +0100, Lucy McWilliam wrote:
> 
> If anyone can think of a better (i.e. quicker) way to do it - even if it
> means having to relearn C - I'd love to know.  Code snippets below.  Bonus
> points for the subject line ;-)
> 

If you have enough swap and RAM, you could read the file in once and
convert it to a 2D array of score[fly,fly] stored in a very long
string and accessed via substr and unpack.

Assuming your packed scores take 4 bytes (long int or single precision
float) you would need 4*$N*$N bytes of virtual memory to hold this
lot, where there are $N flies.  Because of the way your file is
ordered, I think you can build this string with one pass through the
file without undue swapping if you have enough RAM to have $N virtual
memory pages in RAM at once.

Assuming a VM page size of 4k, this would take up 400M of swap and 40M
of RAM for 10000 flies.

Now, one fly at a time, do the sort to convert score[other fly] into
ranking[other fly].  Assuming less than 65000 flies, the rankings will
take less space than the scores since each can be packed into 2 bytes.

This leaves you with the numbers you want packed into 2*$N*$N bytes.

If you don't have enough RAM to keep the whole results string paged in
then you'll need to be a bit careful about the order in which you
extract the answers to write them out to file.  You can pull them out
in pairs in the same order as the original file with $N VM pages of
RAM.

Nick

--
($O=   #\\\\\\\\\\\\\\\\\\\\\\\\\\\\\  /////////////////////////////#{O$}xb|
q|HHHNIiHIHIHNNHHHHI{HHHiiHHHHHiiI|^#\/#(|}OM:-#+(iI$:-+!:- >i (!>:=#!i +-|b
q|-+ !i#=:<i) !< -:i+-:$I!)+#-:WO{|)#/\#v|I!!HHHHH!!HHH}IHHHHNNHIHIH!INHHH|b
|qx{$O}#/////////////////////////////  \\\\\\\\\\\\\\\\\\\\\\\\\\\\\#   =O$)

Reply via email to