I think there will be some optimizations always possible, but You wont get any dramatic improvements.
What I would do is something like this First make sure that all the data is sorted in the file Create a sequence array of all the required numbers, In your example it was all numbers from 1..10 #!/usr/bin/perl # # my @full_sequence = (1..10); my @missing=(); my $i=0; while(<DATA>){ chomp; while($full_sequence[$i] != $_) { push @missing , $full_sequence[$i++] ; } $i++ } print "MISSING ARE @missing\n"; exit 0; ___ END__ Just try that out , hope it will be better. Infact your requirement is very data specific , It will hardly make any difference wether you code in PERL or in C Bye Ram On Thu, 2004-05-13 at 04:09, Larry Wissink wrote: > I have a problem that I thought would be perfect for Perl, except that I > seem to be using all my system resources to run it. Of course this > probably means I'm doing it the wrong way... > > > > The problem: > > We have a backup server that is missing records from the production > server for a particular table. We know that it should have sequential > records and that it is missing some records. We want to get a sense of > the number of records missing. So, we know the problem started around > the beginning of March at id 70,000,000 (rounded for convenience). > Currently we are at 79,000,000. So, I dumped to a file all the ids > between 70,000,000 and 79,000,000 (commas inserted here for > readability). I need to figure out what numbers are missing. The way > that seemed easiest to me was to create two arrays. One with every > number between 70 and 79 million, the other with every number in our > dump file. Then compare them as illustrated in the Perl Cookbook using > a hash. > > The simple script I came up with works fine with a test file of just 10 > records. > > But, when I try to scale that to 9 million records, it doesn't work. > This is probably because it is trying to do something like what db > people call a cartesian join (every record against every record). > > So, does anybody have a suggestion for a better way to do it in Perl? > > > > I'll probably end up doing it in SQL if I can't come up with a Perl > solution. (Create a second table like the first array with every number > between 70 and 79 million, and join the two tables.) > > > > Larry > > [EMAIL PROTECTED] > > > > script: > > > > use strict; > > use warnings; > > > > my %seen; > > my @list = (); > > my @missing; > > my @ids = (); > > my $lis; > > my $item; > > > > foreach $lis (1 .. 10) { # sample list of 10 > > push(@ids, $lis); > > } > > > > open(DATA, "< ms_ids_test.txt") or die "Couldn't open data file: $!\n"; > # create file like below > > > > while (<DATA>) { > > chomp; > > push(@list, $_); > > } > > > > @[EMAIL PROTECTED] = (); > > > > foreach $item (@ids) { > > push(@missing, $item) unless exists $seen{$item}; > > } > > > > print scalar(@missing); > > > > > > #sample file (without the pounds) > > #1 > > #2 > > #3 > > #4 > > #5 > > #9 > > #10 > > # note missing 6,7,8 > > # result is 3 > > > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>