On Wednesday 15 December 2004 15:34, Benjamin Jeeves wrote: > Hi all Hi Benjamin,
> > I'm writting a program in perl to md5sum about 500,000 files these files > are text files and have different files size the biggest being about 500KB. > my code is below [..code snipped..] > > The thing is that it is taking about 3 to 4 hours to complete would these > be about right on this number of files? It's hard to estimate how long it should take - I'm md5sum'ing regularly about 25.000 files, and it takes something like 2 minutes. Given those numbers, I'd think that it should take around 40 minutes to sum your files. > This there any way to speed things up? if so any example would be good or a > point in the right way too? If I do not md5sum the files it print to the > screen in about 2 mins? Off the top of my head, I don't know what could be improved in your code - maybe it would help not to instantiate an own MD5 object in each run of your for loop? I'm attaching the code I'm using - maybe you want to run a test how long this code needs to scan your files. HTH, Philipp # ========================================================================= # SCAN_DIRECTORY # ========================================================================= # scan all files of a directory and write the result (filenames + MD5) # into a specified file # param $directory_name that should be scanned # param $filename into which to write # param (optional) $regexp that should be applied to filter out files my $digest; my $out_file; my $directory; my $filter; my $base_file; # callback procedure sub process { my $fh = new FileHandle; return if (! -f $File::Find::name); if ($filter) { return if ($File::Find::name =~ /$filter/); } # TODO jar support #if ($File::Find::name =~ /\.jar$/) { # $base_file = $File::Find::name; #} open ($fh, $File::Find::name) or die "cannot open file $File::Find::name : $!"; binmode($fh); $digest -> addfile($fh); my $file_name = substr($File::Find::name, length($directory)+1); print $out_file $file_name . ";" . $digest -> hexdigest . "\n"; close ($fh); } sub scanDirectory { $digest = Digest::MD5 -> new; ($directory, my $scan_file, $filter) = @_; my $base_dir = getcwd(); chdir($directory); $out_file = new FileHandle; sysopen($out_file, $scan_file, O_CREAT | O_RDWR) or die "could not open file $scan_file : $!"; find ( \&process, $directory); close($out_file) or die "could not close file $scan_file : $!"; chdir($base_dir); } sub testScanDirectory { scanDirectory('d:/temp/tools', 'd:/tools.txt'); scanDirectory('d:/temp/tools', 'd:/tools_filtered.txt', '\.exe$'); } -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>