On Friday 30 November 2007 22:47, [EMAIL PROTECTED] wrote:

Sorry for taking so long to reply but I was mulling it over while 
eating cold pizza and watching NCAAF.  You don't specify the exact file 
names so I will assume that they match the pattern /\A\d+\z/ which is 
what you use when you copy them.


> Sorry to plop so much shabby code up here but I think I've stared at
> this a little too long and am now incapable of catching my error.
>
> This started out as just a helper script to help solve this problem:
>
> Needing to cp files with number names from one directory to another.
> Sometimes there might be some numbered files already in the target
> directory.   So the problem was really two fold.
>
> 1)  Find the right files to copy by specific content.
>
>     I'm doing this job with grep -rl, it seems to be the fastest by
>     a hefty margin, to compile a list of the right files.
>
> 2) Once the list is compiled, copy the files over to a new directory
>    making sure not to clobber possible exisiting files.
>
> I'll probably try to turn the perl script into something more general
> that does the whole job eventually if this chore looks like it might
> recur enough.
>
> So (2) is all this script was meant to do at this point.
>
> The script does the job and appears to run error free.  But after the
> work is done I attempt to report 2 things.  The number of files
> copied and the number of files now in the Target.
>
> The second count is consistently wrong by a wide wide margin and I
> am too blind to see why.
>
> Here is a sample run (code encluded at the end) running the script
> the way it was intended to be used (script.pl TargetDir SourceList):
>
>  ./cpNoClobber.pl test3 SourceList
>
>   <16> Files copied to:
>  
> /home/reader/News/agent/nntp/news.gmane.org/gmane/comp/lang/perl/test
>3
>
>   <17> files now in:
>  
> /home/reader/News/agent/nntp/news.gmane.org/gmane/comp/lang/perl/test
>3
>
> Now lets look at the results of `ls' on the target:
>
>   ls test3/[0-9]*|wc -l
>   336
>
> The script gets numbering right with no clobbers I think, but the
> reporting is way wrong.
>
> =====* Code *=====     =====* Code *=====     =====* Code *=====
>
>   ./cpNoClobber.pl
>
> #!/usr/local/bin/perl -w

You were asking why 'use warnings' instead of 'perl -w':

perldoc perllexwarn

might help explain better.  If not then ask here further.  And you 
should also have strict enabled:

use strict;


> if(!$ARGV[1]){
>    usage();
>    print "Too few arguments.. exiting\n";
>    exit;
> }
>
> if($ARGV[2]){
>    usage();
>    print "Too many arguments... exiting\n";
>    exit;
> }

You are doing boolean tests on the contents of array elements when you 
should really be testing the array itself:

if ( @ARGV < 2 ) { # Too few arguments

if ( @ARGV > 2 ) { # Too many arguments

The difference is that "0" is a valid file/directory name but it is 
false in boolean context in Perl.


In my opinion (and you may ignore this if you please) you have a 
repeating pattern (usage; print; exit) that you should encapsulate in 
one place, for example:

sub usage {
    print <<EOM, @_;
usage message

EOM
    exit;
    }

@ARGV < 2 and usage( "Too few arguments.. exiting\n" );
@ARGV > 2 and usage( "Too many arguments.. exiting\n" );


> use Cwd;
> my $Cwd = getcwd;
>
> use File::Copy;
>
> my ($HiNum, $Copied, $TrgDir, @DirContent, $Cnt, $Line);
> $TrgDir = shift;
> $HiNum  = 0;
> $Copied = 0;
>
> opendir(DIR,$TrgDir) or die "can't opendir $TrgDir: $!";
>
> ## Find out if target directory has any numbered files in it.
> ## If it does get the highest number in there
> @DirContent = grep { /^[0-9]/ } readdir(DIR);
> if ($DirContent[0]){

Again, you are doing boolean tests on the contents of array elements 
when you should really be testing the array itself.  What if 
$DirContent[0] contained the file name "0"?

if ( @DirContent ) {

But the test is superfluous because foreach will not loop over an empty 
array.


>    for(@DirContent){
>      if($_ > $HiNum){
>         $HiNum = $_;
>      }
>    }
> }
>
> ## If we already have numbered files, Start our counter with
> ## the highest one
> if($HiNum){
>    $Cnt = $HiNum;
> }

The test is superfluous and therefore the $HiNum variable is 
superfluous.  Just do:

for ( @DirContent ) {
    $Cnt = $_ if $Cnt < $_;
    }

[  Why is it superfluous you ask?

  Both variables start out with the value 'undef'.

  At the @DirContent loop $HiNum will either contain a number or remain
  at 'undef'.

  At the if test $Cnt will either recieve the number from $HiNum or
  remain at 'undef'.

  At the end both variables will either have the same number or both
  will still be 'undef'.

  You are using two variables where only one is needed.
]


> while(<>){
>    chomp;
>    $Cnt++ ;
>    ## The first if clause here is unnecessary now... to be removed
>    if($TrgDir){
>       ## Add the correct path if our line doesn't start with '/'
>      if(!/^\//){
>          $Line = $Cwd ."/". $_;
>       }
>       if($TrgDir !~ /^\//){
>          $TrgDir = $Cwd."/".$TrgDir;
>       }

Assuming a unix-like system then prepending $Cwd is redundant, $_ and 
$TrgDir should work as is.


>       ## Do what we came here for: copy the target files into target
>       ## directory with fnames that won't clobber any existing
> filenames copy($Line,$TrgDir. "/". $Cnt) or die "can't copy $Line to
> <$TrgDir>: $!"; $Copied++;
>    }
> }
>
> ## Report how many files were copied
> print " <$Copied> Files copied to:\n $TrgDir\n\n";
> print " ".CountTrgDir()."\n";
> close(DIR);

To close a directory handle you need to use closedir() instead.


> sub CountTrgDir {
>   my $DirCnt = 0;
>   ## Recount the TargetDir
>   @DirCnt = grep { /^[0-9]+/ } readdir(DIR);

/^[0-9]+/ and /^[0-9]/ will both match the same files but /^[0-9]/ is 
more efficient as it only has to match one character.


> ## Just in case there was something wrong with my technique using
> ## the array to get a count.. also tried the formulation commented
> out ## below to see the actual count as it happened... still wrong #
> @DirCnt = grep { /^[0-9]+/&& print "Counting <". $DirCnt++.">\n" }
> readdir(DIR);
> return "<".(@DirCnt + 1)."> files now in:\n $TrgDir\n";

If @DirCnt contains 4 file names then @DirCnt in scalar context will be 
equal to 4 and you are adding 1 to that and saying that you have 5 
files.  Just use @DirCnt itself for the actual count of files:

return '<' . @DirCnt . "> files now in:\n $TrgDir\n";


> }
>
> sub usage {
>  print <<EOM;
>
> Purpose: To copy files from a list on disk to the directory of
> choice. Numbering them with nonClobber names as we go (even if some
> numbered files are already in the directory).
> Usage: \`$myscript TARGETDIRECTORY SOURCEFILE'
>         (DIRECTORY= name of target directory)
> NOTE: SOURCEFILE is likely to have been compiled like this:
>       grep -rl 'REGEX'  SOMEDIRECTORY >SOURCEFILE (where
> SOMEDIRECTORY is a directory full of email or posts.  We are looking
> to skim off those with REGEX placing the filenames in SOURCEFILE, and
> copy them to another directory using \`$myscript ', so as to leave
> mail or news directory unharmed.
>
> EOM
> }


If you want to incorporate the grep into the perl program then this may 
work (UNTESTED):

#!/usr/bin/perl
use warnings;
use strict;

use File::Find;

( my $myscript = $0 ) =~ s!\A.*/!!;

@ARGV == 3 or die "usage: $myscript 'REGEX' SOMEDIRECTORY 
TARGETDIRECTORY\n";

my $regex = qr/$ARGV[0]/;

my ( $SrcDir, $TrgDir ) = @ARGV[ 1, 2 ];

my ( $OldCnt, $Num );
{   opendir my $dh, $TrgDir or die "Cannot open '$TrgDir' $!";
    while ( my $file = readdir $dh ) {
        next unless -f "$TrgDir/$file" and $file =~ /\A\d+\z/;
        $Num = $file if $Num < $file;
        $OldCnt++;
        }
    }

my $CopyCnt;
find sub {
    return unless -f and /\A\d+\z/;
    open my $fh, '<:raw', $_
        or die "Cannot open '$_' $!";
    my $size = -s $fh;
    $size == read $fh, my $data, $size
        or die "Cannot read '$_' $!";
    return unless $data =~ $regex;

    $Num++;
    open my $out, '>:raw', "$TrgDir/$Num"
        or die "Cannot open '$TrgDir/$Num' $!";
    print $out $data
        or die "Cannot print to '$TrgDir/$Num' $!";
    $CopyCnt++;
    }, $SrcDir;

my $NewCnt;
{   opendir my $dh, $TrgDir or die "Cannot open '$TrgDir' $!";
    while ( my $file = readdir $dh ) {
        next unless -f "$TrgDir/$file" and $file =~ /\A\d+\z/;
        $NewCnt++;
        }
    }

## Report how many files were copied here
## Left as an exercise for the OP   :-)


John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to