HOWTO: File Renaming and Directory Recursion

Morbus Iff Thu, 01 Apr 2004 16:06:35 -0800

Earlier this morning, a friend of mine asked me for a script that would
"given a list of files, replace underscores with spaces", to take this:


  Artist_Name-Track_Name.mp3

and rename it to this:

  Artist Name-Track Name.mp3

The script was mindlessly simple, and I felt it would be a good HOWTO
for the perl beginners crowd, if not to show some good code practices,
but also to counteract the . .. .. . "controversial" HOWTO that had
been posted a week or so ago. Certainly, if you find this as misguided
as his, complain onlist with better examples, or offlist with anger.

The first bit of code I wrote him was below. Save for
the additional explanatory comments, it's nearly exact.

 #!/usr/bin/perl

 # start all your scripts with these two lines.
 # they are the best teacher you will ever find for
 # writing perl code. they make you smarter, and
 # you'll last longer in bed; no drugs necessary.
 #
 use warnings;
 use strict;

 # he needed the script to read every file in a directory
 # and rename them based on whether they had underscores
 # in the name. to make the script as 'immediately runnable'
 # as possible, I assumed he would be placing the script
 # in the directory full of files, and running it there.
 # as such, we'll be opening the current working directory.
 # if anything goes wrong, we stop processing with the error.
 # ALWAYS CHECK FOR SUCCESS BEFORE CONTINUING.
 #
 opendir(DIR, ".") or die $!;

 # next, we load all the files in that directory into
 # an array. at this point, we could also have used the
 # grep() function to filter out entries that weren't
 # relevant, but for readability I chose a more configurable
 # approach (see below).
 #
 my @files = readdir(DIR);
 close(DIR); # implicit.

 # now, we need to loop through all the directory files,
 # stored in @array. we're going to use $_ here, which
 # can be remembered as "the thing we want to work with".
 # we could just as easily used an explicit variable name,
 # and in larger scripts, you usually want to.
 #
 while (@files) {

    # so, the filename is now stored in $_, and since $_ is
    # assumed for a number of Perl's functions, we don't
    # have to explicitly mention it in the following filters.
    # these filter serve one purpose: make sure we're working
    # ONLY with files we should be. these are sanity checks:
    # we're ensuring that we're not operating on anything we
    # wouldn't. this is good practice: ALWAYS CHECK YOUR SANITY.
    # rule out everything you don't want, and focus on everything
    # you do. first, we skip directories with the -d check.
    # the syntax I'm using below is far more readable than a bunch
    # of if/else statements: we're not increasing our indents, and
    # we don't have to worry about a zillion open/closing brackets.
    # it also reads more like English.
    #
    next if -d;

    # if we're still here, we've got a file. we'll automatically
    # skip files that begin with a "." as they're usually considered
    # "special", and renaming them can be a bad thing.
    #
    next if /^\./;

    # and finally, if the file doesn't have any underscores in
    # it, we can skip it immediately. again, this is for safety:
    # the rest of our code could operate on the file and rename
    # it with the same filename, but why waste that processing
    # power? it's just dirty. it's how bad things happen.
    #
    next unless /_/;

    # at this point, we're assuming that this is a file
    # we're supposed to be working on. so, we copy the file
    # name, do our "underscore for space" conversion, and
    # issue a rename() from the original name to the new.
    # again, if something goes wrong with the rename, we
    # die immediately. this is probably overly cautious.
    #
    my $new_name = $_;
    $new_name    =~ s/_/ /g;
    rename($_, $new_name) or die $!;
 }

And that's the script. For readability, no comments:

 #!/usr/bin/perl
 use warnings;
 use strict;

 opendir(DIR, ".") or die $!;
 my @files = readdir(DIR);
 close(DIR);

 while (@files) {
    next if -d;
    next if /^\./;
    next unless /_/;

    my $new_name = $_;
    $new_name    =~ s/_/ /g;
    rename($_, $new_name) or die $!;
 }

It worked fine for him, and we moved on. A few hours later, he
asked for a recursive version, and whether that would be "hard
to do". While I wasn't around to help him out, the weak solution
was easy: just move the script into each new directory and run
it again. But, there are two other solutions to this new request:
the bad one, and the good one.

The bad one is to assume the first script is perfect: it's not.
It works if you're in the current directory, and the assumptions
are that no recursion is necessary. A bad approach to the recursive
problem is to start modifying the above script to manually support
it: people think "hey, I got this working, recursion must be
simple as pie, right?!". Usually, they'll end up with something
like (pseudo non-working code follows):

 my DIRECTORIES = "start_directory"

 foreach DIRECTORY (DIRECTORIES) {
     get list of ENTRIES in DIRECTORY

     foreach ENTRY (ENTRIES) {
         if ENTRY is a DIRECTORY, add to DIRECTORIES
     }

     finished DIRECTORY; remove it from DIRECTORIES
 }

And you know what? This approach *can* work, but you're reinventing
the wheel: this is such a common problem ("how do I recurse through
directories") that it has been mentioned in a zillion FAQs. But no
one reads FAQs, and no one reads HOWTO, so we're gonna be blowing
gas for the rest of our lives.

The proper solution to recursing directories is File::Find. It's
included with every distribution of Perl, is quick and easy to use,
and allows code that looks nearly exactly like our first example.
it's also far more platform-agnostic that you'd ever expect your
code need to be. The revised script:

 #!/usr/bin/perl
 use warnings;
 use strict;

 use File::Find;

 # we no longer have to read directories
 # ourselves: File::Find takes care of that
 # for us - we just define a subroutine for
 # what we want to do with what's been found.
 #
 find(\&underscores, ".");

 # and here is that subroutine. it's nearly exactly
 # the same as our previous code, only this time, we
 # move into the directory that contains a file to
 # be renamed. this is actually a quick hack because
 # I knew this wouldn't be production-code: a more proper
 # solution would be to stay where we are in the directory
 # structure, and give full paths to our rename(). this
 # would require the help of another module, File::Spec.
 # find out more with "perldoc File::Spec". it's handy.
 #
 sub underscores {
    next if -d $_;
    next if /^\./;
    next unless /_/;

    my $new_name = $_;
    $new_name    =~ s/_/ /g;
    chdir($File::Find::dir);
    rename($_, $new_name) or die $!;
 }

One of the best traits you can learn as a Perl programmer is
mastering the use of the core modules, as well as how to find what
you need on CPAN: a good metric ton of your code will look far cleaner,
far easier to understand, and far more maintainable (and FAR more
documented too!). Likewise, you'll get far more done, and with
less "doh!" bugs. Try to understand that a good number of the problems
you'll face in programming have been solved for you: it's just a
matter of taking the time to find the answer instead of coding your
own "solution" that really isn't.

Yep.

-- 
Morbus Iff ( shower your women, i'm coming )
Technical: http://www.oreillynet.com/pub/au/779
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

HOWTO: File Renaming and Directory Recursion

Reply via email to