Earlier this morning, a friend of mine asked me for a script that would "given a list of files, replace underscores with spaces", to take this:
Artist_Name-Track_Name.mp3 and rename it to this: Artist Name-Track Name.mp3 The script was mindlessly simple, and I felt it would be a good HOWTO for the perl beginners crowd, if not to show some good code practices, but also to counteract the . .. .. . "controversial" HOWTO that had been posted a week or so ago. Certainly, if you find this as misguided as his, complain onlist with better examples, or offlist with anger. The first bit of code I wrote him was below. Save for the additional explanatory comments, it's nearly exact. #!/usr/bin/perl # start all your scripts with these two lines. # they are the best teacher you will ever find for # writing perl code. they make you smarter, and # you'll last longer in bed; no drugs necessary. # use warnings; use strict; # he needed the script to read every file in a directory # and rename them based on whether they had underscores # in the name. to make the script as 'immediately runnable' # as possible, I assumed he would be placing the script # in the directory full of files, and running it there. # as such, we'll be opening the current working directory. # if anything goes wrong, we stop processing with the error. # ALWAYS CHECK FOR SUCCESS BEFORE CONTINUING. # opendir(DIR, ".") or die $!; # next, we load all the files in that directory into # an array. at this point, we could also have used the # grep() function to filter out entries that weren't # relevant, but for readability I chose a more configurable # approach (see below). # my @files = readdir(DIR); close(DIR); # implicit. # now, we need to loop through all the directory files, # stored in @array. we're going to use $_ here, which # can be remembered as "the thing we want to work with". # we could just as easily used an explicit variable name, # and in larger scripts, you usually want to. # while (@files) { # so, the filename is now stored in $_, and since $_ is # assumed for a number of Perl's functions, we don't # have to explicitly mention it in the following filters. # these filter serve one purpose: make sure we're working # ONLY with files we should be. these are sanity checks: # we're ensuring that we're not operating on anything we # wouldn't. this is good practice: ALWAYS CHECK YOUR SANITY. # rule out everything you don't want, and focus on everything # you do. first, we skip directories with the -d check. # the syntax I'm using below is far more readable than a bunch # of if/else statements: we're not increasing our indents, and # we don't have to worry about a zillion open/closing brackets. # it also reads more like English. # next if -d; # if we're still here, we've got a file. we'll automatically # skip files that begin with a "." as they're usually considered # "special", and renaming them can be a bad thing. # next if /^\./; # and finally, if the file doesn't have any underscores in # it, we can skip it immediately. again, this is for safety: # the rest of our code could operate on the file and rename # it with the same filename, but why waste that processing # power? it's just dirty. it's how bad things happen. # next unless /_/; # at this point, we're assuming that this is a file # we're supposed to be working on. so, we copy the file # name, do our "underscore for space" conversion, and # issue a rename() from the original name to the new. # again, if something goes wrong with the rename, we # die immediately. this is probably overly cautious. # my $new_name = $_; $new_name =~ s/_/ /g; rename($_, $new_name) or die $!; } And that's the script. For readability, no comments: #!/usr/bin/perl use warnings; use strict; opendir(DIR, ".") or die $!; my @files = readdir(DIR); close(DIR); while (@files) { next if -d; next if /^\./; next unless /_/; my $new_name = $_; $new_name =~ s/_/ /g; rename($_, $new_name) or die $!; } It worked fine for him, and we moved on. A few hours later, he asked for a recursive version, and whether that would be "hard to do". While I wasn't around to help him out, the weak solution was easy: just move the script into each new directory and run it again. But, there are two other solutions to this new request: the bad one, and the good one. The bad one is to assume the first script is perfect: it's not. It works if you're in the current directory, and the assumptions are that no recursion is necessary. A bad approach to the recursive problem is to start modifying the above script to manually support it: people think "hey, I got this working, recursion must be simple as pie, right?!". Usually, they'll end up with something like (pseudo non-working code follows): my DIRECTORIES = "start_directory" foreach DIRECTORY (DIRECTORIES) { get list of ENTRIES in DIRECTORY foreach ENTRY (ENTRIES) { if ENTRY is a DIRECTORY, add to DIRECTORIES } finished DIRECTORY; remove it from DIRECTORIES } And you know what? This approach *can* work, but you're reinventing the wheel: this is such a common problem ("how do I recurse through directories") that it has been mentioned in a zillion FAQs. But no one reads FAQs, and no one reads HOWTO, so we're gonna be blowing gas for the rest of our lives. The proper solution to recursing directories is File::Find. It's included with every distribution of Perl, is quick and easy to use, and allows code that looks nearly exactly like our first example. it's also far more platform-agnostic that you'd ever expect your code need to be. The revised script: #!/usr/bin/perl use warnings; use strict; use File::Find; # we no longer have to read directories # ourselves: File::Find takes care of that # for us - we just define a subroutine for # what we want to do with what's been found. # find(\&underscores, "."); # and here is that subroutine. it's nearly exactly # the same as our previous code, only this time, we # move into the directory that contains a file to # be renamed. this is actually a quick hack because # I knew this wouldn't be production-code: a more proper # solution would be to stay where we are in the directory # structure, and give full paths to our rename(). this # would require the help of another module, File::Spec. # find out more with "perldoc File::Spec". it's handy. # sub underscores { next if -d $_; next if /^\./; next unless /_/; my $new_name = $_; $new_name =~ s/_/ /g; chdir($File::Find::dir); rename($_, $new_name) or die $!; } One of the best traits you can learn as a Perl programmer is mastering the use of the core modules, as well as how to find what you need on CPAN: a good metric ton of your code will look far cleaner, far easier to understand, and far more maintainable (and FAR more documented too!). Likewise, you'll get far more done, and with less "doh!" bugs. Try to understand that a good number of the problems you'll face in programming have been solved for you: it's just a matter of taking the time to find the answer instead of coding your own "solution" that really isn't. Yep. -- Morbus Iff ( shower your women, i'm coming ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>