Hi folks, happy new year to everyone. ) John, you're right, of course. ) The filenames in nested directories could well overlap, and using $File::Find::name would be safer. Didn't think of that as a big problem, though, as original script (with 'opendir') ignored all the nested folders overall.
Jonathan, no, you don't have to store the filenames as array: complete pathnames of the file can't be repeated. It'll be sufficient just to change this line: $filedata{$_} = [$filesize, $filemd5] for $filedata{$File::Find::name} = [$filesize, $filemd5] (and replace catfile in the writing block as well, as %filedata keys will now be the whole filenames themselves). On sorting: cmp and <=> is not the same: former compares strings, latter - numbers. So, for example, 'abc' cmp 'def' gives you -1, but 'abc' <=> 'def' gives 0 (and warnings about non-numeric args as well). It's nice to know the difference, but... do you really need to sort the output in your script? What output? ) It makes no difference in what order your .md5 files will be created, right? And you don't need to print the list of files processed (as I did in my test script, that's why the ordering was ever mentioned). As for $_, the problem John mentioned is logical, not 'perlical': as $_ variable is assigned a filename processed within File::Find target sub, and files in different directories could have the same names (but not full names, with absolute path attached), it may cause a bit of confusion when they DO have the same names. ) Generally speaking, $_ usage is for comfort and speed (yes, thinking of it as of 'it' word is right )). Of course, you can reassign it, but it'll make your scripts bigger (sometimes _much_ bigger) without adding much clarity, in my opinion. But that, again, is a matter of taste. For me, I use $_ almost every time I process shallow collections (hashes or arrays, doesn't matter). When two-level (or more complex) data structure is processed, it's usually required to use a temporary variable - but even then inner layers can be iterated with $_ easily. -- iD