Sharan Basappa am Montag, 18. Dezember 2006 08:09: > On 12/17/06, D. Bolliger <[EMAIL PROTECTED]> wrote: > > Sharan Basappa am Sonntag, 17. Dezember 2006 16:22: > > > > Hello > > > > > While going through some of the old text files, which til now I thought > > > were in good shape, found > > > that some of the files were corrupt and contained garbage data. > > > > We need to know more; what garbage? Where in the file? Are the files > > corrupt > > because they were stored on for example old floppies? > > > > > I would > > > like to know if there is a > > > simple way to find this out using script since I have 100s of such > > > files and it is difficult for me to go through all these files .. > > > > What's garbage and what's not depends from the format of the file > > content, it's intended usage... > > > > Your task may be easy or nearly impossible to solve automatically. > > If there's a way to exactly separate garbage from non-garbage, and > > express this with means of a script language, it may be easy.
Hello Sharan Basappa (please don't top post) > actuall these look like invalid ascii files to me (files seem to look like > binary content). > typically this happens when I transfer files from one machine to another > using my usb key. > But this is not the case with all files. So this is the reason I wanted to > know if there is a way to > recursively go through all files and report if a file does not seem to a > valid ascii file .. Maybe the following script [tested] can be a start, you'd have to adapt it to your needs. Dani #!/usr/bin/perl # usage: this_script filename1 [,filename2 ...] use strict; use warnings; # ***Adjust to your needs***, see perldoc perlre # # (invalid defined as "not in the set of valid chars" # my $invalid=qr/[^0-9a-zA-Z_!?.;,\s"'()-]/; my @invalids=(); # contains filenames for my $fn (@ARGV) { open my $fh, '<', $fn or die $!; while (<$fh>) { if (/($invalid)/) { warn "'$fn' seems to have a first invalid char '$1' on line $.\n"; push @invalids, $fn; last; } } close $fh or die $!; } warn "\nfiles with invalid chars:\n", (join "\n", @invalids), "\n"; __END__ -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>