On Thu, 6 Jul 2006, Roman Daszczyszak wrote: > Hello all, > > I am using a script to parse a CSV file with approximately 65,000 > records. Some of these records contain characters such as é, ì, etc. > I can read and write lines containing these characters via a file > handle, however when I try and parse the line using the module > Class::CSV, it fails and returns the error: > Failed to parse line: <line it failed on> > > My question, rather broadly, is how can I successfully handle these > characters? If I manually edit the file (via vim) and replace all the > 'é' characters with 'e', and the like, it works fine. However, I > would prefer the script actually handle the characters, or at least > have a automated way to strip them out, without having to enumerate > each character (i.e. =~ s/é/e/g;). > > Does anyone have a suggestion for how I can handle this, or even where > I can look to solve this issue? Is there another possibility for > where the error is occurring that I am not seeing? > > Any advice you may lend would be greatly appreciated.
This may not solve your problem, but here is a script I used recently to identify all ascii character >127 You will probably get the idea ====================================================== #!/usr/bin/perl use strict; open (PHCSV,">textdb.csv.new") or die "cant open textdb.csv.new $!\n"; open (FH,"textdb.csv") or die "cant open textdb.txt $!\n"; while (<FH>){ my $chr = $_; my @array = split (//,$chr); foreach my $n(@array){#now check each character my $val = ord($n); if ($val <= 127){print PHCSV "$n"}else{print PHCSV ""} #print "$.\t$n\t$val\n" if ($val > 127);} } print PHCSV "\n"; } -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>