On Thu, 6 Jul 2006, Roman Daszczyszak wrote:

> Hello all,
> 
> I am using a script to parse a CSV file with approximately 65,000
> records.  Some of these records contain characters such as é, ì, etc.
> I can read and write lines containing these characters via a file
> handle, however when I try and parse the line using the module
> Class::CSV, it fails and returns the error:
> Failed to parse line: <line it failed on>
> 
> My question, rather broadly, is how can I successfully handle these
> characters?  If I manually edit the file (via vim) and replace all the
> 'é' characters with 'e', and the like, it works fine.  However, I
> would prefer the script actually handle the characters, or at least
> have a automated way to strip them out, without having to enumerate
> each character (i.e. =~ s/é/e/g;).
> 
> Does anyone have a suggestion for how I can handle this, or even where
> I can look to solve this issue?  Is there another possibility for
> where the error is occurring that I am not seeing?
> 
> Any advice you may lend would be greatly appreciated.





This may not solve your problem, but here is a script I used recently to
identify all ascii character >127

You will probably get the idea

======================================================
#!/usr/bin/perl

use strict;

open (PHCSV,">textdb.csv.new") or die "cant open textdb.csv.new $!\n";

open (FH,"textdb.csv") or die "cant open textdb.txt $!\n";

while (<FH>){
my $chr = $_;

my @array = split (//,$chr);

foreach my $n(@array){#now check each character
my $val = ord($n);
if ($val <= 127){print PHCSV "$n"}else{print PHCSV ""}
#print "$.\t$n\t$val\n" if ($val > 127);}
        }
print PHCSV "\n";
}



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to