I have a tool developed in perl that processes a file using regex and then pumps the data into a Berkley DB file. the parts of the logic is as an example as below
my @LMUData = (); # Initiate the variables # Main Logic open (LMUFILE, $InputFile) or die "Could not open $InputFile "; open (OPFILE, "> $OutputFile") or die "Could not open $OutputFile "; open (LOGFILE, "> $LogFile") or die "Could not open $OutputFile "; while (<LMUFILE>) { @LMUData = "ewords('\t', 0, $_); # do some pattern matches if ($LMUData[10]=~m/<.*>|Ctrl\+|^\-|[\*]|Alt\+|[\(]/) { print LOGFILE "$LMUData[10]\t$LMUData[11]\n"; } elsif (!$LMUData[10]) { print LOGFILE "$LMUData[10]\t$LMUData[11]\n"; } else { print OPFILE "$LMUData[10]\t$LMUData[11]\n"; } } close(LOGFILE); close(OPFILE); close(LMUFILE); I'm using the UTF pragma and the input file is also UTF8. The pattern matching does not work and in some cases gives me an error that says "Use of uninitialized value in concatenation (.) or string at C:\paksa3\DataHandler.pl line 102, <LMUFILE> line 155 " How do I pattern matching for japanese/chinese characters? _______________________________________________ Perl-Unix-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs