Hey there,
  My database population script is getting close to being done, and I'm to the point 
where I'm trying all my different input files.  One set of them has Kanji characters 
in it, which means the files are encoded as UCS-2 Unicode files.  If necessary, I can 
manually parse them, but I would prefer to use the CSV driver to do so.  I've noticed 
that the CSV is giving me much better stability than self-parsing, and would really 
prefer to continue with that...

  I have attached one of my input files.  It is a UCS-2 Windows Notepad created file.  
It has Unicode Kanji characters in it...

  Also, here is a very non-robust sub for checking for Unicode files....

sub CheckUnicode {
  my ($file) = @_;
  my ($type, $buf);
  open(UNICHECK, $file);
  binmode UNICHECK;
  read(UNICHECK, $buf, 2);
  if ( (ord(substr($buf,0,1)) == 255) &&  (ord(substr($buf,1,1)) == 254) ) {
    $type = "UNICODE";
  } else {
    $type = "ASCII";
  }
  close (UNICHECK);
  return $type;
}

  I appreciate any advice that is given on this subject...

Thank you,
amonootod


-- 
    `\|||/         amonotod@    | subject line: 
      (@@)         charter.net  | no perl, no read...
  ooO_(_)_Ooo________________________________
  _____|_____|_____|_____|_____|_____|_____|_____|

Attachment: consum.dat
Description: Binary data

Reply via email to