Nadim <[EMAIL PROTECTED]> writes: >Hi, I guess the following question has already been answerd somewher but I >couldn't find it on the archive. A search feature for the archive would be >great. After reading 20-30 mails I started to get tired of it. Has anyone >seen a a mail-archive crawler which can do a search. (That sound almost >like a fun script to write) > >More seriously. I am using 5.6.3 on windows from activestate. I do the >following.
I don't think you are. As far as I am aware there is only perl5.6.1 there isn't a .3 subversion yet. >>>>>>> >my $ole_object = ..... ; >my $unicode_string = $ole_object->GetUnicodeString() ; OLE objects are a Win32 thing. You would be better off asking on one of the Win32 aware ActiveState lists. We would at least need to know how you created $ole_object so we can lookup the code that gets the string. > >print length($unicode_string), "\n" ; ># prints 17, which is the length of the unicode string Cool - but are you sure you got the real string? > >use byte () ; >print byte::length($unicode_string), "\n" ; ># prints 17, wow, the string is japanese I expect 34 The byte:: hackery is _very_ confusing to all concerned. It returns the length the string happens to be in perl's internal encoding. That may be either iso-8859-1 or UTF-8. If the original "japanese" happened to be all iso-8859-1 even though it used to be 2-bytes/char it will be held (normally) by perl as 1-byte per-char. You will also get 1-byte/char if (as I suspect is happening here) ->GetUnicodeString has converted things it does not understand to '?'. > >print $unicode_string ; ># prints ??????????????? on the console Hmm - as perl5.6 does not have "smart" Unicode IO (perl5.8 does), this suggests that string is actually '?' x 17 - i.e. you got "junk" back from the OLE call. > >print FILE $unicode_string ; ># prints ??????????????? in the file ><<<<<< Likewise. > >What the script is to do is: >1/ get a unicode string from an Ole object Contact OLE expert and find out if that works. >2/ read a unicode string from a file For perl5.6 file has to be in UTF-8 and you need to do some hackery (which was so horrible I can't recall it). For perl5.8 this is easy - it was a major goal of perl5.8. >3/ compare both strings and act upon the comparison Once you have two Unicode strings this is easy. > >if the string I get from ole _is_ unicode (and it seems so) What leads you to that conclusion? >how can I >flatten it to binary? I tried with unpack without success. > >Nadim. -- Nick Ing-Simmons http://www.ni-s.u-net.com/