Here's a patch I think will work (I did some minimal testing and it worked
out ok) :
RCS file: /cvsroot/inline-java/Inline-Java/Java/Protocol.pm,v
retrieving revision 1.33
diff -r1.33 Protocol.pm
413c413
< return join(".", unpack("C*", $s)) ;
---
> return join(".", unpack("U*", $s)) ;
420c420
< return pack("C*", split(/\./, $s)) ;
---
> return pack("U*", split(/\./, $s)) ;
and
RCS file:
/cvsroot/inline-java/Inline-Java/Java/sources/InlineJavaProtocol.java,v
retrieving revision 1.2
diff -r1.2 InlineJavaProtocol.java
614,615c614,615
< byte b[] = {(byte)Integer.parseInt(ss)} ;
< sb.append(new String(b)) ;
---
> char c = (char)Integer.parseInt(ss) ;
> sb.append(new String(new char [] {c})) ;
623c623,624
< byte b[] = s.getBytes() ;
---
> char c[] = new char[s.length()] ;
> s.getChars(0, c.length, c, 0) ;
625c626
< for (int i = 0 ; i < b.length ; i++){
---
> for (int i = 0 ; i < c.length ; i++){
629c630,631
< sb.append(String.valueOf(b[i])) ;
---
> sb.append((int)c[i]) ;
Let me know how it turns out.
Patrick
---------------------
Patrick LeBoutillier
Laval, Quebec, Canada
----- Original Message -----
From: "Patrick LeBoutillier" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Thursday, June 05, 2003 8:28 AM
Subject: Re: Inline::Java and utf8
> Dave,
>
> It's possible that there is a problem here. Inline::Java uses a very
simple
> (and somewhat inefficient) encoding to pass the data between Perl and
Java.
>
> Here is the corresponding code:
>
> sub encode {
> my $s = shift ;
>
> return join(".", unpack("C*", $s)) ;
> }
>
> and
>
> String Decode(String s){
> StringTokenizer st = new StringTokenizer(s, ".") ;
> StringBuffer sb = new StringBuffer() ;
> while (st.hasMoreTokens()){
> String ss = st.nextToken() ;
> byte b[] = {(byte)Integer.parseInt(ss)} ;
> sb.append(new String(b)) ;
> }
>
>
> It breaks up the string byte by byte and reconstructs it on the other
side.
> It's probable that this doesn't work
> with multibyte characters since it's probably creating a character for
each
> byte.
>
> If you have time to check this out and send me a patch that would be
great,
> but I don't have the time currently to investigate this. I have no problem
> reviewing the encoding completely, I did like this to make sure I could
> implement the protocol line by line. Maybe only escaping the \n's would
have
> been sufficient.
>
> Anyways comments/suggestions are welcome.
>
>
> ---------------------
> Patrick LeBoutillier
> Laval, Quebec, Canada
> ----- Original Message -----
> From: "Dave LaMacchia" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Wednesday, June 04, 2003 9:19 PM
> Subject: Inline::Java and utf8
>
>
> >
> > I'm working on some code that uses Inline::Java to parse user input in
> > order to make calls to a corba interface in front of an oracle
> > database.
> >
> > I found when I fetch utf8 data from the database, all is well
> > (assuming I've set my locale -- this is on Solaris 2.8 -- to
> > en_US.UTF-8). When I go the other way, however, passing data from
> > perl to Java via Inline, I get data corruption in the non-ASCII
> > characters.
> >
> > I thought that I might have to convert the strings to UCS2, since
> > that's what Java uses internally, but this results in java errors due
> > to embedded null characters.
> >
> > Has anyone run into this problem before? Any suggestions how to get
> > around it? I'm using perl 5.8 so I shouldn't have to insert a use
> > utf8 pragma. Note also that I've confirmed the data is correct in the
> > perl code before the embedded Java is called.
> >
> > Thanks!
> >
> > --dave
> >
>