Re: ObjectRepresentation and String encoding

2008-09-24 Thread Hannes Ebner
Hi Jerome,

Jerome Louvel wrote:
 After looking again at your issue with Thierry, we concluded that
 there is no bug, just some classical characters encoding confusion
 :-).

Ok, that's what I thought of when I wrote it seems to be an encoding
issue which might not even be related to restlets in my last message.

 Now, with your Restlet approach, the object is serialized using Java 
 serialization (binary scheme). When it gets deserialized, it restores
 strings in the JVM as UTF-16 (Java's internal encoding for strings).
 When you print those strings to your console, there is an issue
 because the console expects another encoding (ISO-8859-1) and has no
 way to automatically convert your UTF string.

I thought that Java will convert it to the system's default encoding
when writing e.g. to the console.

Thanks for looking into this and your explanation!

Best regards,
Hannes


Re: ObjectRepresentation and String encoding

2008-09-23 Thread Jerome Louvel

Hi Hannes,

After looking again at your issue with Thierry, we concluded that there 
is no bug, just some classical characters encoding confusion :-).


What happens is probably this: when you use your plain socket, your 
objects are serialized to XML, which will contain metadata about the 
encoding used by the client. When the server receives it and deserialize 
the XML, it can decode it to the local character encoding when printing 
its content.


Now, with your Restlet approach, the object is serialized using Java 
serialization (binary scheme). When it gets deserialized, it restores 
strings in the JVM as UTF-16 (Java's internal encoding for strings). 
When you print those strings to your console, there is an issue because 
the console expects another encoding (ISO-8859-1) and has no way to 
automatically convert your UTF string.


One solution is to change the encoding of your console to UTF-16. The 
other is to convert your string to your local encoding before printing. 
You could use the java.io.OutputStreamWriter for this purpose, wrapping 
the System.out stream and passing ISO-8859-1 as the encoding.


Some text editors are smart enough and can detect the encoding of a text 
file. That's probably why Gedit works for you. I hope this clarified the 
issue. We are closing issue#525 now.


Best regards,
Jérôme Louvel
--
Restlet ~ Founder and Lead developer ~ http://www.restlet.org 
http://www.restlet.org/

Noelios Technologies ~ Co-founder ~ http://www.noelios.com



Hannes Ebner a écrit :

Hi Thierry,

Thierry Boileau wrote:
  

I had a look at the issue and I don't see what's wrong. I was able to
send a serialized object from a client using UTF-8 to a server using
ISO-8859-1 without encoding issues.
Could you send us a reproductible test case, and send us also the trace
of the following code on both client and server side?



I planned to write a reproducable test case, but I didn't get that far.
I tried your small application first (the one attached to the bug
report) and ran it directly on the server which uses ISO-8859-1.

The server sent some data to itself and printed it on the console, so I
guess there is no change in the encoding involved.

The result that I got on the console was:

Une cha�ne de caract�res.

I could also reproduce it on another server with ISO-8859-1, but not on
a third one which was configured for UTF-8.

With UTF-8 I got the correct string:

Une chaîne de caractères.

I also tried to pipe the console output into a file, which I transferred
to my development machine (which uses UTF-8). A simple cat of this file
on the console showed question marks like above, but when I opened the
same file on the same machine with a graphical editor (gedit), the
special characters showed up correctly. I'm confused now, it seems to be
an encoding issue which might not even be related to restlets.

The questions are now: how do I solve it, and why does it work to
transfer the special characters with the old version of my service,
which does not use restlets (it uses Object-to-XML serialization over a
socket).

I also attached the requested system properties to this mail, one file
with the settings from an ISO-8859-1 server, the other one with UTF-8. I
don't know whether it helps you to find something, I couldn't see
anything strange.

Perhaps you have or somebody else on the list has some experience with
problems related to character encodings, I'm out of ideas right now.

Best regards,
Hannes
  




Re: ObjectRepresentation and String encoding

2008-09-15 Thread Hannes Ebner
Hi Thierry,

Thierry Boileau wrote:
 I had a look at the issue and I don't see what's wrong. I was able to
 send a serialized object from a client using UTF-8 to a server using
 ISO-8859-1 without encoding issues.
 Could you send us a reproductible test case, and send us also the trace
 of the following code on both client and server side?

I planned to write a reproducable test case, but I didn't get that far.
I tried your small application first (the one attached to the bug
report) and ran it directly on the server which uses ISO-8859-1.

The server sent some data to itself and printed it on the console, so I
guess there is no change in the encoding involved.

The result that I got on the console was:

Une cha�ne de caract�res.

I could also reproduce it on another server with ISO-8859-1, but not on
a third one which was configured for UTF-8.

With UTF-8 I got the correct string:

Une chaîne de caractères.

I also tried to pipe the console output into a file, which I transferred
to my development machine (which uses UTF-8). A simple cat of this file
on the console showed question marks like above, but when I opened the
same file on the same machine with a graphical editor (gedit), the
special characters showed up correctly. I'm confused now, it seems to be
an encoding issue which might not even be related to restlets.

The questions are now: how do I solve it, and why does it work to
transfer the special characters with the old version of my service,
which does not use restlets (it uses Object-to-XML serialization over a
socket).

I also attached the requested system properties to this mail, one file
with the settings from an ISO-8859-1 server, the other one with UTF-8. I
don't know whether it helps you to find something, I couldn't see
anything strange.

Perhaps you have or somebody else on the list has some experience with
problems related to character encodings, I'm out of ideas right now.

Best regards,
Hannes
java.runtime.name=Java(TM) 2 Runtime Environment, Standard Edition
sun.boot.library.path=/usr/lib/j2sdk1.5-sun/jre/lib/i386
java.vm.version=1.5.0_11-b03
java.vm.vendor=Sun Microsystems Inc.
java.vendor.url=http://java.sun.com/
path.separator=:
java.vm.name=Java HotSpot(TM) Client VM
file.encoding.pkg=sun.io
sun.java.launcher=SUN_STANDARD
user.country=US
sun.os.patch.level=unknown
java.vm.specification.name=Java Virtual Machine Specification
user.dir=/mnt/user/home/ebner/test-case
java.runtime.version=1.5.0_11-b03
java.awt.graphicsenv=sun.awt.X11GraphicsEnvironment
java.endorsed.dirs=/usr/lib/j2sdk1.5-sun/jre/lib/endorsed
os.arch=i386
java.io.tmpdir=/tmp
line.separator=

java.vm.specification.vendor=Sun Microsystems Inc.
os.name=Linux
sun.jnu.encoding=UTF-8
java.library.path=/usr/lib/j2sdk1.5-sun/jre/lib/i386/client:/usr/lib/j2sdk1.5-sun/jre/lib/i386:/usr/lib/j2sdk1.5-sun/jre/../lib/i386
java.specification.name=Java Platform API Specification
java.class.version=49.0
sun.management.compiler=HotSpot Client Compiler
os.version=2.6.18-6-686
user.home=/home/ebner
user.timezone=
java.awt.printerjob=sun.print.PSPrinterJob
file.encoding=UTF-8
java.specification.version=1.5
java.class.path=test.jar
user.name=ebner
java.vm.specification.version=1.0
java.home=/usr/lib/j2sdk1.5-sun/jre
sun.arch.data.model=32
user.language=en
java.specification.vendor=Sun Microsystems Inc.
java.vm.info=mixed mode, sharing
java.version=1.5.0_11
java.ext.dirs=/usr/lib/j2sdk1.5-sun/jre/lib/ext
sun.boot.class.path=/usr/lib/j2sdk1.5-sun/jre/lib/rt.jar:/usr/lib/j2sdk1.5-sun/jre/lib/i18n.jar:/usr/lib/j2sdk1.5-sun/jre/lib/sunrsasign.jar:/usr/lib/j2sdk1.5-sun/jre/lib/jsse.jar:/usr/lib/j2sdk1.5-sun/jre/lib/jce.jar:/usr/lib/j2sdk1.5-sun/jre/lib/charsets.jar:/usr/lib/j2sdk1.5-sun/jre/classes
java.vendor=Sun Microsystems Inc.
file.separator=/
java.vendor.url.bug=http://java.sun.com/cgi-bin/bugreport.cgi
sun.io.unicode.encoding=UnicodeLittle
sun.cpu.endian=little
sun.cpu.isalist=
java.runtime.name=Java(TM) 2 Runtime Environment, Standard Edition
sun.boot.library.path=/usr/lib/jvm/java-1.5.0-sun-1.5.0.14/jre/lib/i386
java.vm.version=1.5.0_14-b03
java.vm.vendor=Sun Microsystems Inc.
java.vendor.url=http://java.sun.com/
path.separator=:
java.vm.name=Java HotSpot(TM) Server VM
file.encoding.pkg=sun.io
sun.java.launcher=SUN_STANDARD
user.country=US
sun.os.patch.level=unknown
java.vm.specification.name=Java Virtual Machine Specification
user.dir=/var/collaborilla-rest
java.runtime.version=1.5.0_14-b03
java.awt.graphicsenv=sun.awt.X11GraphicsEnvironment
java.endorsed.dirs=/usr/lib/jvm/java-1.5.0-sun-1.5.0.14/jre/lib/endorsed
os.arch=i386
java.io.tmpdir=/tmp
line.separator=

java.vm.specification.vendor=Sun Microsystems Inc.
os.name=Linux
sun.jnu.encoding=ISO-8859-1
java.library.path=/usr/lib/jvm/java-1.5.0-sun-1.5.0.14/jre/lib/i386/server:/usr/lib/jvm/java-1.5.0-sun-1.5.0.14/jre/lib/i386:/usr/lib/jvm/java-1.5.0-sun-1.5.0.14/jre/../lib/i386
java.specification.name=Java Platform API Specification
java.class.version=49.0

Re: ObjectRepresentation and String encoding

2008-09-11 Thread Thierry Boileau

Hi Hannes,

I had a look at the issue and I don't see what's wrong. I was able to
send a serialized object from a client using UTF-8 to a server using
ISO-8859-1 without encoding issues.
Could you send us a reproductible test case, and send us also the trace
of the following code on both client and server side?

   for (EntryObject, Object entry :
System.getProperties().entrySet()) {
   System.out.print(entry.getKey());
   System.out.print(=);
   System.out.println(entry.getValue());
   }

Best regards,
Thierry Boileau
--
Restlet ~ Core developer ~ http://www.restlet.org http://www.restlet.org/
Noelios Technologies ~ Co-founder ~ http://www.noelios.com
http://www.noelios.com/

Hi Jerome,

  

It looks like a bug but after looking at the code I don't see what we
are doing wrong as we have no control on encoding for Object
serialization.

Anyway, I've entered a bug report:



great, thanks!

  

If you could attach a reproducible test case (client+server code),
that would help us fix it more quickly. Also, could you add a comment
to the report indicating which client and server connectors you are
using?



Yes, I will do this during the next days.

Best regards,
Hannes




Re: ObjectRepresentation and String encoding

2008-09-11 Thread Hannes Ebner
Hi Thierry,

Thierry Boileau wrote:
 Could you send us a reproductible test case, and send us also the trace
 of the following code on both client and server side?

I will try to reproduce it with a small test case and get back to you.

Best regards,
Hannes


RE: ObjectRepresentation and String encoding

2008-07-04 Thread Jerome Louvel

Hi Hannes,

It looks like a bug but after looking at the code I don't see what we are doing 
wrong as we have no control on encoding for Object serialization.

Anyway, I've entered a bug report:

Encoding issue with ObjectRepresentation
http://restlet.tigris.org/issues/show_bug.cgi?id=525

If you could attach a reproducible test case (client+server code), that would 
help us fix it more quickly. Also, could you add a comment to the report 
indicating which client and server connectors you are using?

Best regards,
Jerome


-Message d'origine-
De : Hannes Ebner [mailto:[EMAIL PROTECTED] 
Envoyé : lundi 30 juin 2008 12:37
À : discuss@restlet.tigris.org
Objet : Re: ObjectRepresentation and String encoding

Hi Stephan,

Stephan Koops wrote:
 you could explicit set the character encoding of a representation.
 Perhaps you have to set ISO-8859-1 into the representation? Use
 Representation.setCharacterSet(...)

I don't think that this works with serialized objects. I tried to set
the character set, but it didn't show up in the HTTP header on the other
side.

I tried to serialize the very same object myself (without Restlets
involved), and sent it directly via a TCP Socket, and it worked. Could
this be a bug somewhere in Restlet's ObjectRepresentation?

Best regards,
Hannes



Re: ObjectRepresentation and String encoding

2008-07-04 Thread Hannes Ebner
Hi Jerome,

Jerome Louvel wrote:
 It looks like a bug but after looking at the code I don't see what we
 are doing wrong as we have no control on encoding for Object
 serialization.
 
 Anyway, I've entered a bug report:

great, thanks!

 If you could attach a reproducible test case (client+server code),
 that would help us fix it more quickly. Also, could you add a comment
 to the report indicating which client and server connectors you are
 using?

Yes, I will do this during the next days.

Best regards,
Hannes


Re: ObjectRepresentation and String encoding

2008-06-30 Thread Hannes Ebner
Hi Stephan,

Stephan Koops wrote:
 you could explicit set the character encoding of a representation.
 Perhaps you have to set ISO-8859-1 into the representation? Use
 Representation.setCharacterSet(...)

I don't think that this works with serialized objects. I tried to set
the character set, but it didn't show up in the HTTP header on the other
side.

I tried to serialize the very same object myself (without Restlets
involved), and sent it directly via a TCP Socket, and it worked. Could
this be a bug somewhere in Restlet's ObjectRepresentation?

Best regards,
Hannes