On Wed, Nov 17, 2010 at 5:31 PM, Raymond Feng <[email protected]> wrote:
> Hi,
> Java Strings are unicode encoded. The tricks are when we create Strings from
> byte[] and vice versa (sometimes through streaming APIs). We need to make
> sure we use the correct encoding such as UTF-8 instead of the default one
> which is platform dependent.
> Thanks,
> Raymond
> ________________________________________________________________
> Raymond Feng
> [email protected]
> Apache Tuscany PMC member and committer: tuscany.apache.org
> Co-author of Tuscany SCA In Action book: www.tuscanyinaction.com
> Personal Web Site: www.enjoyjava.com
> ________________________________________________________________
> On Nov 17, 2010, at 7:54 AM, Simon Laws wrote:
>
> Anyone know if there is any support for or any Tuscany tests for
> multi-byte character set support in any of the bindings/databindings?
>
> Simon
>
> --
> Apache Tuscany committer: tuscany.apache.org
> Co-author of a book about Tuscany and SCA: tuscanyinaction.com
>
>
Right, there is some questionable code in some places. E.g.
public class String2OMElement extends BaseTransformer<String,
OMElement> implements
PullTransformer<String, OMElement> {
@SuppressWarnings("unchecked")
public OMElement transform(String source, TransformationContext context) {
try {
StAXOMBuilder builder = new StAXOMBuilder(new
ByteArrayInputStream(source.getBytes()));
OMElement element = builder.getDocumentElement();
AxiomHelper.adjustElementName(context, element);
return element;
} catch (Exception e) {
throw new TransformationException(e);
}
}
Where it does a source.getBytes() with no encoding. I'm assuming that
we don't test with various encodings to find any issues. But wanted to
check.
Simon
--
Apache Tuscany committer: tuscany.apache.org
Co-author of a book about Tuscany and SCA: tuscanyinaction.com