Re: Multi-byte character support?

Simon Laws Wed, 17 Nov 2010 09:59:18 -0800

On Wed, Nov 17, 2010 at 5:31 PM, Raymond Feng <[email protected]> wrote:
> Hi,
> Java Strings are unicode encoded. The tricks are when we create Strings from
> byte[] and vice versa (sometimes through streaming APIs). We need to make
> sure we use the correct encoding such as UTF-8 instead of the default one
> which is platform dependent.
> Thanks,
> Raymond
> ________________________________________________________________
> Raymond Feng
> [email protected]
> Apache Tuscany PMC member and committer: tuscany.apache.org
> Co-author of Tuscany SCA In Action book: www.tuscanyinaction.com
> Personal Web Site: www.enjoyjava.com
> ________________________________________________________________
> On Nov 17, 2010, at 7:54 AM, Simon Laws wrote:
>
> Anyone know if there is any support for or any Tuscany tests for
> multi-byte character set support in any of the bindings/databindings?
>
> Simon
>
> --
> Apache Tuscany committer: tuscany.apache.org
> Co-author of a book about Tuscany and SCA: tuscanyinaction.com
>
>
Right, there is some questionable code in some places. E.g.


public class String2OMElement extends BaseTransformer<String,
OMElement> implements
    PullTransformer<String, OMElement> {

    @SuppressWarnings("unchecked")
    public OMElement transform(String source, TransformationContext context) {
        try {
            StAXOMBuilder builder = new StAXOMBuilder(new
ByteArrayInputStream(source.getBytes()));
            OMElement element = builder.getDocumentElement();
            AxiomHelper.adjustElementName(context, element);
            return element;
        } catch (Exception e) {
            throw new TransformationException(e);
        }
    }

Where it does a source.getBytes() with no encoding. I'm assuming that
we don't test with various encodings to find any issues. But wanted to
check.

Simon


-- 
Apache Tuscany committer: tuscany.apache.org
Co-author of a book about Tuscany and SCA: tuscanyinaction.com

Re: Multi-byte character support?

Reply via email to