Vladimir Strigun wrote: > Hi all! > > I'm happy to announce one more contribution to harmony on behalf of > Intel. Provided implementation of charset encoders/decoders is > intended to replace the ICU-based charsets encoding/decoding > operations. The code was developed in clean-room environment inside > Intel and I'd like you to play with it and include to current Harmony > tree. > > The package could be found there: > HARMONY-3593 > > The algorithms for charsets encoding/decoding differs from that of > ICU, all charsets are generated from current Harmony or any other > implementation of Java and could be properly integrated into current > nio_char module. The archive contains source files for 6 charsets: > GB18030, US-ASCII, ISO-8859-1, UTF-8, UTF-16, UTF-16BE, UTF-16LE; > implementation of CharsetProvider; generator for other Charsets and > native part. I've tested the package with more that 90 charsets, and > all benchmarks and tests passed with new bundle. Additionally I have > significant boost for Dacapo.antlr and Dacapo.xalan benchmarks with > current Harmony tree on DRLVM and IBM VM. On DRLVM I have 2.5x boost > for antlr and ~5-8x for xalan. > > The main advantages of the package are the following: > - Code for every charset is generated by CharsetGenerator, thus, if > some modification would be necessary we need just correct generator > and re-generate all sources. > - We use 2 different encoders and decoders for java and direct > buffers. Since most applications use java heap buffers, unlike > existing implementation it doesn't produce lots of native calls to > perform encoding/decoding operations on the java buffers those > significantly improving performance. This is the main reason why we > have such a significant boost for Dacapo.
wow, this is huge! Is there any significant change in the speed of string creation too? > - Charset tables for encoding/decoding are stored in appropriate classes. > > Since the package contains implementation for 6 charsets only, > documentations how to generate and build additional charsets you could > find in README file from contributed package. > > Please do not hesitate to contact me for more details. Thanks for the awesome contribution. -- Stefano.
