Thank you Martin, Sherman and Alan for your valuable inputs. I have done some initial analysis on the ICU4J. There are some compatibility issues on the ICU4J charsets with JDK charsets but am more concerned about its performance as JDK optimization do no exist in that implementation. I think we need to work with the ICU4J community to resolve those issues before we remove those charsets from JDK.
The primary reason we are interested to contribute the charsets to openjdk is that Java users of all locales to get a seamless experience when they move between openjdk and other implementations. I agree it is good from footprint and maintenance perspective if we are able to reduce the number of charsets. I believe the maintenance effort on the charsets are usually less as we hardly make any changes to the charsets once developed. Also, the charsets are usually independent to each other and hence usually will not affect the Java users unless they are used. As more team members from my team would like to actively participate in the openjdk community, I hope maintenance of any issues reported on IBM charsets may not be an issue going forward. As we discussed before, the footprint issue can be avoided if we enable the IBM charsets on a need basis with a build flag. As you advised, we can enable the IBM charsets only for AIX platform by default and user can enable them on other platforms on a need basis. If all of you agree, we can start working on moving all IBM charsets from jdk.charsets to a different module jdk.ibm.charsets and enable them only for AIX platform by default. We can consider removing them from JDK in future if community found them as an overhead or not adding value. Please advise. Thank you, Nasser Ebrahim From: Alan Bateman <alan.bate...@oracle.com> To: Xueming Shen <xueming.s...@oracle.com>, Nasser Ebrahim <enas...@in.ibm.com> Cc: core-libs-dev@openjdk.java.net Date: 07/19/2018 03:44 PM Subject: Re: Adding new IBM extended charsets On 19/07/2018 08:27, Xueming Shen wrote: > Hi Nasser, > > From openjdk's perspective It would be preferred to direct the develop > to use the charset > implementation provided by IBM, or the reliable third party that has > the appropriate knowledge, > experience and resource to support/maintain those charsets such as the > icu4j charset > project. I have been pulling the data from that huge icu-charset-data > file and implement/maintain > them based on my best knowledge, but I'm sure engineers from IBM or > the icu project probably > can do a much better job to implement/maintain/update those charsets > going forward. > > As first step we can separate those IBM charsets from the jdk.charset > into a separate package > somewhere and configure them to be built into java.base and > jdk.charsets, for aix platform only. > Then we can further discuss the best way to handle/distribute those > charsets that are not needed > for the java.base module (for vm startup). As I said, it would be > ideal if we can remove them from the > openjdk repo/binaries complete and direct the developer/user to use > the icu4j charset provider > for those encodings, when needed. But given the possible compatibility > concern, we might want to > phase this work out gradually in next major release. I agree and in terms of phasing then I don't think it would be too disruptive if the EBCDIC charsets were dropped from jdk.charsets in JDK 12, at least on the main stream platforms. As we've established in this thread, the ICU4J project does seem to publish its charset provider to Maven so there are alternatives for applications that really need these charsets Nasser - do you do any testing with the ICU4J charsets? I quickly tried 62.1 and it seems to work fine on the class path. I didn't check for any compatibility differences or compare the performance but maybe you have. It's a bit awkward to test this provider as an automatic module due to the unusual naming of these JAR files. They may not have looked at modules yet but the ability to link thee icu4h.charsets module into a run-time image seems something that people may want to do in the future. -Alan