Looks good to me, although still hoping for more review from others. + if (bytes != 0) {
Style: use NULL. +static jstring newStringJava(JNIEnv *env, const char *str) { I expected two versions, one that took a length and one that did not, as you did with newString8859_1. --- TIL that jbyte is signed char, not plain char, and so is more consistent with java "byte", hence less error prone, at least in theory. --- Can we test different native environments by using -Dsun.jnu.encoding=... ? On Tue, Jun 13, 2017 at 7:03 AM, Claes Redestad <claes.redes...@oracle.com> wrote: > <adding build-dev; see http://mail.openjdk.java.net/p > ipermail/core-libs-dev/2017-June/048233.html for missing context> > > Updated webrevs: > http://cr.openjdk.java.net/~redestad/8181147/jdk.05/ > http://cr.openjdk.java.net/~redestad/8181147/top.04/ > > In this version there are various cleanups based on Martin's > comments, improved test coverage and some small changes to > ensure the updated test builds and runs on all our supported > platforms. > > If someone with access to any of the platforms not supported > by Oracle could take this for a spin and ensure linking the newly > added test against libjava works as expected (or suggest how > to make it work if it doesn't) then that'd be much appreciated. > > Thanks! > > /Claes > > On 06/13/2017 02:01 AM, Martin Buchholz wrote: > >> I'm hoping Xueming will review as well. >> + char asciiCheck; >> + for (asciiCheck = 0, p = str; *p != '\0'; p++) { >> + asciiCheck |= *p; >> + } >> + len = (p - str); >> Probably conversion from ptrdiff_t to int needs a cast. >> >> Someday we might need to worry about input string longer than 2^31, but >> that's a pre-existing problem. >> >> The signed-ness of char being implementation-defined makes me nervous. >> Maybe do all the bit operations using explicit unsigned char, then the >> final test becomes simply >> >> asciiCheck <= 0x80 >> >> >> On Mon, Jun 12, 2017 at 4:36 PM, Claes Redestad < >> claes.redes...@oracle.com <mailto:claes.redes...@oracle.com>> wrote: >> >> Hi Martin, >> >> thanks for reviewing! >> >> On 2017-06-12 22:14, Martin Buchholz wrote: >> >> +/* Optimized for char set UTF-8 */ >> >> "charset" is a (poor misnomer!) jargon term, not two words. >> >> >> I got that from the existing use of "char set" in this file, but >> will fix it in all places. >> >> >> --- >> >> + for (b = str[len]; b != '\0'; len++, b = str[len]) { >> + if (isAscii && b & 0x80) { >> + isAscii = JNI_FALSE; >> + } >> + } >> >> I would write this more like >> >> const signed char *p; >> int isAscii; >> >> for (isAscii = 0, p = (const signed char *) str; *p != '\0'; >> p++) isAscii &= (*p >= 0); >> >> >> Did you mean for isAscii to be initialized to 1 (true) and then be >> cleared to 0 (false) when *p >= 0 is false? >> >> >> Then length is (p - str) >> >> >> How about something like this to hoist the second comparison from >> the loop: >> >> int len; >> char asciiCheck; >> for (asciiCheck = 0, p = str; *p != '\0'; p++) { >> asciiCheck |= *p; >> } >> len = (p - str); >> >> if (asciiCheck & 0x80) { >> // ascii fast-path >> return newSizedString8859_1(env, str, len); >> } >> >> ... >> >> --- >> >> + jbyteArray hab = NULL; >> >> I'm having a hard time decoding the name "hab" >> >> >> Not sure, but my guess is "heap allocated byteArray". >> >> >> --- >> >> The code below is not UTF-8 specific. Can it be refactored? >> >> + hab = (*env)->NewByteArray(env, len); >> + if (hab != 0) { >> + jclass strClazz = JNU_ClassString(env); >> + CHECK_NULL_RETURN(strClazz, 0); >> + (*env)->SetByteArrayRegion(env, hab, 0, len, (jbyte >> *)str); >> + result = (*env)->NewObject(env, strClazz, >> + String_init_ID, hab, >> jnuEncoding); >> + (*env)->DeleteLocalRef(env, hab); >> + return result; >> + } >> >> >> Yes, probably. The excessive copy-paste here is due to a crash >> issue I ran into when >> Charset.isSupported was called without proper initialization. It >> has since been >> resolved, but seems I forgot to revisit this part. >> >> I also realized I haven't added a test for this method. I'll look >> into doing the >> refactoring and adding some sanity testing tomorrow. >> >> >> --- >> >> We probably want to use unicode escapes in out java sources to >> keep all source files strictly ASCII. >> >> >> You mean in the test? Sure. >> >> Refreshed the jdk webrev: >> http://cr.openjdk.java.net/~redestad/8181147/jdk.05/ >> <http://cr.openjdk.java.net/%7Eredestad/8181147/jdk.05/> - more to >> come. >> >> Thanks! >> >> /Claes >> >> >> >