On 30.03.2017 20:24, dean.l...@oracle.com wrote: > I would like to go with the webrev.2 version, with asserts removed. All the > reviewers have said they are OK with that version, and it allows more > complete testing than the minimal version.
Okay, I'm fine with that! Best regards, Tobias > On 3/23/17 12:03 PM, dean.l...@oracle.com wrote: >> >> On 3/23/17 11:25 AM, dean.l...@oracle.com wrote: >> >>> On 3/22/17 1:49 PM, Vladimir Ivanov wrote: >>> >>>>> Also, it looks like the changes I made to ASB.appendChars(char[] s, int >>>>> off, int end) are not needed. >>>> >>>> Agree. >>>> >>>>>> Vladimir, don't you need to replace checkIndex with checkOffset in >>>>>> indexOf and lastIndexOf, so that we allow count == length? >>>> >>>> Yes, my bad. Good catch. Updated webrev in place. >>>> >>>> FTR I haven't done any extensive testing of the minimized fix. >>>> >>>> If we agree to proceed with it, the regression test should be updated as >>>> well. I think the viable solution would be to construct broken SBs (using >>>> reflection) and invoke affected methods on them. >>>> >>> >>> We can construct broken SBs using the Helper class that gets patched into >>> java.lang. I'll work on that. >>> >> >> Nevermind. I forgot that some problems can only happen when the SB is >> changed half-way through the method. For example, >> in append(), we can't force an overflow unless we change the SB after >> ensureCapacityInternal() is called. I could do something like: >> >> 760 public AbstractStringBuilder append(int i) { >> 761 int count = this.count; >> 762 int spaceNeeded = count + Integer.stringSize(i); >> 763 ensureCapacityInternal(spaceNeeded); >> 764 if (isLatin1()) { >> >>>>>> Helper.fuzzValue(this); >> 765 Integer.getChars(i, spaceNeeded, value); >> 766 } else { >> 767 byte[] val = this.value; >> >>>>>> Helper.fuzzValue(this); >> 768 checkBoundsBeginEnd(count, spaceNeeded, val.length >> 1); >> 769 Integer.getCharsUTF16(i, spaceNeeded, val); >> 770 } >> 771 this.count = spaceNeeded; >> 772 return this; >> 773 } >> >> where the default Helper.fuzzValue() is an empty method, but the test would >> patch in its own version of Helper that changes the ASB field values. I >> like this less than refactoring the checks to StringUTF16. >> >> dl >> >>> dl >>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>>>> On 3/22/17 8:35 AM, Vladimir Ivanov wrote: >>>>>>>>>> So are we convinced that the proposed changes will never lead to a >>>>>>>>>> crash due to a missing or incorrect bounds check, due to a racy >>>>>>>>>> use of >>>>>>>>>> an unsynchronized ASB instance e.g. StringBuilder? >>>>>>>>> >>>>>>>>> If only we had a static analysis tool that could tell us if the >>>>>>>>> code is >>>>>>>>> safe. Because we don't, in my initial changeset, we always take a >>>>>>>>> snapshot of the ASB fields by passing those field values to >>>>>>>>> StringUTF16 >>>>>>>>> before doing checks on them. And I wrote a test to make sure that >>>>>>>>> those >>>>>>>>> StringUTF16 interfaces are catching all the underflows and overflows I >>>>>>>>> could imagine, and I added verification code to detect when a check >>>>>>>>> was >>>>>>>>> missed. >>>>>>>>> >>>>>>>>> However, all the reviewers have requested to minimize the amount of >>>>>>>>> changes. In Vladimir's version, if there is a missing check >>>>>>>>> somewhere, >>>>>>>>> then yes it could lead to a crash. >>>>>>> >>>>>>> I'd like to point out that asserts and verification code are disabled >>>>>>> by default. They are invaluable during problem diagnosis, but don't >>>>>>> help at all from defence-in-depth perspective. >>>>>>> >>>>>>> But I agree that it's easier to reason about and test the initial >>>>>>> version of the fix. >>>>>>> >>>>>>>> I wonder if the reviewers have fully realized the potential impact >>>>>>>> here? >>>>>>>> This has exposed a flaw in the way intrinsics are used from core >>>>>>>> classes. >>>>>>> >>>>>>> FTR here are the checks I omitted in the minimized version (modulo >>>>>>> separation of indexOf/lastIndexOf for trusted/non-trusted callers): >>>>>>> >>>>>>> http://cr.openjdk.java.net/~vlivanov/dlong/8158168/redundant_checks/ >>>>>>> >>>>>>> Other than that, the difference is mainly about undoing refactorings >>>>>>> and removing verification logic (asserts + checks in the JVM). >>>>>>> >>>>>>> There are still unsafe accesses which are considered safe in both >>>>>>> versions (see StringUTF16.Trusted usages in the initial version [1]). >>>>>>> >>>>>>> We used to provide safe wrappers for unsafe intrinsics which makes it >>>>>>> much easier to reason about code correctness. I'd like to see compact >>>>>>> string code refactored that way and IMO the initial version by Dean >>>>>>> is a big step in the right direction. >>>>>>> >>>>>>> I still prefer to see a point fix in 9 and major refactoring >>>>>>> happening in 10, but I'll leave the decision on how to proceed with >>>>>>> the fix to core-libs folks. After finishing the exercise minimizing >>>>>>> the fix, I'm much more comfortable with the initial fix [1] (though >>>>>>> there are changes I consider excessive). >>>>>>> >>>>>>> Best regards, >>>>>>> Vladimir Ivanov >>>>>>> >>>>>>> [1] http://cr.openjdk.java.net/~dlong/8158168/webrev.0 >>>>>>> >>>>>>>>>>>> Some clarifications: >>>>>>>>>>>> >>>>>>>>>>>> ============ >>>>>>>>>>>> src/java.base/share/classes/java/lang/String.java: >>>>>>>>>>>> >>>>>>>>>>>> The bounds check is needed only in String.nonSyncContentEquals >>>>>>>>>>>> when it >>>>>>>>>>>> extracts info from AbstractStringBuilder. I don't see how out of >>>>>>>>>>>> bounds access can happen in String.contentEquals: >>>>>>>>>>>> if (n != length()) { >>>>>>>>>>>> return false; >>>>>>>>>>>> } >>>>>>>>>>>> ... >>>>>>>>>>>> for (int i = 0; i < n; i++) { >>>>>>>>>>>> if (StringUTF16.getChar(val, i) != cs.charAt(i)) { >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> OK. >>>>>>>>>>> >>>>>>>>>>>> ============ >>>>>>>>>>>> src/java.base/share/classes/java/lang/StringConcatHelper.java: >>>>>>>>>>>> >>>>>>>>>>>> I think bounds checks in StringConcatHelper.prepend() are skipped >>>>>>>>>>>> intentionally, since java.lang.invoke.StringConcatFactory >>>>>>>>>>>> constructs >>>>>>>>>>>> method handle chains which already contain bounds checks: array >>>>>>>>>>>> length >>>>>>>>>>>> is precomputed based on argument values and all accesses are >>>>>>>>>>>> guaranteed to be in bounds. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> This is calling the trusted version of getChars() with no bounds >>>>>>>>>>> checks. It was a little more obvious when I had the Trusted inner >>>>>>>>>>> class. >>>>>>>>>>> >>>>>>>>>>>> ============ >>>>>>>>>>>> src/java.base/share/classes/java/lang/StringUTF16.java: >>>>>>>>>>>> >>>>>>>>>>>> + static void putChar(byte[] val, int index, int c) { >>>>>>>>>>>> + assert index >= 0 && index < length(val) : "Trusted caller >>>>>>>>>>>> missed bounds check"; >>>>>>>>>>>> >>>>>>>>>>>> Unfortunately, asserts can affect inlining decisions (since they >>>>>>>>>>>> increase bytecode size). In order to minimize possible performance >>>>>>>>>>>> impact, I suggest to remove them from the fix targeting 9. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Sure. >>>>>>>>>>> >>>>>>>>>>>> ============ >>>>>>>>>>>> private static int indexOfSupplementary(byte[] value, int >>>>>>>>>>>> ch, int >>>>>>>>>>>> fromIndex, int max) { >>>>>>>>>>>> if (Character.isValidCodePoint(ch)) { >>>>>>>>>>>> final char hi = Character.highSurrogate(ch); >>>>>>>>>>>> final char lo = Character.lowSurrogate(ch); >>>>>>>>>>>> + checkBoundsBeginEnd(fromIndex, max, value); >>>>>>>>>>>> >>>>>>>>>>>> The check is redundant here. fromIndex & max are always inbounds by >>>>>>>>>>>> construction: >>>>>>>>>>>> >>>>>>>>>>>> public static int indexOf(byte[] value, int ch, int >>>>>>>>>>>> fromIndex) { >>>>>>>>>>>> int max = value.length >> 1; >>>>>>>>>>>> if (fromIndex < 0) { >>>>>>>>>>>> fromIndex = 0; >>>>>>>>>>>> } else if (fromIndex >= max) { >>>>>>>>>>>> // Note: fromIndex might be near -1>>>1. >>>>>>>>>>>> return -1; >>>>>>>>>>>> } >>>>>>>>>>>> ... >>>>>>>>>>>> return indexOfSupplementary(value, ch, fromIndex, max); >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> OK. >>>>>>>>>>> >>>>>>>>>>>> ============ >>>>>>>>>>>> I moved bounds checks from StringUTF16.lastIndexOf/indexOf to >>>>>>>>>>>> ABS.indexOf/lastIndexOf. I think it's enough to do range check on >>>>>>>>>>>> ABS.value & ABS.count. After that, all accesses should be >>>>>>>>>>>> inbounds by >>>>>>>>>>>> construction (in String.indexOf/lastIndexOf): >>>>>>>>>>>> >>>>>>>>>>>> jdk/src/java.base/share/classes/java/lang/StringUTF16.java: >>>>>>>>>>>> static int lastIndexOf(byte[] src, byte srcCoder, int srcCount, >>>>>>>>>>>> String tgtStr, int fromIndex) { >>>>>>>>>>>> >>>>>>>>>>>> int rightIndex = srcCount - tgtCount; >>>>>>>>>>>> if (fromIndex > rightIndex) { >>>>>>>>>>>> fromIndex = rightIndex; >>>>>>>>>>>> } >>>>>>>>>>>> if (fromIndex < 0) { >>>>>>>>>>>> return -1; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> jdk/src/java.base/share/classes/java/lang/StringUTF16.java: >>>>>>>>>>>> public static int lastIndexOf(byte[] src, int srcCount, >>>>>>>>>>>> byte[] tgt, int tgtCount, int >>>>>>>>>>>> fromIndex) { >>>>>>>>>>>> int min = tgtCount - 1; >>>>>>>>>>>> int i = min + fromIndex; >>>>>>>>>>>> int strLastIndex = tgtCount - 1; >>>>>>>>>>>> char strLastChar = getChar(tgt, strLastIndex); >>>>>>>>>>>> >>>>>>>>>>>> startSearchForLastChar: >>>>>>>>>>>> while (true) { >>>>>>>>>>>> while (i >= min && getChar(src, i) != strLastChar) { >>>>>>>>>>>> >>>>>>>>>>>> There are 2 places: >>>>>>>>>>>> * getChar(tgt, strLastIndex) => getChar(tgt, tgtCount-1) - >>>>>>>>>>>> inbound >>>>>>>>>>>> >>>>>>>>>>>> * getChar(src, i); i in [ min; min+fromIndex ] >>>>>>>>>>>> min = tgtCount - 1 >>>>>>>>>>>> rightIndex = srcCount - tgtCount >>>>>>>>>>>> fromIndex <= rightIndex >>>>>>>>>>>> >>>>>>>>>>>> 0 <= min + fromIndex <= min + rightIndex == (tgtCount >>>>>>>>>>>> - 1) >>>>>>>>>>>> + (srcCount - tgtCount) == srcCount - 1 >>>>>>>>>>>> >>>>>>>>>>>> Hence, should be covered by the check on count & value: >>>>>>>>>>>> public int lastIndexOf(String str, int fromIndex) { >>>>>>>>>>>> + byte[] value = this.value; >>>>>>>>>>>> + int count = this.count; >>>>>>>>>>>> + byte coder = this.coder; >>>>>>>>>>>> + checkIndex(count, value.length >> coder); >>>>>>>>>>>> return String.lastIndexOf(value, coder, count, str, >>>>>>>>>>>> fromIndex); >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> OK, I will go with your version if it's OK with Sherman. >>>>>>>>>>> >>>>>>>>>>> dl >>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Vladimir Ivanov >>>>>>>>>>>> >>>>>>>>>>>>> On 3/17/17 5:58 AM, Vladimir Ivanov wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have the same concern. Can we fix the immediate problem in >>>>>>>>>>>>>>>> 9 and >>>>>>>>>>>>>>>> integrate verification logic in 10? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> OK, Tobias is suggesting having verification logic only >>>>>>>>>>>>>>> inside the >>>>>>>>>>>>>>> intrinsics. Are you suggesting removing that as well? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes and put them back in 10. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm OK with removing all the verification, but that won't reduce >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> library changes much. I could undo the renaming to >>>>>>>>>>>>>>> Trusted.getChar, but >>>>>>>>>>>>>>> we would still have the bounds checks moved into StringUTF16. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I suggest to go with a point fix for 9: just add missing range >>>>>>>>>>>>>> checks. >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>> >>>>> >>> >> >