Re: RFR 8224789 : Parsing repetition count in regex does not detect numeric overflow
As an academic note, note that non-overflowing strings can be recognized with a regex :-) For signed numbers in base 10: # MAX_VALUE in base 10: 2147483647 ((-)?0*((\p{Digit}){1,9}) |([0-1]\p{Digit}{9}) |(20\p{Digit}{8}) |(21[0-3]\p{Digit}{7}) |(214[0-6]\p{Digit}{6}) |(2147[0-3]\p{Digit}{5}) |(21474[0-7]\p{Digit}{4}) |(214748[0-2]\p{Digit}{3}) |(2147483[0-5]\p{Digit}{2}) |(21474836[0-3]\p{Digit}{1}) |(214748364[0-7]) )| (-0*2147483648) -Joe On 5/29/2019 1:14 PM, Brent Christian wrote: Looks fine. Thanks, Ivan. -Brent On 5/29/19 7:27 AM, Roger Riggs wrote: Thanks Ivan, Looks good. On 05/28/2019 10:08 PM, Ivan Gerasimov wrote: Hi Brent! On 5/28/19 4:06 PM, Brent Christian wrote: Hi, Ivan I agree with Roger that there are more test cases than necessary. Otherwise I think it looks pretty good. Okay. Then let's make the list of invalid ranges shorter, but add a randomly generated value! Please find the updated webrev here: http://cr.openjdk.java.net/~igerasim/8224789/01/webrev/ With kind regards, Ivan I find the addExact/multiplyExact code less readable. I'm not sure what could be done about it - maybe some different indentation: cmin = Math.addExact( Math.multiplyExact(cmin, 10), ch - '0'); though that makes for some long lines. Just a thought. Thanks, -Brent On 5/24/19 11:28 PM, Ivan Gerasimov wrote: Hello! When Pattern.compile() parses the repetition count in the expressions like '.{100}', '.{1,2}' or '.{3,}' it fails to detect numeric overflow if the result is still non-negative. Could you please help review the patch? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8224789 WEBREV: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/ Also, reading a char at line 3274 is done with skip(), so the exception thrown at 3315 displays the position of the error more accurately.
Re: RFR 8224789 : Parsing repetition count in regex does not detect numeric overflow
Looks fine. Thanks, Ivan. -Brent On 5/29/19 7:27 AM, Roger Riggs wrote: Thanks Ivan, Looks good. On 05/28/2019 10:08 PM, Ivan Gerasimov wrote: Hi Brent! On 5/28/19 4:06 PM, Brent Christian wrote: Hi, Ivan I agree with Roger that there are more test cases than necessary. Otherwise I think it looks pretty good. Okay. Then let's make the list of invalid ranges shorter, but add a randomly generated value! Please find the updated webrev here: http://cr.openjdk.java.net/~igerasim/8224789/01/webrev/ With kind regards, Ivan I find the addExact/multiplyExact code less readable. I'm not sure what could be done about it - maybe some different indentation: cmin = Math.addExact( Math.multiplyExact(cmin, 10), ch - '0'); though that makes for some long lines. Just a thought. Thanks, -Brent On 5/24/19 11:28 PM, Ivan Gerasimov wrote: Hello! When Pattern.compile() parses the repetition count in the expressions like '.{100}', '.{1,2}' or '.{3,}' it fails to detect numeric overflow if the result is still non-negative. Could you please help review the patch? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8224789 WEBREV: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/ Also, reading a char at line 3274 is done with skip(), so the exception thrown at 3315 displays the position of the error more accurately.
Re: RFR 8224789 : Parsing repetition count in regex does not detect numeric overflow
Thanks Ivan, Looks good. On 05/28/2019 10:08 PM, Ivan Gerasimov wrote: Hi Brent! On 5/28/19 4:06 PM, Brent Christian wrote: Hi, Ivan I agree with Roger that there are more test cases than necessary. Otherwise I think it looks pretty good. Okay. Then let's make the list of invalid ranges shorter, but add a randomly generated value! Please find the updated webrev here: http://cr.openjdk.java.net/~igerasim/8224789/01/webrev/ With kind regards, Ivan I find the addExact/multiplyExact code less readable. I'm not sure what could be done about it - maybe some different indentation: cmin = Math.addExact( Math.multiplyExact(cmin, 10), ch - '0'); though that makes for some long lines. Just a thought. Thanks, -Brent On 5/24/19 11:28 PM, Ivan Gerasimov wrote: Hello! When Pattern.compile() parses the repetition count in the expressions like '.{100}', '.{1,2}' or '.{3,}' it fails to detect numeric overflow if the result is still non-negative. Could you please help review the patch? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8224789 WEBREV: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/ Also, reading a char at line 3274 is done with skip(), so the exception thrown at 3315 displays the position of the error more accurately.
Re: RFR 8224789 : Parsing repetition count in regex does not detect numeric overflow
Hi Brent! On 5/28/19 4:06 PM, Brent Christian wrote: Hi, Ivan I agree with Roger that there are more test cases than necessary. Otherwise I think it looks pretty good. Okay. Then let's make the list of invalid ranges shorter, but add a randomly generated value! Please find the updated webrev here: http://cr.openjdk.java.net/~igerasim/8224789/01/webrev/ With kind regards, Ivan I find the addExact/multiplyExact code less readable. I'm not sure what could be done about it - maybe some different indentation: cmin = Math.addExact( Math.multiplyExact(cmin, 10), ch - '0'); though that makes for some long lines. Just a thought. Thanks, -Brent On 5/24/19 11:28 PM, Ivan Gerasimov wrote: Hello! When Pattern.compile() parses the repetition count in the expressions like '.{100}', '.{1,2}' or '.{3,}' it fails to detect numeric overflow if the result is still non-negative. Could you please help review the patch? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8224789 WEBREV: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/ Also, reading a char at line 3274 is done with skip(), so the exception thrown at 3315 displays the position of the error more accurately. -- With kind regards, Ivan Gerasimov
Re: RFR 8224789 : Parsing repetition count in regex does not detect numeric overflow
Hi, Ivan I agree with Roger that there are more test cases than necessary. Otherwise I think it looks pretty good. I find the addExact/multiplyExact code less readable. I'm not sure what could be done about it - maybe some different indentation: cmin = Math.addExact( Math.multiplyExact(cmin, 10), ch - '0'); though that makes for some long lines. Just a thought. Thanks, -Brent On 5/24/19 11:28 PM, Ivan Gerasimov wrote: Hello! When Pattern.compile() parses the repetition count in the expressions like '.{100}', '.{1,2}' or '.{3,}' it fails to detect numeric overflow if the result is still non-negative. Could you please help review the patch? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8224789 WEBREV: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/ Also, reading a char at line 3274 is done with skip(), so the exception thrown at 3315 displays the position of the error more accurately.
Re: RFR 8224789 : Parsing repetition count in regex does not detect numeric overflow
Thank you Roger for reviewing! On 5/28/19 9:33 AM, Roger Riggs wrote: Hi Ivan, test/jdk/java/util/regex/RegExTest.java: 4954: The test should print the failing exception information and 4951: a message if the Pattern.compile does not fail to distinguish that failure from any others. Yes, it will be useful, if the test fails. So, I added some debug output. The webrev was updated in place: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/ I don't think there is a need for so many cases of values > 2^31; there is no different code path for all those values. They are fairly cheap, but redundant. I wanted to cover cases when the number would overflow 64-bit integer, so that the test would fail if the count were implemented as a long. Also, I tried to cover different configurations of the quantifier {N}, {N,}, {N,M} with both N and M overflowing int, unsigned int, long, unsigned long. With kind regards, Ivan Otherwise, looks fine. Thanks, Roger On 05/25/2019 02:28 AM, Ivan Gerasimov wrote: Hello! When Pattern.compile() parses the repetition count in the expressions like '.{100}', '.{1,2}' or '.{3,}' it fails to detect numeric overflow if the result is still non-negative. Could you please help review the patch? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8224789 WEBREV: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/ Also, reading a char at line 3274 is done with skip(), so the exception thrown at 3315 displays the position of the error more accurately. -- With kind regards, Ivan Gerasimov
Re: RFR 8224789 : Parsing repetition count in regex does not detect numeric overflow
Hi Ivan, test/jdk/java/util/regex/RegExTest.java: 4954: The test should print the failing exception information and 4951: a message if the Pattern.compile does not fail to distinguish that failure from any others. I don't think there is a need for so many cases of values > 2^31; there is no different code path for all those values. They are fairly cheap, but redundant. Otherwise, looks fine. Thanks, Roger On 05/25/2019 02:28 AM, Ivan Gerasimov wrote: Hello! When Pattern.compile() parses the repetition count in the expressions like '.{100}', '.{1,2}' or '.{3,}' it fails to detect numeric overflow if the result is still non-negative. Could you please help review the patch? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8224789 WEBREV: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/ Also, reading a char at line 3274 is done with skip(), so the exception thrown at 3315 displays the position of the error more accurately.
RFR 8224789 : Parsing repetition count in regex does not detect numeric overflow
Hello! When Pattern.compile() parses the repetition count in the expressions like '.{100}', '.{1,2}' or '.{3,}' it fails to detect numeric overflow if the result is still non-negative. Could you please help review the patch? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8224789 WEBREV: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/ Also, reading a char at line 3274 is done with skip(), so the exception thrown at 3315 displays the position of the error more accurately. -- With kind regards, Ivan Gerasimov