Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-09 Thread 'Thomas Bushnell, BSG' via golang-nuts
On Tue, Jun 9, 2020 at 11:23 AM Axel Wagner 
wrote:

> If you actually read OPs second E-Mail, they did and they didn't find it
> very clear. With that in mind, your message isn't very nice.
> (Also, not to be repetitive or anything: ~80% of the messages in this
> thread aren't actually concerned with what the complexity class *is*, but
> whether it matters)
>

The OP stopped participating in this exciting thread a long time ago. I
went and read through the code, and it seems pretty clear to me that it's
linear.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CA%2BYjuxvQgnTKMM4FF4%3D4pspM-2nBJScNfCNinDv-2cNA09N%3DaQ%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-09 Thread 'Axel Wagner' via golang-nuts
If you actually read OPs second E-Mail, they did and they didn't find it
very clear. With that in mind, your message isn't very nice.
(Also, not to be repetitive or anything: ~80% of the messages in this
thread aren't actually concerned with what the complexity class *is*, but
whether it matters)

On Tue, Jun 9, 2020 at 5:12 PM 'Thomas Bushnell, BSG' via golang-nuts <
golang-nuts@googlegroups.com> wrote:

> I'm surprised none of you have taken all this time to just go read the
> code, where it is clearly linear.
>
> On Mon, Jun 8, 2020 at 9:12 PM Robert Engels 
> wrote:
>
>> The OP specifically asked about compilation - in fact it’s in the title.
>> You stated the compilation complexity is a DoS attack vector. I argued that
>> it is not. That is all.
>>
>> On Jun 8, 2020, at 6:39 PM, Michael Jones 
>> wrote:
>>
>> 
>> Ray, only the discussion is exponential.
>>
>> On Mon, Jun 8, 2020 at 4:23 PM 'Axel Wagner' via golang-nuts <
>> golang-nuts@googlegroups.com> wrote:
>>
>>> On Tue, Jun 9, 2020 at 12:48 AM Kurtis Rader 
>>> wrote:
>>>
 You're talking past each other. Robert is talking about limiting the
 length of the regex, not the string(s) evaluated by the regex.

>>>
>>> So am I. Note that e.g. a Code Search based on PCRE would break, even if
>>> you limit *both* (or rather, any limit causing it to not break would result
>>> in a completely useless piece of software).
>>>
>>> It should be possible to compile any regex of a reasonable length in a
 matter of microseconds. Regardless of whether the application of the regex
 to a given input is near linear (as in the case of the Go RE
 implementation) or exponential (as in the case of PCRE).

>>>
>>> This might be, where we talk past each other. I am using application as
>>> an example for concrete numbers on how quickly exponential growth can
>>> devolve. Of course, compilation of a regexp is fast - I am aware of that.
>>> As it turns out, so is application of a regexp, if you use RE2 (or Go's
>>> regexp package).
>>>
>>> What I take issue with are the statements that a) the question about the
>>> complexity of compiling a regexp is irrelevant, because b) limiting the
>>> algorithmic complexity of a function to counteract resource exhaustion
>>> attacks "never works". RE2 is an excellent example to show that it does
>>> work; it is carefully designed for linear complexity of the combined
>>> compilation+matching of a regular expression.
>>>
>>> I'm not saying compiling a regular expression is an
>>> exponential operation. I'm saying *if it was*, you could never reasonably
>>> build something like Code Search. And we can use the fact that
>>> *application* indeed is (in most engines) exponential in the combined input
>>> length of expression+search text as a good basis to make that case.
>>>
>>> Of course we don't even need this fantasy world to make the case - after
>>> all, saying "you can't build Code Search on PCRE, but you can on RE2" is
>>> just as effective to prove the effectiveness of lowering algorithmic
>>> complexity. It's just that OP's question was about compilation, so to talk
>>> about why OP's question *is* relevant, we need to talk about what would
>>> happen if the answer *wasn't* "it's linear".
>>>
>>> (I also genuinely don't understand the instinct to tell someone "why
>>> would you care?" instead of telling them the answer, FWIW. Especially in a
>>> case like this, where the answer is pretty simple)
>>>
>>> I'm pretty sure Robert is not arguing that the scaling problems of the
 regex implementation used by Perl, and too many others, can be mitigated
 simply by limiting the size of the string to be matched by the regex. If
 compiling a regex of reasonable length takes a non-negligible amount of
 time something is wrong.

 On Mon, Jun 8, 2020 at 3:22 PM Wojciech S. Czarnecki 
 wrote:

> Dnia 2020-06-08, o godz. 16:22:24
> Robert Engels  napisał(a):
>
> > it is trivial to limit the input size to something a user could
> input.
>
> With exponential complexity simple regex /(x+x+)+y/ blows up at input
> of 20 to 30 x-es.
> See: https://www.regular-expressions.info/catastrophic.html
>
> [Cut long explanations... Axel just posted most of what I was writing
> regarding trade-offs).
>
> Hope this helps,
>
> --
> Wojciech S. Czarnecki
>  << ^oo^ >> OHIR-RIPE
>
> --
> You received this message because you are subscribed to the Google
> Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/20200609002207.0a161adf%40xmint
> .
>


 --
 Kurtis Rader
 Caretaker of the exceptional canines Junior and Hank

 --
 You received this message because you are subscribed 

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-09 Thread 'Thomas Bushnell, BSG' via golang-nuts
I'm surprised none of you have taken all this time to just go read the
code, where it is clearly linear.

On Mon, Jun 8, 2020 at 9:12 PM Robert Engels  wrote:

> The OP specifically asked about compilation - in fact it’s in the title.
> You stated the compilation complexity is a DoS attack vector. I argued that
> it is not. That is all.
>
> On Jun 8, 2020, at 6:39 PM, Michael Jones  wrote:
>
> 
> Ray, only the discussion is exponential.
>
> On Mon, Jun 8, 2020 at 4:23 PM 'Axel Wagner' via golang-nuts <
> golang-nuts@googlegroups.com> wrote:
>
>> On Tue, Jun 9, 2020 at 12:48 AM Kurtis Rader 
>> wrote:
>>
>>> You're talking past each other. Robert is talking about limiting the
>>> length of the regex, not the string(s) evaluated by the regex.
>>>
>>
>> So am I. Note that e.g. a Code Search based on PCRE would break, even if
>> you limit *both* (or rather, any limit causing it to not break would result
>> in a completely useless piece of software).
>>
>> It should be possible to compile any regex of a reasonable length in a
>>> matter of microseconds. Regardless of whether the application of the regex
>>> to a given input is near linear (as in the case of the Go RE
>>> implementation) or exponential (as in the case of PCRE).
>>>
>>
>> This might be, where we talk past each other. I am using application as
>> an example for concrete numbers on how quickly exponential growth can
>> devolve. Of course, compilation of a regexp is fast - I am aware of that.
>> As it turns out, so is application of a regexp, if you use RE2 (or Go's
>> regexp package).
>>
>> What I take issue with are the statements that a) the question about the
>> complexity of compiling a regexp is irrelevant, because b) limiting the
>> algorithmic complexity of a function to counteract resource exhaustion
>> attacks "never works". RE2 is an excellent example to show that it does
>> work; it is carefully designed for linear complexity of the combined
>> compilation+matching of a regular expression.
>>
>> I'm not saying compiling a regular expression is an
>> exponential operation. I'm saying *if it was*, you could never reasonably
>> build something like Code Search. And we can use the fact that
>> *application* indeed is (in most engines) exponential in the combined input
>> length of expression+search text as a good basis to make that case.
>>
>> Of course we don't even need this fantasy world to make the case - after
>> all, saying "you can't build Code Search on PCRE, but you can on RE2" is
>> just as effective to prove the effectiveness of lowering algorithmic
>> complexity. It's just that OP's question was about compilation, so to talk
>> about why OP's question *is* relevant, we need to talk about what would
>> happen if the answer *wasn't* "it's linear".
>>
>> (I also genuinely don't understand the instinct to tell someone "why
>> would you care?" instead of telling them the answer, FWIW. Especially in a
>> case like this, where the answer is pretty simple)
>>
>> I'm pretty sure Robert is not arguing that the scaling problems of the
>>> regex implementation used by Perl, and too many others, can be mitigated
>>> simply by limiting the size of the string to be matched by the regex. If
>>> compiling a regex of reasonable length takes a non-negligible amount of
>>> time something is wrong.
>>>
>>> On Mon, Jun 8, 2020 at 3:22 PM Wojciech S. Czarnecki 
>>> wrote:
>>>
 Dnia 2020-06-08, o godz. 16:22:24
 Robert Engels  napisał(a):

 > it is trivial to limit the input size to something a user could input.

 With exponential complexity simple regex /(x+x+)+y/ blows up at input
 of 20 to 30 x-es.
 See: https://www.regular-expressions.info/catastrophic.html

 [Cut long explanations... Axel just posted most of what I was writing
 regarding trade-offs).

 Hope this helps,

 --
 Wojciech S. Czarnecki
  << ^oo^ >> OHIR-RIPE

 --
 You received this message because you are subscribed to the Google
 Groups "golang-nuts" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to golang-nuts+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/golang-nuts/20200609002207.0a161adf%40xmint
 .

>>>
>>>
>>> --
>>> Kurtis Rader
>>> Caretaker of the exceptional canines Junior and Hank
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "golang-nuts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to golang-nuts+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/golang-nuts/CABx2%3DD9sKeHnqUAqB61y5Ts-U_f%2BctVAuS4BC0ae8phhhcp1iw%40mail.gmail.com
>>> 
>>> .
>>>
>> --
>> You r

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread Robert Engels
The OP specifically asked about compilation - in fact it’s in the title. You 
stated the compilation complexity is a DoS attack vector. I argued that it is 
not. That is all. 

> On Jun 8, 2020, at 6:39 PM, Michael Jones  wrote:
> 
> 
> Ray, only the discussion is exponential.
> 
>> On Mon, Jun 8, 2020 at 4:23 PM 'Axel Wagner' via golang-nuts 
>>  wrote:
>>> On Tue, Jun 9, 2020 at 12:48 AM Kurtis Rader  wrote:
>> 
>>> You're talking past each other. Robert is talking about limiting the length 
>>> of the regex, not the string(s) evaluated by the regex.
>> 
>> So am I. Note that e.g. a Code Search based on PCRE would break, even if you 
>> limit *both* (or rather, any limit causing it to not break would result in a 
>> completely useless piece of software).
>> 
>>> It should be possible to compile any regex of a reasonable length in a 
>>> matter of microseconds. Regardless of whether the application of the regex 
>>> to a given input is near linear (as in the case of the Go RE 
>>> implementation) or exponential (as in the case of PCRE).
>> 
>> This might be, where we talk past each other. I am using application as an 
>> example for concrete numbers on how quickly exponential growth can devolve. 
>> Of course, compilation of a regexp is fast - I am aware of that. As it turns 
>> out, so is application of a regexp, if you use RE2 (or Go's regexp package).
>> 
>> What I take issue with are the statements that a) the question about the 
>> complexity of compiling a regexp is irrelevant, because b) limiting the 
>> algorithmic complexity of a function to counteract resource exhaustion 
>> attacks "never works". RE2 is an excellent example to show that it does 
>> work; it is carefully designed for linear complexity of the combined 
>> compilation+matching of a regular expression.
>> 
>> I'm not saying compiling a regular expression is an exponential operation. 
>> I'm saying *if it was*, you could never reasonably build something like Code 
>> Search. And we can use the fact that *application* indeed is (in most 
>> engines) exponential in the combined input length of expression+search text 
>> as a good basis to make that case.
>> 
>> Of course we don't even need this fantasy world to make the case - after 
>> all, saying "you can't build Code Search on PCRE, but you can on RE2" is 
>> just as effective to prove the effectiveness of lowering algorithmic 
>> complexity. It's just that OP's question was about compilation, so to talk 
>> about why OP's question *is* relevant, we need to talk about what would 
>> happen if the answer *wasn't* "it's linear".
>> 
>> (I also genuinely don't understand the instinct to tell someone "why would 
>> you care?" instead of telling them the answer, FWIW. Especially in a case 
>> like this, where the answer is pretty simple)
>> 
>>> I'm pretty sure Robert is not arguing that the scaling problems of the 
>>> regex implementation used by Perl, and too many others, can be mitigated 
>>> simply by limiting the size of the string to be matched by the regex. If 
>>> compiling a regex of reasonable length takes a non-negligible amount of 
>>> time something is wrong.
>>> 
 On Mon, Jun 8, 2020 at 3:22 PM Wojciech S. Czarnecki  
 wrote:
 Dnia 2020-06-08, o godz. 16:22:24
 Robert Engels  napisał(a):
 
 > it is trivial to limit the input size to something a user could input.
 
 With exponential complexity simple regex /(x+x+)+y/ blows up at input of 
 20 to 30 x-es.
 See: https://www.regular-expressions.info/catastrophic.html
 
 [Cut long explanations... Axel just posted most of what I was writing 
 regarding trade-offs).
 
 Hope this helps,
 
 -- 
 Wojciech S. Czarnecki
  << ^oo^ >> OHIR-RIPE
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 "golang-nuts" group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to golang-nuts+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/golang-nuts/20200609002207.0a161adf%40xmint.
>>> 
>>> 
>>> -- 
>>> Kurtis Rader
>>> Caretaker of the exceptional canines Junior and Hank
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "golang-nuts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to golang-nuts+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/golang-nuts/CABx2%3DD9sKeHnqUAqB61y5Ts-U_f%2BctVAuS4BC0ae8phhhcp1iw%40mail.gmail.com.
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/golang-nuts/CAEk

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread Michael Jones
Ray, only the discussion is exponential.

On Mon, Jun 8, 2020 at 4:23 PM 'Axel Wagner' via golang-nuts <
golang-nuts@googlegroups.com> wrote:

> On Tue, Jun 9, 2020 at 12:48 AM Kurtis Rader  wrote:
>
>> You're talking past each other. Robert is talking about limiting the
>> length of the regex, not the string(s) evaluated by the regex.
>>
>
> So am I. Note that e.g. a Code Search based on PCRE would break, even if
> you limit *both* (or rather, any limit causing it to not break would result
> in a completely useless piece of software).
>
> It should be possible to compile any regex of a reasonable length in a
>> matter of microseconds. Regardless of whether the application of the regex
>> to a given input is near linear (as in the case of the Go RE
>> implementation) or exponential (as in the case of PCRE).
>>
>
> This might be, where we talk past each other. I am using application as an
> example for concrete numbers on how quickly exponential growth can devolve.
> Of course, compilation of a regexp is fast - I am aware of that. As it
> turns out, so is application of a regexp, if you use RE2 (or Go's regexp
> package).
>
> What I take issue with are the statements that a) the question about the
> complexity of compiling a regexp is irrelevant, because b) limiting the
> algorithmic complexity of a function to counteract resource exhaustion
> attacks "never works". RE2 is an excellent example to show that it does
> work; it is carefully designed for linear complexity of the combined
> compilation+matching of a regular expression.
>
> I'm not saying compiling a regular expression is an exponential operation.
> I'm saying *if it was*, you could never reasonably build something like
> Code Search. And we can use the fact that *application* indeed is (in most
> engines) exponential in the combined input length of expression+search text
> as a good basis to make that case.
>
> Of course we don't even need this fantasy world to make the case - after
> all, saying "you can't build Code Search on PCRE, but you can on RE2" is
> just as effective to prove the effectiveness of lowering algorithmic
> complexity. It's just that OP's question was about compilation, so to talk
> about why OP's question *is* relevant, we need to talk about what would
> happen if the answer *wasn't* "it's linear".
>
> (I also genuinely don't understand the instinct to tell someone "why would
> you care?" instead of telling them the answer, FWIW. Especially in a case
> like this, where the answer is pretty simple)
>
> I'm pretty sure Robert is not arguing that the scaling problems of the
>> regex implementation used by Perl, and too many others, can be mitigated
>> simply by limiting the size of the string to be matched by the regex. If
>> compiling a regex of reasonable length takes a non-negligible amount of
>> time something is wrong.
>>
>> On Mon, Jun 8, 2020 at 3:22 PM Wojciech S. Czarnecki 
>> wrote:
>>
>>> Dnia 2020-06-08, o godz. 16:22:24
>>> Robert Engels  napisał(a):
>>>
>>> > it is trivial to limit the input size to something a user could input.
>>>
>>> With exponential complexity simple regex /(x+x+)+y/ blows up at input of
>>> 20 to 30 x-es.
>>> See: https://www.regular-expressions.info/catastrophic.html
>>>
>>> [Cut long explanations... Axel just posted most of what I was writing
>>> regarding trade-offs).
>>>
>>> Hope this helps,
>>>
>>> --
>>> Wojciech S. Czarnecki
>>>  << ^oo^ >> OHIR-RIPE
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "golang-nuts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to golang-nuts+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/golang-nuts/20200609002207.0a161adf%40xmint
>>> .
>>>
>>
>>
>> --
>> Kurtis Rader
>> Caretaker of the exceptional canines Junior and Hank
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/golang-nuts/CABx2%3DD9sKeHnqUAqB61y5Ts-U_f%2BctVAuS4BC0ae8phhhcp1iw%40mail.gmail.com
>> 
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/CAEkBMfGnRrAt%3Dx3buNcfoOc9xGqVj_tfMEviU%3D7Amjw01e%3Df_w%40mail.gmail.com
> 

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Axel Wagner' via golang-nuts
On Tue, Jun 9, 2020 at 12:48 AM Kurtis Rader  wrote:

> You're talking past each other. Robert is talking about limiting the
> length of the regex, not the string(s) evaluated by the regex.
>

So am I. Note that e.g. a Code Search based on PCRE would break, even if
you limit *both* (or rather, any limit causing it to not break would result
in a completely useless piece of software).

It should be possible to compile any regex of a reasonable length in a
> matter of microseconds. Regardless of whether the application of the regex
> to a given input is near linear (as in the case of the Go RE
> implementation) or exponential (as in the case of PCRE).
>

This might be, where we talk past each other. I am using application as an
example for concrete numbers on how quickly exponential growth can devolve.
Of course, compilation of a regexp is fast - I am aware of that. As it
turns out, so is application of a regexp, if you use RE2 (or Go's regexp
package).

What I take issue with are the statements that a) the question about the
complexity of compiling a regexp is irrelevant, because b) limiting the
algorithmic complexity of a function to counteract resource exhaustion
attacks "never works". RE2 is an excellent example to show that it does
work; it is carefully designed for linear complexity of the combined
compilation+matching of a regular expression.

I'm not saying compiling a regular expression is an exponential operation.
I'm saying *if it was*, you could never reasonably build something like
Code Search. And we can use the fact that *application* indeed is (in most
engines) exponential in the combined input length of expression+search text
as a good basis to make that case.

Of course we don't even need this fantasy world to make the case - after
all, saying "you can't build Code Search on PCRE, but you can on RE2" is
just as effective to prove the effectiveness of lowering algorithmic
complexity. It's just that OP's question was about compilation, so to talk
about why OP's question *is* relevant, we need to talk about what would
happen if the answer *wasn't* "it's linear".

(I also genuinely don't understand the instinct to tell someone "why would
you care?" instead of telling them the answer, FWIW. Especially in a case
like this, where the answer is pretty simple)

I'm pretty sure Robert is not arguing that the scaling problems of the
> regex implementation used by Perl, and too many others, can be mitigated
> simply by limiting the size of the string to be matched by the regex. If
> compiling a regex of reasonable length takes a non-negligible amount of
> time something is wrong.
>
> On Mon, Jun 8, 2020 at 3:22 PM Wojciech S. Czarnecki 
> wrote:
>
>> Dnia 2020-06-08, o godz. 16:22:24
>> Robert Engels  napisał(a):
>>
>> > it is trivial to limit the input size to something a user could input.
>>
>> With exponential complexity simple regex /(x+x+)+y/ blows up at input of
>> 20 to 30 x-es.
>> See: https://www.regular-expressions.info/catastrophic.html
>>
>> [Cut long explanations... Axel just posted most of what I was writing
>> regarding trade-offs).
>>
>> Hope this helps,
>>
>> --
>> Wojciech S. Czarnecki
>>  << ^oo^ >> OHIR-RIPE
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/golang-nuts/20200609002207.0a161adf%40xmint
>> .
>>
>
>
> --
> Kurtis Rader
> Caretaker of the exceptional canines Junior and Hank
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/CABx2%3DD9sKeHnqUAqB61y5Ts-U_f%2BctVAuS4BC0ae8phhhcp1iw%40mail.gmail.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAEkBMfGnRrAt%3Dx3buNcfoOc9xGqVj_tfMEviU%3D7Amjw01e%3Df_w%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread Kurtis Rader
You're talking past each other. Robert is talking about limiting the length
of the regex, not the string(s) evaluated by the regex. It should be
possible to compile any regex of a reasonable length in a matter of
microseconds. Regardless of whether the application of the regex to a given
input is near linear (as in the case of the Go RE implementation) or
exponential (as in the case of PCRE). I'm pretty sure Robert is not arguing
that the scaling problems of the regex implementation used by Perl, and too
many others, can be mitigated simply by limiting the size of the string to
be matched by the regex. If compiling a regex of reasonable length takes a
non-negligible amount of time something is wrong.

On Mon, Jun 8, 2020 at 3:22 PM Wojciech S. Czarnecki 
wrote:

> Dnia 2020-06-08, o godz. 16:22:24
> Robert Engels  napisał(a):
>
> > it is trivial to limit the input size to something a user could input.
>
> With exponential complexity simple regex /(x+x+)+y/ blows up at input of
> 20 to 30 x-es.
> See: https://www.regular-expressions.info/catastrophic.html
>
> [Cut long explanations... Axel just posted most of what I was writing
> regarding trade-offs).
>
> Hope this helps,
>
> --
> Wojciech S. Czarnecki
>  << ^oo^ >> OHIR-RIPE
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/20200609002207.0a161adf%40xmint
> .
>


-- 
Kurtis Rader
Caretaker of the exceptional canines Junior and Hank

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CABx2%3DD9sKeHnqUAqB61y5Ts-U_f%2BctVAuS4BC0ae8phhhcp1iw%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread Wojciech S. Czarnecki
Dnia 2020-06-08, o godz. 16:22:24
Robert Engels  napisał(a):

> it is trivial to limit the input size to something a user could input.

With exponential complexity simple regex /(x+x+)+y/ blows up at input of 20 to 
30 x-es.
See: https://www.regular-expressions.info/catastrophic.html

[Cut long explanations... Axel just posted most of what I was writing regarding 
trade-offs).

Hope this helps,

-- 
Wojciech S. Czarnecki
 << ^oo^ >> OHIR-RIPE

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/20200609002207.0a161adf%40xmint.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Axel Wagner' via golang-nuts
They don't have to be *arbitrarily* long. They actually can be quite short.
2^n is large, even for relatively small n.
Again, I refer to the graphs in the blog post above. We are talking about
input length of less than 100 characters to get CPU churn of *minutes*.
That's not a huge limit for a search-box.

Yes, the compile-time is negligible - but it's negligible *because it's
linear in the input size*. And match-times are only small, because RE2 made
the conscious decision (for *exactly* the reasons I outline here) to forego
features that were very common and even expected of regexp-engines, so it
could guarantee a linear match time. That was novel at the time - almost no
engine did that and the most commonly used most certainly didn't. When
people in this thread say "regular expressions are great because they give
linear match time on the matched text size", they forget that most "regular
expression" engines don't actually give you that. They only give that, if
you specifically design with that in mind.

I also explained why "it is trivial to limit the input size to something a
user could input" is simply wrong, in practice. If you have exponential
growth, the difference between 20 and 30 characters (again, we are talking
search-terms here, it is not at all unreasonable to look for a 30 character
identifier) is three orders of magnitude. If the curve is too steep, you
can't reasonably separate between sensible input lengths from attacks. If
you get a nice, linear curve, that becomes trivial. Just multiply any
sensible limit you could think of by 100 and you're still going to end up
well within spec.

And even if we ignore the DoS-potential - just considering what that does
to your load-patterns, that some user's queries will be quasi free, while
others will be a million times or more as expensive is a good argument to
pay attention to this. Just from an operational POV.

Seriously - you should *absolutely* consider the algorithmic complexity of
functions you are calling with user-provided input.

On Mon, Jun 8, 2020 at 11:22 PM Robert Engels  wrote:

> You can’t say it arbitrary regex inputted by a user then claim they can be
> arbitrarily long - these are incompatible. For any reasonable user input
> the compile time is negligible, and it is trivial to limit the input size
> to something a user could input.
>
> On Jun 8, 2020, at 2:27 PM, Axel Wagner 
> wrote:
>
> 
> And that exact logic is what wouldn't work, if compilation or matching
> would be exponential, for example.
> Being able to say "as long as the inputs don't get astronomically long,
> the running time of the algorithm will be reasonable" is *exactly* the
> conclusion that a small complexity class gives you.
>
> On Mon, Jun 8, 2020 at 7:17 PM Robert Engels 
> wrote:
>
>> If the input regex string is bounded by a typical user input box - the
>> compilation time will be negligible when compared to the runtimes.
>>
>> On Jun 8, 2020, at 11:07 AM, Axel Wagner 
>> wrote:
>>
>> 
>> Oh, true, I overlooked that detail. But FTR, they clarify their question
>> subsequently and specifically ask whether compilation is linear in
>> expression size. No Must* in that clarification.
>>
>> On Mon, Jun 8, 2020 at 6:05 PM Thomas Bushnell, BSG 
>> wrote:
>>
>>> On Mon, Jun 8, 2020 at 12:02 PM Axel Wagner <
>>> axel.wagner...@googlemail.com> wrote:
>>>
 On Mon, Jun 8, 2020 at 5:41 PM Thomas Bushnell, BSG <
 tbushn...@google.com> wrote:

> The OP was about MustCompile, so I think it's clear they are not using
> patterns passed in by external requests.
>

 I don't think that's clear at all. How do you assume patterns from
 external sources can be matched, if not by compiling them?

>>>
>>> "MustCompile is like Compile but panics if the expression cannot be
>>> parsed."
>>>
>>> Thomas
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAEkBMfHxYgCfPtLZoZFppmsZYmYZMts9%3Dk4Q4pKv7ygAiR5SnA%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread Robert Engels
You can’t say it arbitrary regex inputted by a user then claim they can be 
arbitrarily long - these are incompatible. For any reasonable user input the 
compile time is negligible, and it is trivial to limit the input size to 
something a user could input. 

> On Jun 8, 2020, at 2:27 PM, Axel Wagner  wrote:
> 
> 
> And that exact logic is what wouldn't work, if compilation or matching would 
> be exponential, for example.
> Being able to say "as long as the inputs don't get astronomically long, the 
> running time of the algorithm will be reasonable" is *exactly* the conclusion 
> that a small complexity class gives you.
> 
>> On Mon, Jun 8, 2020 at 7:17 PM Robert Engels  wrote:
>> If the input regex string is bounded by a typical user input box - the 
>> compilation time will be negligible when compared to the runtimes.   
>> 
 On Jun 8, 2020, at 11:07 AM, Axel Wagner  
 wrote:
 
>>> 
>>> Oh, true, I overlooked that detail. But FTR, they clarify their question 
>>> subsequently and specifically ask whether compilation is linear in 
>>> expression size. No Must* in that clarification.
>>> 
 On Mon, Jun 8, 2020 at 6:05 PM Thomas Bushnell, BSG  
 wrote:
> On Mon, Jun 8, 2020 at 12:02 PM Axel Wagner 
>  wrote:
 
>> On Mon, Jun 8, 2020 at 5:41 PM Thomas Bushnell, BSG 
>>  wrote:
> 
>> The OP was about MustCompile, so I think it's clear they are not using 
>> patterns passed in by external requests.
> 
> I don't think that's clear at all. How do you assume patterns from 
> external sources can be matched, if not by compiling them?
 
 "MustCompile is like Compile but panics if the expression cannot be 
 parsed."
 
 Thomas 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/2ADCFB57-FF38-4DE1-8FC6-D660F089BB89%40ix.netcom.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Axel Wagner' via golang-nuts
And that exact logic is what wouldn't work, if compilation or matching
would be exponential, for example.
Being able to say "as long as the inputs don't get astronomically long, the
running time of the algorithm will be reasonable" is *exactly* the
conclusion that a small complexity class gives you.

On Mon, Jun 8, 2020 at 7:17 PM Robert Engels  wrote:

> If the input regex string is bounded by a typical user input box - the
> compilation time will be negligible when compared to the runtimes.
>
> On Jun 8, 2020, at 11:07 AM, Axel Wagner 
> wrote:
>
> 
> Oh, true, I overlooked that detail. But FTR, they clarify their question
> subsequently and specifically ask whether compilation is linear in
> expression size. No Must* in that clarification.
>
> On Mon, Jun 8, 2020 at 6:05 PM Thomas Bushnell, BSG 
> wrote:
>
>> On Mon, Jun 8, 2020 at 12:02 PM Axel Wagner <
>> axel.wagner...@googlemail.com> wrote:
>>
>>> On Mon, Jun 8, 2020 at 5:41 PM Thomas Bushnell, BSG <
>>> tbushn...@google.com> wrote:
>>>
 The OP was about MustCompile, so I think it's clear they are not using
 patterns passed in by external requests.

>>>
>>> I don't think that's clear at all. How do you assume patterns from
>>> external sources can be matched, if not by compiling them?
>>>
>>
>> "MustCompile is like Compile but panics if the expression cannot be
>> parsed."
>>
>> Thomas
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAEkBMfEfJRPh5Zuedva87-PGTJOLHMYmpvsr4whxdFrx6TTySw%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread Robert Engels
If the input regex string is bounded by a typical user input box - the 
compilation time will be negligible when compared to the runtimes.   

> On Jun 8, 2020, at 11:07 AM, Axel Wagner  
> wrote:
> 
> 
> Oh, true, I overlooked that detail. But FTR, they clarify their question 
> subsequently and specifically ask whether compilation is linear in expression 
> size. No Must* in that clarification.
> 
>> On Mon, Jun 8, 2020 at 6:05 PM Thomas Bushnell, BSG  
>> wrote:
>>> On Mon, Jun 8, 2020 at 12:02 PM Axel Wagner  
>>> wrote:
>> 
 On Mon, Jun 8, 2020 at 5:41 PM Thomas Bushnell, BSG  
 wrote:
>>> 
 The OP was about MustCompile, so I think it's clear they are not using 
 patterns passed in by external requests.
>>> 
>>> I don't think that's clear at all. How do you assume patterns from external 
>>> sources can be matched, if not by compiling them?
>> 
>> "MustCompile is like Compile but panics if the expression cannot be parsed."
>> 
>> Thomas 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/D411E257-EE2F-49AA-960A-6BC588AB547C%40ix.netcom.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Axel Wagner' via golang-nuts
Oh, true, I overlooked that detail. But FTR, they clarify their question
subsequently and specifically ask whether compilation is linear in
expression size. No Must* in that clarification.

On Mon, Jun 8, 2020 at 6:05 PM Thomas Bushnell, BSG 
wrote:

> On Mon, Jun 8, 2020 at 12:02 PM Axel Wagner 
> wrote:
>
>> On Mon, Jun 8, 2020 at 5:41 PM Thomas Bushnell, BSG 
>> wrote:
>>
>>> The OP was about MustCompile, so I think it's clear they are not using
>>> patterns passed in by external requests.
>>>
>>
>> I don't think that's clear at all. How do you assume patterns from
>> external sources can be matched, if not by compiling them?
>>
>
> "MustCompile is like Compile but panics if the expression cannot be
> parsed."
>
> Thomas
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAEkBMfHH1m18LXy%3DmstyRV6OTn7ojTuOgnQY8KhvyaW0oTWzTg%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Axel Wagner' via golang-nuts
On Mon, Jun 8, 2020 at 3:38 PM Robert Engels  wrote:

> I will claim it also doesn’t matter because you need external bounding
> anyway. Take a simple recursive directory listing. It is linear time
> bounded.
>

Depends on what we're talking about as the "input" here. It is certainly
not linearly bound by the name of the directory, as you pointed out. So the
input would have to be the FS then.


> Perform it on the top level on a sufficiently large directory and it might
> run for days consuming TBs of memory - easily becoming a DOS attack point.
> So regardless of the complexity you need other constraints anyway. Build
> those in at the request handling level to avoid DOS and UX issues.
>

I disagree. Because for this to be the case, the attacker would then have
to be able to provide a sufficiently large FS as an input. Of course you
shouldn't design a system where someone can run directory listings against
an arbitrarily large FS with an arbitrarily short query. But the reason is
*precisely* that in that scenario, a directory listing is *not* bounded by
the input-size.

That's the core of the argument here: If your algorithm is linear or
sublinear in the input-size, then any attacker wanting to consume
resources, must spent a proportional amount of their own resources.



>
> On Jun 8, 2020, at 8:27 AM, 'Axel Wagner' via golang-nuts <
> golang-nuts@googlegroups.com> wrote:
>
> 
>
>
> On Mon, Jun 8, 2020 at 3:19 PM Robert Engels 
> wrote:
>
>> I don’t see anything in the blog post referring to compilation time -
>> only runtime.
>>
>
> Sure. It's more a general point about you saying algorithmic complexity
> doesn't matter.
>
>
>> That being said I couldn’t review the graphs on my phone.
>>
>> Even still, to prevent DOS you can bound the compilation time as easily
>> as the runtime. If the lib doesn’t support it a simple fork to add an emit
>> with cancel out stage is all that is required.
>>
>
> Or, you know, make sure that it just has a guaranteed linear complexity.
> Then you don't even need "an emit with a cancel out stage".
>
> FTR, the whole issue here is a) someone asked about the algorithmic
> complexity of a stdlib function and b) people told them off saying the
> question isn't relevant. It is. You might not care (you should, IMO. But if
> you don't, that's on you). But it's definitely a relevant question to ask.
>
>
>>
>> On Jun 8, 2020, at 8:04 AM, 'Axel Wagner' via golang-nuts <
>> golang-nuts@googlegroups.com> wrote:
>>
>> 
>> On Mon, Jun 8, 2020 at 2:53 PM Robert Engels 
>> wrote:
>>
>>> Attempting to prevent DOS attacks through algorithm efficiency never
>>> works
>>>
>>
>> Uhm, no, it totally does. Code Search is a real-world example of it
>> working at least once.
>>
>>
>>> you have to have resource throttling.
>>>
>>
>> Well, yes. But reigning in algorithmic complexity makes that far easier
>> and . If the cost of a query is proportional to its length, you can just
>> limit the length of queries to some gratuitous upper bound of reasonable.
>> But if it's quadratic or even exponential, that no longer works; if the
>> cost can be doubled just by adding a character to the query, you have to
>> restrict query length to a restrictive *lower* bound on reasonable - which
>> is inherently user-unfriendly.
>>
>> Really, I'd argue that algorithmic efficiency is the *only* real
>> effective measure against a cost DoS.
>>
>> I’m guessing the IO cost of pulling the text in this case has a better
>>> chance of creating a DOS than the regex compile.
>>>
>>
>> You might guess that, but you'd be wrong. Again, just look at the graph
>> on top of this blog post .
>> You get *minutes* of match-times for queries and corpus of a couple hundred
>> characters.
>>
>> Regexp-based code search couldn't exist without carefully designing
>> around algorithmic complexity.
>>
>>
>>>
>>> On Jun 8, 2020, at 7:40 AM, 'Axel Wagner' via golang-nuts <
>>> golang-nuts@googlegroups.com> wrote:
>>>
>>> 
>>> Hi Amnon,
>>>
>>> if you read the blog posts I linked above, you'll find examples of where
>>> we care very much. RE2 was developed for enabling regular expression search
>>> in a large source code corpus. In that scenario, the attacker controls both
>>> the regular expression and (to a degree) the text to be searched. If they
>>> could craft expression/text pairs that are costly to compile and/or match,
>>> then this could enable a denial of service attack.
>>>
>>> So, guaranteeing linear compile- *and* match-times is actually pretty
>>> relevant for some real-world use-cases.
>>>
>>> Best,
>>>
>>> Axel
>>>
>>> On Mon, Jun 8, 2020 at 10:16 AM Amnon Baron Cohen 
>>> wrote:
>>>
 Should we care?

 Regular expressions are generally small.
 So the asymptotic complexity is not particularly important.

 But regular expressions are often used to search large amounts of input.

 regexp gives us fast, guaranteed linear search times.
 But we pay for 

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Thomas Bushnell, BSG' via golang-nuts
On Mon, Jun 8, 2020 at 12:02 PM Axel Wagner 
wrote:

> On Mon, Jun 8, 2020 at 5:41 PM Thomas Bushnell, BSG 
> wrote:
>
>> The OP was about MustCompile, so I think it's clear they are not using
>> patterns passed in by external requests.
>>
>
> I don't think that's clear at all. How do you assume patterns from
> external sources can be matched, if not by compiling them?
>

"MustCompile is like Compile but panics if the expression cannot be parsed."

Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CA%2BYjuxufXoJsPGGXNFRZE0u3BdK4fQE%3DUXaJ11LRKqR7yoHQPA%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Axel Wagner' via golang-nuts
On Mon, Jun 8, 2020 at 5:41 PM Thomas Bushnell, BSG 
wrote:

> The OP was about MustCompile, so I think it's clear they are not using
> patterns passed in by external requests.
>

I don't think that's clear at all. How do you assume patterns from external
sources can be matched, if not by compiling them?


>
> On Mon, Jun 8, 2020 at 9:39 AM Robert Engels 
> wrote:
>
>> I will claim it also doesn’t matter because you need external bounding
>> anyway. Take a simple recursive directory listing. It is linear time
>> bounded. Perform it on the top level on a sufficiently large directory and
>> it might run for days consuming TBs of memory - easily becoming a DOS
>> attack point. So regardless of the complexity you need other constraints
>> anyway. Build those in at the request handling level to avoid DOS and UX
>> issues.
>>
>> On Jun 8, 2020, at 8:27 AM, 'Axel Wagner' via golang-nuts <
>> golang-nuts@googlegroups.com> wrote:
>>
>> 
>>
>>
>> On Mon, Jun 8, 2020 at 3:19 PM Robert Engels 
>> wrote:
>>
>>> I don’t see anything in the blog post referring to compilation time -
>>> only runtime.
>>>
>>
>> Sure. It's more a general point about you saying algorithmic complexity
>> doesn't matter.
>>
>>
>>> That being said I couldn’t review the graphs on my phone.
>>>
>>> Even still, to prevent DOS you can bound the compilation time as easily
>>> as the runtime. If the lib doesn’t support it a simple fork to add an emit
>>> with cancel out stage is all that is required.
>>>
>>
>> Or, you know, make sure that it just has a guaranteed linear complexity.
>> Then you don't even need "an emit with a cancel out stage".
>>
>> FTR, the whole issue here is a) someone asked about the algorithmic
>> complexity of a stdlib function and b) people told them off saying the
>> question isn't relevant. It is. You might not care (you should, IMO. But if
>> you don't, that's on you). But it's definitely a relevant question to ask.
>>
>>
>>>
>>> On Jun 8, 2020, at 8:04 AM, 'Axel Wagner' via golang-nuts <
>>> golang-nuts@googlegroups.com> wrote:
>>>
>>> 
>>> On Mon, Jun 8, 2020 at 2:53 PM Robert Engels 
>>> wrote:
>>>
 Attempting to prevent DOS attacks through algorithm efficiency never
 works

>>>
>>> Uhm, no, it totally does. Code Search is a real-world example of it
>>> working at least once.
>>>
>>>
 you have to have resource throttling.

>>>
>>> Well, yes. But reigning in algorithmic complexity makes that far easier
>>> and . If the cost of a query is proportional to its length, you can just
>>> limit the length of queries to some gratuitous upper bound of reasonable.
>>> But if it's quadratic or even exponential, that no longer works; if the
>>> cost can be doubled just by adding a character to the query, you have to
>>> restrict query length to a restrictive *lower* bound on reasonable - which
>>> is inherently user-unfriendly.
>>>
>>> Really, I'd argue that algorithmic efficiency is the *only* real
>>> effective measure against a cost DoS.
>>>
>>> I’m guessing the IO cost of pulling the text in this case has a better
 chance of creating a DOS than the regex compile.

>>>
>>> You might guess that, but you'd be wrong. Again, just look at the graph
>>> on top of this blog post .
>>> You get *minutes* of match-times for queries and corpus of a couple hundred
>>> characters.
>>>
>>> Regexp-based code search couldn't exist without carefully designing
>>> around algorithmic complexity.
>>>
>>>

 On Jun 8, 2020, at 7:40 AM, 'Axel Wagner' via golang-nuts <
 golang-nuts@googlegroups.com> wrote:

 
 Hi Amnon,

 if you read the blog posts I linked above, you'll find examples of
 where we care very much. RE2 was developed for enabling regular expression
 search in a large source code corpus. In that scenario, the attacker
 controls both the regular expression and (to a degree) the text to be
 searched. If they could craft expression/text pairs that are costly to
 compile and/or match, then this could enable a denial of service attack.

 So, guaranteeing linear compile- *and* match-times is actually pretty
 relevant for some real-world use-cases.

 Best,

 Axel

 On Mon, Jun 8, 2020 at 10:16 AM Amnon Baron Cohen 
 wrote:

> Should we care?
>
> Regular expressions are generally small.
> So the asymptotic complexity is not particularly important.
>
> But regular expressions are often used to search large amounts of
> input.
>
> regexp gives us fast, guaranteed linear search times.
> But we pay for this with slower compilation times.
>
> In my opinion, this is a good tradeoff.
>
>
>
> On Wednesday, 3 June 2020 18:07:12 UTC+1, Ray Pereda wrote:
>>
>> I believe that the complexity of regexp.MustCompile()
>>  is linear based on this
>> comment in the regexp packa

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Thomas Bushnell, BSG' via golang-nuts
The OP was about MustCompile, so I think it's clear they are not using
patterns passed in by external requests.

On Mon, Jun 8, 2020 at 9:39 AM Robert Engels  wrote:

> I will claim it also doesn’t matter because you need external bounding
> anyway. Take a simple recursive directory listing. It is linear time
> bounded. Perform it on the top level on a sufficiently large directory and
> it might run for days consuming TBs of memory - easily becoming a DOS
> attack point. So regardless of the complexity you need other constraints
> anyway. Build those in at the request handling level to avoid DOS and UX
> issues.
>
> On Jun 8, 2020, at 8:27 AM, 'Axel Wagner' via golang-nuts <
> golang-nuts@googlegroups.com> wrote:
>
> 
>
>
> On Mon, Jun 8, 2020 at 3:19 PM Robert Engels 
> wrote:
>
>> I don’t see anything in the blog post referring to compilation time -
>> only runtime.
>>
>
> Sure. It's more a general point about you saying algorithmic complexity
> doesn't matter.
>
>
>> That being said I couldn’t review the graphs on my phone.
>>
>> Even still, to prevent DOS you can bound the compilation time as easily
>> as the runtime. If the lib doesn’t support it a simple fork to add an emit
>> with cancel out stage is all that is required.
>>
>
> Or, you know, make sure that it just has a guaranteed linear complexity.
> Then you don't even need "an emit with a cancel out stage".
>
> FTR, the whole issue here is a) someone asked about the algorithmic
> complexity of a stdlib function and b) people told them off saying the
> question isn't relevant. It is. You might not care (you should, IMO. But if
> you don't, that's on you). But it's definitely a relevant question to ask.
>
>
>>
>> On Jun 8, 2020, at 8:04 AM, 'Axel Wagner' via golang-nuts <
>> golang-nuts@googlegroups.com> wrote:
>>
>> 
>> On Mon, Jun 8, 2020 at 2:53 PM Robert Engels 
>> wrote:
>>
>>> Attempting to prevent DOS attacks through algorithm efficiency never
>>> works
>>>
>>
>> Uhm, no, it totally does. Code Search is a real-world example of it
>> working at least once.
>>
>>
>>> you have to have resource throttling.
>>>
>>
>> Well, yes. But reigning in algorithmic complexity makes that far easier
>> and . If the cost of a query is proportional to its length, you can just
>> limit the length of queries to some gratuitous upper bound of reasonable.
>> But if it's quadratic or even exponential, that no longer works; if the
>> cost can be doubled just by adding a character to the query, you have to
>> restrict query length to a restrictive *lower* bound on reasonable - which
>> is inherently user-unfriendly.
>>
>> Really, I'd argue that algorithmic efficiency is the *only* real
>> effective measure against a cost DoS.
>>
>> I’m guessing the IO cost of pulling the text in this case has a better
>>> chance of creating a DOS than the regex compile.
>>>
>>
>> You might guess that, but you'd be wrong. Again, just look at the graph
>> on top of this blog post .
>> You get *minutes* of match-times for queries and corpus of a couple hundred
>> characters.
>>
>> Regexp-based code search couldn't exist without carefully designing
>> around algorithmic complexity.
>>
>>
>>>
>>> On Jun 8, 2020, at 7:40 AM, 'Axel Wagner' via golang-nuts <
>>> golang-nuts@googlegroups.com> wrote:
>>>
>>> 
>>> Hi Amnon,
>>>
>>> if you read the blog posts I linked above, you'll find examples of where
>>> we care very much. RE2 was developed for enabling regular expression search
>>> in a large source code corpus. In that scenario, the attacker controls both
>>> the regular expression and (to a degree) the text to be searched. If they
>>> could craft expression/text pairs that are costly to compile and/or match,
>>> then this could enable a denial of service attack.
>>>
>>> So, guaranteeing linear compile- *and* match-times is actually pretty
>>> relevant for some real-world use-cases.
>>>
>>> Best,
>>>
>>> Axel
>>>
>>> On Mon, Jun 8, 2020 at 10:16 AM Amnon Baron Cohen 
>>> wrote:
>>>
 Should we care?

 Regular expressions are generally small.
 So the asymptotic complexity is not particularly important.

 But regular expressions are often used to search large amounts of input.

 regexp gives us fast, guaranteed linear search times.
 But we pay for this with slower compilation times.

 In my opinion, this is a good tradeoff.



 On Wednesday, 3 June 2020 18:07:12 UTC+1, Ray Pereda wrote:
>
> I believe that the complexity of regexp.MustCompile()
>  is linear based on this
> comment in the regexp package overview.
> 
>
> "The regexp implementation provided by this package is guaranteed to
> run in time linear in the size of the input"
>
>
> What is the complexity of regexp.MustCompile()
> ? Is it linear in the
>

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread Robert Engels
I will claim it also doesn’t matter because you need external bounding anyway. 
Take a simple recursive directory listing. It is linear time bounded. Perform 
it on the top level on a sufficiently large directory and it might run for days 
consuming TBs of memory - easily becoming a DOS attack point. So regardless of 
the complexity you need other constraints anyway. Build those in at the request 
handling level to avoid DOS and UX issues. 

> On Jun 8, 2020, at 8:27 AM, 'Axel Wagner' via golang-nuts 
>  wrote:
> 
> 
> 
> 
>> On Mon, Jun 8, 2020 at 3:19 PM Robert Engels  wrote:
>> I don’t see anything in the blog post referring to compilation time - only 
>> runtime.
> 
> Sure. It's more a general point about you saying algorithmic complexity 
> doesn't matter.
>  
>> That being said I couldn’t review the graphs on my phone. 
>> 
>> Even still, to prevent DOS you can bound the compilation time as easily as 
>> the runtime. If the lib doesn’t support it a simple fork to add an emit with 
>> cancel out stage is all that is required. 
> 
> Or, you know, make sure that it just has a guaranteed linear complexity. Then 
> you don't even need "an emit with a cancel out stage".
> 
> FTR, the whole issue here is a) someone asked about the algorithmic 
> complexity of a stdlib function and b) people told them off saying the 
> question isn't relevant. It is. You might not care (you should, IMO. But if 
> you don't, that's on you). But it's definitely a relevant question to ask.
>  
>> 
 On Jun 8, 2020, at 8:04 AM, 'Axel Wagner' via golang-nuts 
  wrote:
 
>>> 
 On Mon, Jun 8, 2020 at 2:53 PM Robert Engels  wrote:
>>> 
 Attempting to prevent DOS attacks through algorithm efficiency never works
>>> 
>>> Uhm, no, it totally does. Code Search is a real-world example of it working 
>>> at least once. 
>>>  
 you have to have resource throttling. 
>>> 
>>> Well, yes. But reigning in algorithmic complexity makes that far easier and 
>>> . If the cost of a query is proportional to its length, you can just limit 
>>> the length of queries to some gratuitous upper bound of reasonable. But if 
>>> it's quadratic or even exponential, that no longer works; if the cost can 
>>> be doubled just by adding a character to the query, you have to restrict 
>>> query length to a restrictive *lower* bound on reasonable - which is 
>>> inherently user-unfriendly.
>>>  
>>> Really, I'd argue that algorithmic efficiency is the *only* real effective 
>>> measure against a cost DoS.
>>> 
 I’m guessing the IO cost of pulling the text in this case has a better 
 chance of creating a DOS than the regex compile. 
>>> 
>>> You might guess that, but you'd be wrong. Again, just look at the graph on 
>>> top of this blog post. You get *minutes* of match-times for queries and 
>>> corpus of a couple hundred characters.
>>> 
>>> Regexp-based code search couldn't exist without carefully designing around 
>>> algorithmic complexity.
>>>  
 
>> On Jun 8, 2020, at 7:40 AM, 'Axel Wagner' via golang-nuts 
>>  wrote:
>> 
> 
> Hi Amnon,
> 
> if you read the blog posts I linked above, you'll find examples of where 
> we care very much. RE2 was developed for enabling regular expression 
> search in a large source code corpus. In that scenario, the attacker 
> controls both the regular expression and (to a degree) the text to be 
> searched. If they could craft expression/text pairs that are costly to 
> compile and/or match, then this could enable a denial of service attack.
> 
> So, guaranteeing linear compile- *and* match-times is actually pretty 
> relevant for some real-world use-cases.
> 
> Best,
> 
> Axel
> 
>> On Mon, Jun 8, 2020 at 10:16 AM Amnon Baron Cohen  
>> wrote:
>> Should we care?
>> 
>> Regular expressions are generally small. 
>> So the asymptotic complexity is not particularly important.
>> 
>> But regular expressions are often used to search large amounts of input.
>> 
>> regexp gives us fast, guaranteed linear search times.
>> But we pay for this with slower compilation times.
>> 
>> In my opinion, this is a good tradeoff.
>> 
>> 
>> 
>>> On Wednesday, 3 June 2020 18:07:12 UTC+1, Ray Pereda wrote:
>>> I believe that the complexity of regexp.MustCompile() is linear based 
>>> on this comment in the regexp package overview.
>>> "The regexp implementation provided by this package is guaranteed to 
>>> run in time linear in the size of the input"
>>> 
>>> What is the complexity of regexp.MustCompile()? Is it linear in the 
>>> length of the regular expression?
>>> 
>>> -ray
>>> 
>>> 
>>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google 
>> Groups "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send 
>> an email to gol

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Axel Wagner' via golang-nuts
On Mon, Jun 8, 2020 at 3:19 PM Robert Engels  wrote:

> I don’t see anything in the blog post referring to compilation time - only
> runtime.
>

Sure. It's more a general point about you saying algorithmic complexity
doesn't matter.


> That being said I couldn’t review the graphs on my phone.
>
> Even still, to prevent DOS you can bound the compilation time as easily as
> the runtime. If the lib doesn’t support it a simple fork to add an emit
> with cancel out stage is all that is required.
>

Or, you know, make sure that it just has a guaranteed linear complexity.
Then you don't even need "an emit with a cancel out stage".

FTR, the whole issue here is a) someone asked about the algorithmic
complexity of a stdlib function and b) people told them off saying the
question isn't relevant. It is. You might not care (you should, IMO. But if
you don't, that's on you). But it's definitely a relevant question to ask.


>
> On Jun 8, 2020, at 8:04 AM, 'Axel Wagner' via golang-nuts <
> golang-nuts@googlegroups.com> wrote:
>
> 
> On Mon, Jun 8, 2020 at 2:53 PM Robert Engels 
> wrote:
>
>> Attempting to prevent DOS attacks through algorithm efficiency never works
>>
>
> Uhm, no, it totally does. Code Search is a real-world example of it
> working at least once.
>
>
>> you have to have resource throttling.
>>
>
> Well, yes. But reigning in algorithmic complexity makes that far easier
> and . If the cost of a query is proportional to its length, you can just
> limit the length of queries to some gratuitous upper bound of reasonable.
> But if it's quadratic or even exponential, that no longer works; if the
> cost can be doubled just by adding a character to the query, you have to
> restrict query length to a restrictive *lower* bound on reasonable - which
> is inherently user-unfriendly.
>
> Really, I'd argue that algorithmic efficiency is the *only* real effective
> measure against a cost DoS.
>
> I’m guessing the IO cost of pulling the text in this case has a better
>> chance of creating a DOS than the regex compile.
>>
>
> You might guess that, but you'd be wrong. Again, just look at the graph on
> top of this blog post . You
> get *minutes* of match-times for queries and corpus of a couple hundred
> characters.
>
> Regexp-based code search couldn't exist without carefully designing around
> algorithmic complexity.
>
>
>>
>> On Jun 8, 2020, at 7:40 AM, 'Axel Wagner' via golang-nuts <
>> golang-nuts@googlegroups.com> wrote:
>>
>> 
>> Hi Amnon,
>>
>> if you read the blog posts I linked above, you'll find examples of where
>> we care very much. RE2 was developed for enabling regular expression search
>> in a large source code corpus. In that scenario, the attacker controls both
>> the regular expression and (to a degree) the text to be searched. If they
>> could craft expression/text pairs that are costly to compile and/or match,
>> then this could enable a denial of service attack.
>>
>> So, guaranteeing linear compile- *and* match-times is actually pretty
>> relevant for some real-world use-cases.
>>
>> Best,
>>
>> Axel
>>
>> On Mon, Jun 8, 2020 at 10:16 AM Amnon Baron Cohen 
>> wrote:
>>
>>> Should we care?
>>>
>>> Regular expressions are generally small.
>>> So the asymptotic complexity is not particularly important.
>>>
>>> But regular expressions are often used to search large amounts of input.
>>>
>>> regexp gives us fast, guaranteed linear search times.
>>> But we pay for this with slower compilation times.
>>>
>>> In my opinion, this is a good tradeoff.
>>>
>>>
>>>
>>> On Wednesday, 3 June 2020 18:07:12 UTC+1, Ray Pereda wrote:

 I believe that the complexity of regexp.MustCompile()
  is linear based on this
 comment in the regexp package overview.
 

 "The regexp implementation provided by this package is guaranteed to
 run in time linear in the size of the input"


 What is the complexity of regexp.MustCompile()
 ? Is it linear in the
 length of the regular expression?

 -ray



 --
>>> You received this message because you are subscribed to the Google
>>> Groups "golang-nuts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to golang-nuts+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/golang-nuts/162b28e7-bd81-47d4-afb7-7fe9f8a15b8do%40googlegroups.com
>>> 
>>> .
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> To view 

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread Robert Engels
I don’t see anything in the blog post referring to compilation time - only 
runtime. That being said I couldn’t review the graphs on my phone. 

Even still, to prevent DOS you can bound the compilation time as easily as the 
runtime. If the lib doesn’t support it a simple fork to add an emit with cancel 
out stage is all that is required. 

> On Jun 8, 2020, at 8:04 AM, 'Axel Wagner' via golang-nuts 
>  wrote:
> 
> 
>> On Mon, Jun 8, 2020 at 2:53 PM Robert Engels  wrote:
> 
>> Attempting to prevent DOS attacks through algorithm efficiency never works
> 
> Uhm, no, it totally does. Code Search is a real-world example of it working 
> at least once. 
>  
>> you have to have resource throttling. 
> 
> Well, yes. But reigning in algorithmic complexity makes that far easier and . 
> If the cost of a query is proportional to its length, you can just limit the 
> length of queries to some gratuitous upper bound of reasonable. But if it's 
> quadratic or even exponential, that no longer works; if the cost can be 
> doubled just by adding a character to the query, you have to restrict query 
> length to a restrictive *lower* bound on reasonable - which is inherently 
> user-unfriendly.
>  
> Really, I'd argue that algorithmic efficiency is the *only* real effective 
> measure against a cost DoS.
> 
>> I’m guessing the IO cost of pulling the text in this case has a better 
>> chance of creating a DOS than the regex compile. 
> 
> You might guess that, but you'd be wrong. Again, just look at the graph on 
> top of this blog post. You get *minutes* of match-times for queries and 
> corpus of a couple hundred characters.
> 
> Regexp-based code search couldn't exist without carefully designing around 
> algorithmic complexity.
>  
>> 
 On Jun 8, 2020, at 7:40 AM, 'Axel Wagner' via golang-nuts 
  wrote:
 
>>> 
>>> Hi Amnon,
>>> 
>>> if you read the blog posts I linked above, you'll find examples of where we 
>>> care very much. RE2 was developed for enabling regular expression search in 
>>> a large source code corpus. In that scenario, the attacker controls both 
>>> the regular expression and (to a degree) the text to be searched. If they 
>>> could craft expression/text pairs that are costly to compile and/or match, 
>>> then this could enable a denial of service attack.
>>> 
>>> So, guaranteeing linear compile- *and* match-times is actually pretty 
>>> relevant for some real-world use-cases.
>>> 
>>> Best,
>>> 
>>> Axel
>>> 
 On Mon, Jun 8, 2020 at 10:16 AM Amnon Baron Cohen  
 wrote:
 Should we care?
 
 Regular expressions are generally small. 
 So the asymptotic complexity is not particularly important.
 
 But regular expressions are often used to search large amounts of input.
 
 regexp gives us fast, guaranteed linear search times.
 But we pay for this with slower compilation times.
 
 In my opinion, this is a good tradeoff.
 
 
 
> On Wednesday, 3 June 2020 18:07:12 UTC+1, Ray Pereda wrote:
> I believe that the complexity of regexp.MustCompile() is linear based on 
> this comment in the regexp package overview.
> "The regexp implementation provided by this package is guaranteed to run 
> in time linear in the size of the input"
> 
> What is the complexity of regexp.MustCompile()? Is it linear in the 
> length of the regular expression?
> 
> -ray
> 
> 
> 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 "golang-nuts" group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to golang-nuts+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/golang-nuts/162b28e7-bd81-47d4-afb7-7fe9f8a15b8do%40googlegroups.com.
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "golang-nuts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to golang-nuts+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/golang-nuts/CAEkBMfGXknH1ZQK7%3DYWay_ruVitjubh3CgWk5hxrTcNLdry%3D_g%40mail.gmail.com.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CAEkBMfGBDB%2BE%3DiVUDdD0_6oa-mXQYPVnjv%2BCpGGGJYaODP8NYA%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/B116

Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Axel Wagner' via golang-nuts
On Mon, Jun 8, 2020 at 2:53 PM Robert Engels  wrote:

> Attempting to prevent DOS attacks through algorithm efficiency never works
>

Uhm, no, it totally does. Code Search is a real-world example of it working
at least once.


> you have to have resource throttling.
>

Well, yes. But reigning in algorithmic complexity makes that far easier and
. If the cost of a query is proportional to its length, you can just limit
the length of queries to some gratuitous upper bound of reasonable. But if
it's quadratic or even exponential, that no longer works; if the cost can
be doubled just by adding a character to the query, you have to restrict
query length to a restrictive *lower* bound on reasonable - which is
inherently user-unfriendly.

Really, I'd argue that algorithmic efficiency is the *only* real effective
measure against a cost DoS.

I’m guessing the IO cost of pulling the text in this case has a better
> chance of creating a DOS than the regex compile.
>

You might guess that, but you'd be wrong. Again, just look at the graph on
top of this blog post . You get
*minutes* of match-times for queries and corpus of a couple hundred
characters.

Regexp-based code search couldn't exist without carefully designing around
algorithmic complexity.


>
> On Jun 8, 2020, at 7:40 AM, 'Axel Wagner' via golang-nuts <
> golang-nuts@googlegroups.com> wrote:
>
> 
> Hi Amnon,
>
> if you read the blog posts I linked above, you'll find examples of where
> we care very much. RE2 was developed for enabling regular expression search
> in a large source code corpus. In that scenario, the attacker controls both
> the regular expression and (to a degree) the text to be searched. If they
> could craft expression/text pairs that are costly to compile and/or match,
> then this could enable a denial of service attack.
>
> So, guaranteeing linear compile- *and* match-times is actually pretty
> relevant for some real-world use-cases.
>
> Best,
>
> Axel
>
> On Mon, Jun 8, 2020 at 10:16 AM Amnon Baron Cohen 
> wrote:
>
>> Should we care?
>>
>> Regular expressions are generally small.
>> So the asymptotic complexity is not particularly important.
>>
>> But regular expressions are often used to search large amounts of input.
>>
>> regexp gives us fast, guaranteed linear search times.
>> But we pay for this with slower compilation times.
>>
>> In my opinion, this is a good tradeoff.
>>
>>
>>
>> On Wednesday, 3 June 2020 18:07:12 UTC+1, Ray Pereda wrote:
>>>
>>> I believe that the complexity of regexp.MustCompile()
>>>  is linear based on this
>>> comment in the regexp package overview.
>>> 
>>>
>>> "The regexp implementation provided by this package is guaranteed to run
>>> in time linear in the size of the input"
>>>
>>>
>>> What is the complexity of regexp.MustCompile()
>>> ? Is it linear in the
>>> length of the regular expression?
>>>
>>> -ray
>>>
>>>
>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/golang-nuts/162b28e7-bd81-47d4-afb7-7fe9f8a15b8do%40googlegroups.com
>> 
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/CAEkBMfGXknH1ZQK7%3DYWay_ruVitjubh3CgWk5hxrTcNLdry%3D_g%40mail.gmail.com
> 
> .
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAEkBMfGBDB%2BE%3DiVUDdD0_6oa-mXQYPVnjv%2BCpGGGJYaODP8NYA%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread Robert Engels
Attempting to prevent DOS attacks through algorithm efficiency never works - 
you have to have resource throttling. 

I’m guessing the IO cost of pulling the text in this case has a better chance 
of creating a DOS than the regex compile. 

> On Jun 8, 2020, at 7:40 AM, 'Axel Wagner' via golang-nuts 
>  wrote:
> 
> 
> Hi Amnon,
> 
> if you read the blog posts I linked above, you'll find examples of where we 
> care very much. RE2 was developed for enabling regular expression search in a 
> large source code corpus. In that scenario, the attacker controls both the 
> regular expression and (to a degree) the text to be searched. If they could 
> craft expression/text pairs that are costly to compile and/or match, then 
> this could enable a denial of service attack.
> 
> So, guaranteeing linear compile- *and* match-times is actually pretty 
> relevant for some real-world use-cases.
> 
> Best,
> 
> Axel
> 
>> On Mon, Jun 8, 2020 at 10:16 AM Amnon Baron Cohen  wrote:
>> Should we care?
>> 
>> Regular expressions are generally small. 
>> So the asymptotic complexity is not particularly important.
>> 
>> But regular expressions are often used to search large amounts of input.
>> 
>> regexp gives us fast, guaranteed linear search times.
>> But we pay for this with slower compilation times.
>> 
>> In my opinion, this is a good tradeoff.
>> 
>> 
>> 
>>> On Wednesday, 3 June 2020 18:07:12 UTC+1, Ray Pereda wrote:
>>> I believe that the complexity of regexp.MustCompile() is linear based on 
>>> this comment in the regexp package overview.
>>> "The regexp implementation provided by this package is guaranteed to run in 
>>> time linear in the size of the input"
>>> 
>>> What is the complexity of regexp.MustCompile()? Is it linear in the length 
>>> of the regular expression?
>>> 
>>> -ray
>>> 
>>> 
>>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/golang-nuts/162b28e7-bd81-47d4-afb7-7fe9f8a15b8do%40googlegroups.com.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CAEkBMfGXknH1ZQK7%3DYWay_ruVitjubh3CgWk5hxrTcNLdry%3D_g%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/08669F7D-3D67-4A84-BA5B-395556550E0A%40ix.netcom.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-08 Thread 'Axel Wagner' via golang-nuts
Hi Amnon,

if you read the blog posts I linked above, you'll find examples of where we
care very much. RE2 was developed for enabling regular expression search in
a large source code corpus. In that scenario, the attacker controls both
the regular expression and (to a degree) the text to be searched. If they
could craft expression/text pairs that are costly to compile and/or match,
then this could enable a denial of service attack.

So, guaranteeing linear compile- *and* match-times is actually pretty
relevant for some real-world use-cases.

Best,

Axel

On Mon, Jun 8, 2020 at 10:16 AM Amnon Baron Cohen  wrote:

> Should we care?
>
> Regular expressions are generally small.
> So the asymptotic complexity is not particularly important.
>
> But regular expressions are often used to search large amounts of input.
>
> regexp gives us fast, guaranteed linear search times.
> But we pay for this with slower compilation times.
>
> In my opinion, this is a good tradeoff.
>
>
>
> On Wednesday, 3 June 2020 18:07:12 UTC+1, Ray Pereda wrote:
>>
>> I believe that the complexity of regexp.MustCompile()
>>  is linear based on this
>> comment in the regexp package overview.
>> 
>>
>> "The regexp implementation provided by this package is guaranteed to run
>> in time linear in the size of the input"
>>
>>
>> What is the complexity of regexp.MustCompile()
>> ? Is it linear in the length
>> of the regular expression?
>>
>> -ray
>>
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/162b28e7-bd81-47d4-afb7-7fe9f8a15b8do%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAEkBMfGXknH1ZQK7%3DYWay_ruVitjubh3CgWk5hxrTcNLdry%3D_g%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-03 Thread 'Axel Wagner' via golang-nuts
Hi,

I can very much recommend reading this list of blog posts by Russ Cox
. They explain the theory behind all of
this. And AIUI, yes, compilation should also be linear. The cost, notably,
is that the regexp package actually can only compile actual regular
expressions which only recognize regular languages, by omitting features
such as back-references. Other regexp-engines are more powerful, at the
expense of no longer being able to have this linear runtime characteristic.
So there is still a tradeoff (though I tend to agree with the one made by
RE2 and thus the regexp-package).


On Wed, Jun 3, 2020 at 11:43 PM Ray Pereda  wrote:

> Typo fix, regexp#MatchString  and
> related Match functions are linear. That is an amazing guarantee and makes
> regular
> expressions 10x more useful than regexps in most other programming
> languages.
>
> Question: Does the *compile* of regular expressions also guaranteed
> linear runtime? I'm considering regular expressions with
> 1000s of keywords; very long regexps. I looked at the source and it was
> not obvious if that is the case. Intricate code.
>
> On Wednesday, June 3, 2020 at 10:07:12 AM UTC-7, Ray Pereda wrote:
>>
>> I believe that the complexity of regexp.MustCompile()
>>  is linear based on this
>> comment in the regexp package overview.
>> 
>>
>> "The regexp implementation provided by this package is guaranteed to run
>> in time linear in the size of the input"
>>
>>
>> What is the complexity of regexp.MustCompile()
>> ? Is it linear in the length
>> of the regular expression?
>>
>> -ray
>>
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/f51125e7-f66e-45e8-af3a-381f96071d9a%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAEkBMfHP3ymt%2BL63fdw%3Dob8%3Df-FfBtsC%3DGhaO1Q97N3eszvogQ%40mail.gmail.com.


Re: [go-nuts] Re: what is the complexity of regexp.MustCompile()?

2020-06-03 Thread Michael Jones
If you have thousands of fixed strings, a map is your friend. Remarkably
so.

On Wed, Jun 3, 2020 at 2:43 PM Ray Pereda  wrote:

> Typo fix, regexp#MatchString  and
> related Match functions are linear. That is an amazing guarantee and makes
> regular
> expressions 10x more useful than regexps in most other programming
> languages.
>
> Question: Does the *compile* of regular expressions also guaranteed
> linear runtime? I'm considering regular expressions with
> 1000s of keywords; very long regexps. I looked at the source and it was
> not obvious if that is the case. Intricate code.
>
> On Wednesday, June 3, 2020 at 10:07:12 AM UTC-7, Ray Pereda wrote:
>>
>> I believe that the complexity of regexp.MustCompile()
>>  is linear based on this
>> comment in the regexp package overview.
>> 
>>
>> "The regexp implementation provided by this package is guaranteed to run
>> in time linear in the size of the input"
>>
>>
>> What is the complexity of regexp.MustCompile()
>> ? Is it linear in the length
>> of the regular expression?
>>
>> -ray
>>
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/f51125e7-f66e-45e8-af3a-381f96071d9a%40googlegroups.com
> 
> .
>
-- 

*Michael T. jonesmichael.jo...@gmail.com *

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CALoEmQy9TBAJo4Oz_wYsWvacDW_Q_y49bj%3DUCuY1tcdqSR5Ouw%40mail.gmail.com.