Excerpts from Edward Z. Yang's message of Fri Jun 05 17:47:00 -0400 2009:
> Now that you mention it, the messages that tickle this bug on my side also
> have one extremely long line. That's very interesting.
Here is the culprit, laid out to bear its full shame:
/\w.*:$/
I thought this was a suspicious looking regexen; a simple test confirmed my
belief:
line = ":a" * 10000
line =~ /\w.*:$/
Ba boom ba boom ba boom. This is a textbook case of catastrophic backtracking.
I have two possible fixes, they end up being about the same time for regular
cases, but the second one is more optimal for really long strings:
First, the simple one:
diff --git a/lib/sup/message.rb b/lib/sup/message.rb
index 5993729..0ddd3af 100644
--- a/lib/sup/message.rb
+++ b/lib/sup/message.rb
@@ -26,7 +26,7 @@ class Message
QUOTE_PATTERN = /^\s{0,4}[>|\}]/
BLOCK_QUOTE_PATTERN = /^-----\s*Original Message\s*----+$/
- QUOTE_START_PATTERN = /\w.*:$/
+ QUOTE_START_PATTERN = /\w\W*:$/
SIG_PATTERN = /(^--
?$)|(^\s*----------+\s*$)|(^\s*_________+\s*$)|(^\s*--~--~-)|(^\s*--\+\+\*\*==)/
MAX_SIG_DISTANCE = 15 # lines from the end
And the slightly more complicated one (but optimal for large n):
diff --git a/lib/sup/message.rb b/lib/sup/message.rb
index 5993729..c5481a6 100644
--- a/lib/sup/message.rb
+++ b/lib/sup/message.rb
@@ -26,7 +26,6 @@ class Message
QUOTE_PATTERN = /^\s{0,4}[>|\}]/
BLOCK_QUOTE_PATTERN = /^-----\s*Original Message\s*----+$/
- QUOTE_START_PATTERN = /\w.*:$/
SIG_PATTERN = /(^--
?$)|(^\s*----------+\s*$)|(^\s*_________+\s*$)|(^\s*--~--~-)|
MAX_SIG_DISTANCE = 15 # lines from the end
@@ -449,7 +448,7 @@ private
when :text
newstate = nil
- if line =~ QUOTE_PATTERN || (line =~ QUOTE_START_PATTERN && nextline
=~ QUO
+ if line =~ QUOTE_PATTERN || (line =~ /:$/ && line =~ /\w/ && nextline
=~ QU
newstate = :quote
elsif line =~ SIG_PATTERN && (lines.length - i) < MAX_SIG_DISTANCE
newstate = :sig
There are number of micro-optimizations that could be made to message
parsing, but this will basically fix the egregious problem.
Cheers,
Edward
_______________________________________________
sup-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/sup-talk