How does not allowing carriage returns in regexes solve the problem? I would think that phrase would still lex incorrectly, since there's no return in it.
Would it be reasonable to have regexes not be recognized when surrounded by numbers? As you say, this breaks out of the LALR a bit, but it's at least straightforward to say something like, "if the previous token was a number, then this can't be a regex" in the lexer, right? On Sep 21, 2009, at 1:57 PM, Brice Figureau wrote: > > This is a temporary fix for 0.25.1. > Unfortunately I don't have any good/better fix for this problem > except adding > once again more parsing in the lexer (ie remember if the previous > token > is a token that can be followed by a :REGEX token). > > Frankly, the real fix would be to move to a PEG parser or anything not > LALR(1) :-( > > A possible solution, but I don't know if it's possible would be to > move the > regex parsing to the Racc grammar instead of the lexer. Then we > might be able > to play with the parser error recovery system, something I'll try. > > Anyway, if someone has a good/better idea please raise your hand. > > To explain the issue have a look to this failing manifest: > $var = 4096 / 4 / 4 > Which is parsed as [4096, :REGEX, 4] instead of the mathematical > expression. > > Brice, > > Original Commit Msg: > This is not the real fix. It is just an hot-fix to limit > the issue. > The issue is that the lexer regexes have precedences over simple > '/' (divide). > In the following expression: > $var = 4096 / 4 > $var2 = "/tmp/file" > > The / 4... part is mis-lexed as a regex instead of a mathematical > expression. > The current fix limits regex to one-line. > > Signed-off-by: Brice Figureau <[email protected]> > --- > lib/puppet/parser/lexer.rb | 2 +- > spec/unit/parser/lexer.rb | 4 ++++ > 2 files changed, 5 insertions(+), 1 deletions(-) > > diff --git a/lib/puppet/parser/lexer.rb b/lib/puppet/parser/lexer.rb > index e027a69..0db6c22 100644 > --- a/lib/puppet/parser/lexer.rb > +++ b/lib/puppet/parser/lexer.rb > @@ -171,7 +171,7 @@ class Puppet::Parser::Lexer > [self,value] > end > > - TOKENS.add_token :REGEX, %r{/[^/]*/} do |lexer, value| > + TOKENS.add_token :REGEX, %r{/[^/\n]*/} do |lexer, value| > # Make sure we haven't matched an escaped / > while value[-2..-2] == '\\' > other = lexer.scan_until(%r{/}) > diff --git a/spec/unit/parser/lexer.rb b/spec/unit/parser/lexer.rb > index 1c3e91b..3c73ca9 100755 > --- a/spec/unit/parser/lexer.rb > +++ b/spec/unit/parser/lexer.rb > @@ -460,6 +460,10 @@ describe Puppet::Parser::Lexer::TOKENS[:REGEX] do > @token.regex.should =~ '/this is a regex/' > end > > + it 'should not match if there is \n in the regex' do > + @token.regex.should_not =~ "/this is \n a regex/" > + end > + > describe "when including escaped slashes" do > before { @lexer = Puppet::Parser::Lexer.new } > > -- > 1.6.4 > > > > -- The Number 1 Sign You Have Nothing to Do at Work... The 4th Division of Paperclips has overrun the Pushpin Infantry and General White-Out has called for a new skirmish. --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en -~----------~----~----~----~------~----~------~--~---
