On 23/09/09 21:03, Luke Kanies wrote:
> On Sep 23, 2009, at 11:53 AM, Brice Figureau wrote:
> 
>> On 23/09/09 20:42, Luke Kanies wrote:
>>> On Sep 23, 2009, at 11:34 AM, Markus Roberts wrote:
>>>
>>>>>>> Given the ridiculous complexity we're looking at adding to work
>>>>>> around this, making the lexer easier by adding the leading 'm'
>>>>>> or whatever seems like a good step.  A bit ugly lexically, but
>>>>>> way easier to implement and will help us avoid more pain in
>>>>>> the future.
>>>>> Definitely.
>>>>> For which version^H^H^H^H codename do we plan the change?
>>>> Well, it's a bugfix that just happens to horribly break backward
>>>> compatibility.  But I'd say 0.25.1, because we're basically choosing
>>>> between breaking compatibility with code in 0.24 (incorrectly lexing
>>>> division) and 0.25 (changing the regex lexing).
>>>>
>>>> Thoughts?
>>>>
>>>> Yeah.  I don't like the leading 'm'.
>>>>
>>>> Further, I think the complexity in the workaround is coming from the
>>>> approach (maintaining special purpose state) rather than the  
>>>> problem.
>>>>
>>>> I'd favor (as I mentioned on the ticket) adding a general purpose
>>>> "acceptable?" test to tokens and have the lexer use that in with the
>>>> longest match first logic.
>>>>
>>>> Make "acceptable?" default to true, but in the case of REGEX have it
>>>> check to see if it's in a value context (that is, that the last
>>>> token generated was in the set LBRACE, RBRACE, COMMA, etc.).  As far
>>>> as I can see, the last token gives a clean division between where
>>>> DIV is acceptable and where REGEX is acceptable so there need only
>>>> be one place (REGEX.acceptable?) where we do anything funky--and
>>>> that just an [...].include? test.
>>>>
>>>> I'm working on debugging a 24.8/25.0 problem at the moment but I'll
>>>> try to code this up later today.
>> OK, thanks.
>>
>>> Ok, I look forward to seeing your code.  I also am not that fond of
>>> the m//, but in the absence of another clean solution...
>> Started working on this m// thing and found a bug in the lexer: in
>> find_regex_token, when we have several tokens matching, then only the
>> first one will win which is the contrary of what we want.
>>
>> Why?
>> Because in the if matched_length > length we scan the input string  
>> so we
>> advance the string pointer and any other matching tokens will have
>> difficulties to match again.
>> It doesn't seem to me to be a feature (or I might be wrong).
>>
>> diff --git a/lib/puppet/parser/lexer.rb b/lib/puppet/parser/lexer.rb
>> index 0db6c22..020963d 100644
>> --- a/lib/puppet/parser/lexer.rb
>> +++ b/lib/puppet/parser/lexer.rb
>> @@ -297,12 +300,12 @@ class Puppet::Parser::Lexer
>>
>>              # We've found a longer match
>>              if match_length > length
>> -                value = @scanner.scan(token.regex)
>>                  length = value.length
>>                  matched_token = token
>>              end
>>          end
>>
>> +        value = @scanner.scan(matched_token.regex) unless
>> matched_token.nil?
>>          return matched_token, value
>>      end
> 
> You're right - the scanning should be outside the loop.

I'll file a ticket and send a patch tomorrow if someone doesn't do it 
before me.

I'm wondering what will be the side-effects of this fix :-(
-- 
Brice Figureau
My Blog: http://www.masterzen.fr/


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to