On Tue, 2009-09-22 at 18:07 -0700, Markus Roberts wrote:
> So I like the idea but I'm not thrilled about some of the details (as
> will always be the case with a hack of this sort I suppose).
>
> + attr_accessor :line, :indefine, :no_regex
>
> I'd call this "regex_permitted" or "may_be_followed_by_regex"; ditto
> with :no_regex_after
Sure, I don't mind.
>
> def add_tokens(hash)
> hash.each do |regex, name|
> - add_token(name, regex)
> + options = {}
> + if name.is_a?(Hash)
> + options = name
> + name = options[:name]
> + options.delete(:name)
> + end
> + add_token(name, regex, options)
> end
> end
>
> I'd be inclined to write this:
>
> def add_tokens(hash)
> hash.each do |regex, options|
> options = {:name => options} unless options.is_a? Hash
> name = options.delete(:name)
> add_token(name, regex, options)
> end
> end
>
> but that's a minor stylistic matter.
OK.
> @@ -365,6 +373,12 @@ class Puppet::Parser::Lexer
>
> @commentstack.push(comment)
> end
>
> + if token.no_regex_after
> + @no_regex = true
> + elsif not token.no_regex_after.nil?
> + @no_regex = false
> + end
> +
>
> I'm not understanding why this isn't just
>
> @no_regex = token.no_regex_after
Because we don't touch no_regex if the token is in no_regex_after dunno
mode.
Otherwise:
$var = 4096 / 23 / 4
When lexing 4096, we'd reset @no_regex to false, and we'd parse / 23 /
as a regex.
Essentially no_regex_after has 3 distinct values:
* true, means we can't lex a regex until it is said we can
* false, we now can lex a regex
* nil, leave no_regex in the mode it was
I know this is kludgy like the whole patch. I'd favor another approach
if I'd have one.
> I recognise that the behavior isn't the same and suspect that it has
> something to do with
>
> ')' => :RPAREN,
> - '=' => :EQUALS,
> + '=' => { :name => :EQUALS, :no_regex_after => true },
> '+=' => :APPENDS,
>
> but I'm not clear on the purpose. We're setting @no_regex to true
> when we see an :EQUALS and then leaving it unchanged until we see
> a :CASE, :NODE, etc. Why not take a more restrictive stance and only
> allow regular expressions directly after :CASE, :NODE, etc.?
I don't remember exactly, but I wen't the other way because there was a
case that wouldn't work easily (I tried at least 3 differents
alternatives yesterday in a small amount of time so things are now
blurred in my mind).
> I guess my inclination would have been to (as I noted on the ticket)
> make the test a property of the :REGEX token type (using a general
> "applicable" test that didn't refer to the REGEX rules), and let it
> look at the most recently generated token(s) to make the judgement
> rather than scattering the infomation on all the various token types.
> But it may be that I'm missing some subtilty of the syntax that makes
> that infeasable.
My feeling is that either way will fail. There are too much places where
a / can be mis-lexed as a regex.
The more I think about it, the more I think the parser should parse the
regex, and the lexer shouldn't do anything about those.
Maybe with an untested grammar like this:
regex: DIV regex_content DIV
regex_content: regex_content regex_string
| LPAREN regex_content RPAREN
| regex_content STAR
| regex_content QMARK
| LBRACE character_class RBRACE
...
character_class: string
Of course the lexer would have to provide some support, especially it
should lex spaces between / /...
What do you think?
--
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Puppet Developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/puppet-dev?hl=en
-~----------~----~----~----~------~----~------~--~---