On 22/10/08 20:26, Brice Figureau wrote:
> Hi,
> 
> On 21/10/08 17:13, Luke Kanies wrote:
>> On Oct 21, 2008, at 3:35 AM, Brice Figureau wrote:
>>
>>> I'm volunteering for this. I'll assign the ticket to myself, if nobody
>>> else wants it.
>>> [snipped]
>>>
>>> I'll keep the list posted with my progress and findings.
>>
>> Awesome.
> 
> I've started working on this. I'm following Luke's detailed advice in 
> the ticket (ie accumulate comment content, attach it on ast nodes when 
> they're found...).
> 
> That's the easy part...

That's what I thought. Until I remembered how a LR parser is working :-(

Let's imagine we have the following pseudo-manifest

1 # comment
2 # comment
3 class test {
4   # comment
5   resource {}
6   # comment
7   $a = 10
8 }

Now, while lexing this, I'm accumulating the comments (let's say the 
first two comments). Then the parser then shifts the token 'class', then 
'test', then '{', then a new comment is lexed.
At this stage still no AST are created, so I can't attach my comments yet.

We lex and parse the various statements and when the parser shifts the 
last RBRACE, it can reduce the whole parser stack to a class. Yet, this 
is at this moment I should attach the so-called accumulated comments.

Unfortunately, I don't see a simple way to not also accumulate the inner 
comments of line 4 and 6, or to not lose the comments of line 1 & 2 in 
the process.

I need a comment stack, and change the stack level each time the parser 
"enters" a new statement (ie this stack should "follow" the parser 
stack), then accumulate at this stack level, and when the parser reduces 
the statements, I must pop the stack and use this as this statement comment.

The hard part is to know when we "enter" a new statement. I can hardcode 
this knowledge in the lexer (ie we lex a LBRACE), but this doesn't feel 
right. Or maybe I can push a new accumulator on the stack each time we 
lex something that is not a comment, but since there is not a 
bijectivity between parser reductions and lexed tokens, I'm not sure 
it'll work (there will be some kind of offsets for sure).

Another alternative, would be to enhance the parser with the knowledge 
of comments token, but I fear this will be a deep modification (you can 
place comments about everywhere).

Of course if what matters is only class, define or node comments, that's 
easier: I can just change the stack level each time we lex a 
class/define/node token. But that's not really generic.

Oh I wish racc would allow mid rules actions (like Bison does), I could 
then get the comments as soon as the parsers shifts the class token (or 
any other tokens that starts a statement), and associate those comments 
as soon as the statement is reduced. But racc seems quite primitive in 
this area :-(

Any good idea/advice to solve this issue?
Or did I miss something?
-- 
Brice Figureau
Days of Wonder
http://www.daysofwonder.com

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to