[il-antlr-interest: 34428] [antlr-interest] Lexer grammar for filtering
Hi All, I have a simple grammar to collapse white-spaces and comment from a c source code input. Also I would like to filter out some variables with a specific name. These have a strict format, so no "real" C parsing needed. Works fine but for example a line "#define PC_HASH_VALUE 1" is not recognized. As far I remember from previous ANTLR usage, this was working straight away. Any suggestions? Thanks! /* [ CODE ] */ lexer grammar Collapse; options { language = Java; filter = true; } @header { package rewriter; import java.util.*; import java.io.*; } @members { PrintStream out; public Collapse(CharStream input, PrintStream out) { this(input); this.out = out; } } PC: 'PC_HASH_VALUE' text=DIGIT {$channel=HIDDEN;}; fragment DIGIT: '0'..'9'; COMMENT : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} | '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} ; WS : ( ' ' | '\t' | '\r' | '\n' ) {$channel=HIDDEN;} ; ELSE : c=. {out.print((char)$c);} ; // match any char and emit /* [ END ] */ List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34429] Re: [antlr-interest] Lexer grammar for filtering
Hi Balazs, Since PC is not a parser rule, you need to account for the space(s) between 'PC_HASH_VALUE' and DIGIT. And since you've set `filter=true`, you don't need a fall-through rule ELSE, AFAIK. Regards, Bart. On Mon, Oct 17, 2011 at 11:15 AM, Balazs Varnai wrote: > Hi All, > > I have a simple grammar to collapse white-spaces and comment from a c > source > code input. Also I would like to filter out some variables with a specific > name. These have a strict format, so no "real" C parsing needed. > Works fine but for example a line "#define PC_HASH_VALUE 1" is not > recognized. As far I remember from previous ANTLR usage, this was working > straight away. Any suggestions? Thanks! > > /* [ CODE ] */ > lexer grammar Collapse; > > options { > language = Java; > filter = true; > } > @header { > package rewriter; > import java.util.*; > import java.io.*; > > } > > @members { > PrintStream out; > > public Collapse(CharStream input, PrintStream out) { >this(input); >this.out = out; > } > } > > PC: 'PC_HASH_VALUE' text=DIGIT {$channel=HIDDEN;}; > > fragment > DIGIT: '0'..'9'; > > COMMENT >: '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} >| '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} >; > > WS : ( ' ' >| '\t' >| '\r' >| '\n' >) {$channel=HIDDEN;} >; > > ELSE : c=. {out.print((char)$c);} ; // match any char and emit > /* [ END ] */ > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34430] Re: [antlr-interest] Lexer grammar for filtering
Hi Bart, Thanks! Adding a WS* solved the problem. I use the ELSE because it's kind of reverse filter, I specify what to exclude from input to output. Regards, Balazs On Mon, Oct 17, 2011 at 11:20 AM, Bart Kiers wrote: > Hi Balazs, > > Since PC is not a parser rule, you need to account for the space(s) between > 'PC_HASH_VALUE' and DIGIT. > And since you've set `filter=true`, you don't need a fall-through rule > ELSE, AFAIK. > > Regards, > > Bart. > > > On Mon, Oct 17, 2011 at 11:15 AM, Balazs Varnai wrote: > >> Hi All, >> >> I have a simple grammar to collapse white-spaces and comment from a c >> source >> code input. Also I would like to filter out some variables with a specific >> name. These have a strict format, so no "real" C parsing needed. >> Works fine but for example a line "#define PC_HASH_VALUE 1" is not >> recognized. As far I remember from previous ANTLR usage, this was working >> straight away. Any suggestions? Thanks! >> >> /* [ CODE ] */ >> lexer grammar Collapse; >> >> options { >> language = Java; >> filter = true; >> } >> @header { >> package rewriter; >> import java.util.*; >> import java.io.*; >> >> } >> >> @members { >> PrintStream out; >> >> public Collapse(CharStream input, PrintStream out) { >>this(input); >>this.out = out; >> } >> } >> >> PC: 'PC_HASH_VALUE' text=DIGIT {$channel=HIDDEN;}; >> >> fragment >> DIGIT: '0'..'9'; >> >> COMMENT >>: '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} >>| '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} >>; >> >> WS : ( ' ' >>| '\t' >>| '\r' >>| '\n' >>) {$channel=HIDDEN;} >>; >> >> ELSE : c=. {out.print((char)$c);} ; // match any char and emit >> /* [ END ] */ >> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest >> Unsubscribe: >> http://www.antlr.org/mailman/options/antlr-interest/your-email-address >> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34433] Re: [antlr-interest] Rewrite action causing error in parser?
x @init {Boolean isPresent = false;} : (A { isPresent = true; })? B -> {isPresent}? ^(A B) -> ^(IMAGINE B) ; Jim > -Original Message- > From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- > boun...@antlr.org] On Behalf Of Ross Bamford > Sent: Sunday, October 16, 2011 4:41 PM > To: Maximilien Colange > Cc: antlr-interest@antlr.org > Subject: Re: [antlr-interest] Rewrite action causing error in parser? > > Further investigation confirms that this does seem to crop up quite a > bit, which suggests I'm definitely doing something wrong. I'm just > hoping that someone might be able to suggest a different way to do what > I need to do (i.e. insert an imaginary token in place of the optional > one if it's not specified)? > > Thanks, > Ross > > > On Sun, Oct 16, 2011 at 11:53 PM, Maximilien Colange > wrote: > > > It appears that this "bug" is frequently reported. > > It would be nice if ANTLR raised an error (or a warning) when a token > > is given a reference in a syntactic predicate. > > > > However, I do not know whether it is easy to detect. I already > > encountered this problem, and it occured in a "hidden" ANTLR- > generated > > syntactic predicate. I am afraid the error is difficult to detect in > > such cases. > > > > And just for curiosity, why is not it possible to reference local > > variables or to assign from token in a syntactic predicate ? > > > > -- > > Maximilien > > > > Le 10/15/11 11:34 PM, Jim Idle a écrit : > > > Your problem does not look to be the rewrite rule, but the fact > that > > > you are referencing a local variable in a predicate, or have tried > > > to assign from a token in a predicate. > > > > > > Look for something like this > > > > > > ((id=IDENTIFIER)=> id=IDENTIFIER)? > > > > > > > > > But regardless, this is the rewrite rule that is the problem as far > > > as I can see. Try commenting it out for instance. > > > > > > Jim > > > > > >> -Original Message- > > >> From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- > > >> boun...@antlr.org] On Behalf Of Ross Bamford > > >> Sent: Saturday, October 15, 2011 5:40 AM > > >> To: antlr-interest@antlr.org > > >> Subject: [antlr-interest] Rewrite action causing error in parser? > > >> > > >> Hi all, > > >> > > >> I have a grammar I'm currently working on (posted in another > thread > > >> the other day), which has the following rule: > > >> > > >> meth_call_expr > > >>: (id = IDENTIFIER DOT)? func_call_expr -> ^(METHOD_CALL { > > >> ($id==null) ? > > >> adaptor.create(SELF, "SELF") : adaptor.create(IDENTIFIER, > > >> $id.getText()) } > > >> func_call_expr) > > >>; > > >> > > >> As you can see, I'm using an action in the rewrite rule to insert > > >> either the > > >> (optional) IDENTIFIER, or an imaginary SELF node if IDENTIFIER is > > >> not specified. The problem I'm having is that this generates a > > >> parser that won't compile. Specifically, it generates the > following > > >> bit of code (edited by hand for brevity and to highlight the > error): > > >> > > >> /* [ CODE ] */ > > >> // $ANTLR start synpred6_BasicLang > > >> public final void synpred6_BasicLang_fragment() throws > > >> RecognitionException { > > >> > > >> Token =null; //<-- ERROR HERE > > >> > > >> /* ... later on ... */ > > >> > > >> switch (alt23) { > > >> case 1 : > > >> // > > >> > C:\\Users\\chantelle\\workspace\\basiclang\\src\\com\\roscopeco\\ba > > >> sicl > > >> ang\\parser\\BasicLang.g:99:8: > > >> id= IDENTIFIER DOT > > >> { > > >> > > >> > id=(Token)match(input,IDENTIFIER,FOLLOW_IDENTIFIER_in_synpred6_Basi > > >> cLan > > >> g232); > > >> if (state.failed) return ; //<-- AND HERE > > >> > > >> > > >> match(input,DOT,FOLLOW_DOT_in_synpred6_BasicLang234); > > >> if > > >> (state.failed) return ; > > >> > > >> } > > >> break; > > >> > > >> } > > >> /* [ END ] */ > > >> > > >> Obviously the problem is the "Token =null" line, which should be > > >> "Token id = null". Changing it by hand fixes the errors and makes > > >> the parser work as expected. > > >> > > >> So I have two questions - is this the right way to go about > > >> inserting an imaginary token if an optional token isn't in the > > >> input, and if so, what am I doing wrong to cause the error above? > > >> > > >> Thanks in advance, > > >> Ross > > >> > > >> List: http://www.antlr.org/mailman/listinfo/antlr-interest > > >> Unsubscribe: > > >> http://www.antlr.org/mailman/options/antlr-interest/your- > > >> email-address > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > > > Unsubscribe: > > http://www.antlr.org/mailman/options/antlr-interest/your-email- > address > > > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > > Unsubscribe: > > http://www.antlr.org/mailman/options/antlr-interest/your-email- > address > > > > List: http://www.antlr.org/mailman