[il-antlr-interest: 34428] [antlr-interest] Lexer grammar for filtering

2011-10-17 Thread Balazs Varnai
Hi All,

I have a simple grammar to collapse white-spaces and comment from a c source
code input. Also I would like to filter out some variables with a specific
name. These have a strict format, so no "real" C parsing needed.
Works fine but for example a line "#define PC_HASH_VALUE 1" is not
recognized. As far I remember from previous ANTLR usage, this was working
straight away. Any suggestions? Thanks!

/*  [ CODE ]  */
lexer grammar Collapse;

options {
  language = Java;
  filter = true;
}
@header {
package rewriter;
import java.util.*;
import java.io.*;

}

@members {
PrintStream out;

public Collapse(CharStream input, PrintStream out) {
this(input);
this.out = out;
}
}

PC: 'PC_HASH_VALUE' text=DIGIT {$channel=HIDDEN;};

fragment
DIGIT: '0'..'9';

COMMENT
:   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
|   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
;

WS  :   ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;

ELSE : c=. {out.print((char)$c);} ; // match any char and emit
/*  [ END ]  */

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 34429] Re: [antlr-interest] Lexer grammar for filtering

2011-10-17 Thread Bart Kiers
Hi Balazs,

Since PC is not a parser rule, you need to account for the space(s) between
'PC_HASH_VALUE' and DIGIT.
And since you've set `filter=true`, you don't need a fall-through rule ELSE,
AFAIK.

Regards,

Bart.


On Mon, Oct 17, 2011 at 11:15 AM, Balazs Varnai  wrote:

> Hi All,
>
> I have a simple grammar to collapse white-spaces and comment from a c
> source
> code input. Also I would like to filter out some variables with a specific
> name. These have a strict format, so no "real" C parsing needed.
> Works fine but for example a line "#define PC_HASH_VALUE 1" is not
> recognized. As far I remember from previous ANTLR usage, this was working
> straight away. Any suggestions? Thanks!
>
> /*  [ CODE ]  */
> lexer grammar Collapse;
>
> options {
>  language = Java;
>  filter = true;
> }
> @header {
> package rewriter;
> import java.util.*;
> import java.io.*;
>
> }
>
> @members {
> PrintStream out;
>
> public Collapse(CharStream input, PrintStream out) {
>this(input);
>this.out = out;
> }
> }
>
> PC: 'PC_HASH_VALUE' text=DIGIT {$channel=HIDDEN;};
>
> fragment
> DIGIT: '0'..'9';
>
> COMMENT
>:   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
>|   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
>;
>
> WS  :   ( ' '
>| '\t'
>| '\r'
>| '\n'
>) {$channel=HIDDEN;}
>;
>
> ELSE : c=. {out.print((char)$c);} ; // match any char and emit
> /*  [ END ]  */
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 34430] Re: [antlr-interest] Lexer grammar for filtering

2011-10-17 Thread Balazs Varnai
Hi Bart,

Thanks! Adding a WS* solved the problem.
I use the ELSE because it's kind of reverse filter, I specify what to
exclude from input to output.

Regards,
Balazs

On Mon, Oct 17, 2011 at 11:20 AM, Bart Kiers  wrote:

> Hi Balazs,
>
> Since PC is not a parser rule, you need to account for the space(s) between
> 'PC_HASH_VALUE' and DIGIT.
> And since you've set `filter=true`, you don't need a fall-through rule
> ELSE, AFAIK.
>
> Regards,
>
> Bart.
>
>
> On Mon, Oct 17, 2011 at 11:15 AM, Balazs Varnai  wrote:
>
>> Hi All,
>>
>> I have a simple grammar to collapse white-spaces and comment from a c
>> source
>> code input. Also I would like to filter out some variables with a specific
>> name. These have a strict format, so no "real" C parsing needed.
>> Works fine but for example a line "#define PC_HASH_VALUE 1" is not
>> recognized. As far I remember from previous ANTLR usage, this was working
>> straight away. Any suggestions? Thanks!
>>
>> /*  [ CODE ]  */
>> lexer grammar Collapse;
>>
>> options {
>>  language = Java;
>>  filter = true;
>> }
>> @header {
>> package rewriter;
>> import java.util.*;
>> import java.io.*;
>>
>> }
>>
>> @members {
>> PrintStream out;
>>
>> public Collapse(CharStream input, PrintStream out) {
>>this(input);
>>this.out = out;
>> }
>> }
>>
>> PC: 'PC_HASH_VALUE' text=DIGIT {$channel=HIDDEN;};
>>
>> fragment
>> DIGIT: '0'..'9';
>>
>> COMMENT
>>:   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
>>|   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
>>;
>>
>> WS  :   ( ' '
>>| '\t'
>>| '\r'
>>| '\n'
>>) {$channel=HIDDEN;}
>>;
>>
>> ELSE : c=. {out.print((char)$c);} ; // match any char and emit
>> /*  [ END ]  */
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>
>

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 34433] Re: [antlr-interest] Rewrite action causing error in parser?

2011-10-17 Thread Jim Idle
x
@init {Boolean isPresent = false;}
:
  (A { isPresent = true; })? B

-> {isPresent}? ^(A B)
 -> ^(IMAGINE B)
;

Jim

> -Original Message-
> From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
> boun...@antlr.org] On Behalf Of Ross Bamford
> Sent: Sunday, October 16, 2011 4:41 PM
> To: Maximilien Colange
> Cc: antlr-interest@antlr.org
> Subject: Re: [antlr-interest] Rewrite action causing error in parser?
>
> Further investigation confirms that this does seem to crop up quite a
> bit, which suggests I'm definitely doing something wrong. I'm just
> hoping that someone might be able to suggest a different way to do what
> I need to do (i.e. insert an imaginary token in place of the optional
> one if it's not specified)?
>
> Thanks,
> Ross
>
>
> On Sun, Oct 16, 2011 at 11:53 PM, Maximilien Colange
> wrote:
>
> > It appears that this "bug" is frequently reported.
> > It would be nice if ANTLR raised an error (or a warning) when a token
> > is given a reference in a syntactic predicate.
> >
> > However, I do not know whether it is easy to detect. I already
> > encountered this problem, and it occured in a "hidden" ANTLR-
> generated
> > syntactic predicate. I am afraid the error is difficult to detect in
> > such cases.
> >
> > And just for curiosity, why is not it possible to reference local
> > variables or to assign from token in a syntactic predicate ?
> >
> > --
> > Maximilien
> >
> > Le 10/15/11 11:34 PM, Jim Idle a écrit :
> > > Your problem does not look to be the rewrite rule, but the fact
> that
> > > you are referencing a local variable in a predicate, or have tried
> > > to assign from a token in a predicate.
> > >
> > > Look for something like this
> > >
> > > ((id=IDENTIFIER)=>  id=IDENTIFIER)? 
> > >
> > >
> > > But regardless, this is the rewrite rule that is the problem as far
> > > as I can see. Try commenting it out for instance.
> > >
> > > Jim
> > >
> > >> -Original Message-
> > >> From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
> > >> boun...@antlr.org] On Behalf Of Ross Bamford
> > >> Sent: Saturday, October 15, 2011 5:40 AM
> > >> To: antlr-interest@antlr.org
> > >> Subject: [antlr-interest] Rewrite action causing error in parser?
> > >>
> > >> Hi all,
> > >>
> > >> I have a grammar I'm currently working on (posted in another
> thread
> > >> the other day), which has the following rule:
> > >>
> > >> meth_call_expr
> > >>:   (id = IDENTIFIER DOT)? func_call_expr ->  ^(METHOD_CALL {
> > >> ($id==null) ?
> > >> adaptor.create(SELF, "SELF") : adaptor.create(IDENTIFIER,
> > >> $id.getText()) }
> > >> func_call_expr)
> > >>;
> > >>
> > >> As you can see, I'm using an action in the rewrite rule to insert
> > >> either the
> > >> (optional) IDENTIFIER, or an imaginary SELF node if IDENTIFIER is
> > >> not specified. The problem I'm having is that this generates a
> > >> parser that won't compile. Specifically, it generates the
> following
> > >> bit of code (edited by hand for brevity and to highlight the
> error):
> > >>
> > >> /*  [ CODE ]  */
> > >>  // $ANTLR start synpred6_BasicLang
> > >>  public final void synpred6_BasicLang_fragment() throws
> > >> RecognitionException {
> > >>
> > >>  Token =null; //<-- ERROR HERE
> > >>
> > >>  /* ... later on ... */
> > >>
> > >>  switch (alt23) {
> > >>  case 1 :
> > >>  //
> > >>
> C:\\Users\\chantelle\\workspace\\basiclang\\src\\com\\roscopeco\\ba
> > >> sicl
> > >> ang\\parser\\BasicLang.g:99:8:
> > >> id= IDENTIFIER DOT
> > >>  {
> > >>
> > >>
> id=(Token)match(input,IDENTIFIER,FOLLOW_IDENTIFIER_in_synpred6_Basi
> > >> cLan
> > >> g232);
> > >> if (state.failed) return ; //<-- AND HERE
> > >>
> > >>
> > >> match(input,DOT,FOLLOW_DOT_in_synpred6_BasicLang234);
> > >> if
> > >> (state.failed) return ;
> > >>
> > >>  }
> > >>  break;
> > >>
> > >>  }
> > >> /*  [ END ]  */
> > >>
> > >> Obviously the problem is the "Token =null" line, which should be
> > >> "Token id = null". Changing it by hand fixes the errors and makes
> > >> the parser work as expected.
> > >>
> > >> So I have two questions - is this the right way to go about
> > >> inserting an imaginary token if an optional token isn't in the
> > >> input, and if so, what am I doing wrong to cause the error above?
> > >>
> > >> Thanks in advance,
> > >> Ross
> > >>
> > >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > >> Unsubscribe:
> > >> http://www.antlr.org/mailman/options/antlr-interest/your-
> > >> email-address
> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> >
>
> List: http://www.antlr.org/mailman