[il-antlr-interest: 29365] Re: [antlr-interest] ANTLR Basic Question

Jason H King Fri, 09 Jul 2010 12:26:01 -0700

Ah, one a fellow near-clueless user can help with.
You might want to follow the example here:
http://www.antlr.org/wiki/pages/viewpage.action?pageId=1782
and implement a case-insensitive file stream.
That should make your rules more straightforward, then you have rules like
DATA : DATA_ ;
LOOP : LOOP_ ;


--- [email protected] wrote:

From: Klaus Martinschitz <[email protected]>
To: [email protected]
Subject: [antlr-interest] ANTLR Basic Question
Date: Fri, 09 Jul 2010 21:10:55 +0200

  Hi ANTLR Gurus,

A beginner's question.
I want to write a compiler for Crystallographic Information File Format 
' (CIF). I don't want to explain the syntax in detail only the problem I 
have to face with.

The data starts with a token

'data_'

followed by arbitrary characters and an EOL, e.g.

data_global
.

There is also a token

'loop_';

Somewehere in my BNF I write something like

DATA
     :(('d'|'D')('a'|'A')('t'|'T')('a'|'A')'_')
     ;

LOOP
     :
     (('l'|'L')('o'|'O')('o'|'O')('p'|'P')'_')
     ;

dataBlockHeading
     :    (DATA NONBLANCKCHAR+ EOL)
     ;

dataItem
     :    (tag WHITESPACE value) | (LOOP loopHeader loopBody)
     ;

The first two expressions are tokens the second are rules. My problem is 
following. The file starts with

data_global

BUT the *lo* of data_g*lo*bal is parsed from the LOOP token. How can 
this be if the parser is in the dataBlockHeadingrule? The parser must 
know that the characters *lo* belong to NONBLANCKCHAR and not to LOOP,
or?

I have attached the whole syntax at the end of the file

Thanks for help

Regards,
Klaus












grammar CIF1_1;

options{
language=Java;
}

@lexer::header{
package at.netcrystals.cif_1_1.parser;
}

@parser::header{
package at.netcrystals.cif_1_1.parser;
}


DATA
     :(('d'|'D')('a'|'A')('t'|'T')('a'|'A')'_')
     ;

LOOP
     :
     (('l'|'L')('o'|'O')('o'|'O')('p'|'P')'_')
     ;

fragment ORDINARYCHAR
     :     '!' | '%' | '&' | '(' | ')' | '*' | '+' | ',' | '-' | '.' | 
'/' | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | ':' | 
'<' | '=' | '>' | '?' | '@' | 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 
'H' | 'I' | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R' | 'S' | 
'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z' | '\\' | '^' | '\`' | 'a' | 'b' 
| 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i' | 'j' | 'k' | 'l' | 'm' | 'n' 
| 'o' | 'p' | 'q' | 'r' | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z' 
| '{' | '|' | '}' | '~'
     ;


NONBLANCKCHAR
     :    ORDINARYCHAR | '"' | '#' | '$' | '\'' | '_' | ';' | '[' | ']'
     ;



WHITESPACE
     :    '\t'|' '
     ;


/************************************************************************************************
     WhiteSpace and Comments
************************************************************************************************/






EOL
     :'\n'|'\r\n'
     ;






/************************************************************************************************
*
* Root
*
************************************************************************************************/

cif
     :      (dataBlock)   EOF
     ;

dataBlock
     :    (dataBlockHeading dataItems)
     ;

dataBlockHeading
     :    (DATA NONBLANCKCHAR+ EOL)
     ;


dataItems
     :    dataItem* EOL
     ;

dataItem
     :    (tag WHITESPACE value) | (LOOP loopHeader loopBody)
     ;

tag
     :    NONBLANCKCHAR+
     ;


value
     :    '.' | '?' | charString
     ;

charString
     :    singleQuotedString
     ;

singleQuotedString
     :    '\'' NONBLANCKCHAR* '\''
     ;

loopHeader
     :    ( (WHITESPACE tag)+)
     ;

loopBody
     :    value (WHITESPACE value)+
     ;





List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 29365] Re: [antlr-interest] ANTLR Basic Question

Reply via email to