Alright I am currently trying to learn ANTLR via the definitive guide
book. My current questions regards the following XML Grammar. What I
am trying to do is; re-write the grammar from the XMLLexer.g example
to be a parser and lexer grammar, I am doing this as an exercise to
try to understand ANTLR.

When debugging under ANTLR Works 1.3 i get a missing token exception
on GENERIC_ID within the "attribute" parser rule. I tried problem
solving by changing it to a non-fragment lexer rule and to a parser
rule, but this causes the beginning XML declaration to break. I cant
understand why it would break the recognition of "XML" when its before
the attribute call.

Any help would be much appreciated for me to understand this situation better.


XML.g:
-----------------------------------------------------------------

grammar XML;

options {
backtrack = true;
}

document
        :       xmldecl WS? doctype
        ;

doctype
    :
        '<!DOCTYPE' WS? GENERIC_ID

        WS?
        (
            ( 'SYSTEM' WS? VALUE
            | 'PUBLIC' WS? VALUE WS? VALUE
            )
            ( WS )?
        )?
        ( INTERNAL_DTD

        )?
                '>'
        ;

INTERNAL_DTD : '[' (options {greedy=false;} : .)* ']' ;

pi :
        '<?' GENERIC_ID WS?

        ( attribute WS? )*  '?>'
        ;

xmldecl :
        '<?' ('x'|'X') ('m'|'M') ('l'|'L') WS?

        attribute  '?>'
        ;


element
    : ( start_tag
            (element
            | PCDATA

            | cdata

            | comment

            | pi
            )*
            end_tag
        | emptyelement
        )
    ;

start_tag
    : '<' WS? GENERIC_ID WS?

        ( attribute WS? )* '>'
    ;

emptyelement
    : '<' WS? GENERIC_ID WS?

        ( attribute WS? )* '/>'
    ;

attribute
    : GENERIC_ID WS? '=' WS? VALUE

    ;

end_tag
    : '</' WS? GENERIC_ID WS? '>'

    ;

comment
        :       '<!--' (options {greedy=false;} : .)* '-->'
        ;

cdata
        :       '<![CDATA[' (options {greedy=false;} : .)* ']]>'
        ;



fragment GENERIC_ID
    : ( LETTER | '_' | ':')
        ( options {greedy=true;} :
        LETTER | '0'..'9' | '.' | '-' | '_' | ':' )*
        ;

fragment LETTER
        : 'a'..'z'
        | 'A'..'Z'
        ;


 WS  :
        (   ' '
        |   '\t'
        |  ( '\n'
            |   '\r\n'
            |   '\r'
            )
        )+
    ;

fragment PCDATA : (~'<')+ ;

fragment VALUE :
        ( '\"' (~'\"')* '\"'
        | '\'' (~'\'')* '\''
        )
        ;

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to