I'm trying to use instaparse to parse Clojure code so that I can reformat 
it, but I'm having an issue with how to handle special forms. Should I 
attempt to parse special forms such as let and defn into their own rules, 
or should I rely instead on the actual content of the terminal to determine 
what lists should be treated as special forms?

For example, let's say I want to write a function which takes the parse 
tree returned by instaparse and arranges all the let bindings as 
recommended by the Clojure style guide 
(https://github.com/bbatsov/clojure-style-guide#source-code-layout--organization).
 
There are two approaches I could take:

1) Build the recognition into the grammar itself:

S = Form*
>
> <Form> = !SpecialForm List | ReaderMacro | Literal | Vector | Map | 
>          SpecialForm | !SpecialForm Symbol
>     
>     List = '(' Form* ')'
>
>    ...
>
>     <SpecialForm> = defn | let | try | JavaMemberAccess | JavaConstructor
>         defn = '(' "defn" Symbol String? MapMetadata? VectorDestructuring 
> Form* ')'
>
> <Destructuring> = VectorDestructuring | MapDestructuring
>     VectorDestructuring = '[' (Symbol | Destructuring)* ('&' (Symbol | 
> Destructuring))? ']'
>     MapDestructuring = Map
>

2) Don't try to detect the let bindings in the grammar. Instead, search the 
resulting parse tree for lists with "let" content.

Which of these is a better approach? I sadly didn't take compilers in 
college so I'm kind of playing this by ear; I'm sure if I had I'd have a 
better idea of what the best practice is here.

Thanks!

(Full code for my project is at 
https://github.com/MoyTW/clojure-toys/tree/master/formatter if needed)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
S = Form*

<Form> = List | ReaderMacro | Literal | Vector | Map | 
         SpecialForm | Symbol
    
    List = '(' Form* ')'
    
    <ReaderMacro> = Quote | SyntaxQuote | Var | Dispatch | Comment | Metadata | 
                    QuotedInternal (*TODO - Slash*)
        Quote = "'" Form
        SyntaxQuote = '`' Form
        Dispatch = '#' DispatchMacro
            <DispatchMacro> = Set | Var | Regex | AnonFuncLit (*TODO - 
IgnoreForm*)
                Set = '{' Form* '}'
                Var = "'" Form
                Regex = String
                AnonFuncLit = '(' Form* ')'
        Comment = ';' #'[^\n]*'
        <Metadata> = SymbolMetadata | KeywordMetadata | StringMetadata | 
MapMetadata
            SymbolMetadata = "^" Symbol Form
            KeywordMetadata = "^" Keyword Form
            StringMetadata = "^" String Form
            MapMetadata = "^" Map Form
        <QuotedInternal> = Unquote | UnquoteSplice | GenSym
            Unquote = '~' Form (*TODO - This should ONLY be used INSIDE a 
quoted form!*)
            UnquoteSplice = '~@' Form (*TODO - This should ONLY be used INSIDE 
a quoted form!*)
            GenSym = Symbol '#' (*TODO - This should ONLY be used INSIDE a 
quoted form!*)

    Symbol = Division | Custom
        <Division> = '/'
        <Custom> = 
#'[a-zA-Z\*\+\!\-\_\?\=<>%&][a-zA-Z0-9\*\+\!\-\_\?\=\.<>%&]*/?[a-zA-Z0-9\*\+\!\-\_\?\=\.<>%&]*'

    <Literal> = String | Number | Character | Boolean | Keyword | NilLiteral
        String = '"' #'(\\\"|[^"])*' '"' (*Matches \\\" or any char not \"*)
        <Number> = Integer | Float | Ratio (* TODO - add in support for hex/oct 
forms*)
            Integer = #'[+-]?[0-9]+r?[0-9]*' (*The r is so you can do 8r52 - 8 
radix 52*)
            Float = #'[+-]?([0-9]*\.[0-9]+|[0-9]+\.[0-9]*)' | (*Decimal form*)
                    #'[+-]?[0-9]+\.?[0-9]*e[+-]?[0-9]+' (*Exponent form*)
            Ratio = #'[+-]?[0-9]+/[0-9]+'
        Character = #'\\.' | '\\newline' | '\\space' | '\\tab' | '\\formfeed' |
                    '\\backspace' | '\\return'
                    (* TODO - add in support for unicode character 
representations!*)
        Boolean = 'true' | 'false'
        Keyword = #'::?[a-zA-Z0-9\*\+\!\-\_\?]*'
        NilLiteral = 'nil'
        
    Vector = '[' Form* ']'
    Map = '{' (Form Form)* '}'
S = Form*

<Destructuring> = VectorDestructuring | MapDestructuring
    VectorDestructuring = '[' (Symbol | Destructuring)* ('&' (Symbol | 
Destructuring))? ']'
    MapDestructuring = Map

<Form> = !SpecialForm List | ReaderMacro | Literal | Vector | Map | 
         SpecialForm | !SpecialForm Symbol
    
    List = '(' Form* ')'
    
    <ReaderMacro> = Quote | SyntaxQuote | Var | Dispatch | Comment | Metadata | 
                    QuotedInternal (*TODO - Slash*)
        Quote = "'" Form
        SyntaxQuote = '`' Form
        Dispatch = '#' DispatchMacro
            <DispatchMacro> = Set | Var | Regex | AnonFuncLit (*TODO - 
IgnoreForm*)
                Set = '{' Form* '}'
                Var = "'" Form
                Regex = String
                AnonFuncLit = '(' Form* ')'
        Comment = ';' #'[^\n]*'
        <Metadata> = SymbolMetadata | KeywordMetadata | StringMetadata | 
MapMetadata
            SymbolMetadata = "^" Symbol Form
            KeywordMetadata = "^" Keyword Form
            StringMetadata = "^" String Form
            MapMetadata = "^" Map Form
        <QuotedInternal> = Unquote | UnquoteSplice | GenSym
            Unquote = '~' Form (*TODO - This should ONLY be used INSIDE a 
quoted form!*)
            UnquoteSplice = '~@' Form (*TODO - This should ONLY be used INSIDE 
a quoted form!*)
            GenSym = Symbol '#' (*TODO - This should ONLY be used INSIDE a 
quoted form!*)

    Symbol = Division | Custom
        <Division> = '/'
        <Custom> = 
#'[a-zA-Z\*\+\!\-\_\?\=<>%&][a-zA-Z0-9\*\+\!\-\_\?\=\.<>%&]*/?[a-zA-Z0-9\*\+\!\-\_\?\=\.<>%&]*'
    
    (*Symbol = Slash | Name ('/' Name)?
        <Slash> = '/'
        <Name> = NameHead NameRest* (':' NameRest+)*
            <NameHead> = #'[a-zA-Z]' | '*' | '+' | '!' | '-' | '_' | '?' | 
                         '>' | '<' | '=' | '$'
            <NameRest> = NameHead | #'[0-9]' | '&' | '.'*)

    <Literal> = String | Number | Character | Boolean | Keyword | NilLiteral
        String = '"' #'(\\\"|[^"])*' '"' (*Matches \\\" or any char not \"*)
        <Number> = Integer | Float | Ratio (* TODO - add in support for hex/oct 
forms*)
            Integer = #'[+-]?[0-9]+r?[0-9]*' (*The r is so you can do 8r52 - 8 
radix 52*)
            Float = #'[+-]?([0-9]*\.[0-9]+|[0-9]+\.[0-9]*)' | (*Decimal form*)
                    #'[+-]?[0-9]+\.?[0-9]*e[+-]?[0-9]+' (*Exponent form*)
            Ratio = #'[+-]?[0-9]+/[0-9]+'
        Character = #'\\.' | '\\newline' | '\\space' | '\\tab' | '\\formfeed' |
                    '\\backspace' | '\\return'
                    (* TODO - add in support for unicode character 
representations!*)
        Boolean = 'true' | 'false'
        Keyword = #'::?[a-zA-Z0-9\*\+\!\-\_\?]*'
        NilLiteral = 'nil'
        
    Vector = '[' Form* ']'
    Map = '{' (Form Form)* '}'
    
    <SpecialForm> = defn | let | try | JavaMemberAccess | JavaConstructor
        defn = '(' "defn" Symbol String? MapMetadata? VectorDestructuring Form* 
')'
        let = '(' "let" LetBinding Form* ')'
            LetBinding = '[' ((Symbol | Destructuring) Form)* ']'
        try = '(' "try" Form* CatchClause* FinallyClause? ')'
            CatchClause = '(' "catch" Symbol Symbol Form* ')'
            FinallyClause = '(' "finally" Form* ')'
        JavaMemberAccess = '.' Symbol Form*
        JavaConstructor = Form '.' Form*

Reply via email to