I'm trying to use instaparse to parse Clojure code so that I can reformat
it, but I'm having an issue with how to handle special forms. Should I
attempt to parse special forms such as let and defn into their own rules,
or should I rely instead on the actual content of the terminal to determine
what lists should be treated as special forms?
For example, let's say I want to write a function which takes the parse
tree returned by instaparse and arranges all the let bindings as
recommended by the Clojure style guide
(https://github.com/bbatsov/clojure-style-guide#source-code-layout--organization).
There are two approaches I could take:
1) Build the recognition into the grammar itself:
S = Form*
>
> <Form> = !SpecialForm List | ReaderMacro | Literal | Vector | Map |
> SpecialForm | !SpecialForm Symbol
>
> List = '(' Form* ')'
>
> ...
>
> <SpecialForm> = defn | let | try | JavaMemberAccess | JavaConstructor
> defn = '(' "defn" Symbol String? MapMetadata? VectorDestructuring
> Form* ')'
>
> <Destructuring> = VectorDestructuring | MapDestructuring
> VectorDestructuring = '[' (Symbol | Destructuring)* ('&' (Symbol |
> Destructuring))? ']'
> MapDestructuring = Map
>
2) Don't try to detect the let bindings in the grammar. Instead, search the
resulting parse tree for lists with "let" content.
Which of these is a better approach? I sadly didn't take compilers in
college so I'm kind of playing this by ear; I'm sure if I had I'd have a
better idea of what the best practice is here.
Thanks!
(Full code for my project is at
https://github.com/MoyTW/clojure-toys/tree/master/formatter if needed)
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
S = Form*
<Form> = List | ReaderMacro | Literal | Vector | Map |
SpecialForm | Symbol
List = '(' Form* ')'
<ReaderMacro> = Quote | SyntaxQuote | Var | Dispatch | Comment | Metadata |
QuotedInternal (*TODO - Slash*)
Quote = "'" Form
SyntaxQuote = '`' Form
Dispatch = '#' DispatchMacro
<DispatchMacro> = Set | Var | Regex | AnonFuncLit (*TODO -
IgnoreForm*)
Set = '{' Form* '}'
Var = "'" Form
Regex = String
AnonFuncLit = '(' Form* ')'
Comment = ';' #'[^\n]*'
<Metadata> = SymbolMetadata | KeywordMetadata | StringMetadata |
MapMetadata
SymbolMetadata = "^" Symbol Form
KeywordMetadata = "^" Keyword Form
StringMetadata = "^" String Form
MapMetadata = "^" Map Form
<QuotedInternal> = Unquote | UnquoteSplice | GenSym
Unquote = '~' Form (*TODO - This should ONLY be used INSIDE a
quoted form!*)
UnquoteSplice = '~@' Form (*TODO - This should ONLY be used INSIDE
a quoted form!*)
GenSym = Symbol '#' (*TODO - This should ONLY be used INSIDE a
quoted form!*)
Symbol = Division | Custom
<Division> = '/'
<Custom> =
#'[a-zA-Z\*\+\!\-\_\?\=<>%&][a-zA-Z0-9\*\+\!\-\_\?\=\.<>%&]*/?[a-zA-Z0-9\*\+\!\-\_\?\=\.<>%&]*'
<Literal> = String | Number | Character | Boolean | Keyword | NilLiteral
String = '"' #'(\\\"|[^"])*' '"' (*Matches \\\" or any char not \"*)
<Number> = Integer | Float | Ratio (* TODO - add in support for hex/oct
forms*)
Integer = #'[+-]?[0-9]+r?[0-9]*' (*The r is so you can do 8r52 - 8
radix 52*)
Float = #'[+-]?([0-9]*\.[0-9]+|[0-9]+\.[0-9]*)' | (*Decimal form*)
#'[+-]?[0-9]+\.?[0-9]*e[+-]?[0-9]+' (*Exponent form*)
Ratio = #'[+-]?[0-9]+/[0-9]+'
Character = #'\\.' | '\\newline' | '\\space' | '\\tab' | '\\formfeed' |
'\\backspace' | '\\return'
(* TODO - add in support for unicode character
representations!*)
Boolean = 'true' | 'false'
Keyword = #'::?[a-zA-Z0-9\*\+\!\-\_\?]*'
NilLiteral = 'nil'
Vector = '[' Form* ']'
Map = '{' (Form Form)* '}'
S = Form*
<Destructuring> = VectorDestructuring | MapDestructuring
VectorDestructuring = '[' (Symbol | Destructuring)* ('&' (Symbol |
Destructuring))? ']'
MapDestructuring = Map
<Form> = !SpecialForm List | ReaderMacro | Literal | Vector | Map |
SpecialForm | !SpecialForm Symbol
List = '(' Form* ')'
<ReaderMacro> = Quote | SyntaxQuote | Var | Dispatch | Comment | Metadata |
QuotedInternal (*TODO - Slash*)
Quote = "'" Form
SyntaxQuote = '`' Form
Dispatch = '#' DispatchMacro
<DispatchMacro> = Set | Var | Regex | AnonFuncLit (*TODO -
IgnoreForm*)
Set = '{' Form* '}'
Var = "'" Form
Regex = String
AnonFuncLit = '(' Form* ')'
Comment = ';' #'[^\n]*'
<Metadata> = SymbolMetadata | KeywordMetadata | StringMetadata |
MapMetadata
SymbolMetadata = "^" Symbol Form
KeywordMetadata = "^" Keyword Form
StringMetadata = "^" String Form
MapMetadata = "^" Map Form
<QuotedInternal> = Unquote | UnquoteSplice | GenSym
Unquote = '~' Form (*TODO - This should ONLY be used INSIDE a
quoted form!*)
UnquoteSplice = '~@' Form (*TODO - This should ONLY be used INSIDE
a quoted form!*)
GenSym = Symbol '#' (*TODO - This should ONLY be used INSIDE a
quoted form!*)
Symbol = Division | Custom
<Division> = '/'
<Custom> =
#'[a-zA-Z\*\+\!\-\_\?\=<>%&][a-zA-Z0-9\*\+\!\-\_\?\=\.<>%&]*/?[a-zA-Z0-9\*\+\!\-\_\?\=\.<>%&]*'
(*Symbol = Slash | Name ('/' Name)?
<Slash> = '/'
<Name> = NameHead NameRest* (':' NameRest+)*
<NameHead> = #'[a-zA-Z]' | '*' | '+' | '!' | '-' | '_' | '?' |
'>' | '<' | '=' | '$'
<NameRest> = NameHead | #'[0-9]' | '&' | '.'*)
<Literal> = String | Number | Character | Boolean | Keyword | NilLiteral
String = '"' #'(\\\"|[^"])*' '"' (*Matches \\\" or any char not \"*)
<Number> = Integer | Float | Ratio (* TODO - add in support for hex/oct
forms*)
Integer = #'[+-]?[0-9]+r?[0-9]*' (*The r is so you can do 8r52 - 8
radix 52*)
Float = #'[+-]?([0-9]*\.[0-9]+|[0-9]+\.[0-9]*)' | (*Decimal form*)
#'[+-]?[0-9]+\.?[0-9]*e[+-]?[0-9]+' (*Exponent form*)
Ratio = #'[+-]?[0-9]+/[0-9]+'
Character = #'\\.' | '\\newline' | '\\space' | '\\tab' | '\\formfeed' |
'\\backspace' | '\\return'
(* TODO - add in support for unicode character
representations!*)
Boolean = 'true' | 'false'
Keyword = #'::?[a-zA-Z0-9\*\+\!\-\_\?]*'
NilLiteral = 'nil'
Vector = '[' Form* ']'
Map = '{' (Form Form)* '}'
<SpecialForm> = defn | let | try | JavaMemberAccess | JavaConstructor
defn = '(' "defn" Symbol String? MapMetadata? VectorDestructuring Form*
')'
let = '(' "let" LetBinding Form* ')'
LetBinding = '[' ((Symbol | Destructuring) Form)* ']'
try = '(' "try" Form* CatchClause* FinallyClause? ')'
CatchClause = '(' "catch" Symbol Symbol Form* ')'
FinallyClause = '(' "finally" Form* ')'
JavaMemberAccess = '.' Symbol Form*
JavaConstructor = Form '.' Form*