GHC AST Annotations
Now that the landmines have hopefully been cleared from the AST via [1] I would like to propose changing the location information in the AST. Right now the locations of syntactic markers such as do/let/where/in/of in the source are discarded from the AST, although they are retained in the rich token stream. The haskell-src-exts package deals with this by means of using the SrcSpanInfo data type [2] which contains the SrcSpan as per the current GHC Located type but also has a list of SrcSpan s for the syntactic markers, depending on the particular AST fragment being annotated. In addition, the annotation type is provided as a parameter to the AST, so that it can be changed as required, see [3]. The motivation for this change is then 1. Simplify the roundtripping and modification of source by explicitly capturing the missing location information for the syntactic markers. 2. Allow the annotation to be a parameter so that it can be replaced with a different one in tools, for example HaRe would include the tokens for the AST fragment leaves. 3. Aim for some level compatibility with haskell-src-exts so that tools developed for it could be easily ported to GHC, for example exactprint [4]. I would like feedback as to whether this would be acceptable, or if the same goals should be achieved a different way. Regards Alan [1] https://phabricator.haskell.org/D157 [2] http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-SrcLoc.html#t:SrcSpanInfo [3] http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-Syntax.html#t:Annotated [4] http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-ExactPrint.html#v:exactPrint ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: GHC AST Annotations
In general I’m fine with this direction of travel. Some specifics: ·You’d have to be careful to document, for every data constructor in HsSyn, what the association between the [SrcSpan] in the SrcSpanInfo and the “sub-entities” ·Many of the sub-entities will have their own SrcSpanInfo wrapped around them, so there’s some unhelpful duplication. Maybe you only want the SrcSpanInfo to list the [SrcSpan]s for the sub-entities (like the syntactic keywords) that do not show up as children in the syntax tree? Anyway do by all means create a GHC Trac wiki page to describe your proposed design, concretely. Simon From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Alan & Kim Zimmerman Sent: 28 August 2014 15:00 To: ghc-devs@haskell.org Subject: GHC AST Annotations Now that the landmines have hopefully been cleared from the AST via [1] I would like to propose changing the location information in the AST. Right now the locations of syntactic markers such as do/let/where/in/of in the source are discarded from the AST, although they are retained in the rich token stream. The haskell-src-exts package deals with this by means of using the SrcSpanInfo data type [2] which contains the SrcSpan as per the current GHC Located type but also has a list of SrcSpan s for the syntactic markers, depending on the particular AST fragment being annotated. In addition, the annotation type is provided as a parameter to the AST, so that it can be changed as required, see [3]. The motivation for this change is then 1. Simplify the roundtripping and modification of source by explicitly capturing the missing location information for the syntactic markers. 2. Allow the annotation to be a parameter so that it can be replaced with a different one in tools, for example HaRe would include the tokens for the AST fragment leaves. 3. Aim for some level compatibility with haskell-src-exts so that tools developed for it could be easily ported to GHC, for example exactprint [4]. I would like feedback as to whether this would be acceptable, or if the same goals should be achieved a different way. Regards Alan [1] https://phabricator.haskell.org/D157 [2] http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-SrcLoc.html#t:SrcSpanInfo [3] http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-Syntax.html#t:Annotated [4] http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-ExactPrint.html#v:exactPrint ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: GHC AST Annotations
For what it's worth, my thought is not to use SrcSpanInfo (which, to me, is the wrong way to slice the abstraction) but instead to add SrcSpan fields to the relevant nodes. For example: | HsDoSrcSpan -- of the word "do" BlockSrcSpans (HsStmtContext Name) -- The parameterisation is unimportant -- because in this context we never use -- the PatGuard or ParStmt variant [ExprLStmt id] -- "do":one or more stmts PostTcType -- Type of the whole expression ... data BlockSrcSpans = LayoutBlock Int -- the parameter is the indentation level ... -- stuff to track the appearance of any semicolons | BracesBlock ... -- stuff to track the braces and semicolons The way I understand it, the SrcSpanInfo proposal means that we would have lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I think. Popping up a level, I do support the idea of including this info in the AST. Richard On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones wrote: > In general I’m fine with this direction of travel. Some specifics: > > ·You’d have to be careful to document, for every data constructor in > HsSyn, what the association between the [SrcSpan] in the SrcSpanInfo and the > “sub-entities” > ·Many of the sub-entities will have their own SrcSpanInfo wrapped > around them, so there’s some unhelpful duplication. Maybe you only want the > SrcSpanInfo to list the [SrcSpan]s for the sub-entities (like the syntactic > keywords) that do not show up as children in the syntax tree? > Anyway do by all means create a GHC Trac wiki page to describe your proposed > design, concretely. > > Simon > > From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Alan & Kim > Zimmerman > Sent: 28 August 2014 15:00 > To: ghc-devs@haskell.org > Subject: GHC AST Annotations > > Now that the landmines have hopefully been cleared from the AST via [1] I > would like to propose changing the location information in the AST. > > Right now the locations of syntactic markers such as do/let/where/in/of in > the source are discarded from the AST, although they are retained in the rich > token stream. > > The haskell-src-exts package deals with this by means of using the > SrcSpanInfo data type [2] which contains the SrcSpan as per the current GHC > Located type but also has a list of SrcSpan s for the syntactic markers, > depending on the particular AST fragment being annotated. > > In addition, the annotation type is provided as a parameter to the AST, so > that it can be changed as required, see [3]. > > The motivation for this change is then > > 1. Simplify the roundtripping and modification of source by explicitly > capturing the missing location information for the syntactic markers. > > 2. Allow the annotation to be a parameter so that it can be replaced with a > different one in tools, for example HaRe would include the tokens for the AST > fragment leaves. > > 3. Aim for some level compatibility with haskell-src-exts so that tools > developed for it could be easily ported to GHC, for example exactprint [4]. > > > > I would like feedback as to whether this would be acceptable, or if the same > goals should be achieved a different way. > > > > Regards > > Alan > > > > > [1] https://phabricator.haskell.org/D157 > > [2] > http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-SrcLoc.html#t:SrcSpanInfo > > [3] > http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-Syntax.html#t:Annotated > > [4] > http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-ExactPrint.html#v:exactPrint > > ___ > ghc-devs mailing list > ghc-devs@haskell.org > http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: GHC AST Annotations
This does have the advantage of being explicit. I modelled the initial proposal on HSE as a proven solution, and I think that they were trying to keep it non-invasive, to allow both an annotated and non-annoted AST. I thiink the key question is whether it is acceptable to sprinkle this kind of information throughout the AST. For someone interested in source-to-source conversions (like me) this is great, others may find it intrusive. The other question, which is probably orthogonal to this, is whether we want the annotation to be a parameter to the AST, which allows it to be overridden by various tools for various purposes, or fixed as in Richard's suggestion. A parameterised annotation allows the annotations to be manipulated via something like for HSE: -- |AST nodes are annotated, and this class allows manipulation of the annotations. class Functor ast => Annotated ast where -- |Retrieve the annotation of an AST node. ann :: ast l -> l -- |Change the annotation of an AST node. Note that only the annotation of the node itself is affected, and not -- the annotations of any child nodes. if all nodes in the AST tree are to be affected, use fmap. amap :: (l -> l) -> ast l -> ast l Alan On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg wrote: > For what it's worth, my thought is not to use SrcSpanInfo (which, to me, > is the wrong way to slice the abstraction) but instead to add SrcSpan > fields to the relevant nodes. For example: > > | HsDoSrcSpan -- of the word "do" > BlockSrcSpans > (HsStmtContext Name) -- The parameterisation is unimportant > -- because in this context we never > use > -- the PatGuard or ParStmt variant > [ExprLStmt id] -- "do":one or more stmts > PostTcType -- Type of the whole expression > > ... > > data BlockSrcSpans = LayoutBlock Int -- the parameter is the indentation > level > ... -- stuff to track the appearance of > any semicolons >| BracesBlock ... -- stuff to track the braces and > semicolons > > > The way I understand it, the SrcSpanInfo proposal means that we would have > lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I think. > > Popping up a level, I do support the idea of including this info in the > AST. > > Richard > > On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones > wrote: > > > In general I’m fine with this direction of travel. Some specifics: > > > > ·You’d have to be careful to document, for every data > constructor in HsSyn, what the association between the [SrcSpan] in the > SrcSpanInfo and the “sub-entities” > > ·Many of the sub-entities will have their own SrcSpanInfo > wrapped around them, so there’s some unhelpful duplication. Maybe you only > want the SrcSpanInfo to list the [SrcSpan]s for the sub-entities (like the > syntactic keywords) that do not show up as children in the syntax tree? > > Anyway do by all means create a GHC Trac wiki page to describe your > proposed design, concretely. > > > > Simon > > > > From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Alan > & Kim Zimmerman > > Sent: 28 August 2014 15:00 > > To: ghc-devs@haskell.org > > Subject: GHC AST Annotations > > > > Now that the landmines have hopefully been cleared from the AST via [1] > I would like to propose changing the location information in the AST. > > > > Right now the locations of syntactic markers such as do/let/where/in/of > in the source are discarded from the AST, although they are retained in the > rich token stream. > > > > The haskell-src-exts package deals with this by means of using the > SrcSpanInfo data type [2] which contains the SrcSpan as per the current GHC > Located type but also has a list of SrcSpan s for the syntactic markers, > depending on the particular AST fragment being annotated. > > > > In addition, the annotation type is provided as a parameter to the AST, > so that it can be changed as required, see [3]. > > > > The motivation for this change is then > > > > 1. Simplify the roundtripping and modification of source by explicitly > capturing the missing location information for the syntactic markers. > > > > 2. Allow the annotation to be a parameter so that it can be replaced > with a different one in tools, for example HaRe would include the tokens > for the AST fragment leaves. > > > > 3. Aim for some level compatibility with haskell-src-exts so that tools > developed for
Re: GHC AST Annotations
I have started capturing the discussion here https://ghc.haskell.org/trac/ghc/wiki/GhcAstAnnotations. On Thu, Aug 28, 2014 at 8:34 PM, Alan & Kim Zimmerman wrote: > This does have the advantage of being explicit. I modelled the initial > proposal on HSE as a proven solution, and I think that they were trying to > keep it non-invasive, to allow both an annotated and non-annoted AST. > > I thiink the key question is whether it is acceptable to sprinkle this > kind of information throughout the AST. For someone interested in > source-to-source conversions (like me) this is great, others may find it > intrusive. > > The other question, which is probably orthogonal to this, is whether we > want the annotation to be a parameter to the AST, which allows it to be > overridden by various tools for various purposes, or fixed as in Richard's > suggestion. > > A parameterised annotation allows the annotations to be manipulated via > something like for HSE: > > -- |AST nodes are annotated, and this class allows manipulation of the > annotations. > class Functor ast => Annotated ast where > >-- |Retrieve the annotation of an AST node. > ann :: ast l -> l > > -- |Change the annotation of an AST node. Note that only the annotation > of the node itself is affected, and not > -- the annotations of any child nodes. if all nodes in the AST tree are > to be affected, use fmap. > amap :: (l -> l) -> ast l -> ast l > > Alan > > > On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg > wrote: > >> For what it's worth, my thought is not to use SrcSpanInfo (which, to me, >> is the wrong way to slice the abstraction) but instead to add SrcSpan >> fields to the relevant nodes. For example: >> >> | HsDoSrcSpan -- of the word "do" >> BlockSrcSpans >> (HsStmtContext Name) -- The parameterisation is >> unimportant >> -- because in this context we never >> use >> -- the PatGuard or ParStmt variant >> [ExprLStmt id] -- "do":one or more stmts >> PostTcType -- Type of the whole expression >> >> ... >> >> data BlockSrcSpans = LayoutBlock Int -- the parameter is the indentation >> level >> ... -- stuff to track the appearance of >> any semicolons >>| BracesBlock ... -- stuff to track the braces and >> semicolons >> >> >> The way I understand it, the SrcSpanInfo proposal means that we would >> have lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I >> think. >> >> Popping up a level, I do support the idea of including this info in the >> AST. >> >> Richard >> >> On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones >> wrote: >> >> > In general I’m fine with this direction of travel. Some specifics: >> > >> > ·You’d have to be careful to document, for every data >> constructor in HsSyn, what the association between the [SrcSpan] in the >> SrcSpanInfo and the “sub-entities” >> > ·Many of the sub-entities will have their own SrcSpanInfo >> wrapped around them, so there’s some unhelpful duplication. Maybe you only >> want the SrcSpanInfo to list the [SrcSpan]s for the sub-entities (like the >> syntactic keywords) that do not show up as children in the syntax tree? >> > Anyway do by all means create a GHC Trac wiki page to describe your >> proposed design, concretely. >> > >> > Simon >> > >> > From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Alan >> & Kim Zimmerman >> > Sent: 28 August 2014 15:00 >> > To: ghc-devs@haskell.org >> > Subject: GHC AST Annotations >> > >> > Now that the landmines have hopefully been cleared from the AST via [1] >> I would like to propose changing the location information in the AST. >> > >> > Right now the locations of syntactic markers such as do/let/where/in/of >> in the source are discarded from the AST, although they are retained in the >> rich token stream. >> > >> > The haskell-src-exts package deals with this by means of using the >> SrcSpanInfo data type [2] which contains the SrcSpan as per the current GHC >> Located type but also has a list of SrcSpan s for the syntactic markers, >> depending on the particular AST fragment being annotated. >> > >> > In addition, the annotation typ
RE: GHC AST Annotations
I thiink the key question is whether it is acceptable to sprinkle this kind of information throughout the AST. For someone interested in source-to-source conversions (like me) this is great, others may find it intrusive. It’s probably not too bad if you use record syntax; thus | HsDo { hsdo_do_loc :: SrcSpan -- of the word "do" , hsdo_blocks :: BlockSrcSpans , hsdo_ctxt :: HsStmtContext Name , hsdo_stmts :: [ExprLStmt id] , hsdo_type:: PostTcType } Simon From: Alan & Kim Zimmerman [mailto:alan.z...@gmail.com] Sent: 28 August 2014 19:35 To: Richard Eisenberg Cc: Simon Peyton Jones; ghc-devs@haskell.org Subject: Re: GHC AST Annotations This does have the advantage of being explicit. I modelled the initial proposal on HSE as a proven solution, and I think that they were trying to keep it non-invasive, to allow both an annotated and non-annoted AST. I thiink the key question is whether it is acceptable to sprinkle this kind of information throughout the AST. For someone interested in source-to-source conversions (like me) this is great, others may find it intrusive. The other question, which is probably orthogonal to this, is whether we want the annotation to be a parameter to the AST, which allows it to be overridden by various tools for various purposes, or fixed as in Richard's suggestion. A parameterised annotation allows the annotations to be manipulated via something like for HSE: -- |AST nodes are annotated, and this class allows manipulation of the annotations. class Functor ast => Annotated ast where -- |Retrieve the annotation of an AST node. ann :: ast l -> l -- |Change the annotation of an AST node. Note that only the annotation of the node itself is affected, and not -- the annotations of any child nodes. if all nodes in the AST tree are to be affected, use fmap. amap :: (l -> l) -> ast l -> ast l Alan On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg mailto:e...@cis.upenn.edu>> wrote: For what it's worth, my thought is not to use SrcSpanInfo (which, to me, is the wrong way to slice the abstraction) but instead to add SrcSpan fields to the relevant nodes. For example: | HsDoSrcSpan -- of the word "do" BlockSrcSpans (HsStmtContext Name) -- The parameterisation is unimportant -- because in this context we never use -- the PatGuard or ParStmt variant [ExprLStmt id] -- "do":one or more stmts PostTcType -- Type of the whole expression ... data BlockSrcSpans = LayoutBlock Int -- the parameter is the indentation level ... -- stuff to track the appearance of any semicolons | BracesBlock ... -- stuff to track the braces and semicolons The way I understand it, the SrcSpanInfo proposal means that we would have lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I think. Popping up a level, I do support the idea of including this info in the AST. Richard On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones mailto:simo...@microsoft.com>> wrote: > In general I’m fine with this direction of travel. Some specifics: > > ·You’d have to be careful to document, for every data constructor in > HsSyn, what the association between the [SrcSpan] in the SrcSpanInfo and the > “sub-entities” > ·Many of the sub-entities will have their own SrcSpanInfo wrapped > around them, so there’s some unhelpful duplication. Maybe you only want the > SrcSpanInfo to list the [SrcSpan]s for the sub-entities (like the syntactic > keywords) that do not show up as children in the syntax tree? > Anyway do by all means create a GHC Trac wiki page to describe your proposed > design, concretely. > > Simon > > From: ghc-devs > [mailto:ghc-devs-boun...@haskell.org<mailto:ghc-devs-boun...@haskell.org>] On > Behalf Of Alan & Kim Zimmerman > Sent: 28 August 2014 15:00 > To: ghc-devs@haskell.org<mailto:ghc-devs@haskell.org> > Subject: GHC AST Annotations > > Now that the landmines have hopefully been cleared from the AST via [1] I > would like to propose changing the location information in the AST. > > Right now the locations of syntactic markers such as do/let/where/in/of in > the source are discarded from the AST, although they are retained in the rich > token stream. > > The haskell-src-exts package deals with this by means of using the > SrcSpanInfo data type [2] which contains the SrcSpan as per the current GHC > Located type but also has a list of SrcSpan s for the syntactic markers, > depending on the particular AST fragment being annotated. > > In addition, the annotati
Re: GHC AST Annotations
A further use case would be to be able to convert all the locations to be relative, or include a relative portion, so that as tools manipulate the AST by adding or removing parts the layout can be preserved. I think I may need to make a wip branch for this and experiment, it is always easier to comment on concrete things. Alan On Thu, Aug 28, 2014 at 10:38 PM, Simon Peyton Jones wrote: > I thiink the key question is whether it is acceptable to sprinkle this > kind of information throughout the AST. For someone interested in > source-to-source conversions (like me) this is great, others may find it > intrusive. > > It’s probably not too bad if you use record syntax; thus > > | HsDo { hsdo_do_loc :: SrcSpan -- of the word "do" > > , hsdo_blocks :: BlockSrcSpans > > , hsdo_ctxt :: HsStmtContext Name > > , hsdo_stmts :: [ExprLStmt id] > > , hsdo_type:: PostTcType } > > > > Simon > > > > *From:* Alan & Kim Zimmerman [mailto:alan.z...@gmail.com] > *Sent:* 28 August 2014 19:35 > *To:* Richard Eisenberg > *Cc:* Simon Peyton Jones; ghc-devs@haskell.org > *Subject:* Re: GHC AST Annotations > > > > This does have the advantage of being explicit. I modelled the initial > proposal on HSE as a proven solution, and I think that they were trying to > keep it non-invasive, to allow both an annotated and non-annoted AST. > > I thiink the key question is whether it is acceptable to sprinkle this > kind of information throughout the AST. For someone interested in > source-to-source conversions (like me) this is great, others may find it > intrusive. > > The other question, which is probably orthogonal to this, is whether we > want the annotation to be a parameter to the AST, which allows it to be > overridden by various tools for various purposes, or fixed as in Richard's > suggestion. > > A parameterised annotation allows the annotations to be manipulated via > something like for HSE: > > -- |AST nodes are annotated, and this class allows manipulation of the > annotations. > class Functor ast => Annotated ast where > >-- |Retrieve the annotation of an AST node. > ann :: ast l -> l > > -- |Change the annotation of an AST node. Note that only the annotation > of the node itself is affected, and not > -- the annotations of any child nodes. if all nodes in the AST tree are > to be affected, use fmap. > > amap :: (l -> l) -> ast l -> ast l > > > > Alan > > > > On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg > wrote: > > For what it's worth, my thought is not to use SrcSpanInfo (which, to me, > is the wrong way to slice the abstraction) but instead to add SrcSpan > fields to the relevant nodes. For example: > > | HsDoSrcSpan -- of the word "do" > BlockSrcSpans > (HsStmtContext Name) -- The parameterisation is unimportant > -- because in this context we never > use > -- the PatGuard or ParStmt variant > [ExprLStmt id] -- "do":one or more stmts > PostTcType -- Type of the whole expression > > ... > > data BlockSrcSpans = LayoutBlock Int -- the parameter is the indentation > level > ... -- stuff to track the appearance of > any semicolons >| BracesBlock ... -- stuff to track the braces and > semicolons > > > The way I understand it, the SrcSpanInfo proposal means that we would have > lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I think. > > Popping up a level, I do support the idea of including this info in the > AST. > > Richard > > > On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones > wrote: > > > In general I’m fine with this direction of travel. Some specifics: > > > > ·You’d have to be careful to document, for every data > constructor in HsSyn, what the association between the [SrcSpan] in the > SrcSpanInfo and the “sub-entities” > > ·Many of the sub-entities will have their own SrcSpanInfo > wrapped around them, so there’s some unhelpful duplication. Maybe you only > want the SrcSpanInfo to list the [SrcSpan]s for the sub-entities (like the > syntactic keywords) that do not show up as children in the syntax tree? > > Anyway do by all means create a GHC Trac wiki page to describe your > proposed design, concretely. > > > > Simon > > > > From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Alan > & Kim Z
Re: GHC AST Annotations
Since Alan is trying to do something for HaRe that I want for HLint on top of haskell-src-exts, he asked me for my opinions on the proposal. There seem to be two approaches to take: * Add SrcSpan's throughout. The HSE approach of having a list of inner source spans is nasty - the details of which source space goes where is entirely undocumented and hard to discover. Even worse, for things like instance, which may or may not have a where after, the number of inner SrcSpan's changes. Simon's idea of hsdo_do_loc is much cleaner, and easily extends to Maybe SrcSpan if the keyword is optional. * Having the annotation be a type parameter gives much greater flexibility. In particular, it would let you mark certain nodes as being added/deleted. However, since SrcSpan has an Int in it, you can always pass around a separate IntMap and make the SrcSpan really be an index into more detailed information. It's nasty, but only the people who use it pay for it. Both approaches have disadvantages. You could always combine both ideas, and have a SrcSpan and entirely separately an annotation (which defaults to (), rather than SrcSpanInfo), but maybe that's too much extra baggage on the AST. Thanks, Neil On Sat, Aug 30, 2014 at 3:32 PM, Alan & Kim Zimmerman wrote: > A further use case would be to be able to convert all the locations to be > relative, or include a relative portion, so that as tools manipulate the AST > by adding or removing parts the layout can be preserved. > > I think I may need to make a wip branch for this and experiment, it is > always easier to comment on concrete things. > > Alan > > > On Thu, Aug 28, 2014 at 10:38 PM, Simon Peyton Jones > wrote: >> >> I thiink the key question is whether it is acceptable to sprinkle this >> kind of information throughout the AST. For someone interested in >> source-to-source conversions (like me) this is great, others may find it >> intrusive. >> >> It’s probably not too bad if you use record syntax; thus >> >> | HsDo { hsdo_do_loc :: SrcSpan -- of the word "do" >> >> , hsdo_blocks :: BlockSrcSpans >> >> , hsdo_ctxt :: HsStmtContext Name >> >> , hsdo_stmts :: [ExprLStmt id] >> >> , hsdo_type:: PostTcType } >> >> >> >> Simon >> >> >> >> From: Alan & Kim Zimmerman [mailto:alan.z...@gmail.com] >> Sent: 28 August 2014 19:35 >> To: Richard Eisenberg >> Cc: Simon Peyton Jones; ghc-devs@haskell.org >> Subject: Re: GHC AST Annotations >> >> >> >> This does have the advantage of being explicit. I modelled the initial >> proposal on HSE as a proven solution, and I think that they were trying to >> keep it non-invasive, to allow both an annotated and non-annoted AST. >> >> I thiink the key question is whether it is acceptable to sprinkle this >> kind of information throughout the AST. For someone interested in >> source-to-source conversions (like me) this is great, others may find it >> intrusive. >> >> The other question, which is probably orthogonal to this, is whether we >> want the annotation to be a parameter to the AST, which allows it to be >> overridden by various tools for various purposes, or fixed as in Richard's >> suggestion. >> >> A parameterised annotation allows the annotations to be manipulated via >> something like for HSE: >> >> -- |AST nodes are annotated, and this class allows manipulation of the >> annotations. >> class Functor ast => Annotated ast where >> >>-- |Retrieve the annotation of an AST node. >> ann :: ast l -> l >> >> -- |Change the annotation of an AST node. Note that only the annotation >> of the node itself is affected, and not >> -- the annotations of any child nodes. if all nodes in the AST tree are >> to be affected, use fmap. >> >> amap :: (l -> l) -> ast l -> ast l >> >> >> >> Alan >> >> >> >> On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg >> wrote: >> >> For what it's worth, my thought is not to use SrcSpanInfo (which, to me, >> is the wrong way to slice the abstraction) but instead to add SrcSpan fields >> to the relevant nodes. For example: >> >> | HsDoSrcSpan -- of the word "do" >> BlockSrcSpans >> (HsStmtContext Name) -- The parameterisation is >> unimportant >> -- because in this context we never >> use >> -- th
Re: GHC AST Annotations
If this is done right it can enable this sort of thing : http://www.davidchristiansen.dk/2014/09/06/pretty-printing-idris/ On Fri, Sep 5, 2014 at 5:11 PM, Alan & Kim Zimmerman wrote: > Hi Neil > > Thanks for the feedback. > > I am going to start putting together a proof of concept, aiming to > identify what annotations are needed to roundtrip source. > > The first version will make use of the index into a separate structure > scheme, so that it can be used with existing GHC ASTs. Hopefully the > information gained will help in understanding what is needed for the > changes to the future AST. > > The concept I will be working with is a pretty-printer, where relative > spacing for the particular elements is derived from the initial SrcSpan > information. Any new elements added or changed in the AST can then have > only relative information, and the final render should honour the layout > from the original. > > It may be possible to harmonise this with Chris Done's hindent package, > which is a code-specific pretty printer for haskell-src-exts. > > Alan > > > On Sat, Aug 30, 2014 at 11:18 PM, Neil Mitchell > wrote: > >> Since Alan is trying to do something for HaRe that I want for HLint on >> top of haskell-src-exts, he asked me for my opinions on the proposal. >> There seem to be two approaches to take: >> >> * Add SrcSpan's throughout. The HSE approach of having a list of inner >> source spans is nasty - the details of which source space goes where >> is entirely undocumented and hard to discover. Even worse, for things >> like instance, which may or may not have a where after, the number of >> inner SrcSpan's changes. Simon's idea of hsdo_do_loc is much cleaner, >> and easily extends to Maybe SrcSpan if the keyword is optional. >> >> * Having the annotation be a type parameter gives much greater >> flexibility. In particular, it would let you mark certain nodes as >> being added/deleted. However, since SrcSpan has an Int in it, you can >> always pass around a separate IntMap and make the SrcSpan really be an >> index into more detailed information. It's nasty, but only the people >> who use it pay for it. >> >> Both approaches have disadvantages. You could always combine both >> ideas, and have a SrcSpan and entirely separately an annotation (which >> defaults to (), rather than SrcSpanInfo), but maybe that's too much >> extra baggage on the AST. >> >> Thanks, Neil >> >> >> On Sat, Aug 30, 2014 at 3:32 PM, Alan & Kim Zimmerman >> wrote: >> > A further use case would be to be able to convert all the locations to >> be >> > relative, or include a relative portion, so that as tools manipulate >> the AST >> > by adding or removing parts the layout can be preserved. >> > >> > I think I may need to make a wip branch for this and experiment, it is >> > always easier to comment on concrete things. >> > >> > Alan >> > >> > >> > On Thu, Aug 28, 2014 at 10:38 PM, Simon Peyton Jones < >> simo...@microsoft.com> >> > wrote: >> >> >> >> I thiink the key question is whether it is acceptable to sprinkle this >> >> kind of information throughout the AST. For someone interested in >> >> source-to-source conversions (like me) this is great, others may find >> it >> >> intrusive. >> >> >> >> It’s probably not too bad if you use record syntax; thus >> >> >> >> | HsDo { hsdo_do_loc :: SrcSpan -- of the word "do" >> >> >> >> , hsdo_blocks :: BlockSrcSpans >> >> >> >> , hsdo_ctxt :: HsStmtContext Name >> >> >> >> , hsdo_stmts :: [ExprLStmt id] >> >> >> >> , hsdo_type:: PostTcType } >> >> >> >> >> >> >> >> Simon >> >> >> >> >> >> >> >> From: Alan & Kim Zimmerman [mailto:alan.z...@gmail.com] >> >> Sent: 28 August 2014 19:35 >> >> To: Richard Eisenberg >> >> Cc: Simon Peyton Jones; ghc-devs@haskell.org >> >> Subject: Re: GHC AST Annotations >> >> >> >> >> >> >> >> This does have the advantage of being explicit. I modelled the initial >> >> proposal on HSE as a proven solution, and I think that they were >> trying to >> >> keep it non-invasive, to allow both an annotated and non-annoted AST. >> >> >>
Re: GHC AST Annotations
I have created https://ghc.haskell.org/trac/ghc/ticket/9628 for this, and have decided to first tackle adding a type parameter to the entire AST, so that tool writers can add custom information as required. My first stab at this is to do is as follows ``` data HsModule r name = HsModule { ann :: r, -- ^ Annotation for external tool writers hsmodName :: Maybe (Located ModuleName), -- ^ @Nothing@: \"module X where\" is omitted (in which case the next -- field is Nothing too) hsmodExports :: Maybe [LIE name], ``` Salient points 1. It comes as the first type parameter, and is called r 2. It gets added as the first field of the syntax element 3. It is always called ann Before undertaking this particular change, I would appreciate some feedback. Regards Alan On Thu, Aug 28, 2014 at 8:34 PM, Alan & Kim Zimmerman wrote: > This does have the advantage of being explicit. I modelled the initial > proposal on HSE as a proven solution, and I think that they were trying to > keep it non-invasive, to allow both an annotated and non-annoted AST. > > I thiink the key question is whether it is acceptable to sprinkle this > kind of information throughout the AST. For someone interested in > source-to-source conversions (like me) this is great, others may find it > intrusive. > > The other question, which is probably orthogonal to this, is whether we > want the annotation to be a parameter to the AST, which allows it to be > overridden by various tools for various purposes, or fixed as in Richard's > suggestion. > > A parameterised annotation allows the annotations to be manipulated via > something like for HSE: > > -- |AST nodes are annotated, and this class allows manipulation of the > annotations. > class Functor ast => Annotated ast where > >-- |Retrieve the annotation of an AST node. > ann :: ast l -> l > > -- |Change the annotation of an AST node. Note that only the annotation > of the node itself is affected, and not > -- the annotations of any child nodes. if all nodes in the AST tree are > to be affected, use fmap. > amap :: (l -> l) -> ast l -> ast l > > Alan > > > On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg > wrote: > >> For what it's worth, my thought is not to use SrcSpanInfo (which, to me, >> is the wrong way to slice the abstraction) but instead to add SrcSpan >> fields to the relevant nodes. For example: >> >> | HsDoSrcSpan -- of the word "do" >> BlockSrcSpans >> (HsStmtContext Name) -- The parameterisation is >> unimportant >> -- because in this context we never >> use >> -- the PatGuard or ParStmt variant >> [ExprLStmt id] -- "do":one or more stmts >> PostTcType -- Type of the whole expression >> >> ... >> >> data BlockSrcSpans = LayoutBlock Int -- the parameter is the indentation >> level >> ... -- stuff to track the appearance of >> any semicolons >>| BracesBlock ... -- stuff to track the braces and >> semicolons >> >> >> The way I understand it, the SrcSpanInfo proposal means that we would >> have lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I >> think. >> >> Popping up a level, I do support the idea of including this info in the >> AST. >> >> Richard >> >> On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones >> wrote: >> >> > In general I’m fine with this direction of travel. Some specifics: >> > >> > ·You’d have to be careful to document, for every data >> constructor in HsSyn, what the association between the [SrcSpan] in the >> SrcSpanInfo and the “sub-entities” >> > ·Many of the sub-entities will have their own SrcSpanInfo >> wrapped around them, so there’s some unhelpful duplication. Maybe you only >> want the SrcSpanInfo to list the [SrcSpan]s for the sub-entities (like the >> syntactic keywords) that do not show up as children in the syntax tree? >> > Anyway do by all means create a GHC Trac wiki page to describe your >> proposed design, concretely. >> > >> > Simon >> > >> > From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Alan >> & Kim Zimmerman >> > Sent: 28 August 2014 15:00 >> > To: ghc-devs@haskell.org >> > Subject: GHC AST Annotations >> > >> > Now that the landmines have hopeful
RE: GHC AST Annotations
Dear Alan, Nice going and thanks for undertaking yet another useful AST transformation! A few thoughts (do with them as you see fit): - Always called "ann"; doesn't this require OverloadedRecordFields? You're in danger of delaying your modification (scheduled to land in 7.10). Other than that, as before, from a design perspective: yes please. - In terms of presentation/comments; when I first started looking at (i.e. traversing, selectively printing etc.) the AST, I was always really annoyed that every child in the tree has one extra step of indirection, due to the location annotations being "L loc thing", as opposed to a loc-field as part of the thing. I would simply call it annotation (no talk of external tool writers). In time, I hope GHC-annotations also move to that field. Regards, Philip From: Alan & Kim Zimmerman Sent: 23 September 2014 20:57 To: Richard Eisenberg Cc: ghc-devs@haskell.org Subject: Re: GHC AST Annotations I have created https://ghc.haskell.org/trac/ghc/ticket/9628 for this, and have decided to first tackle adding a type parameter to the entire AST, so that tool writers can add custom information as required. My first stab at this is to do is as follows ``` data HsModule r name = HsModule { ann :: r, -- ^ Annotation for external tool writers hsmodName :: Maybe (Located ModuleName), -- ^ @Nothing@: \"module X where\" is omitted (in which case the next -- field is Nothing too) hsmodExports :: Maybe [LIE name], ``` Salient points 1. It comes as the first type parameter, and is called r 2. It gets added as the first field of the syntax element 3. It is always called ann Before undertaking this particular change, I would appreciate some feedback. Regards Alan On Thu, Aug 28, 2014 at 8:34 PM, Alan & Kim Zimmerman mailto:alan.z...@gmail.com>> wrote: This does have the advantage of being explicit. I modelled the initial proposal on HSE as a proven solution, and I think that they were trying to keep it non-invasive, to allow both an annotated and non-annoted AST. I thiink the key question is whether it is acceptable to sprinkle this kind of information throughout the AST. For someone interested in source-to-source conversions (like me) this is great, others may find it intrusive. The other question, which is probably orthogonal to this, is whether we want the annotation to be a parameter to the AST, which allows it to be overridden by various tools for various purposes, or fixed as in Richard's suggestion. A parameterised annotation allows the annotations to be manipulated via something like for HSE: -- |AST nodes are annotated, and this class allows manipulation of the annotations. class Functor ast => Annotated ast where -- |Retrieve the annotation of an AST node. ann :: ast l -> l -- |Change the annotation of an AST node. Note that only the annotation of the node itself is affected, and not -- the annotations of any child nodes. if all nodes in the AST tree are to be affected, use fmap. amap :: (l -> l) -> ast l -> ast l Alan On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg mailto:e...@cis.upenn.edu>> wrote: For what it's worth, my thought is not to use SrcSpanInfo (which, to me, is the wrong way to slice the abstraction) but instead to add SrcSpan fields to the relevant nodes. For example: | HsDoSrcSpan -- of the word "do" BlockSrcSpans (HsStmtContext Name) -- The parameterisation is unimportant -- because in this context we never use -- the PatGuard or ParStmt variant [ExprLStmt id] -- "do":one or more stmts PostTcType -- Type of the whole expression ... data BlockSrcSpans = LayoutBlock Int -- the parameter is the indentation level ... -- stuff to track the appearance of any semicolons | BracesBlock ... -- stuff to track the braces and semicolons The way I understand it, the SrcSpanInfo proposal means that we would have lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I think. Popping up a level, I do support the idea of including this info in the AST. Richard On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones mailto:simo...@microsoft.com>> wrote: > In general I’m fine with this direction of travel. Some specifics: > > ·You’d have to be careful to document, for every data constructor in > HsSyn, what the association between the [SrcSpan] in the SrcSpanInfo and the > “sub-entities” > ·Many of the sub-entities will have their own SrcSpanInfo wrapped > around them, so there’s some unhelpful duplication. Maybe you only wa
RE: GHC AST Annotations
- In terms of presentation/comments; when I first started looking at (i.e. traversing, selectively printing etc.) the AST, I was always really annoyed that every child in the tree has one extra step of indirection, due to the location annotations being "L loc thing", as opposed to a loc-field as part of the thing. I would simply call it annotation (no talk of external tool writers). In time, I hope GHC-annotations also move to that field. Replacing the (L loc thing) story by adding a location field to every single data constructor of HsSyn would be entirely possible. But it would mean adding a lot of extra fields.I don’t have a strong opinion either way, but other clients of the GHC API would be affected. What we can’t do is have both the (L loc thing) and an extra field! - Always called "ann"; doesn't this require OverloadedRecordFields? You're in danger of delaying your modification (scheduled to land in 7.10). Other than that, as before, from a design perspective: yes please. Tiresomely it is indeed the case that (for now anyway) the field would need a different name in each data type. Simon From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of p.k.f.holzensp...@utwente.nl Sent: 26 September 2014 09:08 To: alan.z...@gmail.com; e...@cis.upenn.edu Cc: ghc-devs@haskell.org Subject: RE: GHC AST Annotations Dear Alan, Nice going and thanks for undertaking yet another useful AST transformation! A few thoughts (do with them as you see fit): - Always called "ann"; doesn't this require OverloadedRecordFields? You're in danger of delaying your modification (scheduled to land in 7.10). Other than that, as before, from a design perspective: yes please. - In terms of presentation/comments; when I first started looking at (i.e. traversing, selectively printing etc.) the AST, I was always really annoyed that every child in the tree has one extra step of indirection, due to the location annotations being "L loc thing", as opposed to a loc-field as part of the thing. I would simply call it annotation (no talk of external tool writers). In time, I hope GHC-annotations also move to that field. Regards, Philip From: Alan & Kim Zimmerman mailto:alan.z...@gmail.com>> Sent: 23 September 2014 20:57 To: Richard Eisenberg Cc: ghc-devs@haskell.org<mailto:ghc-devs@haskell.org> Subject: Re: GHC AST Annotations I have created https://ghc.haskell.org/trac/ghc/ticket/9628 for this, and have decided to first tackle adding a type parameter to the entire AST, so that tool writers can add custom information as required. My first stab at this is to do is as follows ``` data HsModule r name = HsModule { ann :: r, -- ^ Annotation for external tool writers hsmodName :: Maybe (Located ModuleName), -- ^ @Nothing@: \"module X where\" is omitted (in which case the next -- field is Nothing too) hsmodExports :: Maybe [LIE name], ``` Salient points 1. It comes as the first type parameter, and is called r 2. It gets added as the first field of the syntax element 3. It is always called ann Before undertaking this particular change, I would appreciate some feedback. Regards Alan On Thu, Aug 28, 2014 at 8:34 PM, Alan & Kim Zimmerman mailto:alan.z...@gmail.com>> wrote: This does have the advantage of being explicit. I modelled the initial proposal on HSE as a proven solution, and I think that they were trying to keep it non-invasive, to allow both an annotated and non-annoted AST. I thiink the key question is whether it is acceptable to sprinkle this kind of information throughout the AST. For someone interested in source-to-source conversions (like me) this is great, others may find it intrusive. The other question, which is probably orthogonal to this, is whether we want the annotation to be a parameter to the AST, which allows it to be overridden by various tools for various purposes, or fixed as in Richard's suggestion. A parameterised annotation allows the annotations to be manipulated via something like for HSE: -- |AST nodes are annotated, and this class allows manipulation of the annotations. class Functor ast => Annotated ast where -- |Retrieve the annotation of an AST node. ann :: ast l -> l -- |Change the annotation of an AST node. Note that only the annotation of the node itself is affected, and not -- the annotations of any child nodes. if all nodes in the AST tree are to be affected, use fmap. amap :: (l -> l) -> ast l -> ast l Alan On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg mailto:e...@cis.upenn.edu>> wrote: For what it's worth, my thought is not to use SrcSpanInfo (which, to me, is the wrong way to slice the abstraction) but instead to add SrcSpan fields to the relevant nodes. For example: | HsDoSrcSpan
GHC AST Annotations (again)
Hi all Hopefully I will be able to stop harassing everyone on this topic soon. The final versions of the patches for this are ready for review. It has been split into three parts D412 Extends the HsLit values to have an extra field for the original source text, which can differ from the literal value. D426 adds various extra locations in the HsSyn AST to allow the additions of API annotations everywhere needed. D438 modifies the Lexer/Parser to produce API annotations and comments as part of the ParsedSource result. Please review if you are interested. Links https://ghc.haskell.org/trac/ghc/wiki/GhcAstAnnotations https://ghc.haskell.org/trac/ghc/ticket/9628 https://phabricator.haskell.org/D412 https://phabricator.haskell.org/D426 https://phabricator.haskell.org/D438 Regards Alan ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: GHC AST Annotations (again)
I have placed an ordering on it so that D426 comes first, then D438 and finally D412, there was a very minor merge update required for D412. On Tue, Nov 4, 2014 at 11:41 PM, Alan & Kim Zimmerman wrote: > Hi all > > Hopefully I will be able to stop harassing everyone on this topic soon. > > The final versions of the patches for this are ready for review. > > It has been split into three parts > > D412 Extends the HsLit values to have an extra field for the original > source text, which can differ from the literal value. > > D426 adds various extra locations in the HsSyn AST to allow the additions > of API annotations everywhere needed. > > D438 modifies the Lexer/Parser to produce API annotations and comments as > part of the ParsedSource result. > > Please review if you are interested. > > Links > > https://ghc.haskell.org/trac/ghc/wiki/GhcAstAnnotations > https://ghc.haskell.org/trac/ghc/ticket/9628 > https://phabricator.haskell.org/D412 > https://phabricator.haskell.org/D426 > https://phabricator.haskell.org/D438 > > Regards > Alan > ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs