Re: [Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0

2008-05-12 Thread David Waern
2008/5/9 Claus Reinke [EMAIL PROTECTED]:


  Ah, I didn't think about the GHC options that change the lexical
  syntax. You're right, using the GHC lexer should be easier.
 

  and, if you do that, you could also make the GHC lexer
  squirrel away the comments (including pragmas, if they aren't
  already in the AST) someplace safe, indexed by, or at least annotated with,
 their source locations, and make this comment/
  pragma storage available via the GHC API. (1a)

  then, we'd need a way to merge those comments and pragmas
  back into the output during pretty printing, and we'd have made
  the first small step towards source-to-source transformations: making code
 survive semantically intact over (pretty . parse). (1b)

  that would still not quite fullfill the GHC API comment ticket (*),
  but that was only a quick sketch, not a definite design. it might be
 sufficient to let each GHC API client do its own search to associate bits of
 comment/pragma storage with bits of AST.
  if i understand you correctly, you are going to do (1a), so
  if you could add that to the GHC API, we'd only need (1b)
  to go from useable-for-analysis-and-extraction to
 useable-for-transformation.

  is that going to be a problem?

I'll have a look to see if doing 1a) is possible without too much
work. And then if I actually implement something, adding it to the GHC
API shouldn't be a problem.

David
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0

2008-05-09 Thread Simon Marlow

David Waern wrote:

2008/5/8 Simon Marlow [EMAIL PROTECTED]:

So basically you want to run a lexer over the source again to collect all
the comments?


Yes.


You really want to use GHC's lexer, because otherwise you
have to write another lexer.


I don't mind writing a lexer that just collects the comments. It
should be simpler than a full Haskell lexer, right? It wouldn't need
to handle layout, for instance. Using GHC is also a good option.


I'm not sure it's that much easier to write a lexer that just collects 
comments.  For example, is there a comment here?


  3#--foo

with -XMagicHash it is (3# followed by a comment), but without -XMagicHash 
it is not (3 followed by the operator #--).  You have to implement a 
significant chunk of the options that GHC supports to get it right.  I'd 
say its probably easier to work with GHC's lexer.


Cheers,
Simon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0

2008-05-09 Thread David Waern
2008/5/9 Simon Marlow [EMAIL PROTECTED]:
 David Waern wrote:

 2008/5/8 Simon Marlow [EMAIL PROTECTED]:

 So basically you want to run a lexer over the source again to collect all
 the comments?

 Yes.

 You really want to use GHC's lexer, because otherwise you
 have to write another lexer.

 I don't mind writing a lexer that just collects the comments. It
 should be simpler than a full Haskell lexer, right? It wouldn't need
 to handle layout, for instance. Using GHC is also a good option.

 I'm not sure it's that much easier to write a lexer that just collects
 comments.  For example, is there a comment here?

  3#--foo

 with -XMagicHash it is (3# followed by a comment), but without -XMagicHash
 it is not (3 followed by the operator #--).  You have to implement a
 significant chunk of the options that GHC supports to get it right.  I'd say
 its probably easier to work with GHC's lexer.

Ah, I didn't think about the GHC options that change the lexical
syntax. You're right, using the GHC lexer should be easier.

David
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0

2008-05-09 Thread Claus Reinke



Ah, I didn't think about the GHC options that change the lexical
syntax. You're right, using the GHC lexer should be easier.


and, if you do that, you could also make the GHC lexer
squirrel away the comments (including pragmas, if they aren't
already in the AST) someplace safe, indexed by, or at least 
annotated with, their source locations, and make this comment/

pragma storage available via the GHC API. (1a)

then, we'd need a way to merge those comments and pragmas
back into the output during pretty printing, and we'd have made
the first small step towards source-to-source transformations: 
making code survive semantically intact over (pretty . parse). (1b)


that would still not quite fullfill the GHC API comment ticket (*),
but that was only a quick sketch, not a definite design. it might 
be sufficient to let each GHC API client do its own search to 
associate bits of comment/pragma storage with bits of AST. 


if i understand you correctly, you are going to do (1a), so
if you could add that to the GHC API, we'd only need (1b)
to go from useable-for-analysis-and-extraction to 
useable-for-transformation.


is that going to be a problem?

claus

(*) knowing the source location of some piece of AST is not
sufficient for figuring out whether it has any immediately
preceding or following comments (there might be other AST
fragments in between, closer to the next comment). 

but, if one knows the nearest comment segment for each 
piece of AST, one could then build a map where the closest

AST pieces are mapped to (Just commentID), and the other
AST pieces are mapped to Nothing. 


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0

2008-05-08 Thread Simon Marlow

David Waern wrote:

2008/5/2 Claus Reinke [EMAIL PROTECTED]:

2008/5/2 Simon Marlow [EMAIL PROTECTED]:


David Waern wrote:


No it doesn't, but it's on the TODO list. It needs a fix in GHC.

By the way, I'm going to experiment with doing the parsing of comments
on the Haddock side instead of in GHC.
If that works out, we won't have to fix these things in GHC anymore.


 Sounds great - along the lines that we discussed on cvs-ghc a while

back?

Yes, something along the lines of separately parsing the comments and
recording their source locations, and then
trying to match them with the source locations of the AST nodes.


 yay!-) i hope that the haddock-independent part (parsing, preserving,
 and accessing comments) becomes part of the GHC API in a form that would
fix trac ticket #1886, then we could finally start writing (ghc) haskell
source-to-source transformations without losing pragmas or comments!
 losing layout would still be a pain, but that could be dealt with
 later - at least the code would remain functional under some
 form of (pretty . id . parse).


Hmm. When it comes Haddock, things are simpler than in a refactoring
situation, since we don't need to know exactly where the comments
appear in the concrete syntax. The original Haddock parser is very
liberal in where you can place comments. For example, it doesn't
matter if you place a comment before or after a comma in a record
field list, it is still attached to the previous (or next, depending
on the type of comment) field. I need to take another look at the
grammar to confirm that this is true in general, though. But anyway,
my plan was to do this entirely in Haddock, not do the preserving
part that you mention, and not do anything to GHC.


So basically you want to run a lexer over the source again to collect all 
the comments?  You really want to use GHC's lexer, because otherwise you 
have to write another lexer.  So a flag to GHC's lexer that says whether it 
should return comments or not seems like a reasonable way to go.  But if 
you're doing that, you might as well have the parser collect all the 
comments off to the side during parsing, to avoid having to lex the file 
twice, right?


Cheers,
Simon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0

2008-05-08 Thread David Waern
2008/5/8 Simon Marlow [EMAIL PROTECTED]:
 So basically you want to run a lexer over the source again to collect all
 the comments?

Yes.

 You really want to use GHC's lexer, because otherwise you
 have to write another lexer.

I don't mind writing a lexer that just collects the comments. It
should be simpler than a full Haskell lexer, right? It wouldn't need
to handle layout, for instance. Using GHC is also a good option.

 So a flag to GHC's lexer that says whether it
 should return comments or not seems like a reasonable way to go.  But if
 you're doing that, you might as well have the parser collect all the
 comments off to the side during parsing, to avoid having to lex the file
 twice, right?

Yes.

David
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe