Re: [Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0
2008/5/9 Claus Reinke [EMAIL PROTECTED]: Ah, I didn't think about the GHC options that change the lexical syntax. You're right, using the GHC lexer should be easier. and, if you do that, you could also make the GHC lexer squirrel away the comments (including pragmas, if they aren't already in the AST) someplace safe, indexed by, or at least annotated with, their source locations, and make this comment/ pragma storage available via the GHC API. (1a) then, we'd need a way to merge those comments and pragmas back into the output during pretty printing, and we'd have made the first small step towards source-to-source transformations: making code survive semantically intact over (pretty . parse). (1b) that would still not quite fullfill the GHC API comment ticket (*), but that was only a quick sketch, not a definite design. it might be sufficient to let each GHC API client do its own search to associate bits of comment/pragma storage with bits of AST. if i understand you correctly, you are going to do (1a), so if you could add that to the GHC API, we'd only need (1b) to go from useable-for-analysis-and-extraction to useable-for-transformation. is that going to be a problem? I'll have a look to see if doing 1a) is possible without too much work. And then if I actually implement something, adding it to the GHC API shouldn't be a problem. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0
David Waern wrote: 2008/5/8 Simon Marlow [EMAIL PROTECTED]: So basically you want to run a lexer over the source again to collect all the comments? Yes. You really want to use GHC's lexer, because otherwise you have to write another lexer. I don't mind writing a lexer that just collects the comments. It should be simpler than a full Haskell lexer, right? It wouldn't need to handle layout, for instance. Using GHC is also a good option. I'm not sure it's that much easier to write a lexer that just collects comments. For example, is there a comment here? 3#--foo with -XMagicHash it is (3# followed by a comment), but without -XMagicHash it is not (3 followed by the operator #--). You have to implement a significant chunk of the options that GHC supports to get it right. I'd say its probably easier to work with GHC's lexer. Cheers, Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0
2008/5/9 Simon Marlow [EMAIL PROTECTED]: David Waern wrote: 2008/5/8 Simon Marlow [EMAIL PROTECTED]: So basically you want to run a lexer over the source again to collect all the comments? Yes. You really want to use GHC's lexer, because otherwise you have to write another lexer. I don't mind writing a lexer that just collects the comments. It should be simpler than a full Haskell lexer, right? It wouldn't need to handle layout, for instance. Using GHC is also a good option. I'm not sure it's that much easier to write a lexer that just collects comments. For example, is there a comment here? 3#--foo with -XMagicHash it is (3# followed by a comment), but without -XMagicHash it is not (3 followed by the operator #--). You have to implement a significant chunk of the options that GHC supports to get it right. I'd say its probably easier to work with GHC's lexer. Ah, I didn't think about the GHC options that change the lexical syntax. You're right, using the GHC lexer should be easier. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0
Ah, I didn't think about the GHC options that change the lexical syntax. You're right, using the GHC lexer should be easier. and, if you do that, you could also make the GHC lexer squirrel away the comments (including pragmas, if they aren't already in the AST) someplace safe, indexed by, or at least annotated with, their source locations, and make this comment/ pragma storage available via the GHC API. (1a) then, we'd need a way to merge those comments and pragmas back into the output during pretty printing, and we'd have made the first small step towards source-to-source transformations: making code survive semantically intact over (pretty . parse). (1b) that would still not quite fullfill the GHC API comment ticket (*), but that was only a quick sketch, not a definite design. it might be sufficient to let each GHC API client do its own search to associate bits of comment/pragma storage with bits of AST. if i understand you correctly, you are going to do (1a), so if you could add that to the GHC API, we'd only need (1b) to go from useable-for-analysis-and-extraction to useable-for-transformation. is that going to be a problem? claus (*) knowing the source location of some piece of AST is not sufficient for figuring out whether it has any immediately preceding or following comments (there might be other AST fragments in between, closer to the next comment). but, if one knows the nearest comment segment for each piece of AST, one could then build a map where the closest AST pieces are mapped to (Just commentID), and the other AST pieces are mapped to Nothing. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0
David Waern wrote: 2008/5/2 Claus Reinke [EMAIL PROTECTED]: 2008/5/2 Simon Marlow [EMAIL PROTECTED]: David Waern wrote: No it doesn't, but it's on the TODO list. It needs a fix in GHC. By the way, I'm going to experiment with doing the parsing of comments on the Haddock side instead of in GHC. If that works out, we won't have to fix these things in GHC anymore. Sounds great - along the lines that we discussed on cvs-ghc a while back? Yes, something along the lines of separately parsing the comments and recording their source locations, and then trying to match them with the source locations of the AST nodes. yay!-) i hope that the haddock-independent part (parsing, preserving, and accessing comments) becomes part of the GHC API in a form that would fix trac ticket #1886, then we could finally start writing (ghc) haskell source-to-source transformations without losing pragmas or comments! losing layout would still be a pain, but that could be dealt with later - at least the code would remain functional under some form of (pretty . id . parse). Hmm. When it comes Haddock, things are simpler than in a refactoring situation, since we don't need to know exactly where the comments appear in the concrete syntax. The original Haddock parser is very liberal in where you can place comments. For example, it doesn't matter if you place a comment before or after a comma in a record field list, it is still attached to the previous (or next, depending on the type of comment) field. I need to take another look at the grammar to confirm that this is true in general, though. But anyway, my plan was to do this entirely in Haddock, not do the preserving part that you mention, and not do anything to GHC. So basically you want to run a lexer over the source again to collect all the comments? You really want to use GHC's lexer, because otherwise you have to write another lexer. So a flag to GHC's lexer that says whether it should return comments or not seems like a reasonable way to go. But if you're doing that, you might as well have the parser collect all the comments off to the side during parsing, to avoid having to lex the file twice, right? Cheers, Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] Re: ANN: Haddock version 2.1.0
2008/5/8 Simon Marlow [EMAIL PROTECTED]: So basically you want to run a lexer over the source again to collect all the comments? Yes. You really want to use GHC's lexer, because otherwise you have to write another lexer. I don't mind writing a lexer that just collects the comments. It should be simpler than a full Haskell lexer, right? It wouldn't need to handle layout, for instance. Using GHC is also a good option. So a flag to GHC's lexer that says whether it should return comments or not seems like a reasonable way to go. But if you're doing that, you might as well have the parser collect all the comments off to the side during parsing, to avoid having to lex the file twice, right? Yes. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe