Re: Haddock strings in .hi files

2014-03-21 Thread Simon Marlow
Ok, I buy the argument that if we're already compiling everything, we 
shouldn't have to re-typecheck it all in Haddock. Of course if you're 
*not* already compiling everything, then the argument doesn't apply: 
Haddock does support generating documentation from source files without 
precompiling them, but I think if you ask the GHC API to load modules 
with -fno-code it should do the right thing: load up the .hi files if 
they're up to date, or typecheck the modules otherwise.


So I think having GHC spit out the docs as a side-effect of compilation 
is fine, so long as we don't have to do all the Haddock processing 
inside GHC itself, and provided this eliminates Haddock's own interface 
files (which are a pain).  If the docs go in the .hi file, then they 
must go in a separate section that is lazy parsed - we already do this 
for various other sections in the .hi file.


I don't think this is easy, but it's probably doable.  The code that 
attached docs to declarations is currently part of Haddock itself, so 
perhaps this has to move into GHC.


Cheers,
Simon

On 20/03/2014 16:41, Edward Kmett wrote:

My knowledge of precisely how haddock works is somewhat fuzzy in that it
arises from a series of discussions a couple of years back.

My observation was mostly that I run 'cabal install' it goes through all
the modules building my .hi files, etc. Then I run cabal haddock and it
spends all that time redoing the same work, just to go through and get
at some information that we had right up until the moment we finished
building.

I'm not wedded to bolting the information into the .hi files being the
solution, but the idea that we could avoid redoing that work is
tantalizing. I'm mostly trying to avoid redoing all the same work twice
in the build cycle of the average user.

If there is an alternative strategy, such as, oh, I don't know, making
haddock able to hook in plugin-style late as we're generating the .hi
file to spit out what it needs to something else and
interrogate/rename/whatever it needs the rest of the GHC API I'd be
totally open that as well.

-Edward


On Thu, Mar 20, 2014 at 12:18 PM, Mateusz Kowalczyk
fuuze...@fuuzetsu.co.uk mailto:fuuze...@fuuzetsu.co.uk wrote:

On 20/03/14 16:08, Edward Kmett wrote:
  One strong reason for considering at least including the haddocks
in the
  .hi files is build times.
 
  Currently if you have cabal configured to build and document
every package
  running hackage requires you to recompile your entire source tree
a second
  time to get information that we just dropped on the floor before
spitting
  out the .hi file.
 
  For most of the users of GHC this is a 50% difference in compile
times if
  they have cabal configured to generate haddocks.
 
  GHC doesn't have to understand the haddocks any more than it does
now to
  support it, just include the content.
 
  Haddock could then just go through and load the .hi files rather than
  starting from scratch with parsing and typechecking the entire
module,
  running template-haskell, just to get at the documentation.
 
  Any pythonesque :doc command support to me would be gravy.
 
  The reason I care at all is the build times. I regularly lose
minutes out
  of each build just to regenerate docs and wind up skipping
building them as
  much as I can get away with to avoid he pain.
 
  -Edward
 
 

As Simon M points out, we still have to run the renamer which seems to
be tightly bound with the type-checker. Where do you suggest the
sizeable performance increase would be coming from in this case? For all
the existing packages, we already read the docs from .haddock files so
there's no difference there. For new packages we have to type-check and
generate .haddock anyway so there's no difference there either.

It's not really about GHC having to know more about Haddock, it's about
Haddock having to use GHC anyway, whether the strinsg are embedded
or not.

--
Mateusz K.



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Haddock strings in .hi files

2014-03-21 Thread Edward Kmett
On Fri, Mar 21, 2014 at 7:38 AM, Simon Marlow marlo...@gmail.com wrote:

 Ok, I buy the argument that if we're already compiling everything, we
 shouldn't have to re-typecheck it all in Haddock. Of course if you're *not*
 already compiling everything, then the argument doesn't apply: Haddock does
 support generating documentation from source files without precompiling
 them, but I think if you ask the GHC API to load modules with -fno-code it
 should do the right thing: load up the .hi files if they're up to date, or
 typecheck the modules otherwise.


Definitely. That said, cabal installing a package with documentation is by
far the most common scenario, and that is the thing that could be sped up
the most here.

So I think having GHC spit out the docs as a side-effect of compilation is
 fine, so long as we don't have to do all the Haddock processing inside GHC
 itself, and provided this eliminates Haddock's own interface files (which
 are a pain).  If the docs go in the .hi file, then they must go in a
 separate section that is lazy parsed - we already do this for various other
 sections in the .hi file.


Exactly.


 I don't think this is easy, but it's probably doable.  The code that
 attached docs to declarations is currently part of Haddock itself, so
 perhaps this has to move into GHC.


This was originally scoped to be around the level of work of a GSoC
project, and folks were worried that it came in a bit light, so it taking
some effort isn't unreasonable. I don't think we have an application for it
yet this year, so we probably have some time to chew it over unless someone
applies to do this in the next few hours. (We had one student, but she
backed out at the last second due to other constraints.)

-Edward
___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


RE: Haddock strings in .hi files

2014-03-20 Thread Simon Peyton Jones
| The current design is intended to separate Haddock from GHC as much as
| possible, but putting documentation in .hi files would be going in the
| opposite direction.  There would have to be a compelling reason to do
| that, something that we couldn't do another way.

Actually it would in many ways be easier to put Haddock stuff in .hi files.  
But since GHC writes the .hi file, that would essentially mean merging GHC and 
Haddock into a single compile-and-documentation-generator.   That would be cool 
in a way -- for example, GHCi would natively have access to the Haddock docs 
for a function.

As Simon M says, the big reason not to do that is because it couples together 
two large projects, making each harder to develop independently.  And that's a 
pretty big reason.

Simon

| -Original Message-
| From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Simon
| Marlow
| Sent: 19 March 2014 11:40
| To: Mateusz Kowalczyk; ghc-devs@haskell.org
| Subject: Re: Haddock strings in .hi files
| 
| On 18/03/2014 18:20, Mateusz Kowalczyk wrote:
|  Hi all,
| 
|  I saw https://ghc.haskell.org/trac/ghc/ticket/5467 pop up in my inbox
|  and it reminded me of something I've been wondering for a while: why
|  do we not store Haddock docstrings in the interface file?
| 
|  I think that if we did, we could do some great things:
| 
|  1. Show docs in GHCi (I vaguely recall someone working on this ~1 year
|  ago, does anyone have any info?)
| 
|  2. Allow Haddock to work a lot faster: the big majority of time spent
|  when creating documentation is actually spent by Haddock calling
|  various GHC functions, such as type-checking the modules. Only a small
|  amount of time is actually spent by Haddock on other tasks such as
|  parsing or outputting the documentation. If we could simply get
|  everything we need from the .hi files, we save ourselves a lot of
| time.
| 
| Don't you still have to run the renamer at least?  And in GHC, renaming
| is tied up with typechecking, so it's hard to do one without the other.
|   Furthermore, if there is a missing type signature it's useful to be
| able to put the inferred type in the documentation.  I think I'm missing
| the point somewhere - how does putting docs in the .hi file let you
| avoid typechecking?
| 
| I'm not really sure I see the benefit.  If Haddock provided a library
| that we can call from GHCi to get documentation, then we could show
| documentation in GHCi.
| 
| The current design is intended to separate Haddock from GHC as much as
| possible, but putting documentation in .hi files would be going in the
| opposite direction.  There would have to be a compelling reason to do
| that, something that we couldn't do another way.
| 
| Cheers,
| Simon
| 
| 
|  3. Allow Haddock to create partial documentation: a complaint I
|  sometimes hear is if anything at all in the project doesn't type
|  check, we don't get any documentation at all. I think that it'd be
|  viable to generate only the documentation for the modules/functions
|  that do type-check and perhaps skip type signatures for everything
| else.
| 
|  Points 1. and 2. are of clear benefit. Point 3. is a simple
|  afterthought and thinking about it some more, I think that maybe it'd
|  be possible to do this with what we have right now: is type-checking
|  separate parts of the module supported? Can we retrieve documentation
|  for the parts that don't type-check?
| 
|  I am asking for input on what people think. I am not familiar at all
|  with what goes into the .hi file (and I can't find anything concrete!
|  Am I missing some wiki page?) at all and why. At the very least, 1.
|  should be easy to implement.
| 
|  It was suggested that I submit a proposal for this as part of GSoC,
|  namely implementing 1. and 2.. I admit that having much faster
|  documentation builds would be amazing and Edward K. and Carter S. seem
|  to think that this is very do-able in the 3 month period that GSoC
|  runs over.
| 
|  While I say all this, I have already submitted my proposal on a
|  different topic. I am considering writing this up and submitting this
|  as well but I am looking for some insight into the problem first.
| 
|  If there are any students around still looking for ideas, please do
|  speak up if you want to snatch this. If there are people that are
|  eager to mentor something like this then I suppose they should speak
| up too.
| 
|  Thanks!
| 
| ___
| ghc-devs mailing list
| ghc-devs@haskell.org
| http://www.haskell.org/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Haddock strings in .hi files

2014-03-20 Thread Edward Kmett
One strong reason for considering at least including the haddocks in the
.hi files is build times.

Currently if you have cabal configured to build and document every package
running hackage requires you to recompile your entire source tree a second
time to get information that we just dropped on the floor before spitting
out the .hi file.

For most of the users of GHC this is a 50% difference in compile times if
they have cabal configured to generate haddocks.

GHC doesn't have to understand the haddocks any more than it does now to
support it, just include the content.

Haddock could then just go through and load the .hi files rather than
starting from scratch with parsing and typechecking the entire module,
running template-haskell, just to get at the documentation.

Any pythonesque :doc command support to me would be gravy.

The reason I care at all is the build times. I regularly lose minutes out
of each build just to regenerate docs and wind up skipping building them as
much as I can get away with to avoid he pain.

-Edward


On Thu, Mar 20, 2014 at 4:08 AM, Simon Peyton Jones
simo...@microsoft.comwrote:

 | The current design is intended to separate Haddock from GHC as much as
 | possible, but putting documentation in .hi files would be going in the
 | opposite direction.  There would have to be a compelling reason to do
 | that, something that we couldn't do another way.

 Actually it would in many ways be easier to put Haddock stuff in .hi
 files.  But since GHC writes the .hi file, that would essentially mean
 merging GHC and Haddock into a single compile-and-documentation-generator.
   That would be cool in a way -- for example, GHCi would natively have
 access to the Haddock docs for a function.

 As Simon M says, the big reason not to do that is because it couples
 together two large projects, making each harder to develop independently.
  And that's a pretty big reason.

 Simon

 | -Original Message-
 | From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Simon
 | Marlow
 | Sent: 19 March 2014 11:40
 | To: Mateusz Kowalczyk; ghc-devs@haskell.org
 | Subject: Re: Haddock strings in .hi files
 |
 | On 18/03/2014 18:20, Mateusz Kowalczyk wrote:
 |  Hi all,
 | 
 |  I saw https://ghc.haskell.org/trac/ghc/ticket/5467 pop up in my inbox
 |  and it reminded me of something I've been wondering for a while: why
 |  do we not store Haddock docstrings in the interface file?
 | 
 |  I think that if we did, we could do some great things:
 | 
 |  1. Show docs in GHCi (I vaguely recall someone working on this ~1 year
 |  ago, does anyone have any info?)
 | 
 |  2. Allow Haddock to work a lot faster: the big majority of time spent
 |  when creating documentation is actually spent by Haddock calling
 |  various GHC functions, such as type-checking the modules. Only a small
 |  amount of time is actually spent by Haddock on other tasks such as
 |  parsing or outputting the documentation. If we could simply get
 |  everything we need from the .hi files, we save ourselves a lot of
 | time.
 |
 | Don't you still have to run the renamer at least?  And in GHC, renaming
 | is tied up with typechecking, so it's hard to do one without the other.
 |   Furthermore, if there is a missing type signature it's useful to be
 | able to put the inferred type in the documentation.  I think I'm missing
 | the point somewhere - how does putting docs in the .hi file let you
 | avoid typechecking?
 |
 | I'm not really sure I see the benefit.  If Haddock provided a library
 | that we can call from GHCi to get documentation, then we could show
 | documentation in GHCi.
 |
 | The current design is intended to separate Haddock from GHC as much as
 | possible, but putting documentation in .hi files would be going in the
 | opposite direction.  There would have to be a compelling reason to do
 | that, something that we couldn't do another way.
 |
 | Cheers,
 | Simon
 |
 |
 |  3. Allow Haddock to create partial documentation: a complaint I
 |  sometimes hear is if anything at all in the project doesn't type
 |  check, we don't get any documentation at all. I think that it'd be
 |  viable to generate only the documentation for the modules/functions
 |  that do type-check and perhaps skip type signatures for everything
 | else.
 | 
 |  Points 1. and 2. are of clear benefit. Point 3. is a simple
 |  afterthought and thinking about it some more, I think that maybe it'd
 |  be possible to do this with what we have right now: is type-checking
 |  separate parts of the module supported? Can we retrieve documentation
 |  for the parts that don't type-check?
 | 
 |  I am asking for input on what people think. I am not familiar at all
 |  with what goes into the .hi file (and I can't find anything concrete!
 |  Am I missing some wiki page?) at all and why. At the very least, 1.
 |  should be easy to implement.
 | 
 |  It was suggested that I submit a proposal for this as part of GSoC,
 |  namely implementing 1. and 2.. I admit

Re: Haddock strings in .hi files

2014-03-20 Thread Edward Kmett
My knowledge of precisely how haddock works is somewhat fuzzy in that it
arises from a series of discussions a couple of years back.

My observation was mostly that I run 'cabal install' it goes through all
the modules building my .hi files, etc. Then I run cabal haddock and it
spends all that time redoing the same work, just to go through and get at
some information that we had right up until the moment we finished building.

I'm not wedded to bolting the information into the .hi files being the
solution, but the idea that we could avoid redoing that work is
tantalizing. I'm mostly trying to avoid redoing all the same work twice in
the build cycle of the average user.

If there is an alternative strategy, such as, oh, I don't know, making
haddock able to hook in plugin-style late as we're generating the .hi file
to spit out what it needs to something else and interrogate/rename/whatever
it needs the rest of the GHC API I'd be totally open that as well.

-Edward


On Thu, Mar 20, 2014 at 12:18 PM, Mateusz Kowalczyk fuuze...@fuuzetsu.co.uk
 wrote:

 On 20/03/14 16:08, Edward Kmett wrote:
  One strong reason for considering at least including the haddocks in the
  .hi files is build times.
 
  Currently if you have cabal configured to build and document every
 package
  running hackage requires you to recompile your entire source tree a
 second
  time to get information that we just dropped on the floor before spitting
  out the .hi file.
 
  For most of the users of GHC this is a 50% difference in compile times if
  they have cabal configured to generate haddocks.
 
  GHC doesn't have to understand the haddocks any more than it does now to
  support it, just include the content.
 
  Haddock could then just go through and load the .hi files rather than
  starting from scratch with parsing and typechecking the entire module,
  running template-haskell, just to get at the documentation.
 
  Any pythonesque :doc command support to me would be gravy.
 
  The reason I care at all is the build times. I regularly lose minutes out
  of each build just to regenerate docs and wind up skipping building them
 as
  much as I can get away with to avoid he pain.
 
  -Edward
 
 

 As Simon M points out, we still have to run the renamer which seems to
 be tightly bound with the type-checker. Where do you suggest the
 sizeable performance increase would be coming from in this case? For all
 the existing packages, we already read the docs from .haddock files so
 there's no difference there. For new packages we have to type-check and
 generate .haddock anyway so there's no difference there either.

 It's not really about GHC having to know more about Haddock, it's about
 Haddock having to use GHC anyway, whether the strinsg are embedded or not.

 --
 Mateusz K.

___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Haddock strings in .hi files

2014-03-20 Thread Malcolm Wallace

On 20 Mar 2014, at 16:41, Edward Kmett wrote:

 My observation was mostly that I run 'cabal install' it goes through all the 
 modules building my .hi files, etc. Then I run cabal haddock and it spends 
 all that time redoing the same work, just to go through and get at some 
 information that we had right up until the moment we finished building.
 
 I'm not wedded to bolting the information into the .hi files being the 
 solution, but the idea that we could avoid redoing that work is tantalizing.

One obvious solution could be for Haddock to learn how to read the existing .hi 
files, solely to read out the type signature of any exported entity that does 
not have an explicit signature in the source file.

Regards,
Malcolm
___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Haddock strings in .hi files

2014-03-19 Thread Simon Marlow

On 18/03/2014 18:20, Mateusz Kowalczyk wrote:

Hi all,

I saw https://ghc.haskell.org/trac/ghc/ticket/5467 pop up in my inbox
and it reminded me of something I've been wondering for a while: why do
we not store Haddock docstrings in the interface file?

I think that if we did, we could do some great things:

1. Show docs in GHCi (I vaguely recall someone working on this ~1 year
ago, does anyone have any info?)

2. Allow Haddock to work a lot faster: the big majority of time spent
when creating documentation is actually spent by Haddock calling various
GHC functions, such as type-checking the modules. Only a small amount of
time is actually spent by Haddock on other tasks such as parsing or
outputting the documentation. If we could simply get everything we need
from the .hi files, we save ourselves a lot of time.


Don't you still have to run the renamer at least?  And in GHC, renaming 
is tied up with typechecking, so it's hard to do one without the other. 
 Furthermore, if there is a missing type signature it's useful to be 
able to put the inferred type in the documentation.  I think I'm missing 
the point somewhere - how does putting docs in the .hi file let you 
avoid typechecking?


I'm not really sure I see the benefit.  If Haddock provided a library 
that we can call from GHCi to get documentation, then we could show 
documentation in GHCi.


The current design is intended to separate Haddock from GHC as much as 
possible, but putting documentation in .hi files would be going in the 
opposite direction.  There would have to be a compelling reason to do 
that, something that we couldn't do another way.


Cheers,
Simon



3. Allow Haddock to create partial documentation: a complaint I
sometimes hear is if anything at all in the project doesn't type check,
we don't get any documentation at all. I think that it'd be viable to
generate only the documentation for the modules/functions that do
type-check and perhaps skip type signatures for everything else.

Points 1. and 2. are of clear benefit. Point 3. is a simple afterthought
and thinking about it some more, I think that maybe it'd be possible to
do this with what we have right now: is type-checking separate parts of
the module supported? Can we retrieve documentation for the parts that
don't type-check?

I am asking for input on what people think. I am not familiar at all
with what goes into the .hi file (and I can't find anything concrete! Am
I missing some wiki page?) at all and why. At the very least, 1. should
be easy to implement.

It was suggested that I submit a proposal for this as part of GSoC,
namely implementing 1. and 2.. I admit that having much faster
documentation builds would be amazing and Edward K. and Carter S. seem
to think that this is very do-able in the 3 month period that GSoC runs
over.

While I say all this, I have already submitted my proposal on a
different topic. I am considering writing this up and submitting this as
well but I am looking for some insight into the problem first.

If there are any students around still looking for ideas, please do
speak up if you want to snatch this. If there are people that are eager
to mentor something like this then I suppose they should speak up too.

Thanks!


___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Haddock strings in .hi files

2014-03-19 Thread Mateusz Kowalczyk
On 19/03/14 11:39, Simon Marlow wrote:
 On 18/03/2014 18:20, Mateusz Kowalczyk wrote:
 Hi all,

 I saw https://ghc.haskell.org/trac/ghc/ticket/5467 pop up in my inbox
 and it reminded me of something I've been wondering for a while: why do
 we not store Haddock docstrings in the interface file?

 I think that if we did, we could do some great things:

 1. Show docs in GHCi (I vaguely recall someone working on this ~1 year
 ago, does anyone have any info?)

 2. Allow Haddock to work a lot faster: the big majority of time spent
 when creating documentation is actually spent by Haddock calling various
 GHC functions, such as type-checking the modules. Only a small amount of
 time is actually spent by Haddock on other tasks such as parsing or
 outputting the documentation. If we could simply get everything we need
 from the .hi files, we save ourselves a lot of time.
 
 Don't you still have to run the renamer at least?  And in GHC, renaming 
 is tied up with typechecking, so it's hard to do one without the other. 
   Furthermore, if there is a missing type signature it's useful to be 
 able to put the inferred type in the documentation.  I think I'm missing 
 the point somewhere - how does putting docs in the .hi file let you 
 avoid typechecking?

This is a very good point and precisely why I have e-mailed ghc-devs
first. I think you're correct. I suppose that idea is out of the window!

 I'm not really sure I see the benefit.  If Haddock provided a library 
 that we can call from GHCi to get documentation, then we could show 
 documentation in GHCi.

Considering that (as you point out), we can't really get rid of the time
spent in GHC on renaming/type-checking, this does seem like the best way
to go now. I think there's an old ticket somewhere about providing such
a library that would let you work with Haddock interface files. I'll
investigate that approach in some spare time.

 The current design is intended to separate Haddock from GHC as much as 
 possible, but putting documentation in .hi files would be going in the 
 opposite direction.  There would have to be a compelling reason to do 
 that, something that we couldn't do another way.

I agree. Without being able to rip the major benefit of the idea (fast
docs), it is not worth doing it now.

 Cheers,
 Simon
 
 

Thanks!

-- 
Mateusz K.
___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Haddock strings in .hi files

2014-03-18 Thread Mateusz Kowalczyk
Hi all,

I saw https://ghc.haskell.org/trac/ghc/ticket/5467 pop up in my inbox
and it reminded me of something I've been wondering for a while: why do
we not store Haddock docstrings in the interface file?

I think that if we did, we could do some great things:

1. Show docs in GHCi (I vaguely recall someone working on this ~1 year
ago, does anyone have any info?)

2. Allow Haddock to work a lot faster: the big majority of time spent
when creating documentation is actually spent by Haddock calling various
GHC functions, such as type-checking the modules. Only a small amount of
time is actually spent by Haddock on other tasks such as parsing or
outputting the documentation. If we could simply get everything we need
from the .hi files, we save ourselves a lot of time.

3. Allow Haddock to create partial documentation: a complaint I
sometimes hear is if anything at all in the project doesn't type check,
we don't get any documentation at all. I think that it'd be viable to
generate only the documentation for the modules/functions that do
type-check and perhaps skip type signatures for everything else.

Points 1. and 2. are of clear benefit. Point 3. is a simple afterthought
and thinking about it some more, I think that maybe it'd be possible to
do this with what we have right now: is type-checking separate parts of
the module supported? Can we retrieve documentation for the parts that
don't type-check?

I am asking for input on what people think. I am not familiar at all
with what goes into the .hi file (and I can't find anything concrete! Am
I missing some wiki page?) at all and why. At the very least, 1. should
be easy to implement.

It was suggested that I submit a proposal for this as part of GSoC,
namely implementing 1. and 2.. I admit that having much faster
documentation builds would be amazing and Edward K. and Carter S. seem
to think that this is very do-able in the 3 month period that GSoC runs
over.

While I say all this, I have already submitted my proposal on a
different topic. I am considering writing this up and submitting this as
well but I am looking for some insight into the problem first.

If there are any students around still looking for ideas, please do
speak up if you want to snatch this. If there are people that are eager
to mentor something like this then I suppose they should speak up too.

Thanks!

-- 
Mateusz K.
___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs