Re: darcs patch: FIX #1463 (implement 'ghc-pkg find-module')

Duncan Coutts Sun, 11 Nov 2007 15:42:43 -0800

On Sun, 2007-11-11 at 11:22 +0000, Claus Reinke wrote:
> > whatever format it likes. But we do need some info on the installed
> > packages. We can do that now by calling ghc-pkg list and ghc-pkg
> > describe on each one. ghc-pkg describe returns a standard format defined
> > in Distribution.InstalledPackageInfo in the Cabal library.
> > 
> > All we're asking for is a slightly faster way of doing that. When we
> > have 150 installed packages calling ghc-pkg 150 times will take three
> > and a half minutes (when I had 150 hackage packages installed I clocked
> > ghc-pkg at 1.5s to describe any package). Even if ghc-pkg were faster,
> > it's still just more convenient to do queries ourselves than by asking
> > ghc-pkg all the time.
> 
> if we can sort out how the '--bulk' query option i implemented can 
> fit in despite the regex dependency, you would have that option.
> 
>     http://hackage.haskell.org/trac/ghc/ticket/1839#comment:3
> 
> but what then? do you dump, read, and parse all package databases
> for all installed ghcs (not to mention other haskell implementations) 
> every time cabal is run?


Yes, but just for the configured compiler if that's ghc. For other
implementations we use whatever they provide which at the moment is
nothing, though nhc is going to have a package database at some point.

> or do you store copies of all of them and keep them up-to-date? do you
> replicate all the functionality of ghc-pkg, ghc --make, hugs,
> nhc-make, ..?

Not of ghc-pkg, though we use a lot of the information it provides. We
are looking to replicate the functionality of ghc --make. Work on that
has been progressing recently.

> > As I see it, all of these cool features depend on Cabal being based on a
> > make-like system internally and doing dependency chasing itself. If you
> > follow cabal-devel you'll see we've started on some prototyping work in
> > that direction and we would welcome people to join in the fun.
> 
> i never understood how cabal came out without such a system
> in the first place.

Initially people implemented what was essential to get things working.
So they relied on ghc --make.

> but as it is now, i have to wonder: wouldn't it
> be easier to specify what cabal needs and have ways to expose
> the haskell implementation's dependency chasing results?
> 
> at what point does it become more useful to implement functionality
> in cabal rather than augmenting what exists and having cabal as a
> unifying interface? for everything beyond pure haskell import?

The reason we have to do it in Cabal is that there's no way otherwise to
deal properly with pre-processors. ghc --make does do cpp but it'd be
crazy to ask it to deal with all the other kinds of pre-processors.

The other reason we cannot just take the output of the compiler's dep
chasing is the issue I explained before about search path shadowing.

Also, if we had to rely only on ghc -M (or equivalent) for the .hs
dependencies then we would still require users to specify all the other
hidden modules rather than just the root exposed modules. This is
because dep chasing has to be interleaved with running pre-processors.

Foo.hs may import module Bar from Bar.hs but Bar.hs may not exist yet as
it may be generated from Bar.hs.pp (or .chs or .hsc or whatever) so
supposing we only know that module Foo is an exposed module, we cannot
find out about the Bar.hs dependency with ghc -M because ghc expects
Bar.hs to exist already! If we do the dep chasing per-module we can then
go and look for rule schema that might get us a Bar.hs from Bar.hs.pp
etc.

> > If Cabal can do dependency chasing then it can figure out the required
> > modules and packages. It could go further than what ghci can do by also
> > finding what pre-processors are necessary and the ultimate source files
> > for each module.
> 
> isn't that too optimistic? unless there is a specified path from 
> the Main module, say, any such auxiliaries would need to be
> specified explicitly either way, just as the paths in which to
> find them.

We can ask for the source search paths. That information already has to
be supplied in the .cabal file. If we cannot find all the dependencies
then it just fails and reports to the user where it looked. They can
then add the missing source dirs as necessary.

>  and even if you have a Parser.hs pointing back to
> a Parser.ly, you may end up with having to rename Parser.ly
> so that cabal doesn't touch it.. ;-)

I don't follow. Cabal should find the ultimate source file for each
module and should rebuild things as appropriate.

In this case it is slightly ambiguous what to do because Cabal never
puts generated files in the src dirs, so it actually doesn't generate
Parser.hs from Parser.ly, it generates dist/build/Parser.hs from
Parser.ly. So probably it should ignore Parser.hs and perhaps warn that
it is doing so.

> >> do you have a specification of what cabal needs from 
> >> an implementation, how it would like to query for that
> >> information, and an idea of how ghc, hugs, nhc, etc. 
> >> would implement that spec? 
> > 
> > We have some rough general ideas. Certainly a collection of
> > InstalledPackageInfo records is enough information. We need to be able
> > to do things like map module names to packages and possibly to
> > individual files to be able to track changes in installed packages which
> > might require a rebuild of the current code.
> 
> i don't think the cabal level has enough information to make
> such decisions - if i empty all modules in a package, cabal
> can't see the difference before calling the compiler;

I'm not sure what you mean. Do you mean breaking an installed package by
deleting .hi files that the package db thinks are exposed modules? Or do
you mean just changing a package to expose no modules? Or you mean
making all exposed modules export nothing.

> nor can cabal see what dependencies were used to build a package,

Yes it can. The installed package info lists the dependent packages.

> let alone whether they are up to date. without such information,
> cabal can only call the compiler and ask it to bring a target
> up to date.



> also, do you want --make-style (sources point to their
> dependencies) or make-style (separate specification of
> dependencies)?

We want to automatically discover the dependencies. Separate
specification of deps get out of sync.

> it might help to specify all the possible links in the intended 
> dependency chain, such as:
> 
>     - module A import B -> A.hs needs B.hs
>     - module A -- generated from A.ly -> A.hs needs A.ly
>     - module A -- generated from A.hsc -> A.hs needs A.hsc
>     - name:A; build-depends: B -> A.cabal needs B.cabal
>     ..

Yes, we've been calling these kinds of things rules. Or rule schema if
we're abstracting over the specific file name and only looking at the
file type.

> and then to pin down how each of those is going to be
> specified, and to check whether there is sufficient information
> at the cabal level to make any decisions, and to find all
> dependencies.

I'm fairly confident that we've got enough information.

You can follow the work on the make prototype code. We've been posting
stuff to the cabal-devel mailing list.

http://haskell.org/pipermail/cabal-devel/2007-October/thread.html#1297

Just today we got our first QuickCheck property working. It asserts that
for a random dep graph, after making, every file is up to date wrt its
deps. Here's random dep graph #42:
http://www.haskell.org/~duncan/cabal/bar.svg

We're following the xmonad example and keeping the core pure,
parametrised a monad. So we're testing in a pure monad and the real
thing will a monad that's layered over IO.

We'll continue posting out ideas and prototype progress to cabal-devel
and we'd most appreciate critiques, advice and patches.

Duncan

_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc

Re: darcs patch: FIX #1463 (implement 'ghc-pkg find-module')

Reply via email to