Re: planning for ghc-6.10.1 and hackage [or: combining packages to yield new type correct programs]

Simon Marlow Thu, 02 Oct 2008 04:41:53 -0700

Don Stewart wrote:

Here's a summary of why this is non-trivial,


* We're trying to compose packages on the users machine to yield new
  type correct programs.

* We're using cabal dependencies to decide when it is safe to do this.
  Hopefully we don't rule in any type incorrect combinations, nor rule
  out to many type correct combinations.

* This is scary - in fact, we think the package system admits type

incorrect programs (based on Typeable, or initialised global state),as it is similar to the runtime linking problem for modules.

I think what you're referring to is the problem that occurs if the programlinks in more than one copy of Data.Typeable, which would then invalidatethe assumptions that make Data.Typeable's use of unsafeCoerce safe. Iwouldn't call this "type-incorrect" - it's a violation of assumptions madeby Data.Typeable, not type-incorrectness in the Haskell sense.

But you'll be glad to know this doesn't happen anyway, becauseData.Typeable's state is held by the RTS these days, for exactly this reason.

However, there are libraries which do have private state (e.g.System.Random). We'd prefer not to have more than one copy of the state,but it's not usually fatal: in the case of System.Random, different clientsmight get streams of random numbers initialised from different seeds, butthat's indistinguishable from sharing a single stream of random numbers.Often this global-state stuff is for caching, which works just fine whenmultiple clients use different versions of the library - it's just a bitless efficient.

* We use constraint solving determine when composition is safe, by looking at"package >= 3 && < 4"
  style constraints. That is, we try to guess when the composition
  would yield a type correct program.

The way to make this completely safe is to ensure that the resultingprogram only has one of each module, rather than one of each packageversion - that's an approximation, because the package name might havechanged too.

Now, we want to relax this in various ways. One way is the base-3/base-4situation, where base-3 has a lot of the same modules as base-4, but allthey do is re-export stuff from other packages. How do we know this issafe? Well, we don't - the only way is to check whether the resultingprogram typechecks.

Another way we want to relax is it when a dependency is "private" to apackage; that is, the package API is completely independent of thedependency, and hence changing the dependency cannot cause compilationfailure elsewhere. We've talked in the past about how it would be nice todistinguish private from non-private dependencies.

Let's be clear: there are only two ways that something could "go wrong"when composing packages:


 1. the composition is not type-correct; you get a compile-time error

 2. some top-level state is duplicated; if the programmer has been
    careful in their use of unsafePerformIO, then typically this won't
    lead to a run-time error.

So it's highly unlikely you end up with a program that goes wrong atruntime, and in those cases arguably the library developer has madeincorrect assumptions about unsafePerformIO.

* Again, we're using constraint solving on this language to
  determine when composition of Haskell module sets (aka packages) would
  yield type correct Haskell programs

All without attempting to do type checking of the interfaces between

  packages -- the very thing that says whether this is sound!

True - but we already know that package/version pairs are a proxy forinterfaces, and subject to user failure. If the package says that itcompiles against a given package/version pair, there's no guarantee that itactually does, that's up to the package author to ensure. Now obviouslywe'd like something more robust here, but that's a separate problem - notan unimportant one, but separate from the issue of how to makecabal-install work with GHC 6.10.1.

cabal-install has to start from the assumption that all the dependenciesare correct. Then it can safely construct a complete program by combiningall the constraints, and additionally ensuring that the combination has nomore than one of each module (and possibly relaxing this restriction whenwe know it is safe to do so).

* So, the solver for cabal-install has to be updated to allow the samepackage to have multiple, conflicting versions, as long as version X
  depends on version Y, and then not reject programs that produce this
  constraint.


Right.

* This is non trivial, but think this refactoring is possible, but it is
hard. ultimately we're still making optimistic assumptions aboutwhen module sets can be combined to produce type correct programs,and conservative assumptions, at the same time.
  What we need is a semantics for packages, that in turn uses a
  semantics for modules, that explains interfaces in terms of types.

The semantics is quite straightforward: a module is identified by the pair(package-id, module name), and then you just use the Haskell module systemsemantics. That is, replace all module names in the program with(package-id, module name) pairs according to which packages are in scope inthe context of each module, and then proceed to interpret the program as inHaskell 98.

The main problem with looking at things this way is that you need to seethe whole program - which is what I've been arguing against in the contextof instances. So I agree that looking for a semantics for packages thatlets you treat them as an abstract entity would be useful. Still, theabove interpretation of packages is a good starting point, because it tellsyou whether a higher-level semantics is really equivalent.

* The end result is that cabal-install should be able to find automated
  install plans for packages that ask for base-3, even when base-4 is on
  the system as well, and it uses pieces of base-3 libraries and base-4
  libraries. Some more programs will work than if we didn't ship base-3.

So I'm not sure exactly how cabal-install works now, but I imagine youcould search for a solution with a backtracking algorithm, and prunesolutions that involve multiple versions of the same package, unless thosetwo versions are allowed to co-exist (e.g. base-3/base-4). If backtrackingturns out to be too expensive, then maybe more heavyweightconstraint-solving would be needed, but I'd try the simple way first.

What happens with automatic flag assignments? Presumably we can decidewhat the flag assignment for each package is up-front?


Cheers,
        Simon
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: planning for ghc-6.10.1 and hackage [or: combining packages to yield new type correct programs]

Reply via email to