Select.hSelect

1999-09-11 Thread Sigbjorn Finne (Intl Vendor)


FYI to Haskell WishList listeners - I've checked in support for hSelect
to the CVS repository. Code in ghc/lib/misc/Select.lhs, documentation
in ghc/docs/users_guide/libmisc.vsgml. As always, commits haven't been
subject to a testing procedure that satisfies ISO 9001 requirements..
i.e., give it a go, if interested.

The need for Select.hSelect should be less now that the IO library is
MT friendlier, but it will still serve a useful role in some contexts, I
hope.

--sigbjorn



RE: Haskell Wish list: library documentation

1999-09-11 Thread trb

S. Alexander Jacobson writes:
   At 03:55 AM 9/9/99 , Mark P Jones wrote:
   Some folks out there want to use Haskell to write real programs.  For
   them, there's Haskell 98.
  
  To be clear, I am not an academic researcher.  I develop real world
  web sites.  I would really like to use Haskell for this process, but the
  tools and libraries are not yet there to get this done in an effective
  manner.

My interest in Haskell is also for writing real programs, some of them business
applications. I'm also interested in using it on web sites (and CGI
applications). I have sometimes used Haskell in my professional life, and deeply
wish to renounce lesser languages forever - writing programs in Haskell is
interesting (C++ stopped interesting me about five years ago) and
mind-expanding.

  Much of the substance of our real world web development involves:
  1. talking to databases
  2. talking to other web servers (via http)
  3. processing XML schema and data files
  4. generating HTML/XML which mixes string constants with values
  derived from databases and xml files
  
  Haskell98 is good at none of these things, but Haskell so clearly could be!

Saying H98 is no good at these things is an overstatement, but I agree that it
is too lame compared with the much more elegant approaches possible given a some 
extensions (many of which are already in ghc).

  Putting aside sexy type system innovations, even basic stuff like the File
  and Directory Libraries clearly needs a major overhaul.  Absent a working
  FFI standard, development of the File Directories is happening out in
  space. Real world directory APIs, like LDAP, NDS, and Active Directory,
  are emerging, compatibility with which would be really nice.
  Javasoft has defined a reasonably sane JNDI spec which subsumes a lot of
  file and directory functionality.  If Lambada goes up, then it would be
  really sane for the standard File and Directory APIs to mirror JNDI more
  closely (and an easy implementation would be a shallow wrapper around the
  underlying JNDI functionality!).  I would make similar points about HTTP
  and SQL libs but there is no standard haskell Net or Database module as
  there is in e.g. Java.  (however doesn't lambada assume type system
  extentions?)

It is good for Haskell to be inter-operable with proprietary tools, but I don't
want it to depend on them to implement its standard libraries. I prefer Free
Software.

  Of course, it may be worth someone's time to construct collection classes
  to abstract out database semantics, but the fact that MPTC appears
  designed for the task would make any reasonable developer unwilling to
  restrict themselves to H98.

Indeed (except that extensible records are also needed to do a really good job
of database access).

  On the sexy type system front, the reason why I have been campaigning so
  hard for generic programming interfaces (either via Derive, PolyP, or a
  Categorical Prelude), is that they substantially facilitate much of the
  real work associated with talking to databases and XML files.  The actual
  application logic encoded into most of these apps is really small.  The
  hard part is building the plumbing to connect all the data sources and
  sinks together.  Generic programming interfaces mean that you don't spend
  enourmous amounts of time manually translating haskell datatypes into and
  out of SQL and XML schema (a desired blue sky application is automatic
  optimal compression of valid XML data files as an HTTP transfer encoding 
  option).

I agree. Computer science and other scientific applications tend to be clever
programs such as compilers, where data is well structured and processing is
complex. But business applications typically have rather shallow processing,
with lots of semi-arbitrary types of data records to be edited, moved around,
and stored; handling these records often accounts for most of the code in a
business application. In these cases polytypic programming techniques can save a
lot of work. At present I'm hopeful that Derive can meet my polytypic needs,
since neither PolyP nor a categorical prelude look likely to be usable soon.

One problem with both PolyP and the categorical prelude is that they do not seem 
to support records with more than two fields. Of course, one can build up
multi-field records from nested two-field types (just like a binary tree), but
is this the right way (or just a workable way) ? Also, the categorical prelude
(not sure about PolyP) does not provide a way to get access to type/field
names (this is not interesting from a CS POV, but would be very useful for
automatically generated GUIs for editing records).

  Yes, you can use various extensions that people have around, but the
  plumbing overhead of wrapping them into any individual project typically
  exceeds the overhead of doing the actual work.  The economies of scale you
  get from sharing work or just not there yet.

Nonetheless, I have no intention of 

Problems with Haskell 98 Random Specification/Implementation

1999-09-11 Thread Mark P Jones

To those who use or know about random numbers and Haskell:

A couple of months ago, John Hughes sent me mail about a problem that
he had uncovered with the implementation of the Random library in Hugs.
He had been using the "split" function in an attempt to generate a
stream of random number generators, each of which he hoped would be
different from the others.  But instead he found that he actually
ended with many different copies of the *same* random number generator.
A disappointing, and frustratingly non-random result.

If you don't happen to recall, split is a member of the RandomGen class,
with type RandomGen g = g - (g,g); it takes a single random number
generator as its argument, and returns a pair of two new generators as
its result.  The only thing that the specification requires is that the
two generators returned are (a) distinct and (b) `independently robust'
from a statistical point of view.  To the best of my knowledge, the
implementation in Hugs meets this modest specification.  Sadly, assuming
only this specification, you cannot write the function that John was
looking for and be sure that it will generate more than two different
generators.

For example, the specification allows even the following trivial
implementation for split:  split _ = (g1, g2), where g1 and g2 are some
arbitrary but constant pair of distinct, and independently robust
generators.  With this implementation, you can split as often as you
want and you'll never get more that two generators.

Hugs and GHC (as far as I can tell) both use definitions of the form:

   split g = (g, f g)

for some function f.  (My understanding of the code in GHC is that it
uses an unsafe function for f, breaking referential transparency; I hope
the optimizer knows about this.)  Note that this definition returns the
argument as a result; the specification doesn't prohibit that; all it
requires is that the two results returned be distinct.  But if you try
to generate a list of n different generators using:

   take n (iterate (fst . split) g)

then you will be sorely disappointed; you might as well have written
replicate n g.  (On the other hand, if you were lucky enough to have
used (snd . split), instead of (fst . split), then you wouldn't have
noticed the problem ...)

I know very little about the mathematics or pragmatics of random
number generators, so I'm not sure that I know how to fix this
problem.  However, starting from this position of ignorance, I have
hacked up a new version of "split" for the standard "StdGen" that
will appear in the next release of Hugs (real soon now!).  Judging
from the tests that I've tried so far, it seems to work much
better than the old version.  That said:

 - Take care if you use Random.split in your programs, because it
   may not do what you expect.

 - There should probably be an errata about this for the Haskell 98
   library report ... if somebody can figure out what it should say.

 - If you use Hugs, be aware that the implementation of Random.split
   was hacked up by someone who has no way of justifying that
   implementation, beyond some simple experiments.

 - If you know something about the mechanics of random number
   generators, here's an area where Haskell specifications and
   implementations could benefit from your knowledge!

All the best,
Mark






Re: 3 wishes

1999-09-11 Thread Jose Emilio Labra Gayo



On Fri, 10 Sep 1999, Simon Raahauge DeSantis wrote:

 On Fri, Sep 10, 1999 at 07:40:03PM +0200, Jose Emilio Labra Gayo wrote:
  
  1.- Show "some" Warnings 
  -
[...]
  
  As an example, suppose you write a long program and you define:
  
   f = "First definition . . . ."
  
  And you define two or more pages below a second and different 
  (while type correct) definition for f
  
   f = "Second definition . . ."
  
[...]
 
 Have you ever actually had this happen to you? I tried to load the following
 program into hugs:

[...example deleted...]

 And the result I got:
 ERROR "/tmp/foo.hs" (line 1): "l" multiply defined

 I think having function definitions in multiple places is forbidden by the
 report (though I haven't checked).
 
Well, it was just an example. There is a lot of common bugs that the 
system can try to detect (I am talking about "warnings", not "errors"). 

Repeated definitions can be writen as:

f x = "hello"

-- A long comment

f x = "bye"

And the system could report a warning (not an error). 

  2.- Try to recover from the first error 
  ---
  Hugs could give a list of [line number]-[error message] instead
  of reporting only the first error. 
 
 My experience with this in C compiliers (Older version of Symantic C/C++
 springs to mind, and also gcc) is that often fixing the first error clears
 up some of the subsequent error reports (ie they were caused by parsing
 breaking down). I think this would be especially true with the layout rule
 and all.
 
I think that it depends on the way you develop your program. 
Sometimes you write it interactively (one function each time) 
and you only need the first error. 
But sometimes you get a whole program and try to compile it. 
I remember when I was adapting some programs from Haskell 1.4 
to Haskell 98 and It would have been very useful that the system 
reported more errors than only the first one. 

Regards, Jose Labra
http://lsi.uniovi.es/~labra





Functional Dependencies

1999-09-11 Thread Mark P Jones

[Simon mentioned my work on `functional dependencies' in one of his
messages a couple of days ago, so I thought I'd better post an
explanation!]

A couple of months ago, I developed and implemented an extension to
Hugs that has the potential to make multiple parameter type classes
more useful.  The key idea is to allow class declarations to be
annoted with `functional dependencies'---an idea that has previously
received rather more attention in the theory of relational databases.
For example, we could use the following declaration as the starting
point for a simple `collection' class library:

   class Collects e ce | ce - e where
  empty  :: ce
  insert :: e - ce - ce
  member :: e - ce - Bool

The new part here is the clause "| ce - e", which tells us that the
"ce" parameter (the type of the collection) uniquely determines the
"e" parameter (the type of the elements).  Dependencies like this
can be used to avoid ambiguity problems; to obtain more accurate, and
less cluttered inferred types; and to allow more general sets of
instances than constructor classes.  If this has got your interest,
you can find more information at http://www.cse.ogi.edu/~mpj/fds.html.

Support for this extension is included in the imminent September 1999
release of Hugs 98.  In the meantime, Jeff Lewis, a colleague here at
OGI, has been putting together an implementation for GHC.  I've also
written a longer paper that gives more technical details, and explains
how this idea fits in with other work, but there are a few things I
want to do to that before it is ready for distribution, probably after
I get back from the Haskell Workshop.

I hope folks will find this useful.  A month or two back, somebody on
this list said that we had no `peer review' process ... people
just do something, then say "here it is, use it".  Well I've done
something ... here it is ... I hope that some of you will use it ...
and I do very much appreciate your feedback.  The peer review
starts today!

All the best,
Mark






RE: Implementation of type classes

1999-09-11 Thread Mark P Jones

Hi Heribert,

| The idea is that for every class assertion in the type of a variable,
| the variable gets an additional parameter that will be instantiated by a
| value representing the appropriate instance declaration. These values
| are tuples (let's call them "instance tuples") containing
| - one component for every superclass (viz. an instance tuple) and
| - one component for every class method (viz. a function with an
|   additional argument expecting the instance tuple)

This is exactly the scheme proposed by Wadler and Blott in their paper
"How to make ad-hoc polymorphism less ad-hoc"!  The things that you've
called "instance tuples" here are more commonly referred to as
"dictionaries", and so the translation that is used to implement
overloading by adding these extra dictionary parameters is known as
the "dictionary passing translation".  It is also the implementation
technique used in all current Haskell compilers/interepreters, as far
as I'm aware.  I'm not particularly proud of them, and they weren't
the main topic, but you can find a couple of chapters, going into
many of the gory details of all this in my thesis (Chaps. 7 and 8, to
be precise.

[Historical aside: I did, however, produce a version of Gofer at one
stage that didn't use dictionaries: the compiler still used a
dictionary passing translation internally, but then followed that
with a specialization phase so that no dictionary values were used
at run-time.  To my initial surprise, it actually resulted in smaller
compiled programs in every example that I tried.  Two caveats: (1) the
technique doesn't work well with separate compilation, and would need
a very fancy linker; (2) the introduction of polymorphic recursion in
Haskell means that there are programs for which the technique cannot
be used.  You can read more about this in my paper on "Dictionary-free
overloading".]

| - Does this work completely as intended or have I missed something, such
|   as strictness problems or the like?

Well I hope I've dealt with all this above.  One thing to note,
however: when you add extra arguments to a variable definition, you
can end up with a function where there wasn't one before, and that
has an impact on which computations get shared, and which don't.
So it can have an impact, and that was one of the original motivations
for the "dreaded monomorphism restriction", which is still there
lurking in the definition of Haskell 98.

| - Does this still work if more complex features of the type/class system
|   are used?

Yes to all the things you listed.  (Except your last point --- "dependent
types" --- to which I'd reply that I'm not exactly sure what you were
referring to ... perhaps the kind of work that Hongwei Xi has been
doing?  Or maybe to the kinds of things that you've seen in Lennart's
Cayenne?  Or perhaps something else.)

| - Can the transformed program be evaluated as efficiently as the
|   original one? (The original program gets an advantage if some class
|   constraints are resolved at compile time. But perhaps a similar effect
|   for the transformed program can be achieved with GHC's transformation
|   rules.)

The real question here is how can the program be executed at all
*without* doing the translation.  This is one of the criticisms
that has been made of Haskell's class mechanisms, and one of the
motivations for other proposals.  You might, for example, want to
take a first look at Odersky, Wadler, and Wehr's "A second look
at overloading" to see the kind of compromises that you might need
to make and the benefits that you might hope to gain.

| - Do type classes provide an advantage with respect to expressiveness of
|   types? For example, can the types of an API be formulated in a more
|   flexible way with classes as compared to the transformed style?

They shouldn't, in theory, but along the route to Haskell, some
additional expressiveness snuck in.  Dictionary components in
Haskell, for example, can have polymorphic types (think of fmap
in the Functor class), but the components of a tuple in a regular
Haskell type cannot.  Similarly, before Haskell went official and
allowed polymorphic recursion, you could still code it up by
hacking with classes.  You can restore the balance by adding
support for rank-2 polymorphism, as is done in Hugs and GHC.
Once that's don't you don't get an extra expressiveness by programming
with type classes, but it might still make some programs seem easier
to code (because you don't have to add the extra parameters yourself).

[Historical aside 2: when constructor classes were first introduced,
several people (myself included) wrote about the "Expressiveness
of constructor classes" ... it was only later that I realized how
much of that expressiveness came from rank-2 polymorphism instead
of higher-order kinds.  I wrote something about this in the tail
sections of my paper on "First-class polymorphism with type inference"
if you want to know more.]

| Most of this is probably well-known 

Re: Implementation of type classes

1999-09-11 Thread Jan Skibinski



On Sat, 11 Sep 1999, Heribert Schuetz wrote:

 
 Most of this is probably well-known stuff and written down in papers.
 Which ones? The Haskell report concentrates on the static semantics of
 classes and instances. It looks like the dynamic semantics is expected
 to be already understood by the reader or intuitively clear. Are the
 references [6] (Jones: A system of constructor classes: overloading and
 implicit higher-order polymorphism) and [11] (Wadler/Blott: How to make
 ad hoc polymorphism less ad hoc) of the report available on the Web?
 

Have you read "Typing Haskell in Haskell" by Mark P.
Jones? 
(written not so long ago, available on Web and to be presented
in Haskell Workshop 1999).
It  might help you with some of your questions.

Jan