Re: [Haskell-cafe] Correspondence between libraries and modules

2012-05-25 Thread Rustom Mody
On Wed, May 23, 2012 at 9:44 PM, Gregg Lebovitz glebov...@gmail.com wrote:

  Rustom,

 I am drafting a document that captures some of the social norms from the
 comments on this list, mostly from Brandon and Wren. I have captured the
 discussion about module namespace and am sorting out the comments on the
 relationship between libraries and packages.

 My initial question to the list was to try an identify where Haskell is
 different from other open source distributions. From what I can tell, the
 issues are very similar. The module name space seems to have
 characteristics very similar to the include file hierarchy of linux
 distributions.

 If you have some spare cycles and would like to contribute, I think
 everyone would appreciate your help and effort

 Gregg


Hi Gregg.

One of the common complaints one gets from a first year programming student
(and its now about 3 decades I dealing with these!) is:

The compiler/interpreter etc HAS a BUG!!!

So...
While I am an old geezer with programming and functional programming --
doing, teaching, playing, implementing, or just plain thinking but  -- I am
too much of a noob to ghc to risk falling into the 1st year student trap
above.

Yes perhaps not a typical noob...
Somethings are easier for me than the typical noob -- all the 'classical'
good-stuff like pattern-matching, lambda-calculus, type-inferencing,
polymorphism etc.
And this is helpful to understand the 'modern good stuff' starting monads
and onwards

But then I get hit -- finding my way round hackage, installing with cabal
etc -- even tho I'm an ol-time unix hacker and sysadmin-er.

So I guess its best to assume (as of now) that I dont know the ropes rather
than something is wrong/broken with them.

O well... If the noob trap is one error playing it safe is probably another
so here goes with me saying things that I (probably) know nothing about:
1. cabal was a beautiful system 10 years ago.  Now its being forcibly
scaled up 2 (3?) orders of magnitude and is creaking at the seams
2. There's too much conflicting suggestions out there on the web for a noob
- use system install (eg apt-get) or use cabal
- cabal in user area or system area etc
- the problem is exponentiated by the absence of cabal uninstall
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-05-25 Thread Chris Wong
Rustom:

 O well... If the noob trap is one error playing it safe is probably another
 so here goes with me saying things that I (probably) know nothing about:
 1. cabal was a beautiful system 10 years ago.  Now its being forcibly scaled
 up 2 (3?) orders of magnitude and is creaking at the seams

The problem is, Cabal is not a package management system. The name
gives it away: it is the Common Architecture for *Building*
Applications and Libraries. Cabal is to Haskell how GNU autotools +
make is to C: a thin wrapper that checks for dependencies and invokes
the compiler. All that boring
not-making-your-package-break-everything-else stuff belongs to the
distribution maintainer, not Hackage and Cabal.

 2. There's too much conflicting suggestions out there on the web for a noob
     - use system install (eg apt-get) or use cabal

Use apt-get. Your distribution packages are usually new enough, have
been tested thoroughly, and most importantly, do not conflict with
each other.

     - cabal in user area or system area etc

Installing with --user is usually the best, since they won't clobber
system packages and if^H^Hwhen they do go wrong, you can simply rm -r
~/.ghc. For actual coding, it's better to use a sandboxing tool such
as [cabal-dev][] instead.

[cabal-dev]: http://hackage.haskell.org/package/cabal-dev

     - the problem is exponentiated by the absence of cabal uninstall

See above.

By the way, someone else a whole article about it:
https://ivanmiljenovic.wordpress.com/2010/03/15/repeat-after-me-cabal-is-not-a-package-manager/

Hope that clears it up for you.

Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-05-23 Thread Rustom Mody
On Tue, Apr 24, 2012 at 7:29 PM, Gregg Lebovitz glebov...@gmail.com wrote:



 On 4/23/2012 10:17 PM, Brandon Allbery wrote:

  On Mon, Apr 23, 2012 at 17:16, Gregg Lebovitz glebov...@gmail.comwrote:

  On 4/23/2012 3:39 PM, Brandon Allbery wrote:

   The other dirty little secret that is carefully being avoided here is
 the battle between the folks for whom Haskell is a language research
 platform and those who use it to get work done.  It's not entirely
 inaccurate to say the former group would regard a fragmented module
 namespace as a good thing, specifically because it discourages people from
 considering it to be stable

  Brandon, I find that a little hard to believe.  If the issues are
 similar to other systems and languages, then  I think it is more likely
 that no one has volunteered to work on it.  You volunteering to help?



 Does haskell/hackage have something like debian's lintian?

Debian has a detailed policy document that keeps evolving:
http://www.debian.org/doc/debian-policy/
Lintian tries hard to automate (as much as possible) policy-compliance
http://lintian.debian.org/manual/index.html

Eg how packages should use the file system
 http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/
Even 'boring' legal stuff like license-checking is somewhat automated
http://dep.debian.net/deps/dep5/

And most important is the dos and donts for package dependency making
possible nice pics http://collab-maint.alioth.debian.org/debtree/

Of course as Wren pointed out, the Linux communities have enough manpower
to police their distributions which haskell perhaps cannot.

My question is really: Would not something like a haskell-lintian make such
sanity checking easier and more useful for everyone?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-05-23 Thread Gregg Lebovitz

  
  
Rustom,

I am drafting a document that captures some of the social norms from
the comments on this list, mostly from Brandon and Wren. I have
captured the discussion about module namespace and am sorting out
the comments on the relationship between libraries and packages.

My initial question to the list was to try an identify where Haskell
is different from other open source distributions. From what I can
tell, the issues are very similar. The module name space seems to
have characteristics very similar to the include file hierarchy of
linux distributions.

If you have some spare cycles and would like to contribute, I think
everyone would appreciate your help and effort

Gregg

On 5/23/2012 4:24 AM, Rustom Mody wrote:

  On Tue, Apr 24, 2012 at 7:29 PM, Gregg
Lebovitz glebov...@gmail.com
wrote:

  
 
  
  On 4/23/2012 10:17 PM, Brandon Allbery wrote:
  

  On Mon, Apr 23, 2012 at
17:16, Gregg Lebovitz glebov...@gmail.com
wrote:

  

  On 4/23/2012 3:39 PM, Brandon Allbery
wrote:

  

  
The other dirty little secret
  that is carefully being avoided
  here is the battle between the
  folks for whom Haskell is a
  language research platform and
  those who use it to get work done.
  It's not entirely inaccurate to
  say the former group would regard
  a fragmented module namespace as a
  good thing, specifically because
  it discourages people from
  considering it to be stable
  

  

  
  Brandon, I find that a little hard to
  believe. If the issues are similar to other
  systems and languages, then I think it is
  more likely that no one has volunteered to
  work on it. You volunteering to help?
  
  
  
  

  

  

  

  
  Does haskell/hackage have something like debian's lintian?
  
  Debian has a detailed policy document that keeps evolving: http://www.debian.org/doc/debian-policy/
  Lintian tries hard to automate (as much as possible)
  policy-compliance http://lintian.debian.org/manual/index.html
  
  Eg how packages should use the file system
  http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/
  Even 'boring' legal stuff like license-checking is somewhat
  automated http://dep.debian.net/deps/dep5/
  
  And most important is the dos and donts for package dependency
  making possible nice pics http://collab-maint.alioth.debian.org/debtree/
  
  Of course as Wren pointed out, the Linux communities have enough
  manpower to police their distributions which haskell perhaps
  cannot.
  
  My question is really: Would not something like a haskell-lintian
  make such sanity checking easier and more useful for everyone?

  


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-05-07 Thread Ben Franksen
Alvaro Gutierrez wrote:
 I've only dabbled in Haskell, so please excuse my ignorance: why isn't
 there a 1-to-1 mapping between libraries and modules?
 
 As I understand it, a library can provide any number of unrelated modules,
 and conversely, a single module could be provided by more than one
 library. I can see how this affords library authors more flexibility, but
 at a cost: there is no longer a single, unified view of the library
 universe. (The alternative would be for every module to be its own,
 hermetic library.) So I'm very interested in the rationale behind that
 aspect of the library system.

I am probably repeating arguments brought forward by others, but I really 
like that the Haskell module name space is ordered along functionality 
rather than authorship. If I ever manage to complete an implementaton of the 
EPICS pvData project in Haskell, it will certainly inherit the Java module 
naming convention and thus will contain modules named Org.Epics.PvData.XXX, 
*but* if I need to add utility functions to the API that are generic list 
processing functions they will certainly live in the Data.List.* name space 
and if I need to add type level stuff (which is likely) it will be published 
under Data.Type.* etc. This strikes me as promoting re-use: makes it far 
easier and more likely to factor out these things into a separate general 
purpose library or maybe even integrate them into a widely known standard 
library. It also gives you a much better idea what the thing you export is 
doing than if it is from, say, Org.Epics.PvData.Util. Finally, it gives the 
package author an incentive to actually do the refactoring that makes it 
obvious where the function belongs to, functionally.

Cheers
Ben


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-26 Thread Gregg Lebovitz

On 4/24/2012 11:44 PM, wren ng thornton wrote:
To pick another similar namespacing issue, consider the problem of 
Google Code. In Google Code there's a single namespace for projects, 
and the Google team spends a lot of effort on maintaining that 
namespace and resolving conflicts. (I know folks who've worked in the 
lab next door to that team. So, yes, they do spend a lot of work on 
it.) Whereas if you consider BitBucket or GitHub, each user is given a 
separate project namespace, and therefore the only thing that has to 
be maintained is the user namespace--- which has to be done anyways in 
order to deal with logins. The model of Google Code, SourceForge, and 
Java all assume that projects and repositories are scarce resources. 
Back in the day that may have been true (or may not), but today it is 
clearly false. Repos are cheap and everyone has a dozen side projects.


Actually, I like the idea of combining an assigned User name with the 
repo name as the namespace. We already have login names for haskell.org, 
why not use those. I agree that it is not an end all, but it would be a 
start. My top level namespace would be Org.Haskell.Glebovitz. It is 
democratic and it identifies the code by the repoand the user the 
created it. If someone else decided to use their github id then it their 
modules would be org.github.username or org.github.project. Of course 
people can choose to ignore the namespace common practice, but they can 
do that anyway.




If you look at the case of Perl and CPAN, there's the same old story: 
universal authority. Contrary to Java, CPAN does very much actively 
police (or rather, vett) the namespace. However, this extreme level of 
policing requires a great deal of work and serves to drive away a 
great many developers from publishing their code on CPAN.


I'm not as familiar with the innards of how various Linux distros 
manage things, but they're also tasked with the additional burden of 
needing to pull in stuff from places like CPAN, Hackage, etc. Because 
of that, their namespace situation seems quite different from that of 
Hackage or CPAN on their own. I do know that Debian at least (and 
presumably the others as well) devote a great deal of manpower to all 
this.


Yes, but that goes back to my comments about upstream and downstream. 
Hackage can try to solve the problem for itself, but eventually someone 
is going to put together a distribution, whether it be ubuntu, or 
Microsoft and they will have to sort out the name collisions for their 
packages and modules. If we have a good naming scheme to start with, it 
will make the downstream problem a bit easier. Even so, they will 
probably change it anyways. I know that ubuntu and fedora take different 
approaches to packaging. When I try to use a package like Qt on these 
different platforms, I have to figure out which package contains which 
library.




So we have (1) the Java model where there are rules that noone 
follows; (2) the Google Code, CPAN, and Linux distro model of devoting 
a great deal of community resources to maintaining the rules; and (3) 
the BitBucket, GitHub, Hackage model of having few institutionalized 
rules and leaving it to social factors. The first option buys us 
nothing over the last, excepting a false sense of security and the 
ability to alienate private open-source developers.



I think my combo of formalized namespace and social rules would work 
best here.  The problem is that we do have module collisions because the 
namespace is too simple. Right now it is not an issue because the 
community is not huge. Eventually it will be a problem if Haskell 
popularity grows.


There is no technical solution to this problem, at least not any used 
by the communities you cite. The only solutions on offer require a 
great deal of human effort, which is always a 
social/political/economic matter. The only technical avenues I see are 
ways of making the problem less problematic, such as GitHub and 
BitBucket distinguishing the user namespace from each user's project 
namespace, such as the -XPackageImports extension (which is 
essentially the same as GitHub/BitBucket), or such as various ideas 
about using tree-grafting to rearrange the module namespace on a 
per-project basis thereby allowing clients to resolve the conflicts 
rather than requiring a global solution. I'm quite interested in that 
last one, though I don't have any time for it in the foreseeable future.


There probably is a technical solution, but no one is going to discover 
it and build it anytime soon. dI think we all agree that a centralized 
global solution is out. No one would want to manage it. I do think the 
repo.username namespace has potential. The problem is that informal 
social convention works if the community is small. Once it starts to 
grow it has to be codified to some degree.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org

Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-25 Thread Gregg Lebovitz



On 4/24/2012 11:49 PM, wren ng thornton wrote:

On 4/24/12 9:59 AM, Gregg Lebovitz wrote:

The question of how to support rapid innovation and stable
deployment is not an us versus them problem. It is one of staging 
releases. The
Linux kernel is a really good example. The Linux development team 
innovates
faster than the community can absorb it. The same was true of the GNU 
team.

Distributions addressed the gap by staging releases.


In that case, what you are interested in is not Hackage (the too-fast 
torrent of development) but rather the Haskell Platform (a policed set 
of stable/core libraries with staged releases).


No, that was not what I was thinking because a stable policed set of 
core libraries is at the opposite end of the spectrum from how you 
describe Hackage. What I am suggesting is a way of creating an upstream 
that feeds increasingly stable code into an ever increasing set of 
stable and useful components. Using the current open system model, the 
core compiler team for gcc releases the compiler and a set of libstdc 
and libstdc++ libraries. The GNU folks release more useful libraries, 
and then projects like GNOME build on the other components. Right now we 
have Hackage that moves to fast and the Haskell core that rightfully 
moves more slowly.


Maybe the answer is to add a rating system to Hackage and mark packages 
as experimental, unsupported, and supported, or use a 5 star rating 
system like the app store. Later on when we have appropriate testing 
tools, we can include a rating from the automated tests.




I forget who the best person to contact is these days if you want to 
get involved with helping the HP, but I'm sure someone on the list 
will say shortly :)




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-24 Thread Gregg Lebovitz

  
  


On 4/23/2012 10:17 PM, Brandon Allbery wrote:

  
On Mon, Apr 23, 2012 at 17:16, Gregg
  Lebovitz glebov...@gmail.com
  wrote:
  

  
On 4/23/2012 3:39 PM, Brandon Allbery
  wrote:
  

  

  The other dirty little secret that is
carefully being avoided here is the battle
between the folks for whom Haskell is a
language research platform and those who use
it to get work done.  It's not entirely
inaccurate to say the former group would
regard a fragmented module namespace as a
good thing, specifically because it
discourages people from considering it to be
stable

  

  

Brandon, I find that a little hard to believe.  If the
issues are similar to other systems and languages, then 
I think it is more likely that no one has volunteered to
work on it.  You volunteering to help?



Yes, you do find it hard to believe; so hard that you
  went straight past it and tried to point to the "easy"
  technical solution to the problem you decided to see in
  place of the real one, which doesn't have a technical
  solution.
  

  


Brandon, I am very glad to make your acquaintance. I think you have
given these issue much thought. That is good.

No, I don't think I "went straight past it". I we are trying to
address the same issue, but from different directions. If you take
the time to look at my history, you'll find that I spent my career
bridging the very gap you make so very salient.

Here's where we differ, you see an untenable political issue, and I
see a technical one. The question of how to support rapid innovation
and stable deployment is not an us versus them problem. It is one of
staging releases. The Linux kernel is a really good example. The
Linux development team innovates faster than the community can
absorb it. The same was true of the GNU team. Distributions
addressed the gap by staging releases.

I fought this very battle in the 1980s with the Andrew system. The
technology coming out of the ITC (research community) was evolving
faster than users could absorb. Researchers want to innovate and
push the limits and users want stability. I've spoken with many in
the Haskell research community, and I never heard anyone say "no, we
want to obfuscate Haskell so that we never have to make is stable."
I think both communities want success. The question is how to build
a system that will address both.

From your history, I see you are knowledgeable and well known on the
deployment side of technology. You also understand what Haskell
needs to move forward. So I ask you again, are you volunteering to
help?


  

  


  
  -- 
  brandon s allbery                                      allber...@gmail.com
  wandering unix systems administrator (available)     (412)
  475-9364 vm/sms
  

  

  


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-24 Thread wren ng thornton

On 4/23/12 11:39 AM, Gregg Lebovitz wrote:

On 04/23/2012 12:03 AM, wren ng thornton wrote:

However, until better technical support is implemented (not just for
GHC, but also jhc, UHC,...) it's best to follow social practice.


Wren, I am new to Haskell and not aware of all of the conventions. Is
there a place where I can find information on these social practices?
Are they documented some place?


Not that I know of, though they're fairly standard for any open-source 
programming community. E.g., when it comes to module names: familiarize 
yourself with what's out there; try to fit in with the patterns you 
see[1]; don't intentionally clash, steal namespaces[2], or squat on 
valuable territory[3]; be reasonable and conscientious when interacting 
with people.



[1] e.g., the use of Data.* for data structures which are 
predominantly/universally treated as such, vs the use of Control.* for 
things which are often thought of as control structures (monads, etc). 
The use of Foo.Bar.Strict and Foo.Bar.Lazy when you provide both strict 
and lazy versions of some whole API, usually with Foo.Bar re-exporting 
whichever one seems the sensible default. The use of Foo.Bar.Class to 
resolve circular import issues when defining a class and a bunch of 
datatypes with instances. Etc.


[2] I mean things like if some package is providing a bunch of Foo.Bar.* 
modules, and it's the only one doing so, then you should try to get in 
touch with the maintainer before you start publishing your own Foo.Bar.* 
modules--- in order to collaborate, to send patches up-stream, or just 
to let them know what's going on.


[3] Witness an unintentional breach of this myself a while back. When I 
was hacking up the exact-combinatorics package for my own use, I put 
things in Math.Combinatorics.* since that's a reasonable place and 
wasn't in use; but I didn't think of that fact when I decided to publish 
the code. When pointed out, I promptly moved everything to 
Math.Combinatorics.Exact.* since that project is only interested in 
exact combinatorics and I have no intention of codifying all of 
combinatoric theory; hence using Math.Combinatorics.* would be squatting 
on very valuable names.




However, centralization is prone to bottlenecks and systemic failure.
As such, while it would be nice to ensure that a given module is
provided by only one package, there is no mechanism in place to
enforce this (except at compile time for the code that links the
conflicting modules together).


 From someone new to the community, it seems that yes centralization has
its issues, but it also seems that practices could be put in place that
minimize the bottlenecks and systemic failures.

Unless I greatly misunderstand the challenges, there seem to be lot of
ways to approach this problem and none of them are new. We all use
systems that are composed of many modules neatly combined into complete
systems. Linux distributions do this well. So does Java. Maybe should
borough from their experiences and think about how we put packages
together and what mechanisms we need to resolve inter-package dependencies.


Java attempts to resolve the issue by imposing universal authority (use 
reverse urls for the first part of your package name). Many Java 
developers flagrantly ignore that claim to authority. Sun/Oracle has no 
interest in actually policing these violations, and there's no central 
repository for leveraging social pressure to do it. Moreover, 
open-source developers who do not have a commercial/institutional 
affiliation are specifically placed in a tough spot, and are elided from 
public discourse because of that fact, which is extremely problematic on 
too many levels to get into here. Furthermore, many developers 
---especially among open-source and academic authors--- have an inherent 
distrust for ambient authority like this.


To pick another similar namespacing issue, consider the problem of 
Google Code. In Google Code there's a single namespace for projects, and 
the Google team spends a lot of effort on maintaining that namespace and 
resolving conflicts. (I know folks who've worked in the lab next door to 
that team. So, yes, they do spend a lot of work on it.) Whereas if you 
consider BitBucket or GitHub, each user is given a separate project 
namespace, and therefore the only thing that has to be maintained is the 
user namespace--- which has to be done anyways in order to deal with 
logins. The model of Google Code, SourceForge, and Java all assume that 
projects and repositories are scarce resources. Back in the day that may 
have been true (or may not), but today it is clearly false. Repos are 
cheap and everyone has a dozen side projects.


If you look at the case of Perl and CPAN, there's the same old story: 
universal authority. Contrary to Java, CPAN does very much actively 
police (or rather, vett) the namespace. However, this extreme level of 
policing requires a great deal of work and serves to drive away a great 
many 

Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-24 Thread wren ng thornton

On 4/24/12 9:59 AM, Gregg Lebovitz wrote:

The question of how to support rapid innovation and stable
deployment is not an us versus them problem. It is one of staging releases. The
Linux kernel is a really good example. The Linux development team innovates
faster than the community can absorb it. The same was true of the GNU team.
Distributions addressed the gap by staging releases.


In that case, what you are interested in is not Hackage (the too-fast 
torrent of development) but rather the Haskell Platform (a policed set 
of stable/core libraries with staged releases).


I forget who the best person to contact is these days if you want to get 
involved with helping the HP, but I'm sure someone on the list will say 
shortly :)


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-24 Thread wren ng thornton

On 4/23/12 3:06 PM, Alvaro Gutierrez wrote:

I see. The first thing that comes to mind is the notion of module
granularity, which of course is subjective, so whether a single module or
multiple ones should handle e.g. doubles and integrals is a good question;
are there guidelines as to how those choices are made?


I'm not sure if there are any guidelines per se; that's more of a 
general software engineering problem. If you browse around on Hackage 
you'll get a fairly good idea what the norms are though. Everyone seems 
to have settled on a common range of scope--- with notable exceptions 
like the containers library with far too many functions per module, and 
some of Ed Kmett's work on category theory which tends towards very few 
declarations per module.



At any rate, why do these modules, with sufficiently-different
functionality, live in the same library -- is it that they share some
common bits of implementation, or to ease the management of source code?


I contacted Don Stewart (the former maintainer) to see whether he 
thought I should release the integral stuff on its own, or integrate it 
into bytestring-lexing. We agreed that it made more sense to try to 
build up a core library for lexing various common data types, rather 
than having a bunch of little libraries. He'd just never had time to get 
around to developing bytestring-lexing further; so I took over.


Eventually I plan to add rendering functions for floating point, and to 
split up the parsers for different floating point formats[1], so that it 
more closely resembles the integral stuff. But that won't be until this 
fall or later, unless someone requests it sooner.



[1] Having an omni-parser can be helpful when you want to be liberal 
about your input. But when you're writing parsers for a specified 
format, usually they're not that liberal so we need to offer restricted 
lexers in order to give code reuse.




When dealing with FFI code, because of the impedance mismatch between
Haskell and imperative languages like C, it's clear that there's going to
be some massaging of the API beyond simply declaring FFI calls. As such,
clearly we'd like to have separate modules for doing the low-level binding
vs presenting a high-level API. Moreover, depending on what you're
interfacing with, you may be forced to have multiple low-level modules.


Ah, that's a good use case. Is the lower-level module usually made public
as well, or is it only an implementation detail?


Depends on the project. For ByteStrings, most of that is hidden away as 
implementation details. For binding to C libraries, I think the current 
advice is to offer the low-level interface so that if there's something 
the high-level interface can't handle well, people have some easy recourse.




On the other hand, the main purpose of packages or libraries is as unit of
distribution, code reuse, and separate compilation. Even with the Haskell
culture of making small libraries, most worthwhile units of
distribution/reuse/compilation tend to be larger than a single
namespace/concern. Thus, it makes sense to have more than one module per
package, because otherwise we'd need some higher level mechanism in order
to manage the collections of package-modules which should be considered a
single unit (i.e., clients will almost always want the whole bunch of them).


This is the part that I'm trying to get a better sense of. I can see how in
some cases, it makes sense for more than one module to form a unit, because
they are tightly coupled semantically or implementation-wise -- so clients
will indeed want the whole bunch. On the other hand, several libraries
provide modules that are all over the place, in a way that doesn't form a
unit of any kind (e.g. MissingH), and it's not clear that you would want
any Network stuff when all you need is String utilities.


Yeah, MissingH and similar libraries are just grab-bags full of stuff. 
Usually grab-bag libraries think of themselves as place-holders, with 
the intention of breaking things out once there's something of a large 
enough size to warrant being its own package. (Whether the breaking out 
actually happens is another matter.) But to get the general sense of 
things, you should ignore them.


Instead, consider one of the parsing libraries like uu-parsinglib, 
attoparsec, parsec, frisby. There are lots of pieces to a parsing 
framework, but it makes sense to distribute them together.


Or, consider one of the base libraries for iteratees, enumerators, 
pipes, conduits, etc. Like parsing, these offer a whole framework. You 
won't usually need 100% of it, but everyone needs a different 80%.


Or to mention some more of my own packages, consider stm-chans, 
unification-fd, or unix-bytestrings. In unification-fd, the stuff 
outside of Control.Unification.* could be moved elsewhere, but the stuff 
within there makes sense to be split up yet distributed together. For 
stm-chans because of the similarity in interfaces, use cases, etc, 

Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-24 Thread Henk-Jan van Tuyl
On Wed, 25 Apr 2012 05:44:28 +0200, wren ng thornton w...@freegeek.org  
wrote:



On 4/23/12 11:39 AM, Gregg Lebovitz wrote:

On 04/23/2012 12:03 AM, wren ng thornton wrote:

However, until better technical support is implemented (not just for
GHC, but also jhc, UHC,...) it's best to follow social practice.


Wren, I am new to Haskell and not aware of all of the conventions. Is
there a place where I can find information on these social practices?
Are they documented some place?


Not that I know of, though they're fairly standard for any open-source  
programming community. E.g., when it comes to module names: familiarize  
yourself with what's out there; try to fit in with the patterns you  
see[1]; don't intentionally clash, steal namespaces[2], or squat on  
valuable territory[3]; be reasonable and conscientious when interacting  
with people.


The following page gives you some idea of the module names:
  http://www.haskell.org/haskellwiki/Hierarchical_module_names

An overview of pages about programming style:
  http://www.haskell.org/haskellwiki/Category:Style

Regards,
Henk-Jan van Tuyl


--
http://Van.Tuyl.eu/
http://members.chello.nl/hjgtuyl/tourdemonad.html
Haskell programming
--

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-23 Thread Gregg Lebovitz

On 04/23/2012 12:03 AM, wren ng thornton wrote:

However, until better technical support is implemented (not just for
GHC, but also jhc, UHC,...) it's best to follow social practice.


Wren, I am new to Haskell and not aware of all of the conventions. Is 
there a place where I can find information on these social practices? 
Are they documented some place?



However, centralization is prone to bottlenecks and systemic failure.
As such, while it would be nice to ensure that a given module is
provided by only one package, there is no mechanism in place to
enforce this (except at compile time for the code that links the
conflicting modules together).


From someone new to the community, it seems that yes centralization has 
its issues, but it also seems that practices could be put in place that 
minimize the bottlenecks and systemic failures.


Unless I greatly misunderstand the challenges,  there seem to be lot of 
ways to approach this problem and none of them are new. We all use 
systems that are composed of many modules neatly combined into complete 
systems. Linux distributions do this well. So does Java. Maybe should 
borough from their experiences and think about how we put packages 
together and what mechanisms we need to resolve inter-package dependencies.


Am I missing something that makes this problem harder than other systems 
and languages? Is anyone currently working on the packaging  and 
distribution issues? If not, does anyone else want to work on it?


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-23 Thread Alvaro Gutierrez
Thanks for the write-up -- it's been very helpful!

On Mon, Apr 23, 2012 at 12:03 AM, wren ng thornton w...@freegeek.orgwrote:

 Consider one of my own libraries (chosen randomly via Safari's url
 autocompletion):



 http://hackage.haskell.org/**package/bytestring-lexinghttp://hackage.haskell.org/package/bytestring-lexing

 When I inherited this package there were the Data.ByteString.Lex.Double
 and Data.ByteString.Lex.Lazy.**Double modules, which were separated
 because they provide the same API but for strict vs lazy ByteStrings. Both
 of those modules are concerned with lexing floating point numbers. I
 inherited the package because I wanted to publicize some code I had for
 lexing integers in various formats. Since that's quite a different task
 than lexing floating point numbers, I put it in its own module:
 Data.ByteString.Lex.Integral.


I see. The first thing that comes to mind is the notion of module
granularity, which of course is subjective, so whether a single module or
multiple ones should handle e.g. doubles and integrals is a good question;
are there guidelines as to how those choices are made?

At any rate, why do these modules, with sufficiently-different
functionality, live in the same library -- is it that they share some
common bits of implementation, or to ease the management of source code?

When dealing with FFI code, because of the impedance mismatch between
 Haskell and imperative languages like C, it's clear that there's going to
 be some massaging of the API beyond simply declaring FFI calls. As such,
 clearly we'd like to have separate modules for doing the low-level binding
 vs presenting a high-level API. Moreover, depending on what you're
 interfacing with, you may be forced to have multiple low-level modules.


Ah, that's a good use case. Is the lower-level module usually made public
as well, or is it only an implementation detail?


 On the other hand, the main purpose of packages or libraries is as unit of
 distribution, code reuse, and separate compilation. Even with the Haskell
 culture of making small libraries, most worthwhile units of
 distribution/reuse/compilation tend to be larger than a single
 namespace/concern. Thus, it makes sense to have more than one module per
 package, because otherwise we'd need some higher level mechanism in order
 to manage the collections of package-modules which should be considered a
 single unit (i.e., clients will almost always want the whole bunch of them).


This is the part that I'm trying to get a better sense of. I can see how in
some cases, it makes sense for more than one module to form a unit, because
they are tightly coupled semantically or implementation-wise -- so clients
will indeed want the whole bunch. On the other hand, several libraries
provide modules that are all over the place, in a way that doesn't form a
unit of any kind (e.g. MissingH), and it's not clear that you would want
any Network stuff when all you need is String utilities.

However, centralization is prone to bottlenecks and systemic failure. As
 such, while it would be nice to ensure that a given module is provided by
 only one package, there is no mechanism in place to enforce this (except at
 compile time for the code that links the conflicting modules together).
 With few exceptions, it's considered bad form to knowingly use the same
 module name as is being used by another package. In part, it's bad form
 because egos are involved; but it's also bad form because there's poor
 technical support for resolving namespace collisions for module names. In
 GHC you can use -XPackageImports, which is workable but conflates issues of
 code with issues of provenance, which the Haskell Report intentionally
 keeps separate. However, until better technical support is implemented (not
 just for GHC, but also jhc, UHC,...) it's best to follow social practice.


But the way you describe it, it seems that despite centralization having
those disadvantages, it is more or less the way the system works, socially
(egos, bad form, etc.) and technically (because of the lack of compiler
support) -- except that it is ad-hoc instead of mechanically enforced. In
other words, I don't see what the advantages of allowing ambiguity
currently are.

Some people figured to solve the new issue by implementing it both ways in
 separate packages, but reusing the same module names. (Witness for example
 mtl-2 aka monads-fd, vs monads-tf.) In practice, that didn't work out so
 well. Part of the reason for failure is that although fundeps and TF/ATs
 are formally equivalent in theory, in practice the implementation of TF/ATs
 has(had?) been missing some necessary machinery, and consequentially the
 TF/AT versions were not as powerful as the original fundep versions. Though
 the butterfly dependency issues certainly didn't help.


Ah, interesting. So, perhaps I misunderstand, but this seems like an
argument in favor of having uniquely-named modules (e.g. Foo.FD and
Foo.TF) instead of 

Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-23 Thread Brandon Allbery
On Mon, Apr 23, 2012 at 11:39, Gregg Lebovitz glebov...@gmail.com wrote:

 Am I missing something that makes this problem harder than other systems
 and languages? Is anyone currently working on the packaging  and
 distribution issues? If not, does anyone else want to work on it?


The other dirty little secret that is carefully being avoided here is the
battle between the folks for whom Haskell is a language research platform
and those who use it to get work done.  It's not entirely inaccurate to say
the former group would regard a fragmented module namespace as a good
thing, specifically because it discourages people from considering it to be
stable

-- 
brandon s allbery  allber...@gmail.com
wandering unix systems administrator (available) (412) 475-9364 vm/sms
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-23 Thread Gregg Lebovitz

  
  


On 4/23/2012 3:39 PM, Brandon Allbery wrote:

  
On Mon, Apr 23, 2012 at 11:39, Gregg
  Lebovitz glebov...@gmail.com
  wrote:
  

  Am I missing something that makes this
problem harder than other systems and languages? Is
anyone currently working on the packaging  and
distribution issues? If not, does anyone else want to
work on it?



The other dirty little secret that is carefully being
  avoided here is the battle between the folks for whom
  Haskell is a language research platform and those who use
  it to get work done.  It's not entirely inaccurate to say
  the former group would regard a fragmented module
  namespace as a good thing, specifically because it
  discourages people from considering it to be stable
  

  


Brandon, I find that a little hard to believe.  If the issues are
similar to other systems and languages, then  I think it is more
likely that no one has volunteered to work on it.  You volunteering
to help?


  

  
  
  
  
  -- 
  brandon s allbery                                      allber...@gmail.com
  wandering unix systems administrator (available)     (412)
  475-9364 vm/sms
  

  

  


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-23 Thread Brandon Allbery
On Mon, Apr 23, 2012 at 17:16, Gregg Lebovitz glebov...@gmail.com wrote:

 On 4/23/2012 3:39 PM, Brandon Allbery wrote:

  The other dirty little secret that is carefully being avoided here is
 the battle between the folks for whom Haskell is a language research
 platform and those who use it to get work done.  It's not entirely
 inaccurate to say the former group would regard a fragmented module
 namespace as a good thing, specifically because it discourages people from
 considering it to be stable

 Brandon, I find that a little hard to believe.  If the issues are similar
 to other systems and languages, then  I think it is more likely that no one
 has volunteered to work on it.  You volunteering to help?


Yes, you do find it hard to believe; so hard that you went straight past it
and tried to point to the easy technical solution to the problem you
decided to see in place of the real one, which doesn't have a technical
solution.

-- 
brandon s allbery  allber...@gmail.com
wandering unix systems administrator (available) (412) 475-9364 vm/sms
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-22 Thread Brandon Allbery
On Sun, Apr 22, 2012 at 13:15, Alvaro Gutierrez radi...@google.com wrote:

 As I understand it, a library can provide any number of unrelated modules,
 and conversely, a single module could be provided by more than one library.
 I can see how this affords library authors more flexibility, but at a cost:
 there is no longer a single, unified view of the library universe. (The
 alternative would be for every module to be its own, hermetic library.) So
 I'm very interested in the rationale behind that aspect of the library
 system.


One reason:  modules serve multiple purposes; one of these is namespacing,
and in the case of interfaces to foreign libraries that may force a
division that would otherwise not exist.

More generally, making libraries and modules one-to-one means that either
modules exist solely for libraries, or libraries must be artificially
split.  Perhaps this indicates that modules have too many other functions,
but in that case you should propose an alternative system to replace them.

As to multiple libraries providing the same module:  the Haskell ecosystem
is still evolving and it's not always appropriate to give a particular
implementation sole ownership of a general module name.  Type families vs.
functional dependencies are an example of this (theoretically type families
were considered superior but to date they haven't lived up to it and
recently some cases were shown that fundeps can solve but type families
can't; parallel monad libraries based on both still exist).  New container
implementations have existed as standalone packages, some of which later
merge with standard packages while others are discarded.  Your proposal to
reject this reflects a static library ecosystem that does not exist.  (It
could be enforced dictatorially, but there is no Guido van Rossum of
Haskell and a mistake in an evolving system is difficult to fix after the
fact even with a dictator; we're already living with some difficult to fix
issues not related to modules.)

-- 
brandon s allbery  allber...@gmail.com
wandering unix systems administrator (available) (412) 475-9364 vm/sms
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-22 Thread Alvaro Gutierrez
Thanks for your response.

On Sun, Apr 22, 2012 at 4:45 PM, Brandon Allbery allber...@gmail.comwrote:

 One reason:  modules serve multiple purposes; one of these is namespacing,
 and in the case of interfaces to foreign libraries that may force a
 division that would otherwise not exist.


Interesting. Could you elaborate on what the other purposes are, and
perhaps point to an instance of the foreign library case?

More generally, making libraries and modules one-to-one means that either
 modules exist solely for libraries, or libraries must be artificially
 split.  Perhaps this indicates that modules have too many other functions,
 but in that case you should propose an alternative system to replace them.


Oh, I don't intend to replace it -- at most I want to understand why the
system is set up the way it is, what the cons/pros are, and so on. I've
come across a lot of design discussions for various Haskell features, but
not this one; are there any?

As to multiple libraries providing the same module:  the Haskell ecosystem
 is still evolving and it's not always appropriate to give a particular
 implementation sole ownership of a general module name.  Type families vs.
 functional dependencies are an example of this (theoretically type families
 were considered superior but to date they haven't lived up to it and
 recently some cases were shown that fundeps can solve but type families
 can't; parallel monad libraries based on both still exist).  New container
 implementations have existed as standalone packages, some of which later
 merge with standard packages while others are discarded.


I see. I didn't imagine there was as much variability with respect to
module names and implementations as you suggest.

I'm confused as to how type families vs. fundeps play a role here -- as far
as I can tell both are compiler extensions that do not provide modules.

I'm interested to see examples where two or more well-known yet unrelated
modules clash under the same name; I can't imagine them coexisting in
public very long -- wouldn't the confusion among users (e.g. when looking
for documentation) be enough to either reconcile the modules or change one
of the names?



 Your proposal to reject this reflects a static library ecosystem that does
 not exist.  (It could be enforced dictatorially, but there is no Guido van
 Rossum of Haskell and a mistake in an evolving system is difficult to fix
 after the fact even with a dictator; we're already living with some
 difficult to fix issues not related to modules.)


Right, assuming there could only be one implementation of a module, this is
one of the main drawbacks; on the flip side, it is a feature in that
there is no confusion as to what Foo.Bar.Qux means. As it is, any import
requires out-of-band information in order to be resolved (both cognitively
and by the compiler), in the form of the library it comes from. (There's
also versioning information, but that could be equally specified
per-library or per-module.)

On the other hand, enforcing a single implementation is orthogonal to
having a 1-to-1 module/library mapping. That is, you could allow multiple
implementations either way.

Alvaro
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Correspondence between libraries and modules

2012-04-22 Thread wren ng thornton

On 4/22/12 6:30 PM, Alvaro Gutierrez wrote:

On Sun, Apr 22, 2012 at 4:45 PM, Brandon Allberyallber...@gmail.comwrote:

One reason:  modules serve multiple purposes; one of these is namespacing,
and in the case of interfaces to foreign libraries that may force a
division that would otherwise not exist.


Interesting. Could you elaborate on what the other purposes are, and
perhaps point to an instance of the foreign library case?


The main purpose of namespacing (IMO) is to separate concerns and make 
it easier to figure out how a project fits together. The primary goal of 
modules is to resolve namespacing issues.


Consider one of my own libraries (chosen randomly via Safari's url 
autocompletion):


http://hackage.haskell.org/package/bytestring-lexing

When I inherited this package there were the Data.ByteString.Lex.Double 
and Data.ByteString.Lex.Lazy.Double modules, which were separated 
because they provide the same API but for strict vs lazy ByteStrings. 
Both of those modules are concerned with lexing floating point numbers. 
I inherited the package because I wanted to publicize some code I had 
for lexing integers in various formats. Since that's quite a different 
task than lexing floating point numbers, I put it in its own module: 
Data.ByteString.Lex.Integral.


When dealing with FFI code, because of the impedance mismatch between 
Haskell and imperative languages like C, it's clear that there's going 
to be some massaging of the API beyond simply declaring FFI calls. As 
such, clearly we'd like to have separate modules for doing the low-level 
binding vs presenting a high-level API. Moreover, depending on what 
you're interfacing with, you may be forced to have multiple low-level 
modules. For example, if you use Google protocol buffers via the hprotoc 
package, then it will generate a separate module for each buffer type. 
That's fine, but usually it's not something you want to foist on your users.



On the other hand, the main purpose of packages or libraries is as unit 
of distribution, code reuse, and separate compilation. Even with the 
Haskell culture of making small libraries, most worthwhile units of 
distribution/reuse/compilation tend to be larger than a single 
namespace/concern. Thus, it makes sense to have more than one module per 
package, because otherwise we'd need some higher level mechanism in 
order to manage the collections of package-modules which should be 
considered a single unit (i.e., clients will almost always want the 
whole bunch of them).


However, centralization is prone to bottlenecks and systemic failure. As 
such, while it would be nice to ensure that a given module is provided 
by only one package, there is no mechanism in place to enforce this 
(except at compile time for the code that links the conflicting modules 
together). With few exceptions, it's considered bad form to knowingly 
use the same module name as is being used by another package. In part, 
it's bad form because egos are involved; but it's also bad form because 
there's poor technical support for resolving namespace collisions for 
module names. In GHC you can use -XPackageImports, which is workable but 
conflates issues of code with issues of provenance, which the Haskell 
Report intentionally keeps separate. However, until better technical 
support is implemented (not just for GHC, but also jhc, UHC,...) it's 
best to follow social practice.




I'm confused as to how type families vs. fundeps play a role here -- as far
as I can tell both are compiler extensions that do not provide modules.


Both TFs (or rather associated types) and fundeps aim to solve the same 
problem. Namely: when using multi-parameter type classes, it is often 
desirable to declare that one parameter is wholly defined by other 
parameters, either for semantic reasons or (more often) to help type 
inference. Since they both aim to solve the same problem, this raises a 
new problem: for some given type class, do I implement it with TF/ATs or 
with fundeps?


Some people figured to solve the new issue by implementing it both ways 
in separate packages, but reusing the same module names. (Witness for 
example mtl-2 aka monads-fd, vs monads-tf.) In practice, that didn't 
work out so well. Part of the reason for failure is that although 
fundeps and TF/ATs are formally equivalent in theory, in practice the 
implementation of TF/ATs has(had?) been missing some necessary 
machinery, and consequentially the TF/AT versions were not as powerful 
as the original fundep versions. Though the butterfly dependency issues 
certainly didn't help.




I'm interested to see examples where two or more well-known yet unrelated
modules clash under the same name; I can't imagine them coexisting in
public very long -- wouldn't the confusion among users (e.g. when looking
for documentation) be enough to either reconcile the modules or change one
of the names?


That's not much of a problem in practice. There are lots of different