Re: Proposal: priority queues in containers

2010-03-21 Thread Louis Wasserman
Okay, here's my current plan:  I think I'm going to withdraw the ticket for
containers, and launch a package to be included in a future release of the
Haskell Platform.

First off, I'd like to know what the procedure is for adding something to
the Platform.  I haven't been able to find a mailing list dedicated to it,
so I don't know where it should be discussed.  (In all likelihood, I'm
missing some obvious link on either Trac or HaskellWiki or something...)

First, people should look at my current setup, and complain at me about the
design.  Haddock documentation is
herehttp://code.haskell.org/containers-pqueue/pqueue-1.0.0/html/,
though I've only so far documented the minimum-queue variants; in general, I
aimed to use the same function names used by Data.Map and Data.Set wherever
possible, so extract, top, and delete have been replaced by the
function names used for the analogous methods in Data.Map and Data.Set.

Probably the single aspect of the design I'm least happy about is how I
distinguish between the priority queues that distinguish a key and an
element, and those which don't.  Currently, the Data.Map-style key/element
version is named MinPQueue k a, and lives in the module
Data.PQueue.Prio.Min, and the Data.Set-style version is named MinQueue a,
and lives in the module Data.PQueue.Min.  Can anybody think of better names
for these things?

Finally, I'm not going to consider any more alternative implementations at
this point.  My implementation has performed outstandingly on benchmarks,
and has beaten a number of competitors.  If someone believes that they have
a superior implementation, they should email me off-list, and perhaps a
later version of the package will be based on a different implementation,
but for the moment I'm sticking to the type-magical binomial heap
implementation.

Louis Wasserman
wasserman.lo...@gmail.com
http://profiles.google.com/wasserman.louis
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-19 Thread Isaac Dupree

On 03/19/10 09:39, Louis Wasserman wrote:

Yo,



  * I'm not comfortable with having two redundant modules, one for Min- and
one for MaxQueue



I'm pretty sure there won't be a containers-compatible solution, certainly
not a solution compatible with the style of containers as it's currently
written


On 03/19/10 06:26, Ross Paterson wrote:

If keys are separate, the two versions could be easily achieved using
an inversion adaptor on Ord, which has been proposed before and would
have many other uses.


I agree with Ross. We should add Data.Ord.Opposite, or whatever we 
decide to call it,
(newtype Opposite a = Opposite { getOpposite :: a } deriving (Eq, Show, 
...), and the obvious Ord instance.)


This works with Data.Map too.  Additionally I suggest, since folding a 
Map produces keys lowest-first, that we choose the same behavior for the 
priority queue.


On the other hand, separating the key and the value gives us the same 
Data.Map vs. Data.Set mess (which maybe is not a bad mess, but, it would 
suggest there should be two different PQ modules too, one with 
associated non-ordered data and the other with only the priorities).


-Isaac
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Louis Wasserman
Oh, god, so much to respond to...heh.

 Submit this package for canonicalization as part of the Haskell Platform. I
 would for one would support its inclusion.


This is an option I seriously hadn't considered.  To be fair, that's because
I've never used the Platform myself, preferring rather to have the most
up-to-date version of GHC at all times, heh.  That said, while I'd be okay
with this option, I'd prefer putting it into containers, because I feel like
a canonical, reliable priority queue implementation is the sort of thing a
self-respecting language ought to have built in.

As does Python. In Python, though, the PQ implementation is not a built-in
 class in the default namespace (as are dict and set).  Rather, it is one of
 the batteries included libraries that come with Python. I think that's the
 right place for it in Haskell, too.


I don't know Python, but according to Wikipedia, dict and set are built into
the language.  I don't think it's a fair comparison: set and dict in Python
seem to have a role almost as ubiquitous as [] in Haskell, much more
ubiquitous than e.g. Data.Set or Data.Map.  I'm also not entirely sure that
batteries included doesn't describe containers, given all the other
packages that come with GHC.

* There is no distinction between keys and priority values.  A utility
 type

  Prio p a with the instance Ord p = Ord (Prio p a) is exported to
 allow

  usage of distinct keys and priority values.

I disagree with this one.  It requires an Ord instance that isn't really an
 ordering, and makes a Functor instance impossible.  I would prefer an
 interface separating keys and values like that of Data.Map (which would also
 increase consistency within the package).


I'd be okay with separating out a priority/value version.  However, I'm
still clueless as to what you're talking about with respect to Functor --
what's wrong with the following?
data Prio p a = Prio p a
instance Ord p = Ord (Prio p a) where ...
instance Functor (Prio p) where fmap f (Prio p a) = Prio p (f a)

I can understand if you're uncomfortable with (==) and (\ x y - compare x y
== EQ) being inequivalent, but neither the H98 Report nor the Prelude make
any such claim, as far as I can tell.

 The Foldable instance breaks the abstraction.  I think it should
 process elements in order.


I think that wanting to iterate over the priority queue in the middle of the
computation, without caring about order, is a perfectly legitimate desire
for a programmer!  Moreover, the only way to make a Foldable instance
process elements in order would be something like
data Ord a = PQueue a = 
which I think is an awfully dirty approach.  I'm not at all a fan of adding
restrictions like that, not least because it adds lots of awkward overhead.
Would you be okay with not exporting a Foldable instance at all, but still
exporting a toList method which doesn't guarantee any ordering on the return
list?

My advice would be not to stress over whether priority queues go into
 containers. It's not some pristine thing of beauty that deserves treatment
 with velvet gloves.


I'm...not sure how to respond to this claim.  At least part of me wants to
say, I genuinely do think the containers package is a piece of art...and
then another part pipes up, except for the inconsistencies between the
various minView/maxView versions, and the little differences between IntMap
and Map, and...  That said, I wouldn't be a fan of scrapping the style
which the containers package has at least tried to create.  I could be
convinced that rewriting the rest of containers would be a good thing to do,
though...and I might just want to do that myself.  Hah.

 How does this implementation compare with using Map/Set as a
 priority queue?


Continuing the discussion of the benchmarks: first, Jim, it appears that I'm
the one who made a n00b benchmarking error.  TT_TT  That said, as you
found, my implementation is still slightly faster when the benchmark is
corrected.  Some comments:

   - QuickBinom needs to have O(1) findMin for a reasonable comparison.  I
   added that in the benchmark below, and here.
   - I can't think of any more optimizations for the sparse binomial heap --
   I genuinely think it's not going to get better.
   - There is a way to optimize my implementation still further, but it
   makes my code much less readable.  (Specifically, I start the BinomForest at
   Succ Zero, and unpack the first child of every node still in the forest.
Modifying the whole implementation that way, though, makes it unreadably
   ugly...and I think QuickBinom is possibly already at that point.  I started
   implementing it, and realized just how ugly it was, and I stopped, but I can
   finish it if I have to.)

Sorting 500,000 elements, compiled with -O2, run with +RTS -H128m -K64m,
with another few implementations thrown in for good measure:
Times (ms)
   min  mean  +/-sdmedian  max
Pairing:1440.090  1482.893

Re: Proposal: priority queues in containers

2010-03-18 Thread Evan Laforge
On Thu, Mar 18, 2010 at 7:43 AM, Louis Wasserman
wasserman.lo...@gmail.com wrote:
 Oh, god, so much to respond to...heh.

You did request feedback back there, didn't you :P

 As does Python. In Python, though, the PQ implementation is not a built-in
 class in the default namespace (as are dict and set).  Rather, it is one of
 the batteries included libraries that come with Python. I think that's the
 right place for it in Haskell, too.

 I don't know Python, but according to Wikipedia, dict and set are built into
 the language.  I don't think it's a fair comparison: set and dict in Python
 seem to have a role almost as ubiquitous as [] in Haskell, much more

It's not really the same.  pqueue is not in the built-in namespace in
python, that's like the Prelude in haskell.  pqueue *is* in the
default library, which every python installation will have since it
comes with the default download, this is what's meant by batteries
included.  So that's like containers: you have to explicitly import
it, but you shouldn't worry about installations that don't have it
because it comes with the compiler.

The main difference is that python doesn't have cabal and doesn't have
anything like the haskell platform and installing new packages, while
easy, is not as automatic as cabal can be.

 After about five hours' work (!!!) I *finally* managed to install Criterion
 yesterday, so I'll send out those tests ASAP.

I wanted to use criterion too at one point, but it looked too hard to
install so I was scared away...
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Milan Straka
Hi,

  After about five hours' work (!!!) I *finally* managed to install Criterion
  yesterday, so I'll send out those tests ASAP.
 
 I wanted to use criterion too at one point, but it looked too hard to
 install so I was scared away...

Why is that? Because of the Chart depencency?

You can install progression -- a wrapper over criterion, which uses
gnuplot to draw graphs, and disable the Chart dependency.

I did just
cabal --user install -f -Chart criterion progression
and it installed without a problem, on ghc-6.12.1, ghc-6.13.something,
both Linux and Windows.

Cheers,
Milan
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Bas van Dijk
On Thu, Mar 18, 2010 at 4:43 PM, Louis Wasserman
wasserman.lo...@gmail.com wrote:
  Submit this package for canonicalization as part of the Haskell Platform.
 I would for one would support its inclusion.

 This is an option I seriously hadn't considered.  To be fair, that's because
 I've never used the Platform myself, preferring rather to have the most
 up-to-date version of GHC at all times, heh.  That said, while I'd be okay
 with this option, I'd prefer putting it into containers, because I feel like
 a canonical, reliable priority queue implementation is the sort of thing a
 self-respecting language ought to have built in.

I don't like libraries getting bigger, I like them getting smaller.

When they're smaller they're easier to understand and easier to upgrade.

So I would also advice proposing your package for the HP (Haskell Platform).

I'm even for splitting containers into sub-packages: maps, sets,
sequence, graph and tree. Those sub-packages would then need to be
added to the HP.

Then we could turn containers into a meta-package that depends on
these sub-packages (similar to how the HP works[1]).

Finally we could deprecate containers and after some time remove it.

(I'm also for splitting base even more... but one thing at a time)

regards,

Bas

[1] http://hackage.haskell.org/platform/2009.2.0.2/haskell-platform.cabal
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Milan Straka
Hi,

 I don't like libraries getting bigger, I like them getting smaller.
 
 When they're smaller they're easier to understand and easier to upgrade.
 
 So I would also advice proposing your package for the HP (Haskell Platform).
 
 I'm even for splitting containers into sub-packages: maps, sets,
 sequence, graph and tree. Those sub-packages would then need to be
 added to the HP.
 
 Then we could turn containers into a meta-package that depends on
 these sub-packages (similar to how the HP works[1]).
 
 Finally we could deprecate containers and after some time remove it.
 
 (I'm also for splitting base even more... but one thing at a time)

personally I am against splitting containers. It is a collection of
several basic data structures with similar design decisions
(reasonably efficient, can be used persistently, decent API).
I think these structures should stay together, to have a library of data
structures for common usage.

I am for adding the priority queues to the containers.

Cheers,
Milan Straka
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Louis Wasserman
First off: I've finally gotten set up with code.haskell.org.  A darcs repo
of my binomial heap implementation is at
http://code.haskell.org/containers-pqueue/.  Also on that site is the haddock
documentationhttp://code.haskell.org/containers-pqueue/containers-0.3.0.0/html/,
which I'm sure many people will appreciate.  Somebody else (Ross?) had
requested a separate version of the priority queue to handle priorities and
values separately, so I'm working on that.
I've also deleted the Foldable instance of MinQueue, though I still offer a
clearly documented unordered toList, which will stay in place.

Well, I only tested one thing (heap sort), and QuickBinom is actually

faster under different options (-prof -auto-all or without calling
 performGC before every heapsort).

-prof -auto-all blocks a significant number of optimizations from actually
happening.  Essentially, if the compiler has to figure out how much time is
taken by some particular function, then it's not allowed to inline or
eliminate uses of that function.  I don't consider it a fair comparison.
 Moreover, calling performGC makes sense -- it essentially wipes the slate,
making each successive comparison unbiased by the previous one.

Louis, you note later in this email that your implementation is done.
 That seems important to me. If we fix a sane interface now, the
 implementation can be changed later if we find something more
 efficient, right?


Absolutely true.

However, I finally assembled a benchmark that I think is a fair comparison
-- a heapsort, essentially length . Data.List.unfoldr extract . foldr
insert empty.   The
resultshttp://code.haskell.org/containers-pqueue/bench-chart.pdfare
pretty supportive of my implementation.  (Original timing data,
outputted by Progression, is
herehttp://code.haskell.org/containers-pqueue/bench-final.csv.
 I think all of the original code for the benchmark is in the
code.haskell.org folder, just not part of the darcs repo.  However, I had to
slightly modify my copy of Progression to force the GC, so YMMV.)

I'm still pretty strongly in favor of putting priority queues into
containers: other programming languages consider it necessary for inclusion
into standardized libraries, people will be more likely to use appropriate
data structures for their needs when reliable, friendly implementations are
already at their fingertips, and other reasons already discussed.

In general, I think priority queues should be treated the same as Data.Map
or Data.Set, like java.util.* or the C++ STL treat their respective versions
of those structures.  I don't think there's likely to be any agreement any
time soon to split up containers, so my inclination is to put pqueues into
containers, and maybe if we agree later to split containers, then priority
queues will be part of that decision.

On a somewhat different note: writing unit tests in the existing framework
is awkward and hard!  Writing QuickCheck tests is trivial and much more
exhaustive than what the existing tests look like.  The existing containers
tests appear to check one particular example that was the source of a
preexisting bug, rather than examining exhaustively that methods work
correctly, to eliminate potentially new bugs.  I mean, Data.IntSet's one
existing test is

main = print $ isProperSubsetOf (fromList [2,3]) $ fromList [2,3,4]

which might have found some preexisting bug, but certainly doesn't convince
me of Data.IntSet's correctness.

Is this normal?  Is it permissible to use QuickCheck inside unit tests?  (A
collection of QuickCheck tests -- all of which my implementation passes --
is in the code.haskell.org directory.)

Louis Wasserman
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Bas van Dijk
On Thu, Mar 18, 2010 at 10:39 PM, Milan Straka f...@ucw.cz wrote:
 personally I am against splitting containers. It is a collection of
 several basic data structures with similar design decisions
 (reasonably efficient, can be used persistently, decent API).
 I think these structures should stay together, to have a library of data
 structures for common usage.

But when turning containers into a meta-package these structures will
stay together while at the same time allowing people to use and
upgrade them separately.

regards,

Bas
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Milan Straka
Hi,

 On Thu, Mar 18, 2010 at 10:39 PM, Milan Straka f...@ucw.cz wrote:
  personally I am against splitting containers. It is a collection of
  several basic data structures with similar design decisions
  (reasonably efficient, can be used persistently, decent API).
  I think these structures should stay together, to have a library of data
  structures for common usage.
 
 But when turning containers into a meta-package these structures will
 stay together while at the same time allowing people to use and
 upgrade them separately.

If the metapackage stays, then there is probably little difference.
I was under impression you wanted to remove the metapackage and leave
the individual structures as packages (that was my understanding of your
mail).

I would like to have basic data structures connected together. I do not
really mind if the modules are in one library or in several, as long as
I could say I want 'containers'.

But independently on the form of the containers package, I would like it
to contain a priority queue. Including it to the containers package
seems as a good first step.

Cheers,
Milan
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Thomas Schilling
On 18 March 2010 22:02, Louis Wasserman wasserman.lo...@gmail.com wrote:
 I'm still pretty strongly in favor of putting priority queues into
 containers: other programming languages consider it necessary for inclusion
 into standardized libraries, people will be more likely to use appropriate
 data structures for their needs when reliable, friendly implementations are
 already at their fingertips, and other reasons already discussed.

The Haskell Platform is really is intended to be available at your
fingertips.  Unfortunately, the following does not work (although I
thought it's supposed to)

 $ cabal install haskell-platform

Nevertheless, the libraries bundled with GHC are those libraries that
GHC itself needs and which therefore cannot be upgraded independently.
 The real standard libraries are the Haskell Platform and if your
package is part of the platform, then your package *is* in status
equivalent to things like java.util.*.

This weekend's Hackathon in Zürich will partly be dedicated to getting
the next release of the Platform release ready.  If you can get your
package into the following platform release (due 6 months after the
current release), then this would surely make it the default package
for anyone in need of a PQ.

/ Thomas
-- 
Push the envelope.  Watch it bend.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Louis Wasserman
Okay, let me ask the following question:

Would anybody besides me be heartbroken if priority queues *weren't* put
into containers, but were instead put into the Platform?

Louis Wasserman
wasserman.lo...@gmail.com
http://profiles.google.com/wasserman.louis


On Thu, Mar 18, 2010 at 6:50 PM, Thomas Schilling
nomin...@googlemail.comwrote:

 On 18 March 2010 22:02, Louis Wasserman wasserman.lo...@gmail.com wrote:
  I'm still pretty strongly in favor of putting priority queues into
  containers: other programming languages consider it necessary for
 inclusion
  into standardized libraries, people will be more likely to use
 appropriate
  data structures for their needs when reliable, friendly implementations
 are
  already at their fingertips, and other reasons already discussed.

 The Haskell Platform is really is intended to be available at your
 fingertips.  Unfortunately, the following does not work (although I
 thought it's supposed to)

 $ cabal install haskell-platform

 Nevertheless, the libraries bundled with GHC are those libraries that
 GHC itself needs and which therefore cannot be upgraded independently.
  The real standard libraries are the Haskell Platform and if your
 package is part of the platform, then your package *is* in status
 equivalent to things like java.util.*.

 This weekend's Hackathon in Zürich will partly be dedicated to getting
 the next release of the Platform release ready.  If you can get your
 package into the following platform release (due 6 months after the
 current release), then this would surely make it the default package
 for anyone in need of a PQ.

 / Thomas
 --
 Push the envelope.  Watch it bend.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Brandon S. Allbery KF8NH

On Mar 18, 2010, at 18:33 , Milan Straka wrote:
I would like to have basic data structures connected together. I do  
not
really mind if the modules are in one library or in several, as long  
as

I could say I want 'containers'.


This is what the Haskell Platform is for.  No real need to add more  
stuff to containers; if anything, the libraries stored there should be  
decoupled so that e.g. the recent Data.Map proposals wouldn't disturb  
programs not using Data.Map.


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allb...@kf8nh.com
system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu
electrical and computer engineering, carnegie mellon universityKF8NH




PGP.sig
Description: This is a digitally signed message part
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: priority queues in containers

2010-03-18 Thread Brandon S. Allbery KF8NH

On Mar 18, 2010, at 19:50 , Thomas Schilling wrote:

The Haskell Platform is really is intended to be available at your
fingertips.  Unfortunately, the following does not work (although I
thought it's supposed to)

$ cabal install haskell-platform


Other way around:  installing the Haskell Platorm gives you ghc, a set  
of libraries known to work with that version of ghc and with each  
other, and cabal-install.


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allb...@kf8nh.com
system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu
electrical and computer engineering, carnegie mellon universityKF8NH




PGP.sig
Description: This is a digitally signed message part
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users