[isabelle-dev] Merge-Sort Implementation

2011-10-27 Thread Christian Sternagel

Hi all,

please find attached the formalization of the merge-sort algorithm as 
used in GHC's standard library. See also:


http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/Data-List.html#sort

Due to experiments and comments found there, I suggest that this 
implementation is used in future Isabelle releases for Haskell 
code-generation ;)


Some compliments are also in order:

1) I was positively surprised that the mutually recursive functions 
sequences, ascending, and descending where accepted without further ado 
by the function package.


2) Sledgehammer is great! It shrunk the proof of sorted_sequences from 7 
explicit cases (about 50 lines) -- with lots of tedious instantiations 
to make simp or force solve the goal -- to 4 lines. And the resulting 
metis proof is much faster than my original one.


cheers

chris
theory Sort_Impl
imports ~~/src/HOL/Library/Multiset
begin

section {* GHC version of merge sort *}

context linorder
begin

text {*
Split a list into chunks of ascending and descending parts, where
descending parts are reversed. Hence, the result is a list of
sorted lists.
*}
fun sequences :: 'a list = 'a list list
  and ascending :: 'a = ('a list = 'a list) = 'a list = 'a list list
  and descending :: 'a = 'a list = 'a list = 'a list list
where
  sequences (a#b#xs) = (if a  b
then descending b [a] xs
else ascending b (op # a) xs)
| sequences xs = [xs]

| ascending a f (b#bs) = (if ~ a  b
then ascending b (f o op # a) bs
else f [a] # sequences (b#bs))
| ascending a f bs = f [a] # sequences bs

| descending a as (b#bs) = (if a  b
then descending b (a#as) bs
else (a#as) # sequences (b#bs))
| descending a as bs = (a#as) # sequences bs

fun merge :: 'a list = 'a list = 'a list where
  merge (a#as) (b#bs) = (if a  b
then b # merge (a#as) bs
else a # merge as (b#bs))
| merge [] bs = bs
| merge as [] = as

fun merge_pairs :: 'a list list = 'a list list where
  merge_pairs (a#b#xs) = merge a b # merge_pairs xs
| merge_pairs xs = xs

lemma merge_length[simp]: length (merge xs ys) = length xs + length ys
  by (induct xs ys rule: merge.induct) simp_all

lemma merge_pairs_length[simp]: length (merge_pairs xs) = length xs
  by (induct xs rule: merge_pairs.induct) simp_all

fun merge_all :: 'a list list = 'a list where
  merge_all [] = []
| merge_all [x] = x
| merge_all xs = merge_all (merge_pairs xs)

definition msort :: 'a list = 'a list where
  msort = merge_all o sequences

lemma set_merge[simp]: set (merge xs ys) = set xs Un set ys
  by (induct xs ys rule: merge.induct) auto

lemma sorted_merge[simp]: sorted xs == sorted ys == sorted (merge xs ys)
  by (induct xs ys rule: merge.induct) (auto simp: sorted_Cons)

lemma multiset_of_merge[simp]: multiset_of (merge xs ys) = multiset_of xs + 
multiset_of ys
  by (induct xs ys rule: merge.induct) (auto simp: ac_simps)

lemma sorted_merge_pairs[simp]:
  ALL x:set xs. sorted x == ALL x:set (merge_pairs xs). sorted x
  by (induct xs rule: merge_pairs.induct) simp_all

lemma multiset_of_merge_pairs[simp]:
  multiset_of (concat (merge_pairs xs)) = multiset_of (concat xs)
  by (induct xs rule: merge_pairs.induct) (auto simp: ac_simps)

lemma sorted_merge_all: ALL x:set xs. sorted x == sorted (merge_all xs)
  by (induct xs rule: merge_all.induct) simp_all

lemma multiset_of_merge_all[simp]:
  multiset_of (merge_all xs) = multiset_of (concat xs)
  by (induct xs rule: merge_all.induct) (auto simp: ac_simps)

lemma
  shows sorted_sequences: ALL x:set (sequences xs). sorted x
  and [| ALL x:set (f []). x = a; sorted (f []); ALL xs ys. f (xs@ys) = f xs 
@ ys;
ALL x. f [x] = f [] @ [x] |] == ALL x:set (ascending a f xs). sorted x
  and [| ALL x:set bs. c = x; sorted bs |] == ALL x:set (descending c bs 
xs). sorted x
  by (induct xs and a f xs and c bs xs rule: 
sequences_ascending_descending.induct)
 (simp_all add: sorted_append sorted_Cons,
  metis append_Cons append_Nil le_less_linear order_trans,
  metis less_le less_trans)

lemma
  shows multiset_of_sequences[simp]: multiset_of (concat (sequences xs)) = 
multiset_of xs
  and (!!x xs. multiset_of (f (x#xs)) = {#x#} + multiset_of (f []) + 
multiset_of xs)
== multiset_of (concat (ascending a f xs)) = {#a#} + multiset_of (f []) + 
multiset_of xs
  and multiset_of (concat (descending c bs xs)) = {#c#} + multiset_of bs + 
multiset_of xs
by (induct xs and a f xs and c bs xs rule: 
sequences_ascending_descending.induct)
   (simp_all add: ac_simps)

lemma multiset_of_msort[simp]: multiset_of (msort xs) = multiset_of xs
  by (simp add: msort_def o_def)

lemma sorted_msort[simp]: sorted (msort xs)
  by (simp add: msort_def o_def sorted_merge_all[OF sorted_sequences])

lemma sort_msort: sort = msort
  by (intro ext properties_for_sort) simp_all

end

end
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev


Re: [isabelle-dev] Merge-Sort Implementation

2011-10-27 Thread Florian Haftmann
Hi Christian,

 please find attached the formalization of the merge-sort algorithm as
 used in GHC's standard library. See also:
 
 http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/Data-List.html#sort

interesting to read that comment.  The exiting quicksort implementation
in HOL is indeed taken from Isabelle's ML library.  Don't know what the
ancient motivation for quicksort has been (maybe others can comment on
this).

A critical question: according to the comment, this should easily
generalize to a stable sort_key implementation (as also the current
quicksort does).  Have you undertaken this?

All the best,
Florian

-- 

Home:
http://www.in.tum.de/~haftmann

PGP available:
http://home.informatik.tu-muenchen.de/haftmann/pgp/florian_haftmann_at_informatik_tu_muenchen_de



signature.asc
Description: OpenPGP digital signature
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev


Re: [isabelle-dev] Merge-Sort Implementation

2011-10-27 Thread Makarius

On Thu, 27 Oct 2011, Florian Haftmann wrote:

The exiting quicksort implementation in HOL is indeed taken from 
Isabelle's ML library.  Don't know what the ancient motivation for 
quicksort has been (maybe others can comment on this).


In ancient times, the Isabelle/ML library had a really slow insertion 
sort.  I replaced that by quicksort according to the discussion of sorting 
algorithms in Larry's ML for the working programmer (1st edition), when 
I was a young student. Later the implementation was slightly tuned in 
conjunction with a lot of profiling on the real applications in the system 
-- the most critical one being normalization of sorts, which often happens

in the inference kernel.

In recent years, mergesort definitely gained more popularity than 
quicksort in general public.  Norbert Schirmer was the first to look at 
this again in ML, but he did not find a significant difference to the 
existing Isabelle/ML implementation. I've occasionally reconsidered the 
question myself, but there was never the critical mass to make a change to 
this important detail of the trusted code base.



Makarius
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev


Re: [isabelle-dev] Merge-Sort Implementation

2011-10-27 Thread Lawrence Paulson
If my memory is correct, quicksort was the clear winner in the performance 
tests that I undertook for my book.
Larry

On 27 Oct 2011, at 13:50, Florian Haftmann wrote:

 interesting to read that comment.  The exiting quicksort implementation
 in HOL is indeed taken from Isabelle's ML library.  Don't know what the
 ancient motivation for quicksort has been (maybe others can comment on
 this).

___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev


Re: [isabelle-dev] Merge-Sort Implementation

2011-10-27 Thread Makarius

On Thu, 27 Oct 2011, Lawrence Paulson wrote:

If my memory is correct, quicksort was the clear winner in the 
performance tests that I undertook for my book.


I can confirm it for some later measurements of the refined implementation 
-- data from 2005..2007, IIRC.



There are some anecdotes about other tips from the book, such as the 
floating-point based random generator.  This made its way into Moscow ML, 
from there into Metis, from there back into the Metis version of Isabelle. 
Here it caused huge performance issues, especially in parallel Poly/ML. 
Later David Matthews improved that significantly, and the random generator 
for Metis was replaced by an old word-based version according to Knuth.


Nonetheless, we still have the real-based Library.random from ML for the 
working programmer, because without it quickcheck performs really bad. 
See also 
http://dilbert.com/dyn/str_strip/0//000/00/0/2000/300/2318/2318.strip.gif



Makarius
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev


Re: [isabelle-dev] Merge-Sort Implementation

2011-10-27 Thread Lukas Bulwahn

On 10/27/2011 04:38 PM, Florian Haftmann wrote:

Nonetheless, we still have the real-based Library.random from ML for
the working programmer, because without it quickcheck performs really
bad.


AFAIR this has only been the case for the ancient SML quickcheck,
whereas my quickcheck implementation comes with a random generator in
HOL based on a cousin in Haskell.  If this is true, we could throw away
Library.random.  Maybe Lukas can comment on this.


What Florian mentioned is correct.

A closer code inspection tells me:

Matrix/Compute_Oracle/am_sml.ML still uses it to get a unique 
identifier, probably this can be replaced by a more standard 
serial_string ()
Slegdehammer uses it to randomly announce information from Geoff 
Sutcliffe to our users.
Mutabelle has another copy of a Random engine for some random-choice 
function.


It is seems feasible to get rid of the Random engine if one can replace 
these occurrences by something else appropriate.



Lukas
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev