[isabelle-dev] Merge-Sort Implementation
Hi all, please find attached the formalization of the merge-sort algorithm as used in GHC's standard library. See also: http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/Data-List.html#sort Due to experiments and comments found there, I suggest that this implementation is used in future Isabelle releases for Haskell code-generation ;) Some compliments are also in order: 1) I was positively surprised that the mutually recursive functions sequences, ascending, and descending where accepted without further ado by the function package. 2) Sledgehammer is great! It shrunk the proof of sorted_sequences from 7 explicit cases (about 50 lines) -- with lots of tedious instantiations to make simp or force solve the goal -- to 4 lines. And the resulting metis proof is much faster than my original one. cheers chris theory Sort_Impl imports ~~/src/HOL/Library/Multiset begin section {* GHC version of merge sort *} context linorder begin text {* Split a list into chunks of ascending and descending parts, where descending parts are reversed. Hence, the result is a list of sorted lists. *} fun sequences :: 'a list = 'a list list and ascending :: 'a = ('a list = 'a list) = 'a list = 'a list list and descending :: 'a = 'a list = 'a list = 'a list list where sequences (a#b#xs) = (if a b then descending b [a] xs else ascending b (op # a) xs) | sequences xs = [xs] | ascending a f (b#bs) = (if ~ a b then ascending b (f o op # a) bs else f [a] # sequences (b#bs)) | ascending a f bs = f [a] # sequences bs | descending a as (b#bs) = (if a b then descending b (a#as) bs else (a#as) # sequences (b#bs)) | descending a as bs = (a#as) # sequences bs fun merge :: 'a list = 'a list = 'a list where merge (a#as) (b#bs) = (if a b then b # merge (a#as) bs else a # merge as (b#bs)) | merge [] bs = bs | merge as [] = as fun merge_pairs :: 'a list list = 'a list list where merge_pairs (a#b#xs) = merge a b # merge_pairs xs | merge_pairs xs = xs lemma merge_length[simp]: length (merge xs ys) = length xs + length ys by (induct xs ys rule: merge.induct) simp_all lemma merge_pairs_length[simp]: length (merge_pairs xs) = length xs by (induct xs rule: merge_pairs.induct) simp_all fun merge_all :: 'a list list = 'a list where merge_all [] = [] | merge_all [x] = x | merge_all xs = merge_all (merge_pairs xs) definition msort :: 'a list = 'a list where msort = merge_all o sequences lemma set_merge[simp]: set (merge xs ys) = set xs Un set ys by (induct xs ys rule: merge.induct) auto lemma sorted_merge[simp]: sorted xs == sorted ys == sorted (merge xs ys) by (induct xs ys rule: merge.induct) (auto simp: sorted_Cons) lemma multiset_of_merge[simp]: multiset_of (merge xs ys) = multiset_of xs + multiset_of ys by (induct xs ys rule: merge.induct) (auto simp: ac_simps) lemma sorted_merge_pairs[simp]: ALL x:set xs. sorted x == ALL x:set (merge_pairs xs). sorted x by (induct xs rule: merge_pairs.induct) simp_all lemma multiset_of_merge_pairs[simp]: multiset_of (concat (merge_pairs xs)) = multiset_of (concat xs) by (induct xs rule: merge_pairs.induct) (auto simp: ac_simps) lemma sorted_merge_all: ALL x:set xs. sorted x == sorted (merge_all xs) by (induct xs rule: merge_all.induct) simp_all lemma multiset_of_merge_all[simp]: multiset_of (merge_all xs) = multiset_of (concat xs) by (induct xs rule: merge_all.induct) (auto simp: ac_simps) lemma shows sorted_sequences: ALL x:set (sequences xs). sorted x and [| ALL x:set (f []). x = a; sorted (f []); ALL xs ys. f (xs@ys) = f xs @ ys; ALL x. f [x] = f [] @ [x] |] == ALL x:set (ascending a f xs). sorted x and [| ALL x:set bs. c = x; sorted bs |] == ALL x:set (descending c bs xs). sorted x by (induct xs and a f xs and c bs xs rule: sequences_ascending_descending.induct) (simp_all add: sorted_append sorted_Cons, metis append_Cons append_Nil le_less_linear order_trans, metis less_le less_trans) lemma shows multiset_of_sequences[simp]: multiset_of (concat (sequences xs)) = multiset_of xs and (!!x xs. multiset_of (f (x#xs)) = {#x#} + multiset_of (f []) + multiset_of xs) == multiset_of (concat (ascending a f xs)) = {#a#} + multiset_of (f []) + multiset_of xs and multiset_of (concat (descending c bs xs)) = {#c#} + multiset_of bs + multiset_of xs by (induct xs and a f xs and c bs xs rule: sequences_ascending_descending.induct) (simp_all add: ac_simps) lemma multiset_of_msort[simp]: multiset_of (msort xs) = multiset_of xs by (simp add: msort_def o_def) lemma sorted_msort[simp]: sorted (msort xs) by (simp add: msort_def o_def sorted_merge_all[OF sorted_sequences]) lemma sort_msort: sort = msort by (intro ext properties_for_sort) simp_all end end ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Merge-Sort Implementation
Hi Christian, please find attached the formalization of the merge-sort algorithm as used in GHC's standard library. See also: http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/Data-List.html#sort interesting to read that comment. The exiting quicksort implementation in HOL is indeed taken from Isabelle's ML library. Don't know what the ancient motivation for quicksort has been (maybe others can comment on this). A critical question: according to the comment, this should easily generalize to a stable sort_key implementation (as also the current quicksort does). Have you undertaken this? All the best, Florian -- Home: http://www.in.tum.de/~haftmann PGP available: http://home.informatik.tu-muenchen.de/haftmann/pgp/florian_haftmann_at_informatik_tu_muenchen_de signature.asc Description: OpenPGP digital signature ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Merge-Sort Implementation
On Thu, 27 Oct 2011, Florian Haftmann wrote: The exiting quicksort implementation in HOL is indeed taken from Isabelle's ML library. Don't know what the ancient motivation for quicksort has been (maybe others can comment on this). In ancient times, the Isabelle/ML library had a really slow insertion sort. I replaced that by quicksort according to the discussion of sorting algorithms in Larry's ML for the working programmer (1st edition), when I was a young student. Later the implementation was slightly tuned in conjunction with a lot of profiling on the real applications in the system -- the most critical one being normalization of sorts, which often happens in the inference kernel. In recent years, mergesort definitely gained more popularity than quicksort in general public. Norbert Schirmer was the first to look at this again in ML, but he did not find a significant difference to the existing Isabelle/ML implementation. I've occasionally reconsidered the question myself, but there was never the critical mass to make a change to this important detail of the trusted code base. Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Merge-Sort Implementation
If my memory is correct, quicksort was the clear winner in the performance tests that I undertook for my book. Larry On 27 Oct 2011, at 13:50, Florian Haftmann wrote: interesting to read that comment. The exiting quicksort implementation in HOL is indeed taken from Isabelle's ML library. Don't know what the ancient motivation for quicksort has been (maybe others can comment on this). ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Merge-Sort Implementation
On Thu, 27 Oct 2011, Lawrence Paulson wrote: If my memory is correct, quicksort was the clear winner in the performance tests that I undertook for my book. I can confirm it for some later measurements of the refined implementation -- data from 2005..2007, IIRC. There are some anecdotes about other tips from the book, such as the floating-point based random generator. This made its way into Moscow ML, from there into Metis, from there back into the Metis version of Isabelle. Here it caused huge performance issues, especially in parallel Poly/ML. Later David Matthews improved that significantly, and the random generator for Metis was replaced by an old word-based version according to Knuth. Nonetheless, we still have the real-based Library.random from ML for the working programmer, because without it quickcheck performs really bad. See also http://dilbert.com/dyn/str_strip/0//000/00/0/2000/300/2318/2318.strip.gif Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Merge-Sort Implementation
On 10/27/2011 04:38 PM, Florian Haftmann wrote: Nonetheless, we still have the real-based Library.random from ML for the working programmer, because without it quickcheck performs really bad. AFAIR this has only been the case for the ancient SML quickcheck, whereas my quickcheck implementation comes with a random generator in HOL based on a cousin in Haskell. If this is true, we could throw away Library.random. Maybe Lukas can comment on this. What Florian mentioned is correct. A closer code inspection tells me: Matrix/Compute_Oracle/am_sml.ML still uses it to get a unique identifier, probably this can be replaced by a more standard serial_string () Slegdehammer uses it to randomly announce information from Geoff Sutcliffe to our users. Mutabelle has another copy of a Random engine for some random-choice function. It is seems feasible to get rid of the Random engine if one can replace these occurrences by something else appropriate. Lukas ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev