Re: [Haskell-cafe] Re: Church Encoding Function
Hi, Am Samstag, den 10.03.2007, 14:52 -0500 schrieb Stefan Monnier: I'm pretty sure you can define a catamorphism for any regular algebraic data type. Actually, so-called negative occurrences in (regular) data types cause problems. Try to define the catamorphism of data Exp = Num Int | Lam (Exp - Exp) | App Exp Exp to see the problem, I guess Robert is still true as this Exp contains a non-algebraic type ((-)), therefore is not a regular algebraic type. (Based on an assumed meaning of “algebraic data type”). But let’s see... Am I right to assume that I have found a catamorphism if and only if the that function, applied to the data type constructors (in the right order) gives me the identity on this data type? maybe Nothing Just == id :: Maybe - Maybe \b - if b then True else False == id :: Bool - Bool foldr (:) []== id :: [a] - [a] uncurry (,) == id :: (a,b) - (a,b) either Left Right == id :: Either a b - Either a b Does that also mean that catamorphism for a given type are unique (besides argument re-ordering)? Now some brainstorming about the above Exp. I guess we want: exp Num Lam App e == id e therefore exp :: (Int - Exp) ((Exp - Exp) - Exp) (Exp - Exp - Exp) Exp now we want the return type to not matter exp :: forall b. (Int - b) ((Exp - Exp) - b) (Exp - Exp - b) Exp So my guess would be: exp f _ _ (Num i) = f i exp _ f _ (Lam g) = f g exp _ _ f (App e1 e2) = f e1 e2 Looks a bit stupid, but seems to work, especially as there is not much a function with type (Exp - Exp) - b can do, at least on it’s own. Is that a catamorphism? Greetings, Joachim -- Joachim Breitner e-Mail: [EMAIL PROTECTED] Homepage: http://www.joachim-breitner.de ICQ#: 74513189 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Church Encoding Function
When you make these kind of elimination functions you have a choice. Namely if you want to do case analysis only, or primitive recursion. For non-recursive data types they come out the same, so 'maybe', 'either', 'uncurry', etc are the same. But for lists it differs: -- Case analysis caseList :: (a - [a] - r) - r - [a] - r caseList c n [] = n caseList c n (x:xs) = c x xs And primitive recursion is foldr. It's easy to write the case analysis function for any data type; it just encodes what 'case' does. If you're in a language with general recursion there is no loss in expressive power by just having the case analysis functions instead of the primitive recursive ones since you can recurse yourself. But, in a language without Y you'd be stuck with boring functions with the likes of foldr. -- Lennart On Mar 11, 2007, at 08:27 , Joachim Breitner wrote: Hi, Am Samstag, den 10.03.2007, 14:52 -0500 schrieb Stefan Monnier: I'm pretty sure you can define a catamorphism for any regular algebraic data type. Actually, so-called negative occurrences in (regular) data types cause problems. Try to define the catamorphism of data Exp = Num Int | Lam (Exp - Exp) | App Exp Exp to see the problem, I guess Robert is still true as this Exp contains a non-algebraic type ((-)), therefore is not a regular algebraic type. (Based on an assumed meaning of “algebraic data type”). But let’s see... Am I right to assume that I have found a catamorphism if and only if the that function, applied to the data type constructors (in the right order) gives me the identity on this data type? maybe Nothing Just == id :: Maybe - Maybe \b - if b then True else False == id :: Bool - Bool foldr (:) []== id :: [a] - [a] uncurry (,) == id :: (a,b) - (a,b) either Left Right == id :: Either a b - Either a b Does that also mean that catamorphism for a given type are unique (besides argument re-ordering)? Now some brainstorming about the above Exp. I guess we want: exp Num Lam App e == id e therefore exp :: (Int - Exp) ((Exp - Exp) - Exp) (Exp - Exp - Exp) Exp now we want the return type to not matter exp :: forall b. (Int - b) ((Exp - Exp) - b) (Exp - Exp - b) Exp So my guess would be: exp f _ _ (Num i) = f i exp _ f _ (Lam g) = f g exp _ _ f (App e1 e2) = f e1 e2 Looks a bit stupid, but seems to work, especially as there is not much a function with type (Exp - Exp) - b can do, at least on it’s own. Is that a catamorphism? Greetings, Joachim -- Joachim Breitner e-Mail: [EMAIL PROTECTED] Homepage: http://www.joachim-breitner.de ICQ#: 74513189 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] idea for avoiding temporaries
On 3/11/07, Donald Bruce Stewart [EMAIL PROTECTED] wrote: bulat.ziganshin: Hello Claus, Saturday, March 10, 2007, 4:36:22 AM, you wrote: ah, ok, i'm not used to thinking in such scales;-) (perhaps you should get in touch with those SAC people, after all - i don't know what their state of play is, but many years ago, they started in an office near mine, and they were definitely thinking about large arrays, even about how to distribute them, and computations on them; last days i learned details of google's MapReduce system. seems that this approach is very interesting for dealing with large arrays. files (arrays) are splitted into chunks, operations are splitted into chunks, too. afaik, some C compilers are already able to automatically split vector operations into several threads? at least, it will be interesting to implement same technique for GHC, may be just in form of library, like google does See the data parallel arrays library: http://haskell.org/haskellwiki/GHC/Data_Parallel_Haskell -- Don (which, btw, is much more interesting than google's mapreduce stuff, since it does *nested* data parallelism) -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: idea for avoiding temporaries
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Simon Marlow wrote: GHC doesn't have any kind of uniqueness analysis right now. It's pretty hard to do in general: imagine a function that takes an array as an argument and delivers an array as a result. It'll probably need two versions: one when the argument is unique, one for when it's not. What if the compiler might only create the version that requires uniqueness, and callers that don't already have uniqueness must make a copy? (or that could be a trivial wrapper function alongside the main uniqueness-requiring version). Would this be at all significantly worthwhile? Isaac -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF9BRtHgcxvIWYTTURAmDnAJ9+e5M5k6PmmfHJwqpZrsIopNX5tQCg06rk CRsmtjedyOJ1ARvYijYUCp4= =abyA -END PGP SIGNATURE- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] idea for avoiding temporaries
* The algorithm as written already tries to express minimal storage. The only question is, do +=, -=, and := update their left-hand side in place, or do we think of them (in the Haskell model of the universe) as fresh arrays, and the previous values as newly-created garbage? My challenge to fellow Haskell hackers: find a way to express this such that it doesn't look so imperative. * Even if we do a good job with += and -=, which is what David seems to be looking for, we still have those irritating := assignments--- we'd like to throw out the old p and reuse the space in the last line. And we'd like to have one piece of storage to hold the q on each successive iteration. So even if we solve David's problem, we still haven't matched the space requirements of the imperative code. i find it difficult to discuss performance issues without concrete code examples, so i decided to convert Jan-Willem's loop code into Haskell. at first, i just naively translated the loop to recursion over lists, then i tried to produce an update-inplace variant based on some form of arrays, and finally i added a variant based on strict lists (would be nice to have standard libraries for (head-/spine-)strict lists, btw). both the array and strict list versions avoid some intermediate structures; for the arbitrarily invented, relatively small inputs i've tried, strict lists are the clear winner, thanks to lower memory traffic, but i'd like some feedback from the experts: -are there any obvious inefficiencies in the array code? -what does your real code look like, in view of scaling to much larger inputs? -i tried to keep the computations equal, but seem to have introduced a small variance in the strict list version, which i can't seem to spot by staring at it. any ideas? -i assume the memory overhead of binary strict lists is unacceptable for large inputs, but i'm still waiting for those polymorphic bytestrings..;-) while playing with this code, it occurred to me that situations as that described by David (small numbers of huge structures of constant size, with no nested pointers) are less suited for compacting garbage collection than for slot-reusing reference counting. GC wins in the common situation of many variable-sized, frequently created/destroyed structures, by touching only those objects that are live when space runs out, while RC has a higher continuous overhead. as i mentioned earlier, i'm not sure whether update-in-place vs recreate is so bad for whole-array updates, but the real win could be not having to copy the huge live arrays arround at GC time (either from old to new space, or within one space, to create contiguous free slots while compacting the occupied slots). so instead of update-in-place in virtual memory space, which will still be copied at GC time, one would want to pin the virtual region in which the arrays are allocated to a real memory region, and tell the GC to keep its hands off it (none of that space will be released until the loop ends, all of it will be released after the loop ends and a copy of z has been made). does this make sense to the number crunchers and memory managers out there? and is there a way to allocate Haskell arrays to such pinned memory regions? claus CG.hs Description: Binary data ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Performance Help
I have re-written the sha1 code so that it is (hopefully) easy to see that it faithfully implements the algorithm (see http://www.itl.nist.gov/fipspubs/fip180-1.htm). Having got rid of the space leak, I have been trying to improve performance. Currently, the haskell code is 2 orders of magnitude slower than the sha1sum that ships with my linux. [EMAIL PROTECTED]:~/sha1/testdist/sha1 time ./perfTest perfTest c7eae62ddabb653bb9ce4eb18fa8b94264f92a76 Success real0m2.152s user0m2.112s sys 0m0.028s [EMAIL PROTECTED]:~/sha1/testdist/sha1 time sha1sum perfTest c7eae62ddabb653bb9ce4eb18fa8b94264f92a76 perfTest real0m0.057s user0m0.008s sys 0m0.004s I've played around with profiling and doubled the performance of the haskell code but I'm nowhere near the C performance. Sun Mar 11 19:32 2007 Time and Allocation Profiling Report (Final) perfTest +RTS -p -RTS eg total time =6.75 secs (135 ticks @ 50 ms) total alloc = 1,483,413,752 bytes (excludes profiling overheads) COST CENTREMODULE %time %alloc oneBlock Data.Digest.SHA1 39.3 40.1 $ Data.Digest.SHA1 20.7 21.6 f Data.Digest.SHA1 13.36.2 getWord32s Data.Digest.SHA1 7.46.6 test2 Main 5.98.7 blockWord8sIn32Data.Digest.SHA1 5.25.3 blockWord8sIn512 Data.Digest.SHA1 3.04.4 padData.Digest.SHA1 1.53.5 k Data.Digest.SHA1 1.50.0 fromBytes Data.Digest.SHA1 0.03.5 Here's the code that is taking the majority of the time. ($) :: [Word32] - [Word32] - [Word32] a $ b = zipWith (+) a b -- Word128 - Word512 - Word128 oneBlock ss xs = Word128 (as!!80) (bs!!80) (cs!!80) (ds!!80) (es!!80) where ws = xs ++ map (rotL 1) (zipWith4 xxxor wm3s wm8s wm14s ws) where xxxor a b c d = a `xor` b `xor` c `xor` d wm3s = drop (16-3) ws wm8s = drop (16-8) ws wm14s = drop (16-14) ws as = ai:ts bs = bi:as cs = ci:(map (rotL 30) bs) ds = di:cs es = ei:ds ts = (map (rotL 5) as) $ (zipWith4 f [0..] bs cs ds) $ es $ (map k [0..]) $ ws Word128 ai bi ci di ei = ss Any help would be appreciated. I've put a copy of a working system here if anyone wants to experiment (http://www.haskell.org/crypto/downloads/sha1.tar.gz). Thanks, Dominic. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Performance Help
On Sun, Mar 11, 2007 at 08:18:44PM +, Dominic Steinitz wrote: Word128 ai bi ci di ei = ss 128 is not divisible by 5. You should probably rename that type :) Stefan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Performance Help
On Sunday 11 March 2007 20:46, Stefan O'Rear wrote: On Sun, Mar 11, 2007 at 08:18:44PM +, Dominic Steinitz wrote: Word128 ai bi ci di ei = ss 128 is not divisible by 5. You should probably rename that type :) Stefan I must have been thinking of MD5. Yes Word160 would be better. Dominic. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] idea for avoiding temporaries
Hello Claus, Sunday, March 11, 2007, 10:03:59 PM, you wrote: both the array and strict list versions avoid some intermediate structures; for the arbitrarily invented, relatively small inputs i've tried, strict lists are the clear winner, thanks to lower memory traffic, but i'd like some feedback from the experts: -are there any obvious inefficiencies in the array code? obviously, arrays version should create no temporary cells. the problems was mainly due to 2 factors: 1) readArray m (i,j) 2) 'op' in 'l' which was passed as real closure and was not inlined due to weakness of ghc optimizer also, we should help strictness analyzer by marking all the variables used in tight loops as strict. after that is done, we got 1000 times less temporary data allocated and 5x faster execution. now it's a bit faster than strict lists -- Best regards, Bulatmailto:[EMAIL PROTECTED] CG.hs Description: Binary data ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re[2]: [Haskell-cafe] idea for avoiding temporaries
Hi Bulat, obviously, arrays version should create no temporary cells. that's why the memory traffic surprised me. i knew there had to be something wrong. the problems was mainly due to 2 factors: 1) readArray m (i,j) yes, indeed. since we are dealing in bulk operations, we might as well take advantage of that, so dropping the repeated bounds-checks inside the loops makes a lot of sense. moral/note to self: bulk array operations are your friend (i knew that!-), but you need to use that when defining them (unsafeRead et al are only for library writers, but library writers ought to use them; and i was writing a small inlined library) 2) 'op' in 'l' which was passed as real closure and was not inlined due to weakness of ghc optimizer it irked me having to write the same loop twice, but i didn't consider the consequences. an INLINE pragma on l almost seems sufficient to regain the loss, so i prefer that; but writing out the loop twice is still a tiny tick faster.. we should help strictness analyzer by marking all the variables used in tight loops as strict. ah, there were a few surprises in that one. i thought i had considered possible strictness problems, but obviously, i missed a few relevant possibilities. annotating everything is the safe option, and clearly documents the intentions, but i cannot help thinking about which annotations could be omitted: - modArray: a and i are both used anyway - i index in loop is definitely checked (but j isn't, and some others weren't, either; so better safe than sorry) - some of the parameters need not be annotated in this example, but should be if one wanted to reuse the code elsewhere - the one i really don't get yet is the one on the accumulators (!s in l, in dotA/matA); i thought i had covered that by using $! in applying the loop, but annotating the formal loop parameter, apart from being nicer, also seems faster.. moral/note to self: don't try to be clever, try to be clear..; strictness in formal parameters is better than strictness in actual parameters; bang-patterns are good;-) after that is done, we got 1000 times less temporary data allocated and 5x faster execution. now it's a bit faster than strict lists is this now anywhere near what the number-crunchers use, when they use Haskell?-) Bulat, thanks for looking into it and for isolating the issues so quickly!-) claus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] State of Ignorance
Hans van Thiel wrote: toMyState :: String - MyState String Int toMyState x = MyStateC (repl1 x) where the monad is defined as: data MyState a b = MyStateC ([a] - ([a], b)) instance Monad (MyState a) where return x = MyStateC (\tb - (tb, x)) (MyStateC st) = f = MyStateC (\tb - let (newtb, y) = st tb (MyStateC trans) = f y in trans newtb ) Now I understand this (barely) in the sense that = works through defining how f takes its values from st. So, f would be the function: toMystate and trans would be: (repl1 x). But then y would have to be of type String, whereas the y in the tuple would have type Int, since it is generated by st. I just don't get it. No, y has type String. In more detail: in order for foo = toMyState to make sense, these must be the types: foo :: MyState String String st :: [String] - ([String], String) y is forced to have type String for the sake of toMyState. They work, but now I don't understand sequence, defined in the Prelude. From: a Tour of the Haskell Monad Functions http://members.chello.nl/hjgtuyl/tourdemonad.html sequence :: Monad m = [m a] - m [a] sequence = foldr mcons (return []) where mcons p q = p = \x - q = \y - return (x : y) This has a different bind (=) from the one in the MyState monad, yet is appears to perform all the state transformations. The = used by sequence is the same = in the MyState monad, since you instantiate m to MyState String. Therefore, sequence performs all the state transformations correctly, since = is correct. A method of understanding a program is to reinvent it. Let us do that to sequence. Preliminary: I have three actions: p = toMyState house b = toMyState tree c = toMyState house I want to perform them in that order; moreover, each returns a number, and I want to collect all three numbers into a list, also in that order. If I may hardcode everything, I may write: do x - p y0 - b y1 - c return [x,y0,y1] But we always want to generalize. I do not want to hardcode p,b,c; I want to take them from a parameter, which can be a list of arbitrary length in general. To see how, I first rewrite my hardcoded version as: do x - p y - do {y0 - b; y1 - c; return [y0,y1]} return (x:y) I have the middle line handle the rest of the list, and if somehow it works correctly (later I will replace it by recursion), I just have to handle the first of the list before, and return the combined answer after. Now I use recursion in the middle, and introduce parameters: mysequence (p:bc) = do x - p y - mysequence bc return (x:y) mysequence [] = return [] Or in terms of = : mysequence (p:bc) = p = \x - mysequence bc = \y - return (x:y) mysequence [] = return [] This is now equivalent to the sequence code in the standard prelude. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Haskell Weekly News: March 12, 2007
--- Haskell Weekly News http://sequence.complete.org/hwn/20070312 Issue 59 - March 12, 2007 --- Welcome to issue 59 of HWN, a weekly newsletter covering developments in the [1]Haskell community. This week we see the 2007 Haskell Workshop announcement, Haskell.org's participation in the Google Summer of Code gets underway, and of course, new libraries! 1. http://haskell.org/ Announcements Google Summer of Code and Haskell.org. Malcolm Wallace [2]announced that Haskell.org has once again applied to be a mentoring organisation for the Google Summer of Code. If you are a student who would like to earn money hacking in Haskell, or you are a non-student who has a cool idea for a coding project but no time to do it yourself, then visit the [3]SoC wiki to gather ideas, and add yourself to the list of interested people! Add new ideas for projects! 2. http://article.gmane.org/gmane.comp.lang.haskell.cafe/20232 3. http://hackage.haskell.org/trac/summer-of-code Haskell Workshop Call for Papers. Gabriele Keller [4]announced the initial call for papers for the Haskell Workshop 2007, part of the 2007 International Conference on Functional Programming (ICFP). The purpose of the Haskell Workshop is to discuss experience with Haskell, and possible future developments for the language. The scope of the workshop includes all aspects of the design, semantics, theory, application, implementation, and teaching of Haskell. 4. http://article.gmane.org/gmane.comp.lang.haskell.general/14977 Data.CompactString 0.3: Unicode ByteString. Twan van Laarhoven [5]announced version 0.3 of the Data.CompactString library. Data.CompactString is a wrapper around Data.ByteString supporting Unicode strings. 5. http://article.gmane.org/gmane.comp.lang.haskell.general/14973 harchive-0.2: backup and restore software in Haskell. David Brown [6]announced release 0.2 of [7]harchive, a program for backing up and restoring data. The package is available [8]from Hackage. 6. http://article.gmane.org/gmane.comp.lang.haskell.general/14972 7. http://www.davidb.org/darcs/harchive/ 8. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/harchive-0.2 New release of regex packages. Chris Kuklewicz [9]announced new versions of the regex-* packages (base,compat,dfa,parsec,pcre,posix,tdfa,tre). There is a new [10]wiki page with documentation relating to these packages. All packages are available from [11]Hackage, under the [12]Text Category. 9. http://article.gmane.org/gmane.comp.lang.haskell.cafe/20189 10. http://haskell.org/haskellwiki/Regular_expressions 11. http://hackage.haskell.org/packages/hackage.html 12. http://hackage.haskell.org/packages/archive/pkg-list.html#cat:Text StaticDTD: type safe markup combinators from DTDs. Marcel Manthe [13]announced a tool that transforms a Document Type Definition to a library. The resulting library contains combinators that assure proper nesting of elements. The plan is to add more constraints that will also take care of the order of occurrence of children. The parsing of the DTD is done with HaXml. The code is [14]available via darcs. 13. http://article.gmane.org/gmane.comp.lang.haskell.cafe/20218 14. http://m13s07.vlinux.de/darcs/StaticDTD/ IPv6 support for network package. Bryan O'Sullivan [15]announced that he'd added IPv6 support to the network package. 15. http://thread.gmane.org/gmane.comp.lang.haskell.libraries/6363 Type-level binary arithmetic library. Oleg Kiselyov and Chung-chieh Shan [16]announced a [17]new library for arbitrary precision binary arithmetic over natural kinds. The library supports addition/subtraction, predecessor/successor, multiplication/division, exp2, full comparisons, GCD, and the maximum. At the core of the library are multi-mode ternary relations Add and Mul where any two arguments determine the third. Such relations are especially suitable for specifying static arithmetic constraints on computations. The type-level numerals have no run-time representation; correspondingly, all arithmetic operations are done at compile time and have no effect on run-time. 16. http://article.gmane.org/gmane.comp.lang.haskell.general/14961 17. http://pobox.com/~oleg/ftp/Computation/resource-aware-prog/BinaryNumber.hs Haskell' This section covers the [18]Haskell' standardisation process. * [19]Deriving Functor 18. http://hackage.haskell.org/trac/haskell-prime 19. http://thread.gmane.org/gmane.comp.lang.haskell.prime/2135 Libraries This week's proposals and extensions to the [20]standard libraries. * [21]Add IPv6 support to network library * [22]Error handling conventions 20.
Re: [Haskell-cafe] Performance Help
Dominic Steinitz wrote: Any help would be appreciated. I notice that there's not much user-accessible documentation of what you can expect GHC (or some other Haskell implementation) to do and not do with a given piece of code. For example, you have a lot of little definitions that clearly traverse the same lists many times. I've no idea where I would look, except for the compiler source, to get a sense for when, if ever, the compiler might apply CSE, fusion, or any other techniques that come to mind. So transmitting folk wisdom on what the compiler might do with any given piece of code counts as another half chapter in the Practical Haskell book that ought to get written :-) b ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe