Re: [Haskell-cafe] SYB looping very, very mysteriously
On Mon, Dec 7, 2009 at 3:38 PM, David Fox dds...@gmail.com wrote: On Sat, Dec 5, 2009 at 2:38 AM, Andrea Vezzosi sanzhi...@gmail.com wrote: On Fri, Dec 4, 2009 at 8:51 PM, Jeremy Shaw jer...@n-heptane.com wrote: I have stripped things down to the bare minimum, and test under GHC 6.10, GHC 6.12, Linux, and Mac OS X. Results are consistent. In the following code, 1. if you load the code into ghci and evaluate e it will hang, but (defaultValueD dict) :: Expression returns fine 2. if you change the gunfold instance for Proposition to, error gunfold it stops hanging -- even though this code is never called. 3. if you change, ( Data ctx [Expression], Sat (ctx Expression) = Data ctx Expression, to (Data ctx Expression, ) = ... it stops hanging. If someone could explain why each of these cases perform as they do, that would be awesome! Right now it is a big mystery to me.. e calls dict .. and there is only one instance of dict available, which should call error right away. I can't see how something could get in the way there... It's less of a mystery if you think about the actual dictionaries ghc uses to implement typeclasses. The instance for Data ctx [a] depends on Data ctx a, so by requiring Data ctx [Expression] in the Data ctx Expression instance you're indeed making a loop there, though typeclasses do allow this, and the implementation has to be lazy enough to permit it. Strange that with a direct Data ctx Expression = Data ctx Expression loop we don't get the same problem. The reason the implementation of Proposition's gunfold matters is probably that k gets passed the dictionary for Data DefaultD Expression at the site of its call and some optimization is making it stricter than necessary. Looks like we need a ghc dev here to fully unravel the mystery, in the meantime i'll try to reduce the test case even further. I have posted a ghc bug for this: http://hackage.haskell.org/trac/ghc/ticket/3731 and an syb-with-class bug, in case it is not a ghc bug (perhaps due to undecidable instances?):http://code.google.com/p/syb-with-class/issues/detail?id=3 While trying to make the test case on the ghc trac self-contained I've found a workaround which involves changing the Default [a] instance in happstack-data. I've attached the patch. I don't know why it works with certainity, so it'd be nice if you could test it in a large codebase at least. The problem is that one should work out how all these dictionaries get constructed, and that means staring at a lot of Core. workaround-for-default-wrt-http___hackage_haskell_org_trac_ghc_ticket_3731.dpatch Description: Binary data ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Failing to install hxt
I think the best bet is to: cabal install tagsoup-0.6 --reinstrall This somehow didn't work, since hxt-3.8.2 insisted on using tagsoup-0.8 even when i removed every tagsoup-0.8 directory. And perhaps send the HXT maintainer a cabal patch? Ok, did this.___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: PROPOSAL: Web application interface
Michael Snoyman wrote: [--snip--] Next, I have made the ResponseBodyClass typeclass specifically with the goal of allowing optimizations for lazy bytestrings and sending files. The former seems far-fetched; the latter provides the ability to use a sendfile system call instead of copying the file data into memory. However, in the presence of gzip encoding, how useful is this optimization? [--snip--] I'm hoping that the Web bit in your project title doesn't literally mean that WAI is meant to be restricted to solely serving content to browsers. With that caveat in mind: For non-WWW HTTP servers it can be extremely useful to have sendfile. An example is my Haskell UPnP Media Server (hums) application. It's sending huge files (AVIs, MP4s, etc.) over the network and since these files are already compressed as much as they're ever going to be, gzip would be useless. The CPU load of my hums server went from 2-5% to 0% when streaming files just from switching from a Haskell I/O based solution to proper sendfile. Lack of proper support for sendfile() was indeed one of the reasons that I chose to roll my own HTTP server for hums. I should note that this was quite a while ago and I haven't really gone back to reevaluate that choice -- there's too many HTTP stacks to choose from right now and I don't have the time to properly evaluate them all. For this type of server, response *streaming* is also extremely important for those cases where you cannot use sendfile, so I'd hate to see a standard WAI interface preclude that. (No, lazy I/O is NOT an option -- the HTTP clients in a typical UPnP media client behave so badly that you'll run out of file descriptors in no time. Trust me, I've tried.) Cheers, ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Failing to install hxt
Thank you, that worked. I wonder two things: 1) How to downgrade cabal packages? Just installing tagsoup-0.6 and removing every trace of tagsoup-0.8 somehow didn't work for me. cabal still insisted that there should be a tagsoup-0.8. 2) Is the commonly used version-scheming = XYZ ideal? Maybe package maintainers should depend on exactly one version they know that works? I mean how can you guarantee, that every future version of a package will work? Daniel Fischer daniel.is.fisc...@web.de hat am 24. Januar 2010 um 00:59 geschrieben: cabal unpack hxt-8.3.2 open hxt.cabal in kate, change the range of the required tagsoup to (= 0.6 0.7) cd hxt-8.3.2 cabal install perhaps not the _best_ solution, but quick and it usually works ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: could we get a Data instance for Data.Text.Text?
Hi, The problem with Data for Text isn't that we have to write a new instance, but that you could argue that proper handling of Text with Data would not be using a type class, but have special knowledge baked in to Data. That's far worse than the Serialise problem mentioned above, and no one other than the Data authors could solve it. Of course, I don't believe that, but it is a possible interpretation. The Serialise problem is a serious one. I can't think of any good solutions, but I recommend you give knowledge of your serialise class to Derive (http://community.haskell.org/~ndm/derive/) and then at least the instances can be auto-generated. Writing lots of boilerplate and regularly ripping it up is annoying, setting up something to generate it for you reduces the pain. The only safe rule is: if you don't control the class, C, or you don't control the type constructor, T, don't make instance C T. I agree in principle, but in the real world you can't live by this rule. Example, I want to use Uniplate to traverse the tree built by haskell-src-exts, Using Data.Data is too slow, so I need to make my own instances. HSE provides like 50 types that need instances, and it has to be exactly those types. Also, Uniplate requires instances of a particular class it has. Read my recent blog post (http://neilmitchell.blogspot.com/2010/01/optimising-hlint.html), I optimised Uniplate for working with HSE on top of the Data instances - it's now significantly faster in some cases, which may mean you don't need to resort to the Direct stuff. Of course, if you do, then generating them with Derive is the way to go. Thanks, Neil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Spelling checker exercise
Am Sonntag 24 Januar 2010 06:19:58 schrieb Matthew Phillips: Thanks very much Daniel for giving my (amateurish!) exercise such an in-depth a look-over. Comments inline below. On 23/01/2010, at 12:11 AM, Daniel Fischer wrote: That does help, but the worst part is building the map. That takes a couple of seconds in Python, too. Just building the map takes 1.95s for Python, 3.6s (including GC) with strict ByteStrings, 4.2s with lazy ByteStrings and 6s with plain Strings here. Correction: I took the total time for a no-argument run for the time it took Python to build the map, adding timing for the map-building, that says it takes ~1.45 seconds. I'm clueless as to what Python needs the remaining half second for. I get the Python version running in about 1s, compared to 5.6s for the Haskell: $ python --version Python 2.6 With psyco.full(): $ time python ./norvig.py becuase Trained in 1.16967606544 seconds ./norvig.py because 1.52user 0.08system 0:01.60elapsed 100%CPU without psyco: $ time python ./norvig.py becuase Trained in 1.45706319809 seconds ./norvig.py because 1.95user 0.08system 0:02.03elapsed 100%CPU $ time python ./norvig.py Trained in 1.46250891685 seconds ./norvig.py 1.95user 0.09system 0:02.04elapsed 100%CPU $ time python spelling.py because real 0m1.071s user 0m0.821s sys 0m0.139s $ time ./spelling becuase becuase - because real 0m5.589s user 0m4.554s sys 0m0.307s And, strangely, the rewrite you provided (I called it spelling_df) runs a fair bit slower: $ time ./spelling_df becuase becuase - because real 0m8.087s user 0m6.966s sys 0m0.193s $ time ./spelling korrekt korrekt - correct real 0m5.970s user 0m4.885s sys 0m0.300s $ time ./spelling_df korrekt korrekt - correct real 0m8.616s user 0m7.538s sys 0m0.187s I think I know what happened here: $ ghc -fforce-recomp --make matthew -o matthew0 [1 of 1] Compiling Main ( matthew.hs, matthew.o ) Linking matthew0 ... $ ghc -O2 -fforce-recomp --make matthew -o matthew2 [1 of 1] Compiling Main ( matthew.hs, matthew.o ) Linking matthew2 ... $ time ./matthew0 becuase becuase - because 7.07user 0.21system 0:07.28elapsed 99%CPU $ time ./matthew2 becuase becuase - because 6.01user 0.19system 0:06.21elapsed 100%CPU $ ghc -fforce-recomp --make spellingBS -o spelling0 [1 of 1] Compiling Main ( spellingBS.hs, spellingBS.o ) Linking spelling0 ... $ ghc -O2 -fforce-recomp --make spellingBS -o spelling2 [1 of 1] Compiling Main ( spellingBS.hs, spellingBS.o ) Linking spelling2 ... $ time ./spelling0 becuase becuase - because 9.78user 0.09system 0:09.87elapsed 100%CPU $ time ./spelling2 becuase becuase - because 3.57user 0.03system 0:03.60elapsed 100%CPU I habitually compile all code with -O2, unless I have a specific reason not to. I tend to forget that some compile without optimisations. For some kinds of code that makes hardly any difference, for others it makes a big difference. *** Don't even think of using ByteStrings without optimising. *** readFile does not appear in my profile. Apologies, I should have said that I’d inserted some SCC’s to try to tease out the cost of readFile (i.e. {-# SCC readFile}). If you insert an SCC for updateMap, where updateMap model word = {-# SCC updateMap #-} insertWith' (+) word 1 model , you'll see that the really bad citizen is updateMap (splitWords is rather bad, too, together they take some 95% of the time in that profile). Maybe I'm doing this wrong, but I see splitWords in spelling_df taking 80% of runtime. Adding SCC's like this: splitWords = {-# SCC filter #-} filter (not . B.null) . {-# SCC splitWith #-} B.splitWith isNogud . {-# SCC toLower #-} B.map toLower gives me: splitWords Main 216 1 0.00.078.6 91.8 filterMain 217 1 1.93.078.6 91.8 splitWithMain 218 1 28.4 36.8 76.7 88.8 isNogud Main 221 6488666 4.24.1 4.24.1 toLower Main 219 1 44.2 47.9 44.2 47.9 i.e. it seems that splitWith and toLower (!) are the culprits. Why, I have no idea. Am I reading this wrong? No, you're just compiling it wrong :) If I profile without optimising, I get Sun Jan 24 11:37 2010 Time and Allocation Profiling Report (Final) pspellingBS0 +RTS -P -RTS becuase total time = 16.46 secs (823 ticks @ 20 ms) total alloc = 4,088,410,184 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc ticks bytes toLower Main 36.3 45.9299 468806205 splitWith Main 30.6 33.9252 346657565 train Main 25.8 13.5212 138466403 isNogud Main 4.33.8 35 38931996 filterMain 2.12.7 17 27466769 which is compatible with your results. With -O2:
Re: [Haskell-cafe] Failing to install hxt
Am Sunday 24 January 2010 12:24:37 schrieben Sie: Thank you, that worked. I wonder two things: 1) How to downgrade cabal packages? Just installing tagsoup-0.6 and removing every trace of tagsoup-0.8 somehow didn't work for me. cabal still insisted that there should be a tagsoup-0.8. Hm, I'd think if there's no trace of tagsoup-0.8, cabal would use the installed versions to satisfy dependencies by preference. But maybe, because it sees version 0.8 available, it prefers to use the latest version available and allowed. 2) Is the commonly used version-scheming = XYZ ideal? No. Far from it. It's rotten, as you saw. Maybe package maintainers should depend on exactly one version they know that works? I mean how can you guarantee, that every future version of a package will work? You can't. Well, if you only use map, filter and arithmetics, build-depends: base is a pretty safe bet, but otherwise, adhere to the recommended dependency- scheme, build-depends: foo = X.Y.Z U Specifying one exact version is too restrictive, you'll end up with everybody having umpteen versions of almost all packages installed. Minor version bumps which leave the API unchanged shouldn't break anything, small additions to the API should rarely break things, so if people adhere to http://www.haskell.org/haskellwiki/Package_versioning_policy , build-depends: foo = X.Y.Z X.(Y+1) should be almost safe, build-depends: foo = X.Y.Z (X+1) should be fine for relatively stable packages like base. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Space Efficiency When Sorting a List of Many Lists
Peter Green wrote: Another interesting problem is starting from a file of single wagers and trying to compress them (i.e. inverse of 'explosion'. I believe this problem is analogous to Set Cover and therefore NP-Complete. Heuristics are the order of the day here. Interesting indeed! What kind of heuristics do you employ there? It seems quite difficult to me, somehow. Essentially wager generation in multi-leg races works like this: (1) We have estimated win probabilities for the entrants in each of the race legs. Assume that I have a black box which produces these estimated probabilities. (2) We generate the CP of the entrant numbers in the race fields: Leg 1 Entants x Leg 2 Entrants x.. For each combination generated by the CP, we apply some formula to the probabilities associated with combination entrant numbers to arrive at a decision about whether or not that particular combination represents a viable wager. (3) If a combination does represent a viable wager, we then need to ensure that we have sized its wager amount in a correct and optimal manner. This gives output looking like: 1, 1, 2, 3 - $1.10 1, 1, 2, 4 - $1.20 .. .. 15, 15, 15, 12, - $0.35 Unfortunately these are hard to compress well - and that is ignoring issues of what to do about differing wager sizes when we can match two combinations for merging together. So one possible hack is to insert step 1(a) where we k-means cluster the win probabilities in each leg, such that we end up with multiple entrant numbers per cluster mean (which is a probability). We then proceed through steps 2 and 3, but instead of generating single combinations, we are now directly producing wagers which look like: 1+6+12/3+4+7/13+14 - $1.20 In this case, k-means clustering will have clustered 1,6,12 together with cluster mean/probability of (say) 0.1, etc. This a bit of a hack, as clearly it will result in the wagers not covering exactly the same single combinations as if we had generated pure single wagers without k-means clustering. Hence my interest in doing set comparisons! Another brute force greedy approach is to start with single combinations and do something like this: [...] Unfortunately this approach never seems to result in more than a 10 x compression factor. Can do much better with the k-means approach - at the cost of missing some theoretically desirable combinations and betting on some spurious combinations. As always, would be very interested to hear if you have any ideas/insights on first seeing this kind of problem. I'm almost certainly too fixed in my thinking due to long association with the wagering problem domain and probably cannot see the wood for the trees! I think that you're on the right track with trying to compress the output before actually generating it, i.e. compressing after step (1) instead of after step (3). My reasoning is that I don't see a good reason why, a priori, a *random* list of single wager combinations could be compressed well into *cartesian products*. I mean, if you have data without apparent structure, the best you can do is to pipe it through gzip , which has nothing to do with cartesian products at all. So, the fact that your compression works at all is a strong hint that your data does have a lot of structure, which you saw in the graphviz visualizations and which you are wisely exploiting at creation time with step (1a). (The greedy algorithm you mentioned is doomed to failure because it doesn't respect symmetries. The optimal result is invariant under * interchanging columns * renaming all numbers in one column but the greedy algorithm gives completely different results when you apply these operations to the input.) To achieve better compression, you have to open the black boxes in (1) and (2). The good news is that you don't need to open them completely; instead you just want to know properties like for example * Prob( entrant A wins in leg 1) = Prob( entrant A wins in leg 2) * Prob( entrant A wins) is independent of other participants * if {A,B} is a viable wager for leg 1 , so are {A} and {B} (monotonicity) * Prob( A, B, C ) = Prob(A in leg 1) * Prob(B in leg 2) * Prob(C in leg 3) (independence) * etc... In other words, the idea is that the probability for race outcome with multiple legs is, say, the product of the probabilities of the single legs, which then leads to these hypercube structures in your list of wagers. (I don't know which properties you need exactly because I don't understand the details about the probabilities, choosing wagers and pricing them.) Armed with such properties, there are systematic ways to exploit them for efficient algorithms. See for example the case study De Moor and Gibbons. Bridging the algorithm gap: A linear-time functional program for paragraph formatting. http://tinyurl.com/ydt5n5z (This is a .ps.gz file!) Understanding this might also lead to a
[Haskell-cafe] ghc: unrecognised flags: -I
Hi Cafe! That's a capital i for anyone with font issues. I posted this question to beginners a while ago, but received no meaningful response so I'm trying cafe instead :-) Most packages I try to install off Hackage with cabal are giving me the error ghc: unrecognised flags: -I. For example, when I cabal install -v storable-complex I get the following: /usr/bin/ghc -package-name storable-complex-0.2.1 --make -hide-all-packages -i -idist/build -i. -idist/build/autogen -Idist/build/autogen -Idist/build -I -optP-include -optPdist/build/autogen/cabal_macros.h -odir dist/build -hidir dist/build -stubdir dist/build -package base-3.0.3.1 -O Foreign.Storable.Complex ghc: unrecognised flags: -I I've updated everything several times since this error started occurring, but it persists. I've tried digging to find out where the lone -I is being injected but can't seem to track it down. Has anyone encountered this before, or more realistically, can anyone give me some advice on how to narrow this problem down further? I'm running cabal version 1.6.0.1, ghc 6.10.4 on OS X 10.5. Thanks guys! ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Installing encoding package from Hackage
I was able to build encoding-0.6.2. My steps: (0. Update package list with 'cabal update') 1. Unpack package with 'cabal unpack encoding' 2. Edit encoding.cabal. We need to add HaXml as dependency. It should be added to 'Build-Depends' field. I've added below HaXml == 1.20.2 because 1.20.2. is the latest version available at the time of writing. Library if flag(splitBase) if flag(newGHC) Build-Depends:bytestring, base = 3 5, binary, mtl, containers, extensible-exceptions, array, template-haskell, regex-compat, ghc-prim, ghc = 6.10, HaXml == 1.20.2 else Build-Depends:bytestring, base = 3 5, binary, mtl, containers, extensible-exceptions, array, template-haskell, regex-compat, ghc 6.10, HaXml == 1.20.2 else Build-Depends: base 3, binary, extensible-exceptions, template-haskell 3. Instal with cabal install run in the directory next to encoding.cabal. It looks like package maintainer forgot to specify this dependency. Best regards Krzysztof Skrzętnicki ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: PROPOSAL: Web application interface
On Sun, 24 Jan 2010 12:23:46 +0100, Bardur Arantsson s...@scientician.net wrote: Michael Snoyman wrote: [--snip--] Next, I have made the ResponseBodyClass typeclass specifically with the goal of allowing optimizations for lazy bytestrings and sending files. The former seems far-fetched; the latter provides the ability to use a sendfile system call instead of copying the file data into memory. However, in the presence of gzip encoding, how useful is this optimization? [--snip--] I'm hoping that the Web bit in your project title doesn't literally mean that WAI is meant to be restricted to solely serving content to browsers. With that caveat in mind: For non-WWW HTTP servers it can be extremely useful to have sendfile. An example is my Haskell UPnP Media Server (hums) application. It's sending huge files (AVIs, MP4s, etc.) over the network and since these files are already compressed as much as they're ever going to be, gzip would be useless. The CPU load of my hums server went from 2-5% to 0% when streaming files just from switching from a Haskell I/O based solution to proper sendfile. Lack of proper support for sendfile() was indeed one of the reasons that I chose to roll my own HTTP server for hums. I should note that this was quite a while ago and I haven't really gone back to reevaluate that choice -- there's too many HTTP stacks to choose from right now and I don't have the time to properly evaluate them all. Good reason indeed. For this type of server, response *streaming* is also extremely important for those cases where you cannot use sendfile, so I'd hate to see a standard WAI interface preclude that. (No, lazy I/O is NOT an option -- the HTTP clients in a typical UPnP media client behave so badly that you'll run out of file descriptors in no time. Trust me, I've tried.) Is the experiment easily re-doable? I would like to try using safe-lazy-io instead. -- Nicolas Pouillard http://nicolaspouillard.fr ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Naive booleans and numbers - type-checking fails
Dear cafe, I'm trying to prepare a naive definition of booleans, numbers and some helper functions such a way, so that definition in lambda-calculus can be used in a straightforward way in Haskell. I was caught in trouble quite soon during change of lambda-expressions to Haskell - defining prefn as a helper for prev. When using Haskell-ish if-then-else then there is no problem (the code commented out), while using defined if-then-else (mif), the type-checking fails, but just for this case! Other simple tests are OK for the mif. Do I need some extra option for type-checker, or is it a principal failure (infinite type is reported) - I'm running ghci 6.10.4. mtrue x y = x mfalse x y = y m0 f n = n m1 f n = f n m2 f n = f (f n) msucc x g m = x g (g m) iszero m = m (\_ - mfalse) mtrue mif c t f = c t f mpair f s = \e - e f s mfst p = p mtrue msnd p = p mfalse -- mprefn f p = if mex True False then mth else mel mprefn f p = mif mex mth mel where mex = mfst p mth = mpair mfalse (msnd p) mel = mpair mfalse (f (msnd p)) Please, change of definitions is not a solution, I'm trying to follow available resources, so using something else is not an option. :-( Thanks for any help Dusan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Haskell Weekly News: Issue 147 - January 24, 2010
--- Haskell Weekly News http://sequence.complete.org/hwn/20100124 Issue 147 - January 24, 2010 --- Welcome to issue 147 of HWN, a newsletter covering developments in the [1]Haskell community. This week on the HWN, we have a new blogger in town, Bartek, over at 'The Power of Types Compels You', wrote an excellent article which is highlighted in the blogs section. Many new packages this week as well, and some interesting discussions about performance and existential types. So, Haskellers, till next week, your Haskell Weekly News! Announcements ForSyDe DSL v3.1. Seyed Hosein Attarzadeh Niaki [2]announced a new version of the ForSyDe DSL. This version provides more freedom in the declaration of process functions, as well as compatibility with base 4. parameterized-data library v0.1.4. Seyed Hosein Attarzadeh Niaki [3]announced a new version of the parameterized-data library. The parameterized-data library provides fixed-sized vectors, this update is provide minor compatibility fixes. Updates and a new member of the Monadic Regions family. Bas van Dijk [4]announced several new updates and a new member of the Monadic Regions family. bindings-DSL 1.0.4 (Category: FFI). Mauricio CA [5]announced a new version of his package bindings-DSL. Haskell XML Toolbox Version 8.5.0. Uwe Schmidt [6]announced a new version of the Haskell XML Toolbox (HXT). The main change in this version is the separation of the XPath and XSLT modules. Cardinality-0.1. Andrey Sisoyev [7]announced the release of his package Cardinality, a package for transforming between container types safely. afv-0.0.3. Tom Hawkins [8]announced a new release of AFV, an infinite state model checker for simple, iterative C programs. CfP: PPDP 2010. Temur Kutsia [9]announced a call for papers for the PPDP 2010 conference in July Mini-announce: A few package updates. Andrew Coppin [10]announced several new releases and updates to AC-EasyRaster-GTK, AC-Vector, and other packages. amqp-0.1. Holger Reinhardt [11]announced the release of his AMQP library. It currently only works with RabbitMQ and supports most of the 0-8 spec. An introduction to AMQP can be found [12]here. nntp 0.0.3. Maciej Piechotka [13]1f4e gmane.comp.lang.haskell.cafe/69228 announced a bugfix release to nntp. haskell-src-exts 1.7.0. Niklas Broberg [14]announced a new version of haskell-src-exts, featuring many new changes to the API. Discussion Existential Types (I guess). Ozgur Akgun [15]asked about a problem he encountered when trying to use Existential types. Is Haskell capable of matching C in string processing performance? John Millikin [16]asked a question about Haskell Performance... Quick! Someone get Don Stewart! Blog noise [17]Haskell news from the [18]blogosphere. Blog posts from people new to the Haskell community are marked with , be sure to welcome them! * ++Bartek Paczesiowa++: [19]Pure, extensible exceptions and self-returning functions. Bartek, over at 'The Power of Types Compels You', wrote a fantastic introduction to his forthcoming exceptions package. It's funny, and a great read about using type families (particularly, equality constraints) to create a pure exception handling mechanic. Definitely worth a visit. * Neil Mitchell: [20]Optimising HLint. * Kevin Reid (kpreid): [21]Darcs repositories back. * Darcs: [22]darcs weekly news #52. * Bryan O'Sullivan: [23]New GHC I/O manager, first sets of benchmark numbers. * Galois, Inc: [24]POPL 2010, Day 1. * Galois, Inc: [25]Tech Talk: A Scalable I/O Manager for GHC. * Galois, Inc: [26]PADL/PEPM 2010, Day 2. * Neil Brown: [27]The Process Composition Monad. * Galois, Inc: [28]PADL/PEPM 2010, Day 1. * Thomas M. DuBuisson: [29]GHC on ARM. * David Sankel: [30]Semantic Editor Combinators - one of my favorite blog posts. * David Sankel: [31]Best Haskell Papers of 2009. * Gtk2HS: [32]Compiling with ghc 6.12. * Arch Haskell News: [33]wxHaskell packaged for Arch. * Darcs: [34]darcs weekly news #51. * Don Stewart (dons): [35]Playing with the new Haskell epoll event library. * Dan Piponi (sigfpe): [36]Target Enumeration with the Euler Characteristic. Parts 1 2. Quotes of the Week * tensorpudding: the Plot monad allow you to keep the story pure by containing all the glaring time travel silliness * sproingie: quickcheck myLanguage.hs -- Web browser created after 285,731 tests * RossPaterson: I'm afraid you voided the warranty when you used UndecidableInstances. * aavogt: strong static typing is not a substitute for sleep * tensorpudding: fixity goes up
Re: [Haskell-cafe] Re: PROPOSAL: Web application interface
On Sun, Jan 24, 2010 at 1:23 PM, Bardur Arantsson s...@scientician.netwrote: Michael Snoyman wrote: [--snip--] Next, I have made the ResponseBodyClass typeclass specifically with the goal of allowing optimizations for lazy bytestrings and sending files. The former seems far-fetched; the latter provides the ability to use a sendfile system call instead of copying the file data into memory. However, in the presence of gzip encoding, how useful is this optimization? [--snip--] I'm hoping that the Web bit in your project title doesn't literally mean that WAI is meant to be restricted to solely serving content to browsers. With that caveat in mind: For non-WWW HTTP servers it can be extremely useful to have sendfile. An example is my Haskell UPnP Media Server (hums) application. It's sending huge files (AVIs, MP4s, etc.) over the network and since these files are already compressed as much as they're ever going to be, gzip would be useless. The CPU load of my hums server went from 2-5% to 0% when streaming files just from switching from a Haskell I/O based solution to proper sendfile. Lack of proper support for sendfile() was indeed one of the reasons that I chose to roll my own HTTP server for hums. I should note that this was quite a while ago and I haven't really gone back to reevaluate that choice -- there's too many HTTP stacks to choose from right now and I don't have the time to properly evaluate them all. For this type of server, response *streaming* is also extremely important for those cases where you cannot use sendfile, so I'd hate to see a standard WAI interface preclude that. (No, lazy I/O is NOT an option -- the HTTP clients in a typical UPnP media client behave so badly that you'll run out of file descriptors in no time. Trust me, I've tried.) Both sendfile and response streaming are in the top priorities in the WAI proposal. As far as web, I think the term is just a synonym for HTTP here. I'd be especially interested to hear input from people using Haskell for non-standard HTTP applications, because I want WAI to be as general as possible. Please let me know if you see anything that you would like added. The code is all available at http://github.com/snoyberg/wai Michael ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Naive booleans and numbers - type-checking fails
Doesn't the simply typed lambda calculus introduce if-then-else as a primitive precisely so that it can be typed? Its not an illuminating answer to your question and I'd welcome clarification for my own understanding, but I don't think you can solve the problem without appealing to Haskell's built-in if-then-else. Best wishes Stephen ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Naive booleans and numbers - type-checking fails
On Sun, Jan 24, 2010 at 3:12 PM, Stephen Tetley stephen.tet...@gmail.com wrote: Doesn't the simply typed lambda calculus introduce if-then-else as a primitive precisely so that it can be typed? Its not an illuminating answer to your question and I'd welcome clarification for my own understanding, but I don't think you can solve the problem without appealing to Haskell's built-in if-then-else. Yes, encoding data types as pure typed lambda terms requires rank-2 types. I'd recommend that Dušan Kolář start giving types to all these functions. However, it will, eventually, be necessary to go beyond Haskell 98 to give the appropriate types. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: PROPOSAL: Web application interface
Minor spec question: what should be the defined behavior when an application requests that a file be sent and it does not exist? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Spelling checker exercise
On 24/01/2010, at 10:22 PM, Daniel Fischer wrote: ... I think I know what happened here: $ ghc -fforce-recomp --make matthew -o matthew0 ... I habitually compile all code with -O2, unless I have a specific reason not to. I tend to forget that some compile without optimisations. For some kinds of code that makes hardly any difference, for others it makes a big difference. I used the flags -funbox-strict-fields -fvia-C -optc-O2, but *not* -O2. Whoops! I could kick myself: I blindly assumed that -optc-O2 would turn on optimisation, but of course that's just the flag for GCC. $ time ./spelling becuase becuase - because real 0m4.467s user 0m3.865s sys 0m0.280s $ time ./spelling_df becuase becuase - because real 0m2.422s user 0m2.198s sys 0m0.081s Your previous version is close to twice as fast, and now only 2.5 times slower than Python. snipped new version of code with toLower removed With your suggested changes, your latest version on my machine: $ time ./spelling_df becuase becuase - because real 0m1.731s user 0m1.572s sys 0m0.056s Now, we're getting close: 4.7s - 2.3s - 1.7s. But once you start needing two edits (try korrekt), correct and edits1 start to show up. That shows that Norvig's algorithm isn't really good. With two edit steps, you create a _lot_ of strings you need to look up, far more than there are in the map. That takes time. It'll be better to scan the map for entries with an edit distance ( 3) if you have a good method to check that Indeed: $ time ./nLDBSWSpelling becuase becuase - because 2.84user 0.02system 0:02.86elapsed 100%CPU $ time ./nLDBSWSpelling korrekt korrekt - correct 2.83user 0.05system 0:02.88elapsed 100%CPU Not sure if I see what you're saying here: do you mean to point out the 2.86 vs 2.88 elapsed? Another thing is allWords = keysSet wordCounts Ouch. For each correction, you construct that set anew. Just use member from Data.Map instead of Data.Set.member and look up the words in the map. Whoops! Just to be clear though: Haskell will memoise the result of allWords for a given invocation of correct? Yes. But not across different invocations. So this would only make a large difference for multiple corrections? Right. But that's the interesting use case, isn't it? It will be when I get the the rest of it working, yes :) I will look at using Bloom Filters or Trie’s instead of Data.Map, but I wonder if readFile should be taking nearly %30 of the run time, even for a 6MB file? No way. But it doesn't seem to, from my GHC's point of view. Just to be sure I wasn't using the SCC incorrectly, I split out the readFile call into myReadFile. The prof line comes out as: myReadFileMain 210 1 35.88.6 35.88.6 i.e. 35.8% of runtime. Can I see the exact code which gives that profile? Somehow, things which shouldn't must be attributed to readFile. The version at this link has myReadFile split out. http://github.com/scramjet/spelling/blob/31071edb2166b2bc4d350358900d975228fd43b9/spelling.hs Doing the same to your version has the same result: myReadFile Main 210 1 45.79.645.79.6 It does seem that the profiler is wrong or misleading somehow. Two other quick questions: (1) you added parentheses to candidates: candidates = known [word] `or` ((known e1) `or` known_edits2) The or's should short circuit so that if known [word] is non-empty, the others shouldn't be evaluated. I read the above as forcing evaluation of the second or first: am I wrong? (2) you eliminated the fold in correct in favour of a tail-recursive search in maxCount: was this for style or performance reasons (or both :)? Cheers, Matthew.___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANN: afv-0.1.0
AFV is an infinite state model checker to verify assertions in embedded C programs. New in this release: - Starts analysis from 'main' entry point. Automatically identifies the main loop (for (;;), while (1)). - Better counter example generation. - Enforces stateless expressions. Unfortunately, the list of unsupported C is still long. No... - arrays - pointers - structs - unions - typedefs - type casts - static top-level declarations - non void functions - switch statements - arbitrary loop statements http://hackage.haskell.org/package/afv ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Spelling checker exercise
Am Montag 25 Januar 2010 05:34:50 schrieb Matthew Phillips: On 24/01/2010, at 10:22 PM, Daniel Fischer wrote: ... I think I know what happened here: $ ghc -fforce-recomp --make matthew -o matthew0 ... I habitually compile all code with -O2, unless I have a specific reason not to. I tend to forget that some compile without optimisations. For some kinds of code that makes hardly any difference, for others it makes a big difference. I used the flags -funbox-strict-fields -fvia-C -optc-O2, but *not* -O2. Whoops! I could kick myself: I blindly assumed that -optc-O2 would turn on optimisation, but of course that's just the flag for GCC. By the way, compiling via C nowadays hardly ever produces faster code than the native code generator (most times I tried since 6.10, if there was a difference, native was faster). $ time ./spelling becuase becuase - because real0m4.467s user0m3.865s sys 0m0.280s $ time ./spelling_df becuase becuase - because real0m2.422s user0m2.198s sys 0m0.081s Your previous version is close to twice as fast, and now only 2.5 times slower than Python. snipped new version of code with toLower removed With your suggested changes, your latest version on my machine: $ time ./spelling_df becuase becuase - because real0m1.731s user0m1.572s sys 0m0.056s Now, we're getting close: 4.7s - 2.3s - 1.7s. But once you start needing two edits (try korrekt), correct and edits1 start to show up. That shows that Norvig's algorithm isn't really good. With two edit steps, you create a _lot_ of strings you need to look up, far more than there are in the map. That takes time. It'll be better to scan the map for entries with an edit distance ( 3) if you have a good method to check that Indeed: $ time ./nLDBSWSpelling becuase becuase - because 2.84user 0.02system 0:02.86elapsed 100%CPU $ time ./nLDBSWSpelling korrekt korrekt - correct 2.83user 0.05system 0:02.88elapsed 100%CPU Not sure if I see what you're saying here: do you mean to point out the 2.86 vs 2.88 elapsed? Well, above the code, I had the times for Norvig's algorithm (creating all two-step edits and checking which are known): And, without profiling: $ time ./spellingBSW becuase becuase - because 2.84user 0.03system 0:02.88elapsed 99%CPU Finally, building the set of two-step edits takes longer than the additional lookups: $ time ./spellingBSW becuase becuase - because 2.84user 0.03system 0:02.88elapsed 99%CPU $ time ./spellingBSW korrekt korrekt - correct 3.50user 0.02system 0:03.52elapsed 100%CPU vs. $ time ./spellingBSWL becuase becuase - because 2.79user 0.04system 0:02.83elapsed 100%CPU $ time ./spellingBSWL3 korrekt korrekt - correct 3.20user 0.02system 0:03.23elapsed 99%CPU For becuase, which is a one-step edit, all take the same time (within normal fluctuation), 2.8x seconds. For the two-step edit korrekt, the version building Sets takes 3.5 seconds, the version which doesn't build a Set of two-step edits takes 3.2 seconds, *and the version scanning the Map of known words for entries with a Levenshtein distance [modified to account for transpositions] less than 3, takes 2.8y seconds, practically the same time as for the one-step edit*. It becomes more obvious, perhaps, if we test a couple of two-steppers: Lazy Levenshtein: time ./nLDBSWSpelling becrase korrekt halmos territoir gutzenperg becrase - because korrekt - correct halmos - holmes territoir - territory gutzenperg - gutenberg 2.94user 0.03system 0:02.97elapsed 100%CPU just something like 0.1 - 0.15 seconds more than for becuase, that makes 0.02 - 0.04 seconds per two-edit correction. Sets: $ time ./spellingBSW becrase korrekt halmos territoir gutzenperg becrase - because korrekt - correct halmos - holmes territoir - territory gutzenperg - gutenberg 7.48user 0.03system 0:07.55elapsed 99%CPU Gewalt geschrien! That takes almost a second per two-edit correction. List: $ time ./spellingBSWL3 becrase korrekt halmos territoir gutzenperg becrase - because korrekt - correct halmos - holmes territoir - territory gutzenperg - gutenberg 5.54user 0.04system 0:05.58elapsed 99%CPU Better, but still bad, roughly half a second per correction. Python without psyco: $ time python ./norvig.py becrase korrekt halmos territoir gutzenperg Trained in 1.46993017197 seconds because correct holmes territory gutenberg 3.00user 0.08system 0:03.09elapsed 99%CPU $ time python ./norvig.py becuase Trained in 1.49027395248 seconds because 1.46user 0.08system 0:01.55elapsed 100%CPU about 0.3 seconds per correction and with: $ time python ./norvig.py becrase korrekt halmos territoir gutzenperg Trained in 1.22417902946 seconds because correct holmes territory gutenberg 2.12user 0.09system 0:02.21elapsed 100%CPU $ time python ./norvig.py becuase Trained in 1.17486715317 seconds because