Re: [Haskell-cafe] SYB looping very, very mysteriously

2010-01-24 Thread Andrea Vezzosi
On Mon, Dec 7, 2009 at 3:38 PM, David Fox dds...@gmail.com wrote:
 On Sat, Dec 5, 2009 at 2:38 AM, Andrea Vezzosi sanzhi...@gmail.com wrote:
 On Fri, Dec 4, 2009 at 8:51 PM, Jeremy Shaw jer...@n-heptane.com wrote:
 I have stripped things down to the bare minimum, and test under GHC 6.10,
 GHC 6.12, Linux, and Mac OS X. Results are consistent.

 In the following code,

  1. if you load the code into ghci and evaluate e it will hang, but
 (defaultValueD dict) :: Expression returns fine
  2. if you change the gunfold instance for Proposition to, error gunfold
 it stops hanging -- even though this code is never called.
  3. if you change, ( Data ctx [Expression], Sat (ctx Expression) = Data ctx
 Expression, to (Data ctx Expression, ) = ... it stops hanging.

 If someone could explain why each of these cases perform as they do, that
 would be awesome! Right now it is a big mystery to me.. e calls dict .. and
 there is only one instance of dict available, which should call error right
 away. I can't see how something could get in the way there...


 It's less of a mystery if you think about the actual dictionaries ghc
 uses to implement typeclasses.
 The instance for Data ctx [a] depends on Data ctx a, so by requiring
 Data ctx [Expression] in the Data ctx Expression instance you're
 indeed making a loop there, though typeclasses do allow this, and the
 implementation has to be lazy enough to permit it.
 Strange that with a direct Data ctx Expression = Data ctx Expression
 loop we don't get the same problem.
 The reason the implementation of Proposition's gunfold matters is
 probably that k gets passed the dictionary for Data DefaultD
 Expression at the site of its call and some optimization is making it
 stricter than necessary.

 Looks like we need a ghc dev here to fully unravel the mystery, in the
 meantime i'll try to reduce the test case even further.

 I have posted a ghc bug for this:
 http://hackage.haskell.org/trac/ghc/ticket/3731
 and an syb-with-class bug, in case it is not a ghc bug (perhaps due to
 undecidable 
 instances?):http://code.google.com/p/syb-with-class/issues/detail?id=3


While trying to make the test case on the ghc trac self-contained I've
found a workaround which involves changing the Default [a] instance in
happstack-data. I've attached the patch.

I don't know why it works with certainity, so it'd be nice if you
could test it in a large codebase at least.

The problem is that one should work out how all these dictionaries get
constructed, and that means staring at a lot of Core.


workaround-for-default-wrt-http___hackage_haskell_org_trac_ghc_ticket_3731.dpatch
Description: Binary data
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Failing to install hxt

2010-01-24 Thread hask...@kudling.de

 I think the best bet is to:
 
     cabal install tagsoup-0.6 --reinstrall 


This somehow didn't work, since hxt-3.8.2 insisted on using tagsoup-0.8 even
when i removed every tagsoup-0.8 directory.

 And perhaps send the HXT maintainer a cabal patch?



Ok, did this.___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: PROPOSAL: Web application interface

2010-01-24 Thread Bardur Arantsson

Michael Snoyman wrote:

[--snip--]

Next, I have made the ResponseBodyClass typeclass specifically with the goal
of allowing optimizations for lazy bytestrings and sending files. The former
seems far-fetched; the latter provides the ability to use a sendfile system
call instead of copying the file data into memory. However, in the presence
of gzip encoding, how useful is this optimization?

[--snip--]

I'm hoping that the Web bit in your project title doesn't literally 
mean that WAI is meant to be restricted to solely serving content to 
browsers. With that caveat in mind:


For non-WWW HTTP servers it can be extremely useful to have sendfile. An 
example is my Haskell UPnP Media Server (hums) application. It's sending 
huge files (AVIs, MP4s, etc.) over the network and since these files are 
already compressed as much as they're ever going to be, gzip would be 
useless. The CPU load of my hums server went from 2-5% to 0% when 
streaming files just from switching from a Haskell I/O based solution to 
proper sendfile.


Lack of proper support for sendfile() was indeed one of the reasons that 
I chose to roll my own HTTP server for hums. I should note that this was 
quite a while ago and I haven't really gone back to reevaluate that 
choice -- there's too many HTTP stacks to choose from right now and I 
don't have the time to properly evaluate them all.


For this type of server, response *streaming* is also extremely 
important for those cases where you cannot use sendfile, so I'd hate to 
see a standard WAI interface preclude that. (No, lazy I/O is NOT an 
option -- the HTTP clients in a typical UPnP media client behave so 
badly that you'll run out of file descriptors in no time. Trust me, I've 
tried.)


Cheers,

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Failing to install hxt

2010-01-24 Thread hask...@kudling.de
Thank you, that worked. 


I wonder two things:


1) How to downgrade cabal packages? Just installing tagsoup-0.6 and removing
every trace of tagsoup-0.8 somehow didn't work for me. cabal still insisted that
there should be a tagsoup-0.8.


2) Is the commonly used version-scheming = XYZ ideal? Maybe package maintainers
should depend on exactly one version they know that works? I mean how can you
guarantee, that every future version of a package will work?





Daniel Fischer daniel.is.fisc...@web.de hat am 24. Januar 2010 um 00:59
geschrieben:

 cabal unpack hxt-8.3.2

 open hxt.cabal in kate, change the range of the required tagsoup to
 (= 0.6   0.7)

 cd hxt-8.3.2
 cabal install

 perhaps not the _best_ solution, but quick and it usually works
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: could we get a Data instance for Data.Text.Text?

2010-01-24 Thread Neil Mitchell
Hi,

The problem with Data for Text isn't that we have to write a new
instance, but that you could argue that proper handling of Text with
Data would not be using a type class, but have special knowledge baked
in to Data. That's far worse than the Serialise problem mentioned
above, and no one other than the Data authors could solve it. Of
course, I don't believe that, but it is a possible interpretation.

The Serialise problem is a serious one. I can't think of any good
solutions, but I recommend you give knowledge of your serialise class
to Derive (http://community.haskell.org/~ndm/derive/) and then at
least the instances can be auto-generated. Writing lots of boilerplate
and regularly ripping it up is annoying, setting up something to
generate it for you reduces the pain.

 The only safe rule is: if you don't control the class, C, or you don't
 control the type constructor, T, don't make instance C T.

 I agree in principle, but in the real world you can't live by this rule.
 Example, I want to use Uniplate to traverse the tree built by 
 haskell-src-exts,
 Using Data.Data is too slow, so I need to make my own instances.
 HSE provides like 50 types that need instances, and it has to be
 exactly those types.
 Also, Uniplate requires instances of a particular class it has.

Read my recent blog post
(http://neilmitchell.blogspot.com/2010/01/optimising-hlint.html), I
optimised Uniplate for working with HSE on top of the Data instances -
it's now significantly faster in some cases, which may mean you don't
need to resort to the Direct stuff. Of course, if you do, then
generating them with Derive is the way to go.

Thanks, Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Spelling checker exercise

2010-01-24 Thread Daniel Fischer
Am Sonntag 24 Januar 2010 06:19:58 schrieb Matthew Phillips:
 Thanks very much Daniel for giving my (amateurish!) exercise such an
 in-depth a look-over. Comments inline below.

 On 23/01/2010, at 12:11 AM, Daniel Fischer wrote:
 
  That does help, but the worst part is building the map. That takes a
  couple of seconds in Python, too. Just building the map takes 1.95s
  for Python, 3.6s (including GC) with strict ByteStrings, 4.2s with
  lazy ByteStrings and 6s with plain Strings here.

Correction: I took the total time for a no-argument run for the time it 
took Python to build the map, adding timing for the map-building, that says 
it takes ~1.45 seconds. I'm clueless as to what Python needs the remaining 
half second for.


 I get the Python version running in about 1s, compared to 5.6s for the
 Haskell:

$ python --version
Python 2.6

With psyco.full():
$ time python ./norvig.py becuase
Trained in  1.16967606544 seconds
./norvig.py
because
1.52user 0.08system 0:01.60elapsed 100%CPU 

without psyco:
$ time python ./norvig.py becuase
Trained in  1.45706319809 seconds
./norvig.py
because
1.95user 0.08system 0:02.03elapsed 100%CPU
$ time python ./norvig.py
Trained in  1.46250891685 seconds
./norvig.py
1.95user 0.09system 0:02.04elapsed 100%CPU 


 $ time python spelling.py
 because

 real  0m1.071s
 user  0m0.821s
 sys   0m0.139s

 $ time ./spelling becuase
 becuase - because

 real  0m5.589s
 user  0m4.554s
 sys   0m0.307s

 And, strangely, the rewrite you provided (I called it spelling_df)
 runs a fair bit slower:

 $ time ./spelling_df becuase
 becuase - because

 real  0m8.087s
 user  0m6.966s
 sys   0m0.193s

 $ time ./spelling korrekt
 korrekt - correct

 real  0m5.970s
 user  0m4.885s
 sys   0m0.300s

 $ time ./spelling_df korrekt
 korrekt - correct

 real  0m8.616s
 user  0m7.538s
 sys   0m0.187s


I think I know what happened here:

$ ghc -fforce-recomp --make matthew -o matthew0
[1 of 1] Compiling Main ( matthew.hs, matthew.o )
Linking matthew0 ...
$ ghc -O2 -fforce-recomp --make matthew -o matthew2
[1 of 1] Compiling Main ( matthew.hs, matthew.o )
Linking matthew2 ...
$ time ./matthew0 becuase
becuase - because
7.07user 0.21system 0:07.28elapsed 99%CPU
$ time ./matthew2 becuase
becuase - because
6.01user 0.19system 0:06.21elapsed 100%CPU

$ ghc -fforce-recomp --make spellingBS -o spelling0
[1 of 1] Compiling Main ( spellingBS.hs, spellingBS.o )
Linking spelling0 ...
$ ghc -O2 -fforce-recomp --make spellingBS -o spelling2
[1 of 1] Compiling Main ( spellingBS.hs, spellingBS.o )
Linking spelling2 ...
$ time ./spelling0 becuase
becuase - because
9.78user 0.09system 0:09.87elapsed 100%CPU
$ time ./spelling2 becuase
becuase - because
3.57user 0.03system 0:03.60elapsed 100%CPU

I habitually compile all code with -O2, unless I have a specific reason not 
to. I tend to forget that some compile without optimisations.
For some kinds of code that makes hardly any difference, for others it 
makes a big difference.

*** Don't even think of using ByteStrings without optimising. ***

 
  readFile does not appear in my profile.

 Apologies, I should have said that I’d inserted some SCC’s to try to
 tease out the cost of readFile (i.e. {-# SCC readFile}).

  If you insert an SCC for updateMap,
 
  where updateMap model word = {-# SCC updateMap #-} insertWith' (+)
  word 1 model
 
  , you'll see that the really bad citizen is updateMap (splitWords is
  rather bad, too, together they take some 95% of the time in that
  profile).

 Maybe I'm doing this wrong, but I see splitWords in spelling_df taking
 80% of runtime. Adding SCC's like this:

   splitWords = {-# SCC filter #-} filter (not . B.null) . {-# SCC
 splitWith #-} B.splitWith isNogud . {-# SCC toLower #-} B.map
 toLower

 gives me:

   splitWords Main 216   1   0.00.078.6  
 91.8 filterMain 217   1   1.93.078.6
   91.8 splitWithMain 218   1  28.4   36.8   
 76.7   88.8 isNogud Main 221 6488666   4.24.1   
  4.24.1 toLower Main 219   1  44.2   47.9   
 44.2   47.9

 i.e. it seems that splitWith and toLower (!) are the culprits. Why,
 I have no idea.

 Am I reading this wrong?


No, you're just compiling it wrong :)
If I profile without optimising, I get
Sun Jan 24 11:37 2010 Time and Allocation Profiling Report  (Final)

   pspellingBS0 +RTS -P -RTS becuase

total time  =   16.46 secs   (823 ticks @ 20 ms)
total alloc = 4,088,410,184 bytes  (excludes profiling overheads)

COST CENTRE   MODULE  %time %alloc  ticks bytes

toLower   Main 36.3   45.9299 468806205
splitWith Main 30.6   33.9252 346657565
train Main 25.8   13.5212 138466403
isNogud   Main  4.33.8 35  38931996
filterMain  2.12.7 17  27466769

which is compatible with your results.
With -O2:
 

Re: [Haskell-cafe] Failing to install hxt

2010-01-24 Thread Daniel Fischer
Am Sunday 24 January 2010 12:24:37 schrieben Sie:
 Thank you, that worked.


 I wonder two things:


 1) How to downgrade cabal packages? Just installing tagsoup-0.6 and
 removing every trace of tagsoup-0.8 somehow didn't work for me. cabal
 still insisted that there should be a tagsoup-0.8.

Hm, I'd think if there's no trace of tagsoup-0.8, cabal would use the 
installed versions to satisfy dependencies by preference. But maybe, 
because it sees version 0.8 available, it prefers to use the latest version 
available and allowed.



 2) Is the commonly used version-scheming = XYZ ideal?

No. Far from it. It's rotten, as you saw.

 Maybe package maintainers should depend on exactly one version
 they know that works? I mean how can you guarantee, that every
 future version of a package will work?


You can't.
Well, if you only use map, filter and arithmetics,
build-depends: base
is a pretty safe bet, but otherwise, adhere to the recommended dependency-
scheme,

build-depends: foo = X.Y.Z   U

Specifying one exact version is too restrictive, you'll end up with 
everybody having umpteen versions of almost all packages installed.
Minor version bumps which leave the API unchanged shouldn't break anything, 
small additions to the API should rarely break things, so if people adhere 
to http://www.haskell.org/haskellwiki/Package_versioning_policy ,

build-depends: foo = X.Y.Z   X.(Y+1)

should be almost safe,

build-depends: foo = X.Y.Z   (X+1)

should be fine for relatively stable packages like base.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Space Efficiency When Sorting a List of Many Lists

2010-01-24 Thread Heinrich Apfelmus
Peter Green wrote:

 Another interesting problem is starting from a file of single wagers
 and
 trying to compress them  (i.e. inverse of 'explosion'. I believe this
 problem is analogous to Set Cover and therefore NP-Complete. Heuristics
 are the order of the day here.

 Interesting indeed! What kind of heuristics do you employ there? It
 seems quite difficult to me, somehow.

 Essentially wager generation in multi-leg races works like this:
 
 (1) We have estimated win probabilities for the entrants in each of the
 race legs. Assume that I have a black box which produces these estimated
 probabilities.
 
 (2) We generate the CP of the entrant numbers in the race fields: Leg 1
 Entants x Leg 2 Entrants x..
 For each combination generated by the CP, we apply some formula to the
 probabilities associated with combination entrant numbers to arrive at a
 decision about whether or not that particular combination represents a
 viable wager.
 
 (3) If a combination does represent a viable wager, we then need to
 ensure that we have sized its wager amount in a correct and optimal manner.
 
 This gives output looking like:
 
 1, 1, 2, 3 - $1.10
 1, 1, 2, 4 - $1.20
 ..
 ..
 15, 15, 15, 12, - $0.35
 
 Unfortunately these are hard to compress well - and that is ignoring
 issues of what to do about differing wager sizes when we can match two
 combinations for merging together.
 
 So one possible hack is to insert step 1(a) where we k-means cluster the
 win probabilities in each leg, such that we end up with multiple entrant
 numbers per cluster mean (which is a probability). We then proceed
 through steps 2 and 3, but instead of generating single combinations, we
 are now directly producing wagers which look like:
 
 1+6+12/3+4+7/13+14 - $1.20
 
 In this case, k-means clustering will have clustered 1,6,12 together
 with cluster mean/probability of (say) 0.1, etc.
 
 This a bit of a hack, as clearly it will result in the wagers not
 covering exactly the same single combinations as if we had generated
 pure single wagers without k-means clustering. Hence my interest in
 doing set comparisons!
 
 Another brute force greedy approach is to start with single combinations
 and do something like this:
 [...]
 
 Unfortunately this approach never seems to result in more than a 10 x
 compression factor. Can do much better with the k-means approach - at
 the cost of missing some theoretically desirable combinations and
 betting on some spurious combinations.
 
 As always, would be very interested to hear if you have any
 ideas/insights on first seeing this kind of problem. I'm almost
 certainly too fixed in my thinking due to long association with the
 wagering problem domain and probably cannot see the wood for the trees!

I think that you're on the right track with trying to compress the
output before actually generating it, i.e. compressing after step (1)
instead of after step (3).


My reasoning is that I don't see a good reason why, a priori, a *random*
list of single wager combinations could be compressed well into
*cartesian products*. I mean, if you have data without apparent
structure, the best you can do is to pipe it through  gzip , which has
nothing to do with cartesian products at all.

So, the fact that your compression works at all is a strong hint that
your data does have a lot of structure, which you saw in the graphviz
visualizations and which you are wisely exploiting at creation time with
step (1a).

(The greedy algorithm you mentioned is doomed to failure because it
doesn't respect symmetries. The optimal result is invariant under

* interchanging columns
* renaming all numbers in one column

but the greedy algorithm gives completely different results when you
apply these operations to the input.)


To achieve better compression, you have to open the black boxes in (1)
and (2). The good news is that you don't need to open them completely;
instead you just want to know properties like for example

* Prob( entrant A wins in leg 1) = Prob( entrant A wins in leg 2)
* Prob( entrant A wins) is independent of other participants
* if {A,B} is a viable wager for leg 1 , so are {A} and {B}
  (monotonicity)
* Prob( A, B, C ) = Prob(A in leg 1) * Prob(B in leg 2)
  * Prob(C in leg 3)
  (independence)
* etc...

In other words, the idea is that the probability for race outcome with
multiple legs is, say, the product of the probabilities of the single
legs, which then leads to these hypercube structures in your list of wagers.

(I don't know which properties you need exactly because I don't
understand the details about the probabilities, choosing wagers and
pricing them.)

Armed with such properties, there are systematic ways to exploit them
for efficient algorithms. See for example the case study

  De Moor and Gibbons.
  Bridging the algorithm gap: A linear-time functional program for
  paragraph formatting.
  http://tinyurl.com/ydt5n5z  (This is a  .ps.gz  file!)


Understanding this might also lead to a 

[Haskell-cafe] ghc: unrecognised flags: -I

2010-01-24 Thread Lyndon Maydwell
Hi Cafe!

That's a capital i for anyone with font issues.

I posted this question to beginners a while ago, but received no
meaningful response so I'm trying cafe instead :-)

Most packages I try to install off Hackage with cabal are giving me
the error ghc: unrecognised flags: -I.

For example, when I cabal install -v storable-complex I get the following:

/usr/bin/ghc -package-name storable-complex-0.2.1 --make
-hide-all-packages -i -idist/build -i. -idist/build/autogen
-Idist/build/autogen -Idist/build -I -optP-include
-optPdist/build/autogen/cabal_macros.h -odir dist/build -hidir
dist/build -stubdir dist/build -package base-3.0.3.1 -O
Foreign.Storable.Complex
ghc: unrecognised flags: -I

I've updated everything several times since this error started
occurring, but it persists. I've tried digging to find out where the
lone -I is being injected but can't seem to track it down.

Has anyone encountered this before, or more realistically, can anyone
give me some advice on how to narrow this problem down further?

I'm running cabal version 1.6.0.1, ghc 6.10.4 on OS X 10.5.

Thanks guys!
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Installing encoding package from Hackage

2010-01-24 Thread Krzysztof Skrzętnicki
I was able to build encoding-0.6.2. My steps:

(0. Update package list with 'cabal update')
1. Unpack package with 'cabal unpack encoding'
2. Edit encoding.cabal. We need to add HaXml as dependency. It should
be added to 'Build-Depends' field. I've added below HaXml == 1.20.2
because 1.20.2. is the latest version available at the time of
writing.

Library
  if flag(splitBase)
if flag(newGHC)
  Build-Depends:bytestring, base = 3   5, binary, mtl,
containers, extensible-exceptions, array, template-haskell,
regex-compat, ghc-prim, ghc = 6.10, HaXml == 1.20.2
else
  Build-Depends:bytestring, base = 3   5, binary, mtl,
containers, extensible-exceptions, array, template-haskell,
regex-compat, ghc  6.10, HaXml == 1.20.2
  else
Build-Depends:  base  3, binary, extensible-exceptions, 
template-haskell

3. Instal with cabal install run in the directory next to encoding.cabal.


It looks like package maintainer forgot to specify this dependency.

Best regards

Krzysztof Skrzętnicki
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: PROPOSAL: Web application interface

2010-01-24 Thread Nicolas Pouillard
On Sun, 24 Jan 2010 12:23:46 +0100, Bardur Arantsson s...@scientician.net 
wrote:
 Michael Snoyman wrote:
 
 [--snip--]
  Next, I have made the ResponseBodyClass typeclass specifically with the goal
  of allowing optimizations for lazy bytestrings and sending files. The former
  seems far-fetched; the latter provides the ability to use a sendfile system
  call instead of copying the file data into memory. However, in the presence
  of gzip encoding, how useful is this optimization?
 [--snip--]
 
 I'm hoping that the Web bit in your project title doesn't literally 
 mean that WAI is meant to be restricted to solely serving content to 
 browsers. With that caveat in mind:
 
 For non-WWW HTTP servers it can be extremely useful to have sendfile. An 
 example is my Haskell UPnP Media Server (hums) application. It's sending 
 huge files (AVIs, MP4s, etc.) over the network and since these files are 
 already compressed as much as they're ever going to be, gzip would be 
 useless. The CPU load of my hums server went from 2-5% to 0% when 
 streaming files just from switching from a Haskell I/O based solution to 
 proper sendfile.
 
 Lack of proper support for sendfile() was indeed one of the reasons that 
 I chose to roll my own HTTP server for hums. I should note that this was 
 quite a while ago and I haven't really gone back to reevaluate that 
 choice -- there's too many HTTP stacks to choose from right now and I 
 don't have the time to properly evaluate them all.

Good reason indeed.

 For this type of server, response *streaming* is also extremely 
 important for those cases where you cannot use sendfile, so I'd hate to 
 see a standard WAI interface preclude that. (No, lazy I/O is NOT an 
 option -- the HTTP clients in a typical UPnP media client behave so 
 badly that you'll run out of file descriptors in no time. Trust me, I've 
 tried.)

Is the experiment easily re-doable? I would like to try using safe-lazy-io
instead.

-- 
Nicolas Pouillard
http://nicolaspouillard.fr
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Naive booleans and numbers - type-checking fails

2010-01-24 Thread Dušan Kolář

Dear cafe,

  I'm trying to prepare a naive definition of booleans, numbers and 
some helper functions such a way, so that definition in lambda-calculus 
can be used in a straightforward way in Haskell. I was caught in trouble 
quite soon during change of lambda-expressions to Haskell - defining 
prefn as a helper for prev. When using Haskell-ish if-then-else then 
there is no problem (the code commented out), while using defined 
if-then-else (mif), the type-checking fails, but just for this case! 
Other simple tests are OK for the mif. Do I need some extra option for 
type-checker, or is it a principal failure (infinite type is reported) - 
I'm running ghci 6.10.4.


 mtrue  x y = x
 mfalse x y = y

 m0 f n = n
 m1 f n = f n
 m2 f n = f (f n)

 msucc x g m = x g (g m)
 iszero m = m (\_ - mfalse) mtrue

 mif c t f = c t f
 mpair f s = \e - e f s
 mfst p = p mtrue
 msnd p = p mfalse

 -- mprefn f p = if mex True False then mth else mel
 mprefn f p = mif mex mth mel
   where
 mex = mfst p
 mth = mpair mfalse (msnd p)
 mel = mpair mfalse (f (msnd p))


Please, change of definitions is not a solution, I'm trying to follow 
available resources, so using something else is not an option. :-(


Thanks for any help

  Dusan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Haskell Weekly News: Issue 147 - January 24, 2010

2010-01-24 Thread jfredett

---
Haskell Weekly News
http://sequence.complete.org/hwn/20100124
Issue 147 - January 24, 2010
---
   Welcome to issue 147 of HWN, a newsletter covering developments in the
   [1]Haskell community.

   This week on the HWN, we have a new blogger in town, Bartek, over at
   'The Power of Types Compels You', wrote an excellent article which is
   highlighted in the blogs section. Many new packages this week as well,
   and some interesting discussions about performance and existential
   types. So, Haskellers, till next week, your Haskell Weekly News!

Announcements

   ForSyDe DSL v3.1. Seyed Hosein Attarzadeh Niaki [2]announced a new
   version of the ForSyDe DSL. This version provides more freedom in the
   declaration of process functions, as well as compatibility with base 4.

   parameterized-data library v0.1.4. Seyed Hosein Attarzadeh Niaki
   [3]announced a new version of the parameterized-data library. The
   parameterized-data library provides fixed-sized vectors, this update is
   provide minor compatibility fixes.

   Updates and a new member of the Monadic Regions family. Bas van Dijk
   [4]announced several new updates and a new member of the Monadic
   Regions family.

   bindings-DSL 1.0.4 (Category: FFI). Mauricio CA [5]announced a new
   version of his package bindings-DSL.

   Haskell XML Toolbox Version 8.5.0. Uwe Schmidt [6]announced a new
   version of the Haskell XML Toolbox (HXT). The main change in this
   version is the separation of the XPath and XSLT modules.

   Cardinality-0.1. Andrey Sisoyev [7]announced the release of his package
   Cardinality, a package for transforming between container types safely.

   afv-0.0.3. Tom Hawkins [8]announced a new release of AFV, an infinite
   state model checker for simple, iterative C programs.

   CfP: PPDP 2010. Temur Kutsia [9]announced a call for papers for the
   PPDP 2010 conference in July

   Mini-announce: A few package updates. Andrew Coppin [10]announced
   several new releases and updates to AC-EasyRaster-GTK, AC-Vector, and
   other packages.

   amqp-0.1. Holger Reinhardt [11]announced the release of his AMQP
   library. It currently only works with RabbitMQ and supports most of the
   0-8 spec. An introduction to AMQP can be found [12]here.

   nntp 0.0.3. Maciej Piechotka [13]1f4e
   gmane.comp.lang.haskell.cafe/69228 announced a bugfix release to nntp.

   haskell-src-exts 1.7.0. Niklas Broberg [14]announced a new version of
   haskell-src-exts, featuring many new changes to the API.

Discussion

   Existential Types (I guess). Ozgur Akgun [15]asked about a problem he
   encountered when trying to use Existential types.

   Is Haskell capable of matching C in string processing performance? John
   Millikin [16]asked a question about Haskell Performance... Quick!
   Someone get Don Stewart!

Blog noise

   [17]Haskell news from the [18]blogosphere. Blog posts from people new
   to the Haskell community are marked with , be sure to welcome them!
 * ++Bartek Paczesiowa++: [19]Pure, extensible exceptions and
   self-returning functions. Bartek, over at 'The Power of Types
   Compels You', wrote a fantastic introduction to his forthcoming
   exceptions package. It's funny, and a great read about using type
   families (particularly, equality constraints) to create a pure
   exception handling mechanic. Definitely worth a visit.
 * Neil Mitchell: [20]Optimising HLint.
 * Kevin Reid (kpreid): [21]Darcs repositories back.
 * Darcs: [22]darcs weekly news #52.
 * Bryan O'Sullivan: [23]New GHC I/O manager, first sets of benchmark
   numbers.
 * Galois, Inc: [24]POPL 2010, Day 1.
 * Galois, Inc: [25]Tech Talk: A Scalable I/O Manager for GHC.
 * Galois, Inc: [26]PADL/PEPM 2010, Day 2.
 * Neil Brown: [27]The Process Composition Monad.
 * Galois, Inc: [28]PADL/PEPM 2010, Day 1.
 * Thomas M. DuBuisson: [29]GHC on ARM.
 * David Sankel: [30]Semantic Editor Combinators - one of my favorite
   blog posts.
 * David Sankel: [31]Best Haskell Papers of 2009.
 * Gtk2HS: [32]Compiling with ghc 6.12.
 * Arch Haskell News: [33]wxHaskell packaged for Arch.
 * Darcs: [34]darcs weekly news #51.
 * Don Stewart (dons): [35]Playing with the new Haskell epoll event
   library.
 * Dan Piponi (sigfpe): [36]Target Enumeration with the Euler
   Characteristic. Parts 1  2.

Quotes of the Week

 * tensorpudding: the Plot monad allow you to keep the story pure by
   containing all the glaring time travel silliness
 * sproingie: quickcheck myLanguage.hs -- Web browser created after
   285,731 tests
 * RossPaterson: I'm afraid you voided the warranty when you used
   UndecidableInstances.
 * aavogt: strong static typing is not a substitute for sleep
 * tensorpudding: fixity goes up

Re: [Haskell-cafe] Re: PROPOSAL: Web application interface

2010-01-24 Thread Michael Snoyman
On Sun, Jan 24, 2010 at 1:23 PM, Bardur Arantsson s...@scientician.netwrote:

 Michael Snoyman wrote:

 [--snip--]

  Next, I have made the ResponseBodyClass typeclass specifically with the
 goal
 of allowing optimizations for lazy bytestrings and sending files. The
 former
 seems far-fetched; the latter provides the ability to use a sendfile
 system
 call instead of copying the file data into memory. However, in the
 presence
 of gzip encoding, how useful is this optimization?

 [--snip--]

 I'm hoping that the Web bit in your project title doesn't literally mean
 that WAI is meant to be restricted to solely serving content to browsers.
 With that caveat in mind:

 For non-WWW HTTP servers it can be extremely useful to have sendfile. An
 example is my Haskell UPnP Media Server (hums) application. It's sending
 huge files (AVIs, MP4s, etc.) over the network and since these files are
 already compressed as much as they're ever going to be, gzip would be
 useless. The CPU load of my hums server went from 2-5% to 0% when streaming
 files just from switching from a Haskell I/O based solution to proper
 sendfile.

 Lack of proper support for sendfile() was indeed one of the reasons that I
 chose to roll my own HTTP server for hums. I should note that this was quite
 a while ago and I haven't really gone back to reevaluate that choice --
 there's too many HTTP stacks to choose from right now and I don't have the
 time to properly evaluate them all.

 For this type of server, response *streaming* is also extremely important
 for those cases where you cannot use sendfile, so I'd hate to see a standard
 WAI interface preclude that. (No, lazy I/O is NOT an option -- the HTTP
 clients in a typical UPnP media client behave so badly that you'll run out
 of file descriptors in no time. Trust me, I've tried.)

 Both sendfile and response streaming are in the top priorities in the WAI
proposal. As far as web, I think the term is just a synonym for HTTP here.

I'd be especially interested to hear input from people using Haskell for
non-standard HTTP applications, because I want WAI to be as general as
possible. Please let me know if you see anything that you would like added.
The code is all available at http://github.com/snoyberg/wai

Michael
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Naive booleans and numbers - type-checking fails

2010-01-24 Thread Stephen Tetley
Doesn't the simply typed lambda calculus introduce if-then-else as a
primitive precisely so that it can be typed?

Its not an illuminating answer to your question and I'd welcome
clarification for my own understanding, but I don't think you can
solve the problem without appealing to Haskell's built-in
if-then-else.

Best wishes

Stephen
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Naive booleans and numbers - type-checking fails

2010-01-24 Thread Derek Elkins
On Sun, Jan 24, 2010 at 3:12 PM, Stephen Tetley
stephen.tet...@gmail.com wrote:
 Doesn't the simply typed lambda calculus introduce if-then-else as a
 primitive precisely so that it can be typed?

 Its not an illuminating answer to your question and I'd welcome
 clarification for my own understanding, but I don't think you can
 solve the problem without appealing to Haskell's built-in
 if-then-else.

Yes, encoding data types as pure typed lambda terms requires rank-2
types.  I'd recommend that Dušan Kolář start giving types to all these
functions.  However, it will, eventually, be necessary to go beyond
Haskell 98 to give the appropriate types.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: PROPOSAL: Web application interface

2010-01-24 Thread Michael Snoyman
Minor spec question: what should be the defined behavior when an application
requests that a file be sent and it does not exist?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Spelling checker exercise

2010-01-24 Thread Matthew Phillips
On 24/01/2010, at 10:22 PM, Daniel Fischer wrote:

...

 I think I know what happened here:
 
 $ ghc -fforce-recomp --make matthew -o matthew0

...

 I habitually compile all code with -O2, unless I have a specific reason not 
 to. I tend to forget that some compile without optimisations.
 For some kinds of code that makes hardly any difference, for others it 
 makes a big difference.

I used the flags -funbox-strict-fields -fvia-C -optc-O2, but *not* -O2. 
Whoops! I could kick myself: I blindly assumed that -optc-O2 would turn on 
optimisation, but of course that's just the flag for GCC.

  $ time ./spelling becuase
  becuase - because

  real  0m4.467s
  user  0m3.865s
  sys   0m0.280s

  $ time ./spelling_df becuase
  becuase - because

  real  0m2.422s
  user  0m2.198s
  sys   0m0.081s

Your previous version is close to twice as fast, and now only 2.5 times slower 
than Python.

snipped new version of code with toLower removed

With your suggested changes, your latest version on my machine:

  $ time ./spelling_df becuase
  becuase - because

  real  0m1.731s
  user  0m1.572s
  sys   0m0.056s

Now, we're getting close: 4.7s - 2.3s - 1.7s.

 But once you start needing two edits (try korrekt), correct and edits1
 start to show up. That shows that Norvig's algorithm isn't really
 good. With two edit steps, you create a _lot_ of strings you need to
 look up, far more than there are in the map. That takes time. It'll be
 better to scan the map for entries with an edit distance ( 3) if you
 have a good method to check that
 
 Indeed:
 $ time ./nLDBSWSpelling becuase
 becuase - because
 2.84user 0.02system 0:02.86elapsed 100%CPU
 $ time ./nLDBSWSpelling korrekt
 korrekt - correct
 2.83user 0.05system 0:02.88elapsed 100%CPU

Not sure if I see what you're saying here: do you mean to point out the 2.86 vs 
2.88 elapsed?

 Another thing is
 
 allWords = keysSet wordCounts
 
 Ouch. For each correction, you construct that set anew. Just use
 member from Data.Map instead of Data.Set.member and look up the words
 in the map.
 
 Whoops! Just to be clear though: Haskell will memoise the result of
 allWords for a given invocation of correct?
 
 Yes. But not across different invocations.
 
 So this would only make a large difference for multiple corrections?
 
 Right. But that's the interesting use case, isn't it?

It will be when I get the the rest of it working, yes :)

 I will look at using Bloom Filters or
 Trie’s instead of Data.Map, but I wonder if readFile should be taking
 nearly %30 of the run time, even for a 6MB file?
 
 No way. But it doesn't seem to, from my GHC's point of view.
 
 Just to be sure I wasn't using the SCC incorrectly, I split out the
 readFile call into myReadFile. The prof line comes out as:
 
  myReadFileMain 210   1  35.88.6   
 35.88.6
 
 i.e. 35.8%  of runtime.
 
 
 Can I see the exact code which gives that profile?
 Somehow, things which shouldn't must be attributed to readFile.

The version at this link has myReadFile split out.

http://github.com/scramjet/spelling/blob/31071edb2166b2bc4d350358900d975228fd43b9/spelling.hs

Doing the same to your version has the same result:

  myReadFile   Main   210   1  45.79.645.79.6

It does seem that the profiler is wrong or misleading somehow.

Two other quick questions:

(1) you added parentheses to candidates:

  candidates = known [word] `or` ((known e1) `or` known_edits2)

The or's should short circuit so that if known [word] is non-empty, the 
others shouldn't be evaluated. I read the above as forcing evaluation of the 
second or first: am I wrong?

(2) you eliminated the fold in correct in favour of a tail-recursive search 
in maxCount: was this for style or performance reasons (or both :)?

Cheers,

Matthew.___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: afv-0.1.0

2010-01-24 Thread Tom Hawkins
AFV is an infinite state model checker to verify assertions in
embedded C programs.

New in this release:

- Starts analysis from 'main' entry point.  Automatically identifies
the main loop (for (;;), while (1)).
- Better counter example generation.
- Enforces stateless expressions.

Unfortunately, the list of unsupported C is still long.  No...
- arrays
- pointers
- structs
- unions
- typedefs
- type casts
- static top-level declarations
- non void functions
- switch statements
- arbitrary loop statements


http://hackage.haskell.org/package/afv
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Spelling checker exercise

2010-01-24 Thread Daniel Fischer
Am Montag 25 Januar 2010 05:34:50 schrieb Matthew Phillips:
 On 24/01/2010, at 10:22 PM, Daniel Fischer wrote:

 ...

  I think I know what happened here:
 
  $ ghc -fforce-recomp --make matthew -o matthew0

 ...

  I habitually compile all code with -O2, unless I have a specific
  reason not to. I tend to forget that some compile without
  optimisations. For some kinds of code that makes hardly any
  difference, for others it makes a big difference.

 I used the flags -funbox-strict-fields -fvia-C -optc-O2, but *not*
 -O2. Whoops! I could kick myself: I blindly assumed that -optc-O2 would
 turn on optimisation, but of course that's just the flag for GCC.

By the way, compiling via C nowadays hardly ever produces faster code than 
the native code generator (most times I tried since 6.10, if there was a 
difference, native was faster).


   $ time ./spelling becuase
   becuase - because

   real0m4.467s
   user0m3.865s
   sys 0m0.280s

   $ time ./spelling_df becuase
   becuase - because

   real0m2.422s
   user0m2.198s
   sys 0m0.081s

 Your previous version is close to twice as fast, and now only 2.5 times
 slower than Python.

 snipped new version of code with toLower removed

 With your suggested changes, your latest version on my machine:

   $ time ./spelling_df becuase
   becuase - because

   real0m1.731s
   user0m1.572s
   sys 0m0.056s

 Now, we're getting close: 4.7s - 2.3s - 1.7s.

  But once you start needing two edits (try korrekt), correct and
  edits1 start to show up. That shows that Norvig's algorithm isn't
  really good. With two edit steps, you create a _lot_ of strings you
  need to look up, far more than there are in the map. That takes
  time. It'll be better to scan the map for entries with an edit
  distance ( 3) if you have a good method to check that
 
  Indeed:
  $ time ./nLDBSWSpelling becuase
  becuase - because
  2.84user 0.02system 0:02.86elapsed 100%CPU
  $ time ./nLDBSWSpelling korrekt
  korrekt - correct
  2.83user 0.05system 0:02.88elapsed 100%CPU

 Not sure if I see what you're saying here: do you mean to point out the
 2.86 vs 2.88 elapsed?

Well, above the code, I had the times for Norvig's algorithm (creating all 
two-step edits and checking which are known):

  And, without profiling:
  $ time ./spellingBSW becuase
  becuase - because
  2.84user 0.03system 0:02.88elapsed 99%CPU
 
  Finally, building the set of two-step edits takes longer than the
  additional lookups:
 
  $ time ./spellingBSW becuase
  becuase - because
  2.84user 0.03system 0:02.88elapsed 99%CPU
  $ time ./spellingBSW korrekt
  korrekt - correct
  3.50user 0.02system 0:03.52elapsed 100%CPU
 
  vs.
 
  $ time ./spellingBSWL becuase
  becuase - because
  2.79user 0.04system 0:02.83elapsed 100%CPU
  $ time ./spellingBSWL3 korrekt
  korrekt - correct
  3.20user 0.02system 0:03.23elapsed 99%CPU

For becuase, which is a one-step edit, all take the same time (within 
normal fluctuation), 2.8x seconds.
For the two-step edit korrekt, the version building Sets takes 3.5 
seconds, the version which doesn't build a Set of two-step edits takes 3.2 
seconds, *and the version scanning the Map of known words for entries with 
a Levenshtein distance [modified to account for transpositions] less than 
3, takes 2.8y seconds, practically the same time as for the one-step edit*.

It becomes more obvious, perhaps, if we test a couple of two-steppers:

Lazy Levenshtein:

 time ./nLDBSWSpelling becrase korrekt halmos territoir gutzenperg
becrase - because
korrekt - correct
halmos - holmes
territoir - territory
gutzenperg - gutenberg
2.94user 0.03system 0:02.97elapsed 100%CPU

just something like 0.1 - 0.15 seconds more than for becuase, that makes 
0.02 - 0.04 seconds per two-edit correction.

Sets:

$ time ./spellingBSW becrase korrekt halmos territoir gutzenperg
becrase - because
korrekt - correct
halmos - holmes
territoir - territory
gutzenperg - gutenberg
7.48user 0.03system 0:07.55elapsed 99%CPU

Gewalt geschrien! That takes almost a second per two-edit correction.

List:

$ time ./spellingBSWL3 becrase korrekt halmos territoir gutzenperg
becrase - because
korrekt - correct
halmos - holmes
territoir - territory
gutzenperg - gutenberg
5.54user 0.04system 0:05.58elapsed 99%CPU

Better, but still bad, roughly half a second per correction.

Python without psyco:

$ time python ./norvig.py becrase korrekt halmos territoir gutzenperg
Trained in  1.46993017197 seconds
because
correct
holmes
territory
gutenberg
3.00user 0.08system 0:03.09elapsed 99%CPU
$ time python ./norvig.py becuase
Trained in  1.49027395248 seconds
because
1.46user 0.08system 0:01.55elapsed 100%CPU

about 0.3 seconds per correction

and with:

$ time python ./norvig.py becrase korrekt halmos territoir gutzenperg
Trained in  1.22417902946 seconds
because
correct
holmes
territory
gutenberg
2.12user 0.09system 0:02.21elapsed 100%CPU
$ time python ./norvig.py becuase
Trained in  1.17486715317 seconds
because