Re: [Haskell-cafe] Simple network client

2008-01-30 Thread Bryan O'Sullivan
Adam Langley wrote:
 On Jan 30, 2008 1:07 PM, Adam Langley [EMAIL PROTECTED] wrote:
 So, if I don't hear otherwise soon, I'll probably push a new version
 of binary-strict later on today with the interface above.
 
 It's in the darcs now, http://darcs.imperialviolet.org/binary-strict

Thanks!

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: bytestringparser-0.2, a Parsec-like parser for lazy ByteStrings

2008-01-25 Thread Bryan O'Sullivan
Some time ago, Jeremy Shaw wrote a Parsec clone for lazy ByteStrings.
I've been using it for a while, and have made substantial changes to it
along the way.

It's very fast, using the same manual unpacking trick as the binary
package to keep performance nippy.  It also integrates with the latest
flava in parsing combinators, Control.Applicative, providing instances
of the Applicative and Alternative typeclasses.

The API is rather smaller than that of Parsec, but it's more than
adequate for parsing e.g. text-based network protocols.

Download:

http://hackage.haskell.org/packages/archive/bytestringparser/0.2/bytestringparser-0.2.tar.gz

Haddocks:

http://darcs.serpentine.com/bytestringparser/dist/doc/html/bytestringparser/Text-ParserCombinators-ByteStringParser.html

Darcs repository:

darcs get http://darcs.serpentine.com/bytestringparser

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Draft chapters of Real World Haskell now publicly available

2008-01-23 Thread Bryan O'Sullivan
Wolfgang Jeltsch wrote:

 Covering reactive programming would indeed be interesting.

I agree.  However, we have no plans to cover this topic.  I don't
believe any of us has used FRP, and my impression of it as an approach
is that it's not yet cooked.  We already have our hands and TOC full
covering well-established topics, never mind tracking the leading edge
of research.  (For similar reasons, we won't be writing about ndp, even
though it's got a lot more obvious crowd appeal.)

b

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Draft chapters of Real World Haskell now publicly available

2008-01-22 Thread Bryan O'Sullivan
Paul Moore wrote:

 I'm posting here because there doesn't seem to be an overall comment
 section, but the TOC seems to cover less ground than I expected. Is
 the TOC meant to be complete?

No, it's less than a third of the whole thing.

Here's the announcement from last May, including a more detailed TOC:
http://www.realworldhaskell.org/blog/2007/05/23/real-world-haskell-its-time/

We've since added a few chapters, and I wouldn't be surprised if we
subtract a couple in the interests of length later in the process.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Draft chapters of Real World Haskell now publicly available

2008-01-21 Thread Bryan O'Sullivan
John, Don and I are pleased to announce the beginning of the public beta
programme for our upcoming book, Real World Haskell.  For further
details, please see the following blog entry:

http://www.realworldhaskell.org/blog/2008/01/21/finally-the-public-beta-programme-begins/

Thanks to all of the Haskell community members who have so far performed
sterling service in commenting on our closed drafts.

We look forward to your feedback!

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Poisx select support

2008-01-16 Thread Bryan O'Sullivan
Spencer Janssen wrote:

 For C's void *, I'd use Ptr ().

Ptr a seems to be more usual, and hews closer to the idea that it's a
pointer to an opaque value.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Computer Science Books using Haskell

2008-01-14 Thread Bryan O'Sullivan
PR Stanley wrote:

 Can the list recommend books that use Haskell - or any FP language but
 preferably Haskell - to illustrate the principles of compilers and/or
 algorithms?

Try Andrew Appel's Modern Compiler Implementation in ML
http://www.cs.princeton.edu/~appel/modern/ml/ which, as it uses SML for
everything, should translate quite readily to Haskell.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] US Homeland Security program language security risks

2008-01-09 Thread Bryan O'Sullivan
Yitzchak Gale wrote:

 Perhaps Coverity's interest could be
 piqued if they were made aware of Haskell's emergence
 as an important platform in security-sensitive
 industries such as finance and chip design, and of
 the significant influence that Haskell is having on the
 design of all other major programming languages.

During one of Simon PJ's tutorials at OSCON last year, a Coverity
engineer was in the audience.  He told us afterwards that he downloaded
the GHC source and gave a try at analysing it while Simon talked.  He
didn't get far, of course; their software wasn't built for the tricks
that -fvia-C plays.  But they have at least one person who was that
interested.

However, it would cost several million dollars to produce a tool as
slick as Coverity's for Haskell (Prevent is really very impressive).
That would rival Coverity's RD expenditure to date; they're a small
company.  I'd have a hard time believing that any such investment could
be recouped through commercial sales within the next decade.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: A triple of new packages for talking to the outside world

2008-01-08 Thread Bryan O'Sullivan
Adam Langley wrote:

 Ok, see http://www.imperialviolet.org/IncrementalGet.hs

That's excellent!  This is just the sort of thing one wants if getting
dribs and drabs of information instead of a steady stream.  For example,
I need to reconstruct TCP streams from individual packets captured off
the wire, and this is a much easier mechanism to use than playing tricks
with the direct-mode Get monad.

 Questions:
   1) Should Finished include the remainder of the ByteString (e.g.
 that which wasn't used by that parser)

Yes, definitely.  I had to add a runGetState to the existing Get monad
so that I could recover the unparsed residual, so I'm sure it will be
necessary here.

   2) I've no idea what I've done to the parse speed

Getting the API right is the appropriate thing to be doing first.
Afterwards, the rewrite rule ninjas can stage a night attack on
performance problems.

 But if this is useful to you, make any requests. I'll (hopefully) do
 them, clean it up and push a new release of binary-strict.

I'm lobbying for Don and company to include this stuff in the regular
binary distribution.  A proliferation of almost-identical packages
doesn't serve the community all that well.

Thanks for the nice work!  I'll try to put that code to use in perhaps a
few days, and let you know how the API works out in practice.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: A triple of new packages for talking to the outside world

2008-01-07 Thread Bryan O'Sullivan

 It would seem that there would be three possible outcomes from an
 incremental Get:
   - Failure: some bitstreams are just invalid and no amount of extra
 data will ever fix that
   - Complete [Result]: the last chunk of data has been processed.
 Maybe this should also include the remainder of the data?
   - Partial Cont [Result]: needs more data, but here's a (possibly
 empty) list of results so far.

Yes, that's more or less exactly what I had in mind.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: A triple of new packages for talking to the outside world

2008-01-06 Thread Bryan O'Sullivan
Adam Langley wrote:

 This is mostly a cut-n-paste job from the excellent binary package
 which provides Data.Binary.Strict.Get - a monad which is a drop in
 replacement for Get, but which parses strict ByteStrings and returns
 an Either,

Ooh, nice.  We could really do with an incremental version, too, which
could be spoonfed chunks of bytes, and dole out values as
deserialisation completes.

Passing back a Left String is in some sense not much of an improvement
over calling error, as if I merely doesn't have enough bytes accumulated
yet, I can't restart parsing from the point the bytes ran out.  Any
chance you'd like to hand back a continuation instead?

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Quanta. Was: Wikipedia on first-class object

2008-01-06 Thread Bryan O'Sullivan
Achim Schneider wrote:

 There is this story about some military (US afair) training a neural
 net to detect tanks in images, I can't find the link right now.
 
 It worked, with amazing 100% accuracy.
 
 Then they threw another batch of images at the net.
 
 It worked, with devastating 50% accuracy.

Indeed.  This is not the sort of thing you want to get wrong.

http://blog.wired.com/defense/2007/10/robot-cannon-ki.html

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN / CFP - LLVM bindings for Haskell

2008-01-03 Thread Bryan O'Sullivan
This is an early release of Haskell bindings for the popular LLVM
compiler infrastructure project.

If you don't know what LLVM is, it's a wonderful toybox of compiler
components, from a complete toolchain supporting multiple architectures
through a set of well-defined APIs and IR formats that are designed for
building interesting software with.

The official LLVM home page is here:

  http://llvm.org/

The Haskell bindings are based on Gordon Henriksen's C bindings.  The C
bindings are almost untyped, but the Haskell bindings re-add type safety
to prevent runtime crashes and general badness.

Currently, the entire code generation system is implemented, with most
LLVM data types supported (notably absent are structs).  Also plugged in
is JIT support, so you can generate code at runtime from Haskell and run
it immediately.  I've attached an example.

Please join in the hacking fun!

  darcs get http://darcs.serpentine.com/llvm

If you want a source tarball, fetch it from here:

  http://darcs.serpentine.com/llvm/llvm-0.0.2.tar.gz

(Hackage can't host code that uses GHC 6.8.2's language extension names
yet.)

There's very light documentation at present, but it ought to be enough
to get you going.

b
{-# LANGUAGE TypeOperators #-}

module Fibonacci (main) where

import Control.Monad (forM_)
import Data.Int (Int32)
import System.Environment (getArgs)

import qualified LLVM.Core as Core
import qualified LLVM.Core.Builder as B
import qualified LLVM.Core.Constant as C
import qualified LLVM.Core.Instruction as I
import qualified LLVM.Core.Type as T
import qualified LLVM.Core.Value as V
import qualified LLVM.Core.Utils as U
import qualified LLVM.ExecutionEngine as EE

buildFib :: T.Module - IO (V.Function T.Int32 T.Int32)
buildFib m = do
  let one = C.const (1::Int32)
  two = C.const (2::Int32)
  (fib, entry) - U.defineFunction m fib (T.function undefined undefined)
  bld - B.createBuilder
  exit - Core.appendBasicBlock fib return
  recurse - Core.appendBasicBlock fib recurse
  let arg = V.params fib

  B.positionAtEnd bld entry
  test - B.icmp bld  I.IntSLE arg two
  B.condBr bld test exit recurse

  B.positionAtEnd bld exit
  B.ret bld one

  B.positionAtEnd bld recurse
  x1 - B.sub bld  arg one
  fibx1 - B.call bld  fib x1

  x2 - B.sub bld  arg two
  fibx2 - B.call bld  fib x2

  B.add bld  fibx1 fibx2 = B.ret bld
  return fib

main :: IO ()
main = do
  args - getArgs
  let args' = if null args then [10] else args

  m - Core.createModule fib
  fib - buildFib m
  V.dumpValue fib

  prov - Core.createModuleProviderForExistingModule m
  ee - EE.createExecutionEngine prov
  
  forM_ args' $ \num - do
putStr $ fib  ++ num ++  = 
parm - EE.createGeneric (read num :: Int)
gv - EE.runFunction ee fib [parm]
print (EE.fromGeneric gv :: Int)

define i32 @fib(i32) {
entry:
icmp sle i32 %0, 2  ; i1:1 [#uses=1]
br i1 %1, label %return, label %recurse

return: ; preds = %entry
ret i32 1

recurse:; preds = %entry
sub i32 %0, 1   ; i32:2 [#uses=1]
call i32 @fib( i32 %2 ) ; i32:3 [#uses=1]
sub i32 %0, 2   ; i32:4 [#uses=1]
call i32 @fib( i32 %4 ) ; i32:5 [#uses=1]
add i32 %3, %5  ; i32:6 [#uses=1]
ret i32 %6
}

fib 10 = 55
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN / CFP - LLVM bindings for Haskell

2008-01-03 Thread Bryan O'Sullivan
Don Stewart wrote:

 (Hackage can't host code that uses GHC 6.8.2's language extension names
 yet.)
 
 {-# LANGUAGE XYZ #-} pragmas? If so, I'm pretty sure they're 
 supported, since xmonad uses them, and is on hackage.

Language pragmas in general are fine, but I believe I'm using a few that
are new to Cabal 1.2.3.0, which isn't being used to power Hackage yet.
Or thus quoth Duncan.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN / CFP - LLVM bindings for Haskell

2008-01-03 Thread Bryan O'Sullivan
Ross Paterson wrote:

 It should be able to now.

Thank you!

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compiler backend question

2008-01-01 Thread Bryan O'Sullivan
Peter Verswyvelen wrote:

 Well, I don't know about the licensing, but according to
 http://en.wikipedia.org/wiki/GNU_Compiler_Collection#Front_ends, a new
 cleaner intermediate language was created in 2005 for GCC, which might be
 more general?

It's still very difficult to work with GCC from the perspective of an
external tool.  Its current IR is still targeted towards the languages
it currently supports and the needs of its back end, so it omits a fair
bit of information that third-party tools would find useful.

 I just wanted to see if it is *possible* to feed just all the C code from
 GHC into a C compiler, and then generate a single executable from that C
 code (including the GHC runtime).

It would be a fair bit of work.  My recollection is that GHC knows
that it is talking to GCC when you go through -fvia-c, so it uses some
gcc extensions to handle things like register reservation.  A few
compilers support some of those gcc extensions, so you might have some
level of success with those.  The Intel and PathScale compilers do, for
example, and LLVM's clang front end has some level of gcc compatibility
already.

In the case of the PathScale compiler (which I used to work on), when
invoked in whole-program mode it emits files that look like regular .o
files, but are in fact just the serialised IR for a compilation unit.
It's also open source, so it would be easier to tinker with than the
Intel compiler.

 Actually this LTCG thingy was first supported by the
 Intel compiler I believe.

No, that kind of whole-program optimisation long predates its inclusion
in the Intel compiler.  The earliest I saw it in deployment was in MIPS
compilers circa 1992, and I don't recall it being a new idea at the time.

Anyway, the point about GCC being an unsuitable vehicle for this sort of
thing remains.  You'd do better looking at LLVM, for which I've lately
been working on nice Haskell bindings:

darcs get http://darcs.serpentine.com/llvm

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Comments on reading two ints off Bytestring

2007-12-23 Thread Bryan O'Sullivan
Paulo J. Matos wrote:

 I guess the latter is the correct guess.

Good guess!

You can take advantage of the fact that the Maybe type is an instance of
the Monad typeclass to chain those computations together, getting rid of
all of the explicit case analysis.

import qualified Data.ByteString.Char8 as B
import Data.Char (isDigit)

readTwoInts :: B.ByteString - Maybe ((Int, Int), B.ByteString)
readTwoInts r = do
  (a, s) - B.readInt . B.dropWhile (not . isDigit) $ r
  (b, t) - B.readInt . B.dropWhile (not . isDigit) $ s
  return ((a, b), t)

Let's try that in ghci:

  *Main readTwoInts (B.pack hello 256 299 remainder)
  Just ((256,299), remainder)

The case analysis is still happening, it's just being done behind your
back by the (=) combinator, leaving your code much tidier.  (And why
is there no explicit use of (=) above?  Read about desugaring of do
notation in the Haskell 98 report.)

The learning you'll want to do, to be able to reproduce code such as the
above, is about monads.

Cheers,

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [OT] A nice organized collection of threads in Haskell-Cafe

2007-12-08 Thread Bryan O'Sullivan
Albert Y. C. Lai wrote:

 I can't blame you for being not observant. Afterall, this is precisely
 what I'm alluding to with everyone can haz PC [...]

Please don't flame people on the list.

Thank you,

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Type error in final generator

2007-12-08 Thread Bryan O'Sullivan
Loganathan Lingappan wrote:

 main = do
 hSetBuffering stdin LineBuffering
 numList - processInputs
 foldr (+) 0 numList

The type of main is understood to be IO (), so it can't return anything.
 You could work around this by rewriting the last line above as follows:

print (foldr (+) 0 numList)

This prints the number, which is presumably what you want, and print has
type IO (), so it works out nicely here.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] fast Array operations: foldl, drop

2007-11-29 Thread Bryan O'Sullivan

Henning Thielemann wrote:

I thought operations like foldl' and drop must be very fast on arrays
(especially UArray) with appropriate pointer tricks,


These kinds of functions are only much use on one-dimensional arrays, 
which look sufficiently list-like that the ideas translate fairly 
cleanly.  For higher dimensions, there are enough options in terms of 
traversal direction and what exactly e.g. a fold should fold over 
(single elements? lower-dimensional slices?) that a sensible API doesn't 
exactly leap out.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Waiting for thread to finish

2007-11-28 Thread Bryan O'Sullivan

Andrew Coppin wrote:

Dan Weston wrote:
Silly or not, if I compile with -threaded, I always link in the 
one-liner C file:


  char *ghc_rts_opts = -N2;


Ah... you learn something useful every day! I was going to ask on IRC 
whether there's any way to do this - but now I don't need to bother. :-)


But wait, there's more!  If you're using the threaded RTS, you often 
need to know how many threads you can run concurrently, for example to 
explicitly split up a compute-bound task.  This value is exposed at 
runtime by the numCapabilities variable in the GHC.Conc module.


http://www.haskell.org/ghc/docs/latest/html/libraries/base-3.0.0.0/GHC-Conc.html#v%3AnumCapabilities

This variable is new in GHC 6.8.1 (thanks, Simon!), so don't try to use 
it with an older release.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Unix (Posix) low-level device driver functionality?

2007-11-23 Thread Bryan O'Sullivan

Galchin Vasili wrote:

In any case, it seems like the GHC 
documentation allows raw driver I/Obut when I look at the actual GHC 
6.8.1 libraries I don't see low level driver functionailty.


On Unix, at least, you don't need anything special to write userspace 
device drivers.  You normally open a special file in /dev, maybe memory 
map some of it into your address space, then use read, write, select, 
ioctl, and memory-mapped I/O.


There isn't a direct Haskell interface to ioctl, but you can write an 
FFI wrapper for it and use hsc2hs to marshal Storable instances as 
necessary, based on their C definitions.


Using the normal Unix read, write, and select functions is a bit tricky 
due to GHC's default of non-blocking I/O.  Device drivers are often very 
tetchy about the exact mode in which they're accessed, and a driver 
written for blocking I/O can simply fall over if you try non-blocking 
access.  The easiest thing to do is to rewrap them (and open, of course) 
using the FFI so that you can control the blocking behaviour.


As for memory-mapped I/O for doing the likes of PIO or DMA, 
Data.ByteString has an mmap wrapper, but it's not currently built, and 
wouldn't be especially useful for the kind of super-low-level control 
you need if you're bit-banging a device directly.  For that, you would 
do best to work in C, where it's more predictable what the compiler will 
and won't do with code around memory barriers and the like, and expose 
the code to Haskell in a controlled manner via the FFI.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] expanded standard lib

2007-11-21 Thread Bryan O'Sullivan

Magnus Therning wrote:



 “Rubygems is source-intrusive. The require instruction is replaced by a
 require_gem instruction to allow for versioned dependencies. Debian and
 most other systems think that dealing with versioned dependencies
 outside of the source is a better idea.”


To drag the conversation reluctantly back to Haskell: I comaintain all 
of the Haskell packages for Fedora, one of the biggest Linux 
distributions.  I also wrote cabal-rpm, which automates the task of 
turning a Cabal package into an RPM.  I also use Debian and MacPorts.


On none of these platforms do we have a 
Cabal-versus-native-package-manager problem.  It simply doesn't occur. 
All of the needed interactions and scripting that I've come across are 
already provided for by Cabal.  When we've had problems, it's been easy 
to write patches, and to have them accepted.


Thanks for not perpetuating the discussion,

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] expanded standard lib

2007-11-20 Thread Bryan O'Sullivan

Krzysztof Kościuszkiewicz wrote:


I would advocate using a comment system that is similar to the one
at http://djangobook.com/.


That's an appealing idea, but the devil lies in the details.

I wrote just such a comment system for  draft chapters of our book, and 
it's seen a lot of use.


However, what I do is add ID tags to the DocBook source, and the XSLT 
processor passes those through to the final HTML.  This isn't easily 
generalised to other tools, as each needs its own approach.


An alternative is to embed identifiers in the generated HTML, but this 
is brittle in its own way.  Few people generate HTML by hand, and most 
tools that do so have a habit of making huge changes to the output 
structure in response to minor user edits.  Stably identifying chunks of 
text across multiple versions of a document is thus somewhat fiddly.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] timeout and waitForProcess

2007-11-20 Thread Bryan O'Sullivan

Tim Bauer wrote:

Does anyone know if the new System.Timeout.timeout combinator can wakeup from
System.Process.waitForProcess?


No, this is expected behaviour per the documentation:

The technique works very well for computations executing inside of the 
Haskell runtime system, but it doesn't work at all for non-Haskell code. 
Foreign function calls, for example, cannot be timed out with this 
combinator simply because an arbitrary C function cannot receive 
asynchronous exceptions.


In principle, this FFI restriction could be partly lifted on POSIX 
systems, at least for some library calls, by use of thread cancellation. 
 However, the thread cancellation rules are sufficiently subtle that I 
have never heard of anyone actually using this facility.  I wouldn't 
want to be trying to write or maintain this code, though.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] expanded standard lib

2007-11-19 Thread Bryan O'Sullivan

Neil Mitchell wrote:


- The packages seem to be of quite variable quality. Some are excellent,
some are rather poor (or just not maintained any more).


The problem is that only one person gets to comment on the quality of
a library, the author, who is about the least objective person.


Not necessarily.  CPAN has a nice voting system for packages, which is 
quite widely used.


Another useful proxy for quality that CPAN is missing is download 
statistics.  The maintainers handwave about this being due to the wide 
geographic distribution of mirrors, but  I think that any download 
statistics would be better than none.


Clearly, we can do both of these things with Hackage, and I think they'd 
be very useful (particularly the voting).  Another small but useful 
thing that Hackage is missing is a notion of how fresh a package is. 
You have to hand-construct an URL to get a directory listing from Apache 
to find out how old a particular release a tarball is.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] WideFinder

2007-11-10 Thread Bryan O'Sullivan

Sterling Clover wrote:

Maps are a good choice for parallelism because they merge 
efficiently, but for the iterative aspect their performance leaves a lot 
to be desired.


This is not consistent with my observations, I must say.

What I've found to dominate the benchmark are straightforward string 
search and manipulation.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: ByteString search code available in easy-to-digest form

2007-11-09 Thread Bryan O'Sullivan

ChrisK wrote:

Yeah, my code wants to open up the internals of Lazy bytestrings.  Until
recently this was possible toChunks, but it would be best to rewrite it for the
newest Lazy representation (which comes with new shiny ghc 6.8.1).


I've updated the stringsearch package on hackage so that it ought to 
compile against both 6.6.1 and 6.8.1.  The toChunks function hasn't 
changed in any way.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sinus in Haskell

2007-11-09 Thread Bryan O'Sullivan

Hans van Thiel wrote:


Can anybody explain the results for 1.0, 2.0 and 3.0 times pi below?


It's due to rounding error in the platform's math library.  You'll see 
the same results in most other languages that call into libm.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: FileManip 0.1, an expressive file manipulationlibrary

2007-11-08 Thread Bryan O'Sullivan

Claus Reinke wrote:


the somewhat pained tone of that email was because this was a library
i might have liked to use, hindered by two all too typical issues.


To resurrect an old thread, version 0.3.1 is now BSD3-licensed, for your 
hacking pleasure, and updated to work with GHC 6.8.1.



portability is another matter,  [...]


Unfortunately, the portable System.Directory API is fairly crippled.  I 
gather there's a unix-compat package for Windows now, which might get 
you closer to a usable API.  If you find that you can get the two 
working happily, please send a patch or two.  (Unfortunately, I don't 
have time to hack Haskell on Windows myself.)


Thanks,

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ByteString search code available in easy-to-digest form

2007-11-07 Thread Bryan O'Sullivan
I've packaged up the fast Boyer-Moore and Knuth-Morris-Pratt code that 
Chris Kuklewicz posted a few months ago:


  http://article.gmane.org/gmane.comp.lang.haskell.libraries/7363

The consensus at the time was that the code was not ready for rolling 
into the bytestring package, but now it's easy to install and start 
working with.


API docs:

  http://darcs.serpentine.com/stringsearch/dist/doc/html/stringsearch/

Patches against the darcs repo welcome:

  darcs get http://darcs.serpentine.com/stringsearch

Credit to Justin Bailey, Daniel Fischer, and Chris Kuklewicz for their 
hard work.


(Currently only tested against GHC 6.6.1, FYI.)

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ByteString search code available in easy-to-digest form

2007-11-07 Thread Bryan O'Sullivan

Don Stewart wrote:


Do we have any benchmarks, for say, 1G files, versus linear, naive
(strict) search?


Chris mentioned that he did, but I haven't had time to write anything 
benchmarky yet.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The question of ByteString

2007-11-02 Thread Bryan O'Sullivan

Andrew Coppin wrote:

1. Why do I have to type ByteString in my code? Why isn't the compiler 
automatically performing this optimisation for me?


One reason is that ByteString is stricter than String.  Even lazy 
ByteString operates on 64KB chunks.  You can see how this might lead to 
problems with a String like this:


foo ++ undefined

The first three elements of this list are well-defined, but if you touch 
the fourth, you die.


2. ByteString makes text strings faster. But what about other kinds of 
collections? Can't we do something similar to them that makes them go 
faster?


Not as easily.  The big wins with ByteString are, as you observe, that 
the data are tiny, uniformly sized, and easily unboxed (though using 
ForeignPtr seems to be a significant win compared to UArray, too).  This 
also applies to other basic types like Int and Double, but leave those 
behind, and you get problems.


If your type is an instance of Storable, it's going to have a uniform 
size, but it might be expensive to flatten and unflatten it, so who 
knows whether or not it's truly beneficial.  If it's not an instance of 
Storable, you have to store an array of boxed values, and we know that 
arrays of boxes have crummy locality of reference.


Spencer Janssen hacked up the ByteString code to produce StorableVector 
as part of last year's SoC, but it never got finished off:


http://darcs.haskell.org/SoC/fps-soc/Data/StorableVector/

More recently, we've been pinning our hopes on the new list fusion stuff 
to give many of the locality of reference benefits of StorableVector 
with fewer restrictions, and all the heavy work done in a library.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Why can't Haskell be faster?

2007-11-01 Thread Bryan O'Sullivan

Ketil Malde wrote:


Python used to do pretty well here compared
to Haskell, with rather efficient hashes and text parsing, although I
suspect ByteString IO and other optimizations may have changed that
now. 


It still does just fine.  For typical munge a file with regexps, lists, 
and maps tasks, Python and Perl remain on par with comparably written 
Haskell.  This because the scripting-level code acts as a thin layer of 
glue around I/O, regexps, lists, and dicts, all of which are written in 
native code.


The Haskell regexp libraries actually give us something of a leg down 
with respect to Python and Perl.  The aggressive use of polymorphism in 
the return type of (=~) makes it hard to remember which of the possible 
return types gives me what information.  Not only did I write a regexp 
tutorial to understand the API in the first place, I have to reread it 
every time I want to match a regexp.


A suitable solution would be a return type of RegexpMatch a = Maybe a 
(to live alongside the existing types, but aiming to become the one 
that's easy to remember), with appropriate methods on a, but I don't 
have time to write up a patch.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Regex API ideas

2007-11-01 Thread Bryan O'Sullivan

ChrisK wrote:


The Haskell regexp libraries actually give us something of a leg down
with respect to Python and Perl.


True, the pure Haskell library is not as fast as a C library.


Actually, I wasn't referring to the performance of the libraries, merely 
to the non-stick nature of the API.  For my purposes, regex-pcre 
performs well (though I owe you some patches to make it, and other regex 
back ends, compile successfully out of the box).



But more interesting to me is learning what API you would like to see.
What would you like the code that uses the API to be?


Python's regexp API is pretty easy to use, and also to remember.  Here's 
what it does for match objects.


http://docs.python.org/lib/match-objects.html

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Equality Question

2007-10-15 Thread Bryan O'Sullivan

PR Stanley wrote:


is const = id?


No, const is saturated with 2 arguments, id with 1.

const 1 2 - 1
id 1 2 - type error

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Filesystem questions

2007-10-14 Thread Bryan O'Sullivan

Yitzchak Gale wrote:


Your library is very nice. But - it suffers from the
same problem. You use unsafe IO operations to build
a lazy IO list, and we all know what grief that can
lead to.


This is little different from the approach taken by Python's os.walk, 
which lazily yields the contents of a directory tree as it traverses it. 
 I'm a little unclear on why one appears good in your eyes, while the 
other is not, beyond perhaps the depth/breadth knob and differences in 
amount of documentation.  Maybe you could expand on that a bit?


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Filesystem questions

2007-10-14 Thread Bryan O'Sullivan

Yitzchak Gale wrote:


I do think that it is much better to provide IO laziness
using monad transformers (or whatever) rather than
unsafe IO.


That's fair enough.  I think it would be great if you were to turn your 
ideas into a library and provide a few examples of its use.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Filesystem questions

2007-10-13 Thread Bryan O'Sullivan

Yitzchak Gale wrote:


Python also has os.walk, a very convenient functional (sort of)
tool for recursing through directories. (It sounds trivial, but
it is not, there are enough annoying details that this function
saves huge amounts of time.) Very embarrassing that Haskell
is missing this.


See System.FilePath.Find in 
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/FileManip-0.2



How about a built-in function that represents a directory tree
as a lazy Data.Tree?


Not a very good idea.  Representing a directory structure as a tree 
makes people think they can manipulate it as if it were a tree, which 
leads to all kinds of nasty bugs when the real world bleeds through the 
holes in the abstraction.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Manual constructor specialization

2007-10-09 Thread Bryan O'Sullivan

Johan Tibell wrote:

I have a rope data type [...]


Perhaps you should talk to Derek Elkins about his.  It would be nice if 
we had fewer, more canonical implementations of popular data structures, 
instead of a proliferation of half bakery.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] signals lib

2007-09-28 Thread Bryan O'Sullivan

brad clawsie wrote:

does System.POSIX.Signals bind to OS specific real-time POSIX signal
apis? (i.e., kqueue on freebsd).


No, just the usual portable signals.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Library Process (was Building production stable software in Haskell)

2007-09-24 Thread Bryan O'Sullivan

David Menendez wrote:


Using Cabal directly, I can simply run the configure/build/install
process three times with different configuration options.

Is this possible with systems like RPM/apt/port/etc?


Yes.  In the case of RPM and dpkg, we prefix a library's name with the 
name and version of the compiler name against which it's built.  This 
means that you can have hugs20051030-binary, ghc661-binary, and so on 
all installed at once without a problem.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: interaction between OS processes

2007-09-14 Thread Bryan O'Sullivan

Aaron Denney wrote:


If you want expect like functionality, i.e. working for arbitrary
client programs, you'll need to use pseudottys, as expect, script,
screen, xterm, etc. do.


I packaged up a patch for System.Posix to add this a month or three ago, 
but forgot to follow through on it.  Thanks for the (albeit indirect) 
reminder :-)


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.Binary Endianness

2007-09-11 Thread Bryan O'Sullivan

Jules Bean wrote:

For these reasons, although it is very cool, I don't think it can be 
recommended as a basis for long-term file format definitions.


Indeed, the authors have never claimed that this is what it's for. 
Unfortunately, because the authors haven't *disclaimed* this as a 
purpose, people have fairly reasonably assumed that this *is* the intent 
of the package.


In conversations with Don and Duncan, they've always been quite clear 
that Data.Binary is intended to shovel bits rapidly and with a 
reasonable interface.  All of the things of which you speak, and more 
useful ones such as RTTI and representation of cyclic data, ought to 
live in a higher-level library.  Said library merely hasn't been written 
yet.


(All of the above speaks of the 'high-level' Data.Binary not the 
'low-level'.)


Data.Binary *is* the low-level Data.Binary :-)

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Block-wise lazy sequences in Haskell

2007-09-05 Thread Bryan O'Sullivan

Henning Thielemann wrote:

 I thought it must be possible to define an unboxed array type with 
Storable elements.


Yes, this just hasn't been done.  There would be a few potentially 
tricky corners, of course; Storable instances are not required to be 
fixed in size, though all the precanned instances are.  Using arbitrary 
Storable instances would make it necessary to scan an array linearly to 
get to a particular element, defeating one of the advantages of e.g. 
ByteStrings.



Further on, I wonder why pairs are not instances of Storable.


I think it hasn't been done simply because it hasn't been done.  The 
upcoming fusion-based list rewrite might hold some promise for relieving 
the pressure for this kind of work.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] interaction between OS processes

2007-09-02 Thread Bryan O'Sullivan

Andrea Rossato wrote:

 Most likely, the content of s sits in a local buffer and never leaves this 
 process, following most OS conventions and as others point out. Another 
 process waiting for it will deadlock.



Yes, I knew it was something related to the underneath OS. I'll have
to study Unix seriously


Your problem may be buffering-related (I haven't read your code to 
check), but if so, there's a fair likelihood that it has nothing to do 
with the OS.  GHC's runtime does its own buffer management on Handles. 
It's quite possible that your deadlock lies at that level, rather than 
anything lower.  Are you calling hFlush after writing to your pipe?


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] redirecting stdout

2007-08-30 Thread Bryan O'Sullivan

Chad Scherrer wrote:

Is it possible to write a function

redirect :: Handle - IO () - IO ()

so that redirect h action is just like action, except that all the
output written to stdout now gets sent to h instead?


No.  The file descriptor used for IO is wired into a Handle, just as in 
a FILE * in C.  You can change where stdout points using hDuplicateTo, 
but that affects the entire process.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC 6.6.1 and SELinux issues

2007-08-29 Thread Bryan O'Sullivan

Alexander Vodomerov wrote:


I've put GHC in unconfined_execmem_t and it started to work fine.  But
the problem is not in GHC -- it is in programs compiled by GHC. They
also require exec/write memory. Only root can grant unconfined_execmem
privileges, so simple user can not run binaries compiled by GHC. How do
you solve this problem?


Running chcon -t unconfined_execmem_exec_t as root will let you run 
the binaries, which you probably already knew.


The underlying problem is harder to fix: the default SELinux policy 
doesn't allow PROT_EXEC pages to be mapped with PROT_WRITE, for obvious 
reasons.  The solution is expensive in terms of address space and TLB 
entries: map the same pages twice, once only with PROT_EXEC, and once 
only with PROT_WRITE.


There's already a Trac ticket filed against this problem, but Simon 
Marlow marked it as closed because he couldn't test the code he wrote to 
try to fix it, and nobody stepped in to help out at the time: 
http://hackage.haskell.org/trac/ghc/ticket/738


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] [ANN] pcap 0.3.1, for user-level network packet capture

2007-08-27 Thread Bryan O'Sullivan
I've taken over maintenance of the pcap library (an interface to 
libpcap, for user-level network packet capture), and released a new version.


Home page: http://www.serpentine.com/software/pcap/
API docs: http://darcs.serpentine.com/pcap/dist/doc/html/pcap/
Download: http://hackage.haskell.org/packages/archive/pcap/

darcs repo: darcs get http://darcs.serpentine.com/pcap/

Thanks are due to Gregory Wright for originally writing the package, and 
Dominic Steinitz and Nick Burlett for their respective maintenance efforts.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [ANN] An efficient lazy suffix tree library

2007-08-27 Thread Bryan O'Sullivan

ChrisK wrote:


That is almost certainly because the algorithm expects the source string to have
a unique character at its end.


Chris is correct.  I'll ensure that the docs make this clear.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Diagnosing stack overflow

2007-08-17 Thread Bryan O'Sullivan

Joe Buehler wrote:


What is the point in building this huge thunk if it can't be evaluated
without a stack overflow?


It's not that there's a point to it, it's just the behaviour of foldl. 
  Hence you shouldn't be using foldl.


GHC's strictness analyser can sometimes save you from yourself if you're 
compiling with -O, but it's better to just avoid foldl and use foldr or 
Data.List.foldl' instead.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diagnosing stack overflow

2007-08-16 Thread Bryan O'Sullivan

Justin Bailey wrote:

I am trying to determine why my stack overflows in my medium sized
program (it has several modules but maybe only 1000 LOC total). On
Windows, at least, the ghcprof visualization tool doesn't work. Any
suggestions besides an output trace?


You shouldn't need ghcprof.  Just compiling with -prof -auto-all will be 
enough to get you able to use allocation profiling, then running with 
+RTS -p -RTS will generate an allocation profile as a fairly readable 
text file.



It may be the function below, which tries to determine if a list of
strict bytestrings is longer than the count given.


Taking stabs in the dark is not a good idea, and sprinkling strictness 
annotations around in an undirected manner won't help, either, however 
much it might feel like doing something concrete.  Start with looking at 
the profile output.  You'll probably find it's a different part of your 
code entirely that's causing the problem.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] When is waitForProcess not necessary?

2007-08-03 Thread Bryan O'Sullivan

Dougal Stanton wrote:


I had to do this recently, so you might be interested in my approach:

http://193.219.108.225/code/blogpost/BlogPost.hs

The idea here is to run arbitrary text (blog posts) through Markdown
and Smartypants before sending them out to the wider world.


Pardon me while I veer off-topic, but you could also use Pandoc to do 
this.  No forking required.

http://sophos.berkeley.edu/macfarlane/pandoc/

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OS swapping and haskell data structures

2007-08-01 Thread Bryan O'Sullivan

Alex Jacobson wrote:
If you create a Data.Map or Data.Set larger than fits in physical 
memory, will OS level swapping enable your app to behave reasonably or 
will things just die catastrophically as you hit a memory limit?


Relying on the OS to page portions of your app in and out should always 
be the fallback of last resort.  You are fairly guaranteed to get 
terrible performance because the VM subsystem can't anticipate your 
app's memory access patterns, and catastrophic death of either your app 
or other system processes is a strong possibility (Google for OOM 
killer if you want some horror stories).  In many cases, you can't even 
rely on paging being possible.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange behavior of executeFile

2007-07-29 Thread Bryan O'Sullivan

Krzysztof Kościuszkiewicz wrote:


This works for files, but randomly fails when stdin is connected to
a pipe (pstops complains that it can't seek input).


GHC's file handles are backed by non-blocking file descriptors.  The 
child process run by executeFile inherits the stdin, stdout and stderr 
file descriptors of your Haskell process, so they're unexpectedly (from 
its perspective) in non-blocking mode.


Due to POSIX sharing semantics, you can't simply switch those file 
descriptors to blocking in the child, because they'll then become 
blocking in the parent, too.


Anything involving sharing file descriptors between processes becomes 
similarly broken if the GHC runtime starts using a file descriptor as a 
Handle.  You're not the only one to be surprised by this behaviour, but 
unfortunately it's not trivial to work around.


Simon Marlow was going to look into this problem a few months ago, but I 
don't know if he's had a chance to.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange results when trying to create large Bool arrays.

2007-07-12 Thread Bryan O'Sullivan

Ketil Malde wrote:


I'd really like to have errors on overflow, at least as an option, even
if it is costly in terms of performance.  Is there a Trac ticket or
something for this?


Not that I know of.  I filed a Trac ticket against ByteString's readInt 
function before I noticed that read has the same problem, and it was 
closed because read does the same thing.  I've been reluctant to pop my 
head over the parapet since.


CPUs generally don't trap on integer overflow, so generating the 
additional tests and jumps necessary to handle this would be a bit 
involved, and would certainly cost a few percent in performance. 
There's also overflow in truncation and sign conversions to worry about, 
e.g. Word32 - Word16, Word32 - Int (on 32-bit systems), etc.


On the other hand, integer overflows have long been a popular attack 
vector for getting programs to misbehave in the exploit community.  If 
you Google for ia32 integer overflow or i386 integer overflow, the 
first several *pages* of results in each case consist entirely of 
security advisories.  Haskell's FFI makes it as vulnerable as the 
libraries it interfaces to.


Here's a cute-looking paper entitled Efficient and accurate detection 
of integer-based attacks.


http://www.cs.cmu.edu/~dbrumley/pubs/integer-ndss-07.pdf

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Numbers : [Haskell-cafe] Number overflow

2007-07-12 Thread Bryan O'Sullivan

Carter T Schonwald wrote:
Out of curiosity, what ever happened to the proposal a while back to 
refactor the Num class etc so that the operations would be grouped 
according to what abstract algebra notions they correspond to?


The numeric prelude proposals have a wiki page:

  http://www.haskell.org/haskellwiki/Mathematical_prelude_discussion

I think it's one of those things that doesn't have enough people itching 
over it for the collective mind to scratch.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange results when trying to create large Bool arrays.

2007-07-11 Thread Bryan O'Sullivan

Richard Kelsall wrote:


main = do
n - getArgs = readIO . head :: IO Int
a - newArray (1,n) True :: IO (IOUArray Int Bool)
printf Created array 1 .. %8d \n (n::Int) :: IO ()

It appears to work up to quite large numbers, but then gets strange.
When I give it an array size of 1,000,000,000,000 it returns this

Created array 1 .. -727379968

Presumably the Int has overflowed without warning when read from the
argument.


Yes, that's right.


I guess there must be a switch to make it produce a nice
error message rather than overflowing without warning.


Actually, there isn't.  Int is a bit of an odd fish that way; it's a 
window onto the underlying machine's behaviour, not a tidy, well-behaved 
mathematical ring.  In a similar vein, I was initially perplexed when I 
found that an expression like this produces garbage instead of an error:


  read 111 :: Int

I have not seen a lot of interest expressed in fixing this sort of 
misbehaviour, which jars a little with the usual emphasis on stringency 
and testing.



It seems to randomly claim to have successfully created huge sizes
of array.


This may be outside of Haskell's control, as you're not actually 
touching the memory you allocate.  I wouldn't be surprised if the 
underlying page allocation is succeeding by virtue of the OS being 
willing to overcommit resources that may not actually be used.  (This 
would be normal behaviour on Linux, for example.)  In such a case, the 
Haskell runtime may not receive an error until you try to actually touch 
the data.


You can get GHC to fix an upper limit on the heap size it will try to 
use, by passing +RTS -M768m -RTS to your compiled program on the 
command line.  That should cause your program to crash more reliably.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] haskell - db, on Solaris

2007-07-11 Thread Bryan O'Sullivan

Daniil Elovkov wrote:


Would you please tell me, what would be my choice if I wanted to
interact with MySql and Oracle from a Haskell program on Solaris?


http://www.haskell.org/haskellwiki/Applications_and_libraries/Database_interfaces

If you want to talk to MySQL, you have a few choices.

HDBC has an ODBC interface that lets you use any ODBC provider, so 
you'll be able to talk to both MySQL and Oracle with it.


HaskellDB can bridge to HDBC, I believe, so that will let you do 
type-safe SQL.


There's also Takusen, which can talk to Oracle, but not MySQL.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange results when trying to create large Bool arrays.

2007-07-11 Thread Bryan O'Sullivan

Albert Y. C. Lai wrote:

I'm just being picky here: where the underlying machine's behaviour is 
2's complement binary, it (Int, +, *) is actually a tidy, well-behaved 
mathematical ring, isomorphic to Z / 2^n Z.


Yes, naturally it wasn't until a few moments after I had sent the 
message that I noticed my error.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] haskell - db, on Solaris

2007-07-11 Thread Bryan O'Sullivan

Daniil Elovkov wrote:


Yes, thanks. But the emphasis was on Solaris. I don't quite understand
what is the common way to access databases on Solaris. Is it odbc?


ODBC is a standard, fairly portable database interface.  Since HDBC has 
ODBC bindings, it can in principle talk to any database that provides an 
ODBC interface.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.ByteString.dropWhile

2007-07-09 Thread Bryan O'Sullivan

Donald Bruce Stewart wrote:


I'd just manually write a 'tidy' loop (in the Data.ByteString style) (which
would avoid all allocations), since it seems pretty useful.


That would indeed be very useful to have as a library function.  I've 
pined for Python's strip() string method (removes leading and trailing 
whitespace) for a while.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] directory tree?

2007-06-25 Thread Bryan O'Sullivan

Chad Scherrer wrote:



Now it doesn't complain about too many open files, but instead I get
this runtime error:

LPS *** Exception: user error (Codec.Compression.Zlib: incorrect header 
check)


Are you sure you really have gzip files?  If you're on a Linux or 
similar box, what does file myfile.z report to you?  It should say 
something like gzip compressed data.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] directory tree?

2007-06-24 Thread Bryan O'Sullivan

Chad Scherrer wrote:


What got me thinking about this is I'd like to be able to do something
like this in just a couple lines of code:

gunzip -c ./2*/*.z

... and feed the result into a giant lazy ByteString.


Using my FileManip library, you'd do that like this.

import Codec.Compression.GZip
import qualified Data.ByteString.Lazy as B
import System.FilePath.Glob

foo :: IO B.ByteString
foo = namesMatching */*.gz =
  fmap B.concat . mapM (fmap decompress . B.readFile)

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/FileManip-0.2

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] memory usage in repeated reading of an external program's output

2007-06-21 Thread Bryan O'Sullivan

Andrea Rossato wrote:


Now I'm going to profile for memory usage: I've seen that some GC
happens if you are patient enough.


Yes, the process will hit a steady state of a few megabytes of heap 
after a short time.


By the way, your program leaks ProcessHandles.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] memory usage in repeated reading of an external program's output

2007-06-21 Thread Bryan O'Sullivan

Andrea Rossato wrote:


Still I do not understand you reference to the leak problem. Could you
please elaborate a bit?


The runProcess function returns a ProcessHandle.  If you don't call 
waitForProcess on that handle, you'll leak those handles.  On Unix-like 
systems, this means you'll accumulate zombie processes and potentially 
fill your process table, DoSing your machine.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] hSetBuffering woes

2007-06-17 Thread Bryan O'Sullivan

Eric wrote:

I'm writing  a simple HTTP server and am trying to implement the POST 
method.


That's a rather general problem statement, indeed :-)  For an 
application like this, I'd suggest that explicit resource management is 
the way to go, and that you should not be using hGetContents at all, 
under any guise.  For example, any scheme involving reading an entire 
stream is going to do completely the wrong thing in the face of HTTP 
keepalive.  Also, code that leaves open sockets piling up in drifts, to 
eventually be shoveled up by the RTS, is going to be trivially easy to DoS.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] hSetBuffering woes

2007-06-16 Thread Bryan O'Sullivan

Eric wrote:

I tried to turn off buffering with the command hSetBuffering (from 
System.IO) but my app still blocks on hGetContents (from 
Data.ByteString). Does anyone know what's happening?


The hGetContents function can't behave the way you want, because it's 
defined to return the entire rest of the input stream.


If you want to stick with strict ByteStrings, use hGetNonBlocking 
instead, but you'll need to block between reads of the handle yourself, 
using System.IO.hWaitForInput.


Otherwise, use lazy ByteStrings.  That version of hGetContents will 
lazily yield chunks that are as big as can be read without blocking as 
they arrive (up to a limit of 64KB), and will hWaitForInput for you.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] hSetBuffering woes

2007-06-16 Thread Bryan O'Sullivan

Eric wrote:

I've converted to lazy bytestrings. After reading in the bytes from a 
network connection I want to save them to a file but now the appendFile 
function blocks:


Well, yes.  It's presumably waiting for data from the network 
connection, because it wants to write out the entire ByteString, and 
whoever you're receiving data from hasn't closed the connection.


If you stepped back and stated the more general problem you are trying 
to solve, we could help you more effectively.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] embedded build language?

2007-06-14 Thread Bryan O'Sullivan

Greg Fitzgerald wrote:
Has anyone embedded a build language in Haskell?  Something like Rake 
http://rake.rubyforge.org/ is to Ruby, but in Haskell or any 
statically-typed functional language.


The closest I've seen is a tiny snippet from a blog posting:

http://ashish.typepad.com/ashishs_niti/2007/06/another_dsl_emb.html

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: [Haskell] Who pays for *.haskell.org machines?

2007-06-13 Thread Bryan O'Sullivan

Bryan Burgers wrote:


Similarly, the Perl community has a foundation, and I believe giving
to it is tax-deductible. You could look in to how they do it.


Setting up a 501(c)(3) foundation is a morass of paperwork.  If people 
within the US are interested in writing tax deductible cheques, a far 
less onerous thing to do would be to look at the Software Freedom 
Conservancy (http://conservancy.softwarefreedom.org/).  This offers many 
advantages not available to a small group, not least protection from 
personal liability for individual volunteer contributors.


I've worked before with the lawyers and administrators at the SFC and 
its parent organisation, the Software Freedom Law Center, and they are 
wonderful, motivated people.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] QuickCheck Arbitrary - infinite recursion on recursive GADTs?

2007-06-04 Thread Bryan O'Sullivan

Marc Weber wrote:


data A = A INt
   | B [A]

instace Arbitrary A where
  arbitrary = oneof [ liftM A arbitrary
, liftM B arbitrary
]

But now QuickCheck will propably create a test value
A ( B [ A ( B [ A  - no end

Is there an easy QuickCheck way to prevent this?


There are two successive sections in the QuickCheck manual that cover 
exactly this topic.


http://www.cs.chalmers.se/~rjmh/QuickCheck/manual_body.html#15

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OpenGL

2007-05-30 Thread Bryan O'Sullivan

Jon Harrop wrote:


Where should I go to get started with OpenGL and Haskell?


Take a look at Gtk2Hs, which has OpenGL bindings.

For example, see http://darcs.haskell.org/gtk2hs/demo/opengl/

b

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] New book: Real-World Haskell!

2007-05-29 Thread Bryan O'Sullivan

Jules Bean wrote:


No offense to the darcs creators, but

1) Only current Haskellers use it; everyone else either uses
Subversion or is migrating to it;


If that is true, then they have missed the point. DVC is a real win for 
most workflows.


We are indeed using darcs, so this discussion is a bit moot.

Regards,

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Slower with ByteStrings?

2007-05-26 Thread Bryan O'Sullivan

Jason Dagit wrote:


I think, given my simple algorithm that means that (==) for
ByteStrings is slower than (==) for String.  Is this possible?


Yes indeed.  Over ByteStrings, (==) is implemented as a call to memcmp. 
 For small strings, this loses by a large margin because it has to go 
through the FFI.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] New book: Real-World Haskell!

2007-05-23 Thread Bryan O'Sullivan

Dougal Stanton wrote:


That is fantastic news to hear. I realise this may be jumping the gun
a bit but could you say anything about predicted timelines?


Not just yet, but it will be a much faster process with three seasoned 
verbmonkeys at work than if we had just one.



Are you
starting from a clean slate or have you been squirrelling away lots of
material already?


A bit of both.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] New book: Real-World Haskell!

2007-05-23 Thread Bryan O'Sullivan
I'll condense my remaining replies to this thread into a single message, 
to save people a little noise.


Henning Thielemann:


I guess there will also be some lines about how to write
efficient code by using ByteString et. al.?


You bet!


What about a public darcs repository where people can constantly download
and review modifications?   People could even send patches to the authors 
(editors?).


We'll certainly consider those possibilities.  I don't know how our 
publisher will feel about them, but they've been great so far.


Alfonso Acosta:


Existential types are a must  [...]


Yes, we'll be sure to cover existentials.  Thanks for the nudge (and 
ndm, and others too).


Hans van Thiel:


Compared to Erlang. While other functional languages are mentioned
occoasionally on this list, Erlang is notably absent.


I'm reluctant to do comparisons with other languages; that's too easy to 
interpret as evangelism, a game I don't want to play.  It also invites 
the plaintive cry that someone's favourite language was dissed through 
omission.



Number two on my wish list: interfacing with Java.


The temptation to cover new and exciting material is of course strong. 
LambdaVM is both, but it's also an in-progress one-person master's 
project.  We think we'd do best to focus on libraries and extensions 
that are either distributed as standard or widely used.



How about 'Applying Haskell' or something like that as
the working title; what is the 'real world' anyway?


The details may vary, but it's roughly this:

newtype IO a = IO (State# RealWorld - (# State# RealWorld, a #))

:-)

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: FileManip 0.1, an expressive file manipulation library

2007-05-02 Thread Bryan O'Sullivan
The FileManip package provides expressive functions and combinators for 
searching, matching, and manipulating files.


It provides three modules.

System.FilePath.Find lets you search a filesystem hierarchy efficiently:

  find always (extension ==? .rb) = mapM_ remove

System.FilePath.GlobPattern lets you perform glob-style pattern 
matching, without going through a regexp engine:


  foo.c ~~ *.c
 == True

System.FilePath.Manip lets you rename files procedurally, edit files in 
place, or save old copies as backups:


  modifyWithBackup (. bak)
   (unlines . map (takeWhile (/= ',')) . lines)
   myPoorFile.csv
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: FileManip 0.1, an expressive file manipulation library

2007-05-02 Thread Bryan O'Sullivan

Bryan O'Sullivan wrote:
The FileManip package provides expressive functions and combinators for 
searching, matching, and manipulating files.


As seems to be my habit, I forgot something important, namely where to 
get FileManip from.


http://hackage.haskell.org/cgi-bin/hackage-scripts/package/FileManip-0.1

Online docs:

http://darcs.serpentine.com/filemanip/dist/doc/html/FileManip/

Darcs repo:

darcs get http://darcs.serpentine.com/filemanip
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: FileManip 0.1, an expressive file manipulationlibrary

2007-05-02 Thread Bryan O'Sullivan

Claus Reinke wrote:


i have no intention to participate
in yet-another-licencing-discussion, i would just like to ask whether 
those limitations of your offering are an accident or intended?


I didn't use the LGPL by accident.  However, I might be amenable to 
persuasion, perhaps more so if you climb down from that thing that looks 
awfully like a high horse from here.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: FileManip 0.1, an expressive filemanipulationlibrary

2007-05-02 Thread Bryan O'Sullivan

Claus Reinke wrote:

if i wanted to use that library 
for
anything i want to distribute, my only chance to avoid the source 
re-distribution

and advertising clauses would be dynamic linking


I believe that the magical protective properties of dynamic linking 
amount to no more than folklore.  So you might not want to bet your 
proprietary farm on that distinction without first seeking legal advice.


the unix dependency - it could be inherent in 
the

design, or an accident of the author's current platform.


Unfortunately, the standard libraries do not provide portable ways to 
check file status.  Much of what's currently in the unix library would 
in fact compile and work fine on Windows, and could usefully be moved 
from unix to a more portable posix library.


Regarding your soapbox, the FileManip library uses Neil Mitchell's new 
filepath library for precisely the purpose of portable file name handling.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Haskell version of Norvig's Python Spelling Corrector

2007-04-24 Thread Bryan O'Sullivan

Udo Stenzel wrote:


There is another bug of this sort in your code.  Consider


incWordCount w m = M.insertWith (+) w 1 m


There is no reason to evaluate the sum inside the map, instead an
unevaluated thunk is put in there.


Would not Data.Map.insertWith' do the trick?

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] faster factorial function via FFI?

2007-04-24 Thread Bryan O'Sullivan

Dan Weston wrote:

A thing of beauty is a joy forever. Simple, fast, elegant.



factorial :: Integer - Integer
factorial = product . zipWith (^) . factorisedFactorial


Well... The zipWith (^) should be map (uncurry (^)).

And the performance of this approach is strongly dependent on the 
efficiency of your prime sieve, so you're moving the complexity around, 
not eliminating it.


The binary splitting method doesn't need a source of primes, and 
performs half decently on numbers such as fact 1e6 (5.5 million digits 
computed in about 5 seconds).


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] faster factorial function via FFI?

2007-04-24 Thread Bryan O'Sullivan

[EMAIL PROTECTED] wrote:


Yes and no.  Standard algorithms for computing and manipulating
combinatorial-sized Integers strongly depend on the properties of
your Integer implementation.

Manipulating lists of prime factors can also be more efficient,
because most of the numbers you deal with are machine-word-sized.


Yep.  By the way, if approximations are good enough, the OP could use 
Gosper's formula:


gosper :: Integral a = a - a

gosper n | n  143 = let n' = fromIntegral n
 g = sqrt ((n' * 2 + 1/3) * pi)
   * n'**n' * exp (-n')
 in round g

The accuracy of this approximation increases with n, until you hit the 
ceiling of whatever your Double implementation can manage (142, typically).


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Announce: DisTract: Distributed Bug Tracker implemented in Haskell

2007-04-23 Thread Bryan O'Sullivan
Nice.  You might find Bugs Everywhere 
http://www.panoramicfeedback.com/opensource/ interesting for comparison.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] faster factorial function via FFI?

2007-04-23 Thread Bryan O'Sullivan

Dan Drake wrote:


This is combinatorics, so I can't just say oh, this is small and cross
it off like physicists do. :)


Binary splitting is much faster than the naive approach, but still easy 
to understand.  That's fac1 in the attached file.


I ran out of time to write an efficient implementation using swing 
numbers, but my slow, crummy version is present as fac2, just as a data 
point.


b
{-# OPTIONS_GHC -fbang-patterns #-}

import Data.Bits (Bits, (..))
import System.Environment (getArgs)

fac0 :: Integral a = a - a

fac0 n = product [1..n]

fac1 :: Integral a = a - a

fac1 n = prod n 0
where prod a b = let d = a - b
 in if d  0
then 0
else case d of
   0 - 1
   1 - a
   2 - a * (a - 1)
   3 - a * (a - 1) * (a - 2)
   _ - let m = (a + b) `div` 2
in prod a m * prod m b


fac2 :: (Integral a, Bits a) = a - a

fac2 0 = 1
fac2 n = let f2 = fac2 (n `div` 2)
 in f2 * f2 * swing n
where swing n = let b = case n `mod` 4 of
  0 - 1
  1 - n `div` 2 + 1
  2 - 2
  3 - 2 * (n `div` 2 + 2)
z = 2 * (n - ((n + 1) .. 1))
in loop b z 1
  where loop !b !z !i
| i == n `div` 4 + 1 = b
| otherwise = let b' = (b * z) `div` i
  z' = z - 4
  in loop b' z' (i + 1)

fac :: (Integral a, Bits a) = a - a

fac n | n  500   = fac1 n
  | otherwise = fac2 n


main :: IO ()

main = do
  (f:ns) - getArgs
  let func = case f of
   0 - fac0
   1 - fac1
   2 - fac2
   _ - error Huh?
  print (map (odd . func . (read :: String - Integer)) ns)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Haskell version of Norvig's Python Spelling Corrector

2007-04-22 Thread Bryan O'Sullivan

Ketil Malde wrote:


Hm - nobody suggested using ByteStrings yet?


I wrote an independent port of Norvig's spellchecker, because I figured 
it would make the basis of an interesting tutorial.  For such a small 
program, it's been quite a challenge!


I started out using lazy ByteStrings, Data.Map, and Data.Set.  As Albert 
observed, using Data.Set is poison for heap size and performance.  The 
result of switching from sets to lists was a  90% reduction in memory 
usage, and a big (but unmeasured) speedup.


After this switch, I found that spellchecking one word still took 4x as 
long in Haskell as Norvig's Python program.  Since I was checking only 
one word in each case, essentially all of the runtime was taken up by 
building the word frequency map.


In my profile results, I find that simply converting words to lower case 
accounts for a whopping 40% of time and allocation (see the attachment 
for my definition of the train function).


COST CENTREMODULE   %time %alloc

lower  Spell 40.5   41.2
train  Spell 26.3   14.3
mkWordsSpell 21.9   24.1

I was interested in building a profile-enabled version of fps to see 
what might be going on inside the library, but was stymied by not 
knowing how to get GHC 6.6's base to coexist peacefully with fps (hiding 
base isn't very practical).


My experiments are available here:

darcs get http://darcs.serpentine.com/spell

Norvig's training data is available from http://norvig.com/big.txt

b
import qualified Data.Map as M
import qualified Data.ByteString.Lazy.Char8 as B
import qualified Data.ByteString.Lazy as X

type Model = M.Map B.ByteString Int

train :: B.ByteString - Model

train = foldl' updateMap M.empty . map lower . mkWords
where updateMap model word = M.insertWith' (+) word 1 model
  mkWords = filter (not . B.null) . X.splitWith isNotAlpha
  lower !s = X.map toLower s
  isNotAlpha !c = c  0x41 || (c  0x5a  c  0x61) || c  0x7a
  toLower !c | c = 0x41  c = 0x5a = c + 0x20
 | otherwise = c
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Haskell version of Norvig's Python Spelling Corrector

2007-04-22 Thread Bryan O'Sullivan

Pete Kazmier wrote:


Bryan, out of curiosity, is a non bytestring version of your code
faster?


No, but the difference is smaller than I expected: the lazy ByteString 
version is about 1.8x faster than using String.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Haskell version of Norvig's Python Spelling Corrector

2007-04-22 Thread Bryan O'Sullivan

Bryan O'Sullivan wrote:

In my profile results, I find that simply converting words to lower case 
accounts for a whopping 40% of time and allocation (see the attachment 
for my definition of the train function).


COST CENTREMODULE  %time %alloc

lower  Spell40.5   41.2
train  Spell26.3   14.3
mkWordsSpell21.9   24.1


A little more instrumentation says this (using the darcs head of fps 
built with -auto-all):


loopU  NewData.ByteString.Fusion  25.4   28.8
splitWith  NewData.ByteString 15.4   17.2
train  Spell  10.26.1
isNotAlpha Spell   9.4   12.2
compareBytes   NewData.ByteString  8.89.6
compareBytes   NewData.ByteString.Lazy 7.40.4
inlinePerformIONewData.ByteString.Base 6.60.0

(At Stefan's suggestion, I renamed the modules in fps to NewData.*, in 
order to get around the name clashes when I try to concurrently use fps 
and base.)


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: How Albus Dumbledore would sell Haskell

2007-04-20 Thread Bryan O'Sullivan

Derek Elkins wrote:

Game search is exactly an example use in Why Functional Programming 
Matters (http://www.math.chalmers.se/~rjmh/Papers/whyfp.html).  That 
paper, 23 years later, is still pretty compelling.  Perhaps, it should 
just be modernized and somewhat expanded.


I'll echo Lennart's response to the theorem proving suggestion :-)

Tom Moertel's parallel port scanner is much more the kind of thing that 
will get people's attention.  It's practical; it's parallel; and it's 
short.  And for Simon's convenience, it's already been written, so he 
can concentrate on presenting it, rather than writing it.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Tutorial on Haskell

2007-04-16 Thread Bryan O'Sullivan

Mark T.B. Carroll wrote:


I'm afraid no
examples come easily to mind, though.


Here's a simple one: reading a flattened graph from disk.  If your 
flattened representation contains forward references, you have to fix 
them up in a strict language.  In a lazy language, you can refer to 
elements you haven't yet read, eliminating that book-keeping.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Tutorial on Haskell

2007-04-16 Thread Bryan O'Sullivan

Neil Bartlett wrote:


E.g. perhaps some kind of instant messaging server? Or Twitter except
scalable.


A twitter-alike will quite probably get people's attention.  And of 
course anything that breaks the it's good for compilers! stereotype is 
to be commended :-)


Also on the subject of scaling, Ralf Lammel's paper on looking at 
MapReduce through a strongly typed functional lens has been quite a hit. 
A tutorial along the lines of dealing safely with lots of data, in a 
cluster of systems, would likely go down well.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] newbie concatenating monad question

2007-03-24 Thread Bryan O'Sullivan

Leandro Penz wrote:


My idea is to have a monad with a concatenating , so that I can:

bulidStuff = do
  func1
  func2
  func3
  func4


You could do this, but it's easier to take advantage of the fact that [] 
is an instance of MonadPlus, and just use `mplus`.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Partial Evaluation

2007-03-21 Thread Bryan O'Sullivan

jim burton wrote:
I am reading Hudak's paper Modular Domain Specific Languages and Tools 
[1] and am confused by his use of the term `Partial Evaluation'. I 
understand it to mean supplying some but not all arguments to a 
function, e.g. (+3) but it seems to mean something else too.


That's partial application you're thinking of.  In the context of inline 
operators, this is referred to as a section.


Partial evaluation actually executes some of a partially applied 
function, generating a new function that is specialised to the given 
arguments.


Here's a silly, contrived example to illustrate the difference. 
Consider this function:


sumPlus :: Num a = [a] - a - a
sumPlus xs y = sum xs + y

Here's a partially applied version of the function:

sumPlus ([3,2] :: [Int])

Its type is

Int - Int

If you partially evaluate this function, you're evaluating as much of 
the function as possible at compile time (usually in a separate 
partial evaluation tool that generates a whole new source file for the 
real compiler), and getting a new specialised function:


sumPlus32 :: Int - Int
sumPlus32 y = 5 + y

You could expect a decent compiler to apply this kind of transformation 
under limited circumstances, but partial evaluation is much more 
ambitious.  The canonical example is partially evaluating a language 
interpreter by giving it a chunk of source as the input to specialise 
on.  This produces a new interpreter that is specialised to interpret 
exactly that source (aka the first Futamura projection), and which you 
might hope would do so more efficiently than the fully general interpreter.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Lazy IO and closing of file handles

2007-03-19 Thread Bryan O'Sullivan

Pete Kazmier wrote:


I understand the intent of this code, but I am having a hard time
understanding the implementation, specifically the combination of
'fix', 'flip', and 'interate'.  I looked up 'fix' and I'm unsure how
one can call 'flip' on a function that takes one argument.


If you look at the code, that's not really what's happening.  See the 
embedded anonymous function below?


  flip fix accum $
 \iterate accum - do
   ...

It's a function of two arguments.  All flip is doing is switching the 
order of the arguments to fix, in this case for readability.  If you 
were to get rid of the flip, you'd need to remove the accum after 
fix and move it after the lambda expression, which would make the 
expression much uglier to write and read.  So all the flip is doing 
here is tidying up the code.


(If you're still confused, look at the difference between forM and mapM. 
 The only reason forM exists is readability when you have - in terms of 
the amount of screen space they consume - a big function and a small 
piece of data, just as here.)


As to why it's okay to call flip on fix at all, look at the types 
involved.


fix :: (a - a) - a
flip :: (a - b - c) - b - a - c

By substitution:

flip fix :: a - ((a - b) - a - b) - b

In the case above, accum has type a, and the lambda has type
(a - IO a) - a - IO a, and these fit nicely into the type expected by 
flip fix.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell and SSL

2007-03-17 Thread Bryan O'Sullivan

Thomas David Baker wrote:

I know that darcs uses curl in a similar way for some stuff but 
it still feels like I'm doing the Wrong Thing.


No, you're not.  SSL is very complicated, which is why there are no 
Haskell libraries that implement or usefully wrap it.


There's a proposal to write a decent curl wrapper for SOC, which would 
provide some level of SSL support.  We'll see where it goes.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Defining types (newbie)

2007-03-14 Thread Bryan O'Sullivan

John Fouhy wrote:


But if I want to combine tcEqOne and tcGtThree I run into type
problems, because one of them uses Strings and the other Integers.


Yep.  The observation you can make here is that TC is really a fancy way 
to write a function that takes some parameters and returns a Bool.  So 
And and Or don't really care what types their parameters have, so long 
as they return Bools when evaluated.


If you could get rid of the type variable on the left, you'd be set, 
because then And and Or would each take ThingCompares of any type.  And 
you can!  See section 7.3.4 of the Hugs manual for existential types: 
http://cvs.haskell.org/Hugs/pages/hugsman/exts.html#sect7.3.4


This will let you move your type variable a into the TC branch alone as 
forall a. TC ..., so it will not, er, infect the rest of the branches 
of the type.  Then And and Or will neither care, nor be able to find 
out, about what TC really has inside, which is as you might wish.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Defining types (newbie)

2007-03-13 Thread Bryan O'Sullivan

John Fouhy wrote:


In Haskell, I envisage writing something like:

data ThingCompare = TC Op Field


This wants to be a bit more concrete:

data ThingCompare = TC (a - a - Bool) (Thing - a)

so that you could then have something to execute your comparison thingy:

runTC :: ThingCompare - Thing - Thing - Bool
runTC (TC compare extract) a b = compare (extract a) (extract b)

and construct a value of it:

tcEqFirst = TC (==) first_field

so to compare two Things, you'd do something like this:

runTC tcEqFirst a b

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Performance Help

2007-03-11 Thread Bryan O'Sullivan

Dominic Steinitz wrote:


Any help would be appreciated.


I notice that there's not much user-accessible documentation of what you 
can expect GHC (or some other Haskell implementation) to do and not do 
with a given piece of code.  For example, you have a lot of little 
definitions that clearly traverse the same lists many times.  I've no 
idea where I would look, except for the compiler source, to get a sense 
for when, if ever, the compiler might apply CSE, fusion, or any other 
techniques that come to mind.  So transmitting folk wisdom on what the 
compiler might do with any given piece of code counts as another half 
chapter in the Practical Haskell book that ought to get written :-)


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] System.FilePath Re[2]: ANN: HSH 1.2.0

2007-03-10 Thread Bryan O'Sullivan

Bulat Ziganshin wrote:

-- i've added crossposts to John Meacham and Einar Karttunen because
-- you also denoted interest in new i/o library


Me, too :-)


just a couple of
ideas:



- portable async i/o which is able to work via select/epoll/...


I think you mean non-blocking I/O here, right?  Async is not the same thing.


- interfacing with Streams and FPS libraries


It should also integrate cleanly with the network stack, which needs an 
overhaul about as badly as the I/O library.  Einar Karttunen maintains 
his network-alt library at


http://www.cs.helsinki.fi/u/ekarttun/network-alt/

which I think is a good starting point.

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


<    1   2   3   4   5   >