Re: [Haskell-cafe] symbol type?

2007-10-11 Thread Thomas Conway
On 10/10/07, Michael Vanier <[EMAIL PROTECTED]> wrote:
> Is there an implementation of a symbol type in Haskell i.e. a string which 
> has a constant-time
> comparison operation?

To borrow Prolog terminology, it sounds like you're looking for an
"atom" data type.

I've not done it, but I've plotted to implement a module according to
the following sketch:

module Data.Atom where

data Atom 

atom :: String -> Atom -- or ByteString

name :: Atom -> String -- or ByteString

instance Eq Atom where ...

instance Ord Atom where ...

The constructor function would do hash-consing using unsafePerformIO
internally to build a [hash] table of extant Atoms. If ByteString is
used for the internal "name", the hash consing means that you can use
the Ptr for O(1) equality tests. The implementation of compare would
still need to do a normal string comparison, after doing an initial
equality test.

If you do the O(1) equality test before doing a full compare, the
performance will be very good in many situations, since non-equal
comparisons tend to terminate quickly. The exception of course is
strings with long common prefixes (e.g. URLs). For symbol names in a
compiler, this is unlikely to be a significant problem.

cheers,
T.
-- 
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] New slogan for haskell.org

2007-10-08 Thread Thomas Conway
On 10/8/07, Alistair Bayley <[EMAIL PROTECTED]> wrote:
> For me, a large part of Haskell's attraction are the features which
> reflect good engineering practice: strong, static type checking;
> purely functional code; good FFI. It should be easier to write simple,
> reliable software in Haskell than in most other languages; IMO,
> getting the unwashed hordes to use Haskell would be a great
> improvement in software industry productivity.

I recently put it thus to a friend: When it's hard writing a program
in Haskell, it is usually because Haskell makes you think about what
you are doing, and as a consequence you end up doing your thinking
sooner rather than later. This usually ends with better software with
fewer bugs.

cheers,
T.
-- 
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] New slogan for haskell.org

2007-10-08 Thread Thomas Conway
I just had a conversation today that seems relevant to this thread. I
was chatting with a friend who is working in the academic sector, and
I was observing that Melbourne Uni (my old school), is switching in
the new year from teaching Haskell as a first language, to teaching
Python. I was dismayed, but not surprised.

Anyway, I was talking about this with my friend said that he
understood the main reason for the change was that students were not
being "switched on" or excited learning Haskell as they used to be
learning C. He put it down to the fact that in C, you are more
obviously "making the computer do stuff", and that Haskell is
sufficiently high level and abstract that beginner programmers don't
get that thrill of feeling like you're making the computer work for
you.

I must say, I get that! but at the same time, of course, the high
level abstraction is exactly what *we* love about Haskell.

cheers,
T.
-- 
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: bizarre memory usage with data.binary

2007-10-04 Thread Thomas Conway
On 10/4/07, Jules Bean <[EMAIL PROTECTED]> wrote:
> ...and indeed it can't be done, except by the naive brute-force method
> of comparing every subtree, possibly optimised by cryptographically
> hashing a representation of every subtree, since sharing isn't an
> observable property.

At least one Prolog implementation (I forget which, I'm sorry), had a
[de]serialisation library which used a hash-consing approach.
Basically, it did its serialization using a post-order traversal and
emitted references to previous values when the same value had already
been emitted. Not rocket science. Actually, I've heard a Prolog guy -
Bart Demoen - talk about doing pretty much this during GC to improve
sharing.

cheers,
T.
 --
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] 'data' syntax - a suggestion

2007-09-27 Thread Thomas Conway
On 9/28/07, David Menendez <[EMAIL PROTECTED]> wrote:
> I'm not sure there is a clash.
>
> data B k v where ...
>
> is easily distinguished from
>
> data B k v = ... where ...

Indeed.

Although Richard's proposal was simpler, I reckon it's worth
discussing whether the where clause should allow normal
type/data/newtype declarations, effectively introducing a new scope.
There are obviously some type variable quantification and name
resolution issues that should yield several conference papers.

Here are a couple of examples:


data Tree key val
= Leaf key val
| Node BST key val BST
where
type BST = Tree key val


data RelaxedTree key val
= Leaf Bal [(key,val)]
| Node Bal [(key,RelaxedTree key val)]
where
data Bal = Balanced | Unbalanced

-- 
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] 'data' syntax - a suggestion

2007-09-27 Thread Thomas Conway
On 9/27/07, ok <[EMAIL PROTECTED]> wrote:
> I have often found myself wishing for a small extension to the syntax of
> Haskell 'data' declarations.  It goes like this:
['where' clause to allow locally defined names in type declarations]

Nice.

Quite a few times I've found myself declaring type synonyms for this
reason, but you end up  polluting the global namespace.

+1 vote.

-- 
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] C's fmod in Haskell

2007-09-24 Thread Thomas Conway
On 9/24/07, Henning Thielemann <[EMAIL PROTECTED]> wrote:
> Unfortunately, properFraction doesn't work well with negative numbers.

This old chestnut! It's a common problem in practice. As I recall, the
behaviour of C's % operator allows implementations to yield either
behaviour.  I just checked ISO 9899:1999 which defines fmod. It
specifies that the result of fmod(x,y) should have the same sign as x.
Interesting.

cheers,
T.
-- 
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] C's fmod in Haskell

2007-09-23 Thread Thomas Conway
On 9/17/07, Peter Verswyvelen <[EMAIL PROTECTED]> wrote:
>
>  Maybe this is a stupid question, but I don't find something similar in
> Haskell.
>
>  I find mod and rem, which work on integers. But I'm looking for a function
> similar to C's fmod.
>
>  Of course I can write it myself, but I guess it must already exist under a
> different name?

This one worried me for a while but, help is at hand with the std library.

In Prelude there is a function properFraction which splits a RealFrac
into its integer part and its fractional part. You can use this to
implement fmod. Assuming properFraction is efficient (which it
probably isn't), you can implement fmod really quite efficiently.

In particular, x `fmod` 1.0 == (snd . properFraction) x

cheers,
T.
-- 
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Monad.Reader 8: Haskell, the new C++

2007-09-13 Thread Thomas Conway
On 9/14/07, Thomas Schilling <[EMAIL PROTECTED]> wrote:
> The type system doesn't help you avoid writing non-terminating programs,
> so i see no problem with it being possible giving programmers the power
> to express and check more complex properties of their programs -- as
> long as type-checking remains sound.  From a practical standpoint,
> non-terminating type checks are just as much a bug as non-terminating
> library functions.  Type systems need more thought anyways, so why not
> make sure it's terminating, too?  The other extreme is to use dependent
> types everywhere, but this has a bit more drastic consequences to the
> accessibility and practicality of the language.

While I love all the exceedingly cool type hackery, I also like the
compiler to terminate.

Some people in this forum may be old enough to remember Turbo Prolog.
It did mode inference (i.e. data-flow analysis) on programs, but
unfortunately it didn't always terminate. So what you got was a hung
compiler, leaving you to guess what it was about your [quite possibly
correct] program that caused the analysis to loop.

With C++ templates, the problem is addressed by having a limit to the
depth of the "call stack" for template evaluation. I recall with Forte
5, there was no flag to let you increase  the depth, so at one point
we had to do something like

if (0) {
  // Horrible nasty expression to force the evaluation of some of the
 // the lower parts of the template stack
}

This works because (at least in Forte 5, and probably most
implementations) template instantiations are hash-consed.

I would *much* rather have a simpler type system, than a compiler
which might not terminate.

cheers,
T.
-- 
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Learn Prolog...

2007-09-04 Thread Thomas Conway
On 9/5/07, Stefan O'Rear <[EMAIL PROTECTED]> wrote:
> I've always wondered why Prolog uses DFS, instead of some complete
> method like DFID or Eppstein's hybrid BFS...  having to worry about
> clause order seems so out of place.

Well, a couple of reasons are pretty well agreed in the Prolog community:

1. Order of side-effects.

2. Efficiency of implementation.

and arguably

3. Hysterical Raisins.

-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Learn Prolog...

2007-09-04 Thread Thomas Conway
On 9/5/07, Derek Elkins <[EMAIL PROTECTED]> wrote:
> That's because Prolog is -ugly-.  The only reason I recommend it is
> because it's archetypical and there aren't any other logic languages
> with anywhere near the mindshare/significance.  For a thing of sheer
> beauty, see, e.g. LolliMon.

Oh, look, I quite like Prolog in some respects. Especially Nu-Prolog
which has safe negation.

I also highly recommend learning it. It will enrich the way you think
about problems, especially if you get to know it well enough to
understand how logic variables are implemented, etc.

T.
ps I feel obliged to put in a good word for Mercury which I worked on,
along with a few other denizens in this forum. See
www.mercury.cs.mu.oz.au.
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Learn Prolog...

2007-09-04 Thread Thomas Conway
On 9/2/07, Andrew Coppin <[EMAIL PROTECTED]> wrote:
>
> > One of standard exercices in Prolog is the construction of the
> > meta-interpreter of Prolog in Prolog. While this is cheating, I recommend
> > it to you. It opens eyes.
>
> Ever tried implementing Haskell in Haskell? ;-)

In many respects, Haskell is a much higher-level language than Prolog.
Before you all gasp and go , consider the following argument.

In Prolog, you need to pay close attention to the exact order in which
things are executed. In preditcate logic you may write (using
haskellish term notation)

a(xs,ys,zs) <= (xs = [] /\ ys = zs) \/ (xs = (x:ws) /\ zs = x:vs /\
a(ws, ys, vs))

but to interpret this as a *program* you have to consider how it will
be executed. In particular, using SLD resolution, conjunction (/\, or
',' in Prolog notation) is not commutative as it is in predicate
logic.

This is good in some respects - in this respect it is easier to write
efficient Prolog than Haskell, but at the same time, compilers can't
rewrite programs (i.e. optimize) nearly so effectively. And that's not
even taking into account that Prolog is impure, so the compiler has to
watch out for side effects.

Also, still considering "append", as a high level specification of the
relationship between lists and their concatenation, it is ill-typed:

append([], Ys, Ys).
append([X|Xs], Ys, [X|Zs]) :- append(Xs, Ys, Zs).

(Ye Gods! My Prolog *has* rusted quickly!)

>From the clauses, it is clear that the first argument must satisfy

list([]).
list([X|Xs]) :- list(Xs).

but the same is not true of the second and third arguments.

?- append([1,2,3], 4, Zs).
Zs = [1|[2|[3|4]]]
?-

Lee Naish has written in detail on this subject.

Another argument in favour of Haskell being high-level is John Hughes'
"glue" argument. If you don't know what I mean, go and read "Why
Functional Programming Matters".

Hey, that was fun. I have barely written *any* Prolog since I finished
my thesis. :-)

cheers,
T.
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] STM, IO and b-trees

2007-08-20 Thread Thomas Conway
On 8/21/07, Ben <[EMAIL PROTECTED]> wrote:
> some basic questions: you have limits on key and value size?  fixed
> page sizes per tree?

I have chosen Yes and Yes respectively, though these are *choices*.
Personally, I think fixed sized pages are fine. Handling unlimited
sized keys is not hard, especially if you think they will be rare: you
use a bit to flag an immediate key vs a remote key, and keep an
overflow file for long keys.

Another idea that I've had, though I have not had the opportunity to
implement it is to use arithmetic coding on the keys. If you carve up
the zero-to-one interval in sorted order, then the arithmetic coded
bitstrings sort (lexicographically) in the same order as the
uncompressed strings. From the top of my head, most *english* words
can compress in this fashion with a 0 order model to about 2-3 bits
per character. This doesn't eliminate a key length by itself, but it
should increase the limit dramatically in practice.

My implementation uses 64K pages, with a limit of 32K - 8 bytes on the
size of an individual key.

> on point 2: it sounds like you'r suggesting using STM after all.  not
> sure if i understand what you've said:

something like

type PagePtr t = TVar (Address, Maybe t)

data Node = Node (PagePtr Node) [(Key,PagePtr Node)] | Leaf [(Key,Data)]

> work with a page cache.  do the various b-tree algorithms in STM
> transactions on the cached pages.  retry STM transactions when pages
> aren't loaded. on successful STM transaction completion, write out the
> dirty pages.

Yes, except you might want to be clever about flushing dirty pages more lazily.

My implementation isn't crash-proof (i.e. doesn't support crash
recovery - the environment in which my code operated means that if
something bad happens, I can rebuild from external data).

>  probably use the trick where the STM transaction returns
> an IO action which you then perform.  probably use ordinary page-level
> locks to deal with concurrency on IO -- STM doesn't help.

Maybe. See the SPJ video on STM. His basic point is that STM helps get
rid of the stupid concurrency bugs, leaving just the "more
interesting" ones for you to deal with. :-)

Actually, using relaxed balance and making all the tree updates local,
means that my btree code successfully supports a very high degree of
concurrency.

> as far as implementing the page cache: using a TVar Map would
> essentially be creating a global lock on the entire cache.

Exactly, which is why you want to push the TVars down.

>  so you
> suggest using other implementations.  your suggestion sounds like an
> array with linear search, or probably something more sophisticated.  i
> would imagine a balanced binary tree where the nodes are TVars might
> work nice, though rebalacing would touch a lot of nodes.  (it does
> seem strange haskell doesn't have more concurrent data structures.)

Yes, I've chatted with Andrew Bromage about the need for

Data.Map.Concurrent
Data.Set.Concurrent
etc.

I have a concurrent hash table which works very nicely. Think

class Hashable t where
hash :: t -> Word64

type HashTable k v = TArray Word64 [(k,v)]

Another alternative that others have suggested are Tries (radix
trees). Something along the lines:

type Trie v = 

insert :: [Bit] -> v -> Trie v -> STM ()

> on point 1: you're saying relaxed balance is better for STM-style
> optimistic transactions?  i'm using
>
> B-Trees with Relaxed Balance (1993)
> Kim S. Larsen, Rolf Fagerberg
> IPPS: 9th International Parallel Processing Symposium

Yes. There's a tech report version which includes details of deletion
which IIRC the one you mention does not. citeseer... google

The reason I believe relaxed balance works particularly well with STM
is that all the operations (insert, delete, rebalance) operate
*locally*. That is, they only modify a single node or a couple of
proximate nodes. In STM terms, this means a given operation only
*writes* a couple of TVars close to where the operation took place.
One of the cool things is that Larsen et al prove that as you progress
up the tree the number of rebalancing operations drops geometrically.

Effectively, this means you very rarely need to get a write-lock on
the root (or top few nodes) of the tree, so randomly dispersed
operations are unlikely to conflict with one another.

Also, by separating the insert/delete bits from the rebalance bits,
both are simpler to code, and you have fewer edge cases to think
about.

cheers,
T
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] STM, IO and b-trees

2007-08-20 Thread Thomas Conway
On 8/21/07, Ben <[EMAIL PROTECTED]> wrote:
> for sake of argument, suppose an enterprising haskell newbie wanted to
> code up concurrent b-trees (really b-link trees) in haskell.  if i am
> understanding STM correctly, it will NOT help in any way with the
> implementation, because of the IO-intensive nature of the algorithms?
> so i will have to resort to the usual games with locks and latches?

I have produced exactly such an implementation in my day-job (so I
can't, at this stage, give you the code, I'm afraid), but I'll happily
give you some tips:

1. Investigate relaxed balance.

BTrees with relaxed balance enable you to break up operations into
much smaller transactions, which will reduce the amount of rerunning
on transactions (big transactions are more likely to contain
conflicts).

Also, getting all the edge cases right is hard with strict balance.
Especially in the presence of deletions. It is VASTLY simpler with
relaxed balance, though there are a few little tricks. If it was too
easy, it wouldn't be any fun (see 3, below). Hint: Although the
on-disk version doesn't need or want parent pointers, you might want
them for your in-memory version of pages.

2. Separate the IO from the BTree-stuff.

Conceptually keep a TVar (Map Address ByteString). In the
transaction, use this to find pages. If the page is not there, throw
an exception containing the desired address. In a wrapper, catch the
exception, read the page, add it to the map as a separate transaction
then retry the original transaction. I say "conceptually" because
something like TArray Address (Maybe ByteString), or
similar will yield much better concurrency. In general, you want to
push the TVars down as far as possible.

3. Have Fun

STM is very cool, so make sure you enjoy making it all hang together. :-)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell vs GC'd imperative languages, threading, parallelizeability (is that a word? :-D )

2007-08-10 Thread Thomas Conway
On Friday 10 August 2007 03:51:49 Hugh Perkins wrote:
> Getting back to the original problem, which is: threading.  Donald, one of
> the things that is very interesting about Haskell is it's potential for
> automatic threading, ie you write a trivial algorithm that looks like it
> runs in a single thread, and the runtime splits it across multiple cores
> automatically.

In the interests of clarity, it is useful and important to distinguish
between threading (i.e. concurrency) and parallelism. The former
relates to the programmer's view of the world, and the latter relates
to the computer's model of execution. As nomenclature, it is not
universal, but is very common, and helps avoid conflating two quite
different things.

cheers,
Tom
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Dynamic thread management?

2007-08-10 Thread Thomas Conway
On 8/11/07, Hugh Perkins <[EMAIL PROTECTED]> wrote:
> > - parallelism must be quite coarse to offset overheads
> >(which I think is the problem with expecting things like map and fold
> > to parallelised automagically; they're just too small grained for it to
> > be worthwhile)
>
> Someone else said that.  I dont understand  what you mean.

There are many papers about this in the Parallel Logic Programming
area. It is commonly called "Embarrassing Parallelism". Creating a
thread, or even just scheduling a chunk of work for evaluation has
packaging-up costs, synchronization costs, and so on. It is all too
easy for these costs to outweigh the work to be done, so by
parallelizing your code, you make it run slower.

So, if you want to parallelize "map f xs", unless f is *really*
expensive, you'll only get a benefit if you can break xs into chunks
of e.g. 10^3 elements or more, and more like 10^5 or more for more
usual 'f's. Some tasks, like Monte Carlo integration are very amenable
to such, but most tasks are not.

cheers,
T.
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Small question

2007-08-09 Thread Thomas Conway
On 8/10/07, Hugh Perkins <[EMAIL PROTECTED]> wrote:
> On 8/10/07, Stefan O'Rear <[EMAIL PROTECTED]> wrote:
> > Haskell's purpose: To be a generally cool language
> > Haskell's competition: C++, SML, ... hundreds of thousands more and I make
> no assertion of a representative sample ...
> >
>
> Well, C++ is not really competitive with Haskell, because C++ does not have
> a GC, and it's trivial to corrupt the stack/heap.

Beg to differ. I offer the following proof by contradiction. :-)

In my current job, I had a version-1 implementation in Python which
had severe performance problems, and was not amenable to concurrency
(The Python interpreter has a global lock, so you can only execute
python bytecodes from one thread at a time. :-(). The natural
alternative implementation language was C++, but I argued successfully
that a Haskell implementation would be significantly easier to make
concurrent.

Saying that it's trivial to corrupt the stack/heap in C++ is a bit
like saying it's easy to fall of a bicycle. Sure it is, but there are
also well understood techniques for avoiding doing so. :-) In C++ that
I write, I almost never use bare pointers. Using auto_ptr, shared_ptr,
etc, handle most of the memory management issues. When they don't, one
can usually make a analogous class to manage the lifetime for you.

cheers,
Tom
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Zippers, Random Numbers & Terrain

2007-08-01 Thread Thomas Conway
On 8/2/07, apfelmus <[EMAIL PROTECTED]> wrote:
> That concludes the infinite terrain generation for one dimension. For
> higher dimension, one just needs to use 2D objects instead of intervals
> to split into two or more pieces. For instance, one can divide
> equilateral triangles into 4 smaller ones. In fact, it doesn't matter
> whether the starting triangle is equilateral or not when using the
> midpoints of the three sides to split it into four smaller triangles.

Nice. The issue of the RNG running backwards was what made me realize
that rather than using StdGen in the nodes, if you simply number them
(Hmmm - the nodes are countably infinite :-)), you can then [e.g.] use
a cryptographic hash or similar to turn them into random numbers. You
can seed the hash to generate different terrains.

You may be interested that in some of the code I wrote for the right
angle isosceles triangle case, I got into precision problems. It turns
out that all the vertices lie on positions with coordinates that are
precisely sums of 2^-k (e.g. 0.5, 0.125, 0.625), yet each time you
subdivide, the scaling factor on the side length is sqrt 2/2. The
resultant rounding meant that instead of getting 0.5, I got
0.53, or some such.

After pondering on this for a while, I realized instead of
representing the scale of the triangle as a Double, I could use
(Either Double Double), with Left x representing the scale x, and
Right x representing the scale x * sqrt 2 / 2. That way, all the
rounding problems can be made to go away. Well, not all of them -
after all Double has limited digits of mantissa, but down to quite
small scales, the arithmetic will be precise. Actually, you could use
(Either Rational Rational), except that performance would be [even
more] atrocious.

cheers,
T.
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Backpatching

2007-08-01 Thread Thomas Conway
On 8/2/07, Daniel McAllansmith <[EMAIL PROTECTED]> wrote:
> On Wednesday 01 August 2007 17:44, Thomas Conway wrote:
> > This sounds like a common problem type. Is there a well known solution
> > to this sort of problem?
>
> Have you looked into Tying the Knot?
> http://www.haskell.org/haskellwiki/Tying_the_Knot

I'll need to look further, but I'd say this looks to be on the money.
In many ways it's isomorphic to the logic variable solution (in
Prolog) - it works so long as you don't look at the binding before the
symbol table is complete (i.e. at the end of parsing). Neat. Except
that the Haskell one does not require evil like var/1 to determine
that there are missing bindings. That's pleasing - I knew I walked
away from Prolog for a good reason[*].

T.
[*] It did take a while though - my first Prolog interpreter was
MicroProlog on the Sinclair Spectrum. And that was circa 1980. :-)
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Backpatching

2007-07-31 Thread Thomas Conway
Hi All,

One of the things I've been working on lately is some ASN.1 stuff.One
of the first things I wrote in Haskell was an ASN.1 parser. It only
worked for a subset, and I'm revisiting it to make it handle a
larger subset.

One of the things that gets messy is that in lots of places you can
put either a thing or a reference to a thing (i.e. the name of a thing
defined elsewhere). For example, consider the production:

NamedNumber ::= identifier "(" SignedNumber ")"
  | identifier "(" DefinedValue ")"

If we ignore the second alternative, the natural translation into a
Parsec parser would look like:

namedNumber = do
name <- identifier
val <- parens signedNumber
return (name, val)

Now to handle the second alternative is easy enough:

namedNumber = do
name <- identifier
val <- parens (fmap Left signedNumber <|> fmap Right definedValue)
return (name, val)

however because names can be used before they are defined the result
typegoes from being

type NamedNumber = (Name,Integer)

to

type NamedNumber = (Name,Either Integer Name)

Nothing too terrible so far. The messiness comes in when you
considerthe number of places that you have to replace a type 't' with
(Either t Name). I'd really like to avoid having to do this.

If I were using Prolog, I could finesse the problem by introducing
afree variable and filling it in when I come across the definition[*].
Logic variable backpatching. :-)

So one possibility would be to return a closure:

...
return $ \bindings -> (name,resolve val bindings)

resolve :: (Either t Name) -> Map Name t -> t

or something like that. Then when you get to the end, you apply the
bindings and voila, out pops the simple type. I'm not sure this will
work quite as well as it sounds.

This sounds like a common problem type. Is there a well known solution
to this sort of problem?

cheers,
Tom
[*] And then at the end use var/1 to look for undefined names. Urk.
Actually, if I were using Prolog in the way most Prolog programmers use
it, I wouldn't be thinking about the types.
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Operational Semantics of Haskell

2007-07-31 Thread Thomas Conway
On 8/1/07, Lewis-Sandy, Darrell <[EMAIL PROTECTED]> wrote:
> Is there a good source for the operational semantics of Haskell?  I am
> trying to squeeze the most efficiency out of a bit of code and am looking to
> remove unnecessary reductions.

You're kind of asking two questions - the first you ask explicitly,
the second "how do I optimize my program" others have already
answered.

The first question is a very tricky one for haskell. For example, see
the following:

http://www.cs.chalmers.se/~rjmh/Haskell/Messages/Archived.cgi?id=102

-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Zippers, Random Numbers & Terrain

2007-07-29 Thread Thomas Conway
Hi All,

To amuse myself while waiting for test-runs to complete, I was
thinking about random terrain generation. I came across a bunch of
nice posts by Torben Mogensen, where he describes a neat way of
constructing random terrains by recursively subdividing right angled
isosceles triangles. It got me thinking - it's all well and good
subdividing to give more detail as you zoom in, but what about when
you zoom out?

This got me thinking that it would be cool to make an infinite terrain
generator using a zipper, so you can zoom in/out infinitely, and by
implication, infinitely in any direction.

One of the key components that seems to be necessary is a random
number generator zipper. In Mogensen's scheme, you have a number
associated with each point, and when you subdivide, you create a new
RNG seed from the numbers at each end of the hypotenuse which you are
bisecting. These numbers are used to generate height variation. The
trick, is to make the combination order independent (e.g. xor). This
is easy for zooming in, but it's not clear how to do this for zooming
out.

It's probably sufficient to assume a (parameterizable) hashing/mixing
scheme, and to simply number the nodes in some deterministic fashion.
The subdivision is binary, so we could number the children
deterministically. If we use "decimals", from some arbitrary starting
point, we can extend in the "fractional" direction when we zoom in,
and extend in the "whole number" direction.

I'm only just discovering zippers, so my question to the learned
members of this forum is: Am I on the right track? Is a scheme like
this going to work?

cheers,
Tom
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re : [Haskell-cafe] Indentation woes

2007-07-26 Thread Thomas Conway
On 7/27/07, Neil Mitchell <[EMAIL PROTECTED]> wrote:
> Personally, I have no problem with the current way (and would consider
> anything other than 4 leading spaces in the first example to be evil).
> However, if you are using a text editor which doesn't automatically
> indent the start of following lines, it might be a bit more annoying.
> Of course, if your editor is that bad you should consider changing to
> virtually anything which isn't notepad.

Or pico. :-) You'd be amazed the number of undergraduates I taught who
refused to learn to use gvim, emacs, or any other *programming*
editor, and instead spent 75% of their time battling the editor.


I must say, I agree about the indentation question. My experience is
that if you use 4 space indentation and run out of columns, then it's
time to refactor, and think more about how your logical structure is
working (you may recall, this list recently introduced me to the Maybe
monad transformer for exactly this reason). It's actually a pretty
effective rule of thumb.

cheers,
T.
-- 
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Maintaining the community

2007-07-17 Thread Thomas Conway

On 7/18/07, Hugh Perkins <[EMAIL PROTECTED]> wrote:

Am I the only person who finds it interesting/worrying that there are few to
no people in the group who are ex-C# programmers.  I mean, you could argue
that C# programmers are simply too stupid to do Haskell, but ... you know,
there is another explanation ;-)


I wouldn't say too stupid, but it may be a cultural thing. People
working in C++ are more likely to be doing what I would call
"technical" programming, and correspondingly more likely to be
interested in Haskell, and to appreciate what it has to offer from
painful personal experience. From what I know of the marketplace,
people working in C# are more likely to be doing client/integration
work where technical finesse is less important, and are therefore less
likely to see the point.

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Maintaining the community

2007-07-17 Thread Thomas Conway

On 7/17/07, Martin Coxall <[EMAIL PROTECTED]> wrote:

Me too, which is why I find your statement that expertise in C++ is
easy to acquire. Seeing some of my colleagues' code is enough to tell
me that this is most definitely not the case.


You're quite right. That was careless on my part. Though the way C++
is taught at the undergraduate level, and the way it is perceived by
the inexperienced is that it isn't so hard.

But then again, I've taught Java at the ugrad level, and what do I
know about Java, other that it'd be quite a nice place to have a
holiday some time :)

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Maintaining the community

2007-07-16 Thread Thomas Conway

On 7/16/07, Malcolm Wallace <[EMAIL PROTECTED]> wrote:


OK, so I'm not genuinely suggesting that you must possess or be studying
for a PhD, to grok Haskell.  But I find nothing alarming about the
suggestion that one needs a fairly high level of intelligence, and some
training, in order to be able to use Haskell effectively.


When I was a teenager I thought people with PhDs were minor deities.
Having done one, and knowing lots of people with them, I can tell you,
the vast majority of people with a PhD (including me)  have merely
above average intelligence. A PhD is not a mark of intelligence. It's
a mark of persistence. (Shall we say obsession?)

I think Malcolm's analogy to other professions is quite apt. If we
expect to be taken seriously as professionals, it would be
unsurprising to find that we need to engage in some strenuous [mental]
effort to acquire the skills.

And this is where I think Haskell has it all over C++, Java, and the
rest. Haskell is easy to learn at a simple level, and hard to learn at
the expert level, but once learned is very powerful and has excellent
payoffs in terms of productivity. With C++ or Java, the expertise is
somewhat easier to acquire, but you never get the payoff. And before
you all flame, yes, I do know C++ at an expert level, and that is
exactly why, after 7 years of writing server software in C++, I now
want to do it in Haskell.

cheers,
T
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Indentation Creep

2007-07-15 Thread Thomas Conway

On 7/16/07, Claus Reinke <[EMAIL PROTECTED]> wrote:

enjoy (i hope:-),


Very much. Work hasn't been so much fun since Well, for a long time. :-)

One small question, which might reveal if I've understood things aright.


(do (v,e) <- dmin' l
(do guard e
me <- empty m
re <- nullT r
write m p (v,me && re))
 `mplus` return ((v,e)))
 `mplus` (do v <- readTVar' m
 re <- nullT r
 write m p (v,re))
 `mplus` (do (v,e) <- dmin' r
 when e $ writeTVar' p Empty
 return ((v,e)))
 `mplus` error "emit nasal daemons"


If I refactor this a little into dminLeft dminMiddle and dminRight, I
believe I should be able to replace this with:

   ...
   msum [dminLeft l m r, dminMiddle m r, dminRight r, error "emit
nasal daemons"]
   where
   dminLeft l m r = do
  (v,e) <- dmin' l
  (do guard e
  me <- empty m
  re <- nullT r
  write m p (v,me && re)) `mplus` return (v,e)
   dminMiddle m r = do
   v <- readTVar' m
   re <- nullT r
   write m p (v,re)
   dminRight r = do
   (v,e) <- dmin' r
   when e $ writeTVar' p Empty
   return (v,e)
   

Is this correct? And if so, is GHC likely to do constant folding on
msum over the list skeleton?

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Indentation Creep

2007-07-15 Thread Thomas Conway

On 7/15/07, Miguel Mitrofanov <[EMAIL PROTECTED]> wrote:
   [nice solution deleted]

Shiny.

Everyone's suggestions show that in order to advance to a level 3
Haskell Mage[*], I need to spend a chunk of time learning to grok
monad transformers.

Thanks for the suggestions. If anyone cares I can post the complete
module, after maybe transforming it a little. ;-)

cheers,
T
[*]
Level -1 - Java programmer.
Level 0 - Can write simple list comprehensions and call "print" to
show the results.
Level 1 - Figured out elementary IO, but thinks "monad" is some kind
of swear word.
Level 2 - Has used Parsec, and can use the Error monad, and mean it.
Level 3 - Can use monad transformers deliberately
...
Level 42 - Has realized "monad" really is some kind of swear word. :-)

--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Indentation Creep

2007-07-14 Thread Thomas Conway

On 7/14/07, Aaron Denney <[EMAIL PROTECTED]> wrote:

It might be a bit clearer if every level of the tree were a flat map of
pointers.  You can even parametrize on this map type...


Yes, this would be an obvious generalization, though if I were to
modify the details of the structure, I'd be inclined to go in exactly
the opposite direction, and rather than have the keys be [Bit], use
Bits b =>  and use an Int argument to recurse down the tree.

The motivation for this structure is that I wanted a queue, from which
I could remove elements from the middle efficiently, and using only
local updates (for high concurrency).

The structure I was replacing used a doubly linked list using TVars as
pointers between nodes. As I hinted in the original post, this was
ugly, and seem to be leaking memory (I actually think there might be
some issues with the GHC implementation of TVars and GC - I'm not
certain, but I think the leak *may* be a bug in GHC, and as I posted
separately, GC was taking an awfully large proportion of the time).
One way of achieving what I wanted was to keep a "timestamp" counter
and use (Map TimeStamp k). The problem with Map is that it is hostile
to concurrency - all modifications are "global" wrt the Map.

The structure that is required in this instance is a structure with
enough TVars linking the pieces that updates are local - a write to
one TVar doesn't interact with reads in other parts of the structure.
For example a binary tree with TVars between the nodes. Except a
(vanilla) binary tree would be rotten in this case because the new
keys arrive in strictly increasing order - a worst case for such a
structure. So I could have modified Map, or my own home-rolled AVL
tree to stick TVars between the nodes, there were reasons not to:

1. Tries are easy to implement and offer O(b) ~= O(1) for keys with b
bits. Thanks to apfelmus for reminding me of this recently.

2. In Haskell, it's *fun* rolling new data structures to see how
elegant you can be. (A favorite quote of mine is "Elegance is not
Optional", I think due to Richard O'Keefe.)

3. This structure is used in an inner loop, so a structure giving O(1)
operations was desirable.

Anyway, the point of the original post was to find tricks for avoiding
indentation creep, rather than the trie itself.

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Indentation Creep

2007-07-13 Thread Thomas Conway

Hi All,

In the best spirit of Haskelling, I thought I'd try dropping in a
completely different data structure in a spot where I thought the
existing one was (1) ugly (2) leaking memory. In particular, I wrote a
Trie implementation. Now the point is actually not much to do with the
data structure itself, but code layout. I mention this particular data
structure only because it is the one I was working on, but it seems to
come up quite often.

Consider the following function:

data Trie t = Empty | Trie (TriePtr t) (MaybePtr t) (TriePtr t)
type TriePtr t = TVar (Trie t)
type MaybePtr t = TVar (Maybe t)

data Bit = Zero | One
   deriving Show

dmin p = do
   mv <- dmin' p
   case mv of
   Nothing -> error "dmin: no values"
   Just (v,_) -> return v

dmin' p = do
   t <- readTVar p
   case t of
   Empty -> return Nothing
   Trie l m r -> do
   mv <- dmin' l
   case mv of
   Nothing -> do
   mv <- readTVar m
   case mv of
   Nothing -> do
   mv <- dmin' r
   case mv of
   Nothing -> error "emit nasal daemons"
   Just (v,e) -> do
   if e
   then writeTVar p Empty
   else return ()
   return mv
   Just v -> do
   re <- null r
   case re of
   False -> writeTVar m Nothing
   True  -> writeTVar p Empty
   return (Just (v,re))
   Just (v,e) -> do
   case e of
   True -> do
   me <- empty m
   re <- null r
   case me && re of
   False -> writeTVar m Nothing
   True  -> writeTVar p Empty
   return (Just (v,me && re))
   False -> return mv
   where
   empty m = do
   v <- readTVar m
   case v of
   Nothing -> return True
   Just _  -> return False

All that case analysis causes indentation to creep, and lots of
vertical space "feels" wasted. Is that just a fact of life, or is
there Haskellmagic that I still need to learn?

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[8]: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-12 Thread Thomas Conway

On 7/12/07, Bulat Ziganshin <[EMAIL PROTECTED]> wrote:

about which particular algorithm you said? Moffat?


Well, both Andrew and I has Alistair Moffat as a lecturer in our time.
So, surely. :-)

If my memory serves me, you'll find Alistair has published work on
quite a number of algorithms, some of which are symbol based (0, 1 or
higher order), and others which are word based.

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] better error expression in IO function

2007-07-11 Thread Thomas Conway

On 7/12/07, Andrew Coppin <[EMAIL PROTECTED]> wrote:

It's fairly common to use the Either type for this. By convention,
"Right" means "correct", and by elimination "Left" means an error...


Presumably, this is because the world is dominated by dull,
conventional, right handed people. :-)

cheers,
Tom Southpaw Conway
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Number overflow

2007-07-11 Thread Thomas Conway

On 7/12/07, Andrew Coppin <[EMAIL PROTECTED]> wrote:

What, as in the way that simple strings are lists, and you change it to
something less flexible but faster if you actually need it?

I suppose that makes some sense...


Not much. There is a very significant difference between the two.

The differences between Int and Integer operations are mostly constant factors.

The differences between String and ByteString operations are
algorithmic (consider length).

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange results when trying to create large Bool arrays.

2007-07-11 Thread Thomas Conway

On 7/12/07, Stefan O'Rear <[EMAIL PROTECTED]> wrote:

Indeed.  I beleive that Int should be removed from the Prelude.


metoo.

Actually, one of the really annoying things that I am finding in my
code which is a mixture of ByteString, Word16, Word32 and Word64, is
that all the standard libraries use Int everywhere, so I end up having
to get out the fromIntegral spray-gun, which uglifies the code and
makes it much harder to read.

Is there a compelling reason (hysterical raisins is not a compelling
reason) why Data.*.{length,size,take,drop,etc} use Int and not Num n
=>  or similar?

If the answer is efficiency, then cannot they use Int# or similar
internally, and have an optimizable fromIntegral *inside* them?

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Toy compression algorithms

2007-07-11 Thread Thomas Conway

On 7/12/07, Andrew Coppin <[EMAIL PROTECTED]> wrote:

Yes - but making it use a non-flat model opens a whole Pandora's Box of
fiddly programming. ;-)


This could just about be Rule No 1 of haskell programming: if it's
fiddly, then you haven't thought about the problem hard enough.

Corollary No 1 is Any Expression requiring more than 80 columns is fiddly.

:-)

I say this in jest, but it is "ha ha, only serious".

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The Garbage Collector Ate My Homework

2007-07-10 Thread Thomas Conway

Hmm, looks like the garbage collector got hungry again:

1093,741,664,672 bytes allocated in the heap
1006,759,632,160 bytes copied during GC (scavenged)
72,181,353,728 bytes copied during GC (not scavenged)
400,940,412 bytes maximum residency (8853 sample(s))

 76353 collections in generation 0 (3724.54s)
  8853 collections in generation 1 (19689.00s)

  1065 Mb total memory in use

 INIT  time0.00s  (  0.00s elapsed)
 MUT   time  2619.54s  (3103.46s elapsed)
 GCtime  23413.54s  (23569.24s elapsed)
 RPtime0.00s  (  0.00s elapsed)
 PROF  time  2167.91s  (2204.09s elapsed)
 EXIT  time0.00s  (  0.08s elapsed)
 Total time  28201.00s  (28876.80s elapsed)

 %GC time  83.0%  (81.6% elapsed)

 Alloc rate417,531,666 bytes per MUT second

 Productivity   9.3% of total user, 9.1% of total elapsed

And that was with -H700M.

What a shame the heap profiler (-hb) aborts on my program. :-( Still,
-hc and -hd seem to work, the only hassle is the hours of running time
required to get things to go bad.

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.ByteString.dropWhile

2007-07-09 Thread Thomas Conway

So the following isn't as clever as the line-noise Don posted, but
should be in the ball-park.

dropFromEnds p = dropWhile p . dropWhileEnd p

dropWhileEnd p bs = take (findFromEndUntil (not p) bs) bs

takeWhileEnd p bs = drop (findFromEndUntil p bs) bs

{- findFromEndUntil is in ByteString.hs, but is not exported -}

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.ByteString.dropWhile

2007-07-09 Thread Thomas Conway

Well, maybe I shoud be asking a higher level question then.

I have a function

tidy = reverse . dropWhile punk . reverse . dropWhile punk
   where
   punk = isPunctuation . chr . fromIntegral

which is leading to a significant amount of allocation, and you can see why.

The way I'd like to write it is

tidy = dropWhile punk . dropWhileEnd punk
   where 

which has the obvious advantage of avoiding quite a bit of
intermediate allocation.

Is there a another way?

I note that since I'm using a nice declarative language, the compiler
CLEARLY should be transforming the first form into the second. :-)

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Data.ByteString.dropWhile

2007-07-09 Thread Thomas Conway

Hi All,

I notice that Data.ByteString has span and spanEnd. Is there a known
particular reason why dropWhile and takeWhile don't have corresponding
*End functions? If not, what is the protocol for adding them?

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Too many packages on hackage? :-)

2007-07-09 Thread Thomas Conway

On 7/9/07, Ketil Malde <[EMAIL PROTECTED]> wrote:

The current deliverables seem to consist of a tar file and a package
description, neither of them accurately dated.


Clearly we need to store them in a treap. :-)

--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Clearly, Haskell is ill-founded

2007-07-08 Thread Thomas Conway

On 7/9/07, Donald Bruce Stewart <[EMAIL PROTECTED]> wrote:

And he's patented it...

http://www.patentstorm.us/patents/5355496-description.html


Clearly a winner then. :-)

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Clearly, Haskell is ill-founded

2007-07-08 Thread Thomas Conway

I don't know if you saw the following linked off /.

http://www.itwire.com.au/content/view/13339/53/

An amazon link for the book is here:

http://www.amazon.com/Computer-Science-Reconsidered-Invocation-Expression/dp/0471798142

The basic claim appears to be that discrete mathematics is a bad
foundation for computer science. I suspect the subscribers to this
list would beg to disagree.

Enjoy,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very edgy language (was: A very nontrivial parser)

2007-07-07 Thread Thomas Conway

On 7/8/07, Dave Bayer <[EMAIL PROTECTED]> wrote:

This of course sets up the best answer to this debate: For a hard
problem, one can express better algorithms in Haskell that would
simply be too painful to code in other languages, swamping any
considerations about the speed of Haskell versus C for a given
algorithm.


This is certainly true. I've coded up in less than six months,
something that uses better algorithms and finer grained concurrency
than the software I used to work on, and the latter represented 5 or
more man-years of coding. However this is server software, which is
long running so performance and memory usage are pretty important, and
these are relatively hard to get right in Haskell. OTOH, you can tell,
I think it's a good trade off - I did convince the mgt to let me doit
in Haskell in the first place. :-)

--
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very edgy language (was: A very nontrivial parser)

2007-07-07 Thread Thomas Conway

On 7/7/07, Albert Y. C. Lai <[EMAIL PROTECTED]> wrote:

Non-strict (most implementations lazy): rarely useful if you ask the
mainstream.




Of your propositions, I must say this one has the most merit, though
not exactly as stated. :-) Being non-strict does allow some nice
expressiveness, but has one teeny tiny downside - the performance
model for haskell programs is at best inscrutable. Even using the
decent profiling tools in GHC, it can be almost impossible to
understand why a non-trivial program behaves the way it does. In my
current project, we restart the server periodically because there's a
memory leak in there somewhere that I can't track down. Now, I'm not
saying that someone else might not spot it easily, but I hope you see
my point:

I can look at the source code of a C function, and I can pretty much
guess what machine code will be generated for it (issues like
instruction scheduling and register allocation aside). The same is
essentially true for C++, Lisp, Prolog, Java, Mercury, &c, &c, &c, but
not for Haskell.

I wind up using -prof -auto-all as standard GHC flags so that if error
gets called, I have a vague chance of figuring out what's going on.




Static typing: extreme paranoia.


I've been working in a mostly Python shop this last year, and it
reinforces my belief that people who don't like strong static typing
are yahoos, not professionals interested in producing high quality
code. Maybe I just don't get the line between professionalism and
paranoia. ;-)


Purely functional: vocal minority of edgy people.


Ever used Prolog? Compromising purity in a declarative language can
seem like a good idea in the short term, but in the long term, it
usually causes untold grief. Especially, in the case of Prolog, the
cut operator which interferes with the natural operation of
backtracking. It overflows into the operation of negation, and creates
all kinds of bother.


So I did my PhD in the Mercury group at .mu.oz.au. Mercury is a retake
on logic programing. It is pure.

In 1995 I arrived in the US for my first logic programming conference,
and on the first evening, before the conference proper began, went out
with a bunch of attendees. I got chatting with a really nice Canadian
guy, Jamie Andrews, and five minutes into the conversation, on finding
out he was a semantics researcher, asked what I thought was a terribly
witty question "So are you presenting *another* semantics for the
'cut' operator?"

"Um, well, yes, actually" was his reply.

Apart from showing what a precocious prat I was, OMG 12 years ago, it
tells you something about what happens when you ride rough-shod over
purity. It creates gainful employment for hundreds of researchers for
decades trying to put the genie back in the bottle.


ML and friends have had a much easier time of it than Prolog, I
concede, but the problem of finding practical paradigms of programming
in pure languages that combine expressiveness with clean semantics is
actually well worth the short term inconvenience. Those with good
memories will know that the use of monads to express IO took some
time, and that there were several less successful, though more-or-less
pure attempts before. There was the pair of lazy streams model; the
continuation passing model; the linear types model (deployed by Clean,
of course); and maybe others. The cool thing is that they were all
fairly painful to use, and rather than give up, the researchers kept
trying new things and came upon monads. The extra cool thing is that
monads have turned out to be really useful for a whole lot of other
things than just a way of expressing IO or even IO and mutable state
(which linear types captures).

As SPJ notes in his Hair Shirt talk, monads are not perfect, since
they are often used in ways which over-sequentialize code using them,
so we have people working on arrows, and other more sophisticated
mechanisms, which in time will probably lead to more expressive
paradigms.

cheers,
T.
--
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] interrupting an accept()ing thread

2007-07-05 Thread Thomas Conway

On 7/6/07, Lukas Mai <[EMAIL PROTECTED]> wrote:

I don't see how this solves the problem. AFAICS acceptLoop never returns
and sok is never closed. On the other hand, my program doesn't need a
liveOpCount because the subthreads take care of themselves. It's just the
accept loop I need to break somehow.


Well, it works because the sub-thread dies when the program exits, so
the socket gets closed then.

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] interrupting an accept()ing thread

2007-07-05 Thread Thomas Conway

On 7/6/07, Lukas Mai <[EMAIL PROTECTED]> wrote:

Hello, cafe!


Have you been reading my mind? See the other recent Cafe thread (um,
titled something about System.Exit).

Here's my solution:

acceptLoop sok reqChan = do
   req <- Network.Socket.accept sok
   atomically (writeTChan reqChan req)
   acceptLoop sok reqChan

mainLoop reqChan quitVar liveOpCountVar = do
   action <- atomically (quitNow `orElse` getReq)
   case action of
   Nothing -> return ()
   Just (reqSok,reqAddr) -> do
   atomically $ do
   liveOpCount <- readTVar liveOpCountVar
   writeTVar liveOpCountVar (liveOpCount + 1)
   forkIO (doSession reqSok reqAddr quitVar liveOpCountVar)
   mainLoop reqChan quitVar liveOpCountVar
   where
   quitNow = do
   q <- readTVar quitVar
   case q of
   True -> return Nothing
   False -> retry

   getReq = do
   req <- readTChan reqChan
   return (Just req)

doit sok = do
   reqChan <- atomically newTChan
   quitVar <- atomically (newTVar False)
   liveOpCountVar <- atomically (newTVar 0)
   forkIO (acceptLoop sok reqChan)
   mainLoop reqChan quitVar liveOpCountVar
   atomically $ do
   liveOpCount <- readTVar liveOpCountVar
   if liveOpCount > 0
   then retry
   else return ()

Although doSession is not included, obviously when you want to quit,
something in doSession should set quitVar to True. Also, as suggested
elsewhere, doSession should involve a "finally" clauses to make sure
the live op count gets decremented.

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] System.Exit

2007-07-05 Thread Thomas Conway

FWIW,

I implemented the STM based solution, and it works a treat. It's less
of a hack than the version where I got the child thread to send a
SIGINT. ;-)

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Binary serialization, was Re: Abstraction leak

2007-07-05 Thread Thomas Conway

I was explaining Haskell to a perl/python hacking friend recently and
characterized things thus:

Perl is a horrible language with fantastic libraries.
Haskell is a fantastic language with horrible libraries.

Actually, many of the libraries that exist for Haskell *are*
fantastic, it's just that Haskell lacks the *coverage* that Perl or
Python have.

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] System.Exit

2007-07-04 Thread Thomas Conway

Hi All,

Can anyone tell me what System.Exit.exitWith is actually supposed to
do? As far as I can tell, it seems to be a synonym of (return ()).

Okay, I'll stop being provocative and try and be helpful.

So I have a web server, which like the one in The Literature(TM),
essentially has a main loop:

doit sok = do
   (reqSok,reqAddr) <- Network.Socket.accept sok
   forkIO (handleRequest reqSok reqAddr)
   doit sok

This is all well and good, but how do you *stop* a server? Well, you
have a request (blah blash auth blah blah) which tells it to shut
down, /quit for example.

I have a function to handle the quit request that looks something like:

quitHandler sok addr  = do
   tidyUpEverything 
   sendOkResponse sok
   sClose sok
   System.Exit.exitWith ExitSuccess

All nice and simple. All except one detail: it doesn't actually work.

It prints

exit: ExitSuccess

but the "doit" loop keeps going. Of course, it goes totally spacko,
because of the call to tidyUpEverything, but it doesn't exit.

So, if I set an IORef/TVar inside quitHandler which I inspect either
just before or just after the call to Network.Socket.accept, I could
exit the loop, but that only helps once the next request comes in.

I contemplated a solution involving Control.Exception.throwTo, but I
actually read the doco (!) which states the following:


If the target thread is currently making a foreign call, then the
exception will not be raised (and hence throwTo will not return) until
the call has completed. This is the case regardless of whether the
call is inside a block or not.


So no joy there.

Ideas anyone?

And is exitWith broken, or is it the doco that's broken?

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Abstraction leak

2007-07-04 Thread Thomas Conway

On 7/5/07, Donald Bruce Stewart <[EMAIL PROTECTED]> wrote:

Yep. The current impl is:

mmapFile :: FilePath -> IO ByteString
mmapFile f = mmap f >>= \(fp,l) -> return $! PS fp 0 l

mmap :: FilePath -> IO (ForeignPtr Word8, Int)
mmap = do
 ...
 p  <- mmap l fd
 fp <- newForeignPtr p unmap -- attach unmap finaliser
 return fp



Which, if I read it correctly is not safe in a concurrent/multitasking
environment, since it wraps the underlying mmapped region. In many
programs, I'm sure this won't be a problem. Unfortunately, the system
I'm working on is multi-threaded, and we definitely want to update
regions. Perhaps I'll have to bite the bullet and implement the
Mapping thing I described. The really unfortunate thing is that I'd
really like to be able to do it within the STM monad, with rollback,
etc - escaping to the IO monad is annoying.

FWIW, the technique I use to handle this kind of situation may be of
general interest. Consider a cache of structures reconstituted from an
external file.  If a requested item is not in the cache, then we throw
an exception which is caught in a wrapper function which is in the IO
monad, read the requested structure, stick it in the cache, then rerun
the transaction. There are a few details you have to get right,
including making sure none of the items you require to complete the
operation get evicted by another thread, but it works very nicely.

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Abstraction leak

2007-07-04 Thread Thomas Conway

Anyone trying to do any of this?


I've done some work in this area. I'm particularly interested in
manipulating ASN.1 in haskell. Actually, my first use of Parsec was an
ASN.1 parser. I'd done one previously in Spirit (the Boost C++ rip-off
of parsec), but semantic actions were horrible in the extreme. Mmmm
Parsec.

In the indexing system I'm currently building in Haskell for my day
job, I'm serializing several data structures, and using Data.Bits and
Data.ByteString heavily.

I was using HaXml, but I found it was very slow. So instead, I'm using
an internal (within the indexing system) representation that is more
akin to WBXML:

import Data.ByteString as ByteString
import Data.List as List
import Data.Sequence as Seq

data DocTree
   = DocElem ByteString [(ByteString,ByteString)] [DocTree]
   | DocText ByteString

serialize tree = ByteString.concat $ Seq.toList $ execState
(serialize' tree) Seq.empty
serialize' (DocText txt) = do
   stuff <- get
   put (stuff |> pack [0])
   putStr txt
serialize' (DocElem name attrs kids) = do
   stuff <- get
   put (stuff |> pack [1])
   putStr name
   putNum (List.length attrs)
   mapM_ (putPair putStr putStr) attrs
   putNum (List.length kids)
   mapM_ serialize' kids

putStr 

You get the idea. Actually, the *real* code is trickier - it grovels
first to find all the element names and numbers them. Likewise with
attribute names (per element). The extra grovel is well worth it - it
takes a little longer to serialize, but is more compact and
deserializes quicker.

Also worth noting - whether you compile a dictionary of element names
or not, the result is much much much more space efficient than using
HaXml, since it can all be decoded out of a single ByteString
containing the document tree, with no actual string copying at all.
That's the kind of [de]serialization I like. :-) Mind you, I still
have to use HaXml when I first read documents into the system, and a
very nice job it does too.

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] The Garbage Collector Ate My Homework

2007-07-03 Thread Thomas Conway

Well, not quite, but look at the following:

118,342,689,824 bytes allocated in the heap
144,831,738,780 bytes copied during GC (scavenged)
335,086,064 bytes copied during GC (not scavenged)
255,257,516 bytes maximum residency (42 sample(s))

222884 collections in generation 0 (3891.90s)
42 collections in generation 1 (153.99s)

   536 Mb total memory in use

 INIT  time0.00s  (  0.00s elapsed)
 MUT   time  233.66s  (776.99s elapsed)
 GCtime  4045.89s  (4251.52s elapsed)
 EXIT  time0.00s  (  0.00s elapsed)
 Total time  4279.55s  (5028.52s elapsed)

 %GC time  94.5%  (84.5% elapsed)

 Alloc rate506,470,897 bytes per MUT second

 Productivity   5.5% of total user, 4.6% of total elapsed

Can anyone offer general suggestions for how to fix this!

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Odd lack of laziness

2007-06-23 Thread Thomas Conway

On 6/23/07, Chaddaï Fouché <[EMAIL PROTECTED]> wrote:


isLength1 [x] = "Ok"
isLength _ = "Nok"


excellent.


How is [x] big in any way ? If you need to test for more than one
element you can just put put a guard with length


Invoking length is more strict than is necessary, though this may not
be a problem. If you want a lazier solution, you could try:

hasLength 0 [] = True
hasLength 0 (_:_) = False
hasLength n [] = False
hasLength n (_:rest) = hasLength (n - 1) rest

This only evaluates at most enough of the list skeleton to verify
whether or not it has the right length, where invoking length would
evaluate the whole list skeleton.

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Collections

2007-06-22 Thread Thomas Conway

On 6/22/07, apfelmus <[EMAIL PROTECTED]> wrote:

I guess you have considered Software Transactional Memory for atomic
operations?
   http://research.microsoft.com/~simonpj/papers/stm/index.htm

Also, write-once-read-many data structures (like lazy evaluation uses
them all the time) are probably very easy to get locked correctly.


STM was *the* justification to the mgt for letting me use Haskell
rather than C++. :-)

However, you do need to take care, because in this context it would be
easy to end up creating great big transactions which conflict with one
another, which quite aside from wasting CPU on retries, can in extreme
cases lead to starvation. A bit like laziness, STM is fantastic for
correctness, but can be a bit obtuse for performance. With that
proviso, I think STM is better than sliced bread.[*]

Incidentally, I read Herlihy's papers on lock free data structures
early on in my work on parallelism and concurrency for Mercury in the
mid 90's. What a shame I didn't have the wit to understand them
properly at the time, or Mercury might have had STM 10 years ago. :-)

T.
[*] People who know me well, would realize that since I bake my own
bread and slice it with a bread-knife myself, comparison to sliced
bread may be faint praise. It isn't.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Collections

2007-06-21 Thread Thomas Conway

On 6/22/07, Duncan Coutts <[EMAIL PROTECTED]> wrote:

You might find that lazy IO is helpful in this case. The primitive that
implements lazy IO is unsafeInterleaveIO :: IO a -> IO a


Personally, unsafeInterleaveIO is so horribly evil, that even just
having typed the name, I'll have to put the keyboard through the
dishwasher (see http://www.coudal.com/keywasher.php). Also, I need to
support concurrent querying and updates, and trying to manage the
locking is quite hard enough as it is, without trying to keep track of
which postings vectors have closures pointing to them!


Note that using a Map will probably not help since it needs to read all
the keys to be able to construct it so that'd pull in all the data from
disk.


Well, in the case I'm dealing with, the map can contain the current
key from each postings vector, and the closure for reading the
remainder of the vector. E.g. Map Key ([IO (Maybe Key)]).

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Collections

2007-06-21 Thread Thomas Conway

On 6/20/07, apfelmus <[EMAIL PROTECTED]> wrote:

Eh, why not a simple mergesort that also deletes duplicates?


I had to sit down and think about this, and while for the simple case
that I showed, your equivalent code is definitely simpler, and
probably more efficient.

The actual case that I'm dealing with, where I believe Data.Map (or
similar, incl finger trees) has a benefit is one in which it's not
simply a case of lists of items, yielding a list of items. I'm
manipulating an on-disk inverted index, so rather than a simple list
of items, the code is actually monadic, doing IO to retrieve the items
off disk, and the cost of creating the intermediate lists is
unwearable. The key problem is that you loose the laziness because of
the IO monad, so if you're not careful, you end up trying to store the
complete intermediate lists.

If you can assume you can hold enough stuff in memory, then nice
elegant lazy algorithms work beautifully. I'm doing external
algorithms which have to work on Gbs of data, which I can't hold in
memory. So the type signature of my merge is approximately:

type Reader t = IO (Maybe t)
type Writer t = t -> IO ()
merge :: [Reader t] -> Writer t -> IO ()

Actually, the scope of the problem is such that I could *almost*
finesse the problem by reading the globs of data as ByteStrings, then
use lazy evaluation internally, and write the outputs

merge blobLocators writer = do
   lists <- fmap decode $ mapM readOffDisk blobLocators
   mapM_ writer (mergesort lists)

but I need to keep a firm lid on the resource usage, so I can't. 

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Parsec question

2007-06-21 Thread Thomas Conway

On 6/21/07, Jules Bean <[EMAIL PROTECTED]> wrote:

I would write

primary = PrimaryIdentifier `fmap` identifer
   <|> PrimaryLiteral`fmap` stringLiteral

(I prefer fmap to liftM but they are the same for monads). To my mind
this fits the general pattern of 'constructor comes before contents'.
with is, of course, just fmap with the parameters reversed.


Nice. I only discovered the joys of Data.Monad, fmap included
relatively recently, well after I'd spent quite some time writing
parsec parsers.

I note that it was Parsec that converted me to Haskell from Mercury[*]
- I wrote a library of parser combinators for Mercury, but the lambda
notation in Mercury is not nearly so clean as Haskell's, so lacking
special syntax for monadic code, the results looked horrible. Much
faster - Mercury is a strict language, so it's easier to compile into
respectable machine code - but horrible to read.

cheers,
T.
[*] Well, that, and the fact that my thesis was about Mercury, which
meant that once I was finished, I never wanted to look at Mercury
again. ;-)
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Parsec question

2007-06-21 Thread Thomas Conway

On 6/21/07, Dave Tapley <[EMAIL PROTECTED]> wrote:

> primary = (identifier >>= (return . PrimaryIdentifier)) <|> (stringLiteral 
>>= (return . PrimaryLiteral))
> identifier = (many1 letter) >>= (return . Identifier)
> stringLiteral = (char '\'') >> (manyTill anyChar (char '\'')) >>= (return . 
StringLiteral)


I have found this a sufficiently common pattern that I have a little
combinator to tidy it up:

p `with` f = p >>= (return . f)

so I can write

primary = (identifier `with` PrimaryIdentifier) <|> (stringLiteral
`with` PrimaryLiteral)

Obviously you could write it in terms of liftM, choose a different name, &c, &c.

FWIW, the other little combinator I invariably use is

p `returning` x = p >>= (\_ -> return x)

which I end up calling with () as the second argument so often
(especially during development), I usually have another combinator

void p = p >> return ()

YMMV,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Collections

2007-06-20 Thread Thomas Conway

On 6/21/07, Andrew Coppin <[EMAIL PROTECTED]> wrote:

Lennart Augustsson wrote:
> I don't think the collection type (a,b) is best thought of as a loop.

True. That's a rather special type; I haven't seen anything remotely
like it in any other language.


Is it that special? How is it different to the C++ STL std::pair
template type? I must be missing something.

--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Collections

2007-06-19 Thread Thomas Conway

On 6/20/07, Spencer Janssen <[EMAIL PROTECTED]> wrote:

 (I often miss priority queues)


I must say, there have been several times lately where I have wanted
such a type. Actually, I've found Data.Map isn't too bad if you use
deleteMin/deleteMax. What's more, deleteM{in,ax} & insert are both
O(log n) for most reasonable implementations that could be used for a
Data.Map-like structure, which works out much the same in many cases.

In particular, I find my self wanting to use a priority queue for
N-way sorted merge, which you can do with Data.Map: (compiles, so
clearly works even though I have not tested it. ;-) )

import Data.List as List
import Data.Map as Map

merge :: Ord t => [[t]] -> [t]
merge lists = merge' $ Map.fromList $ concatMap makePair lists
   where
   makePair [] = []
   makePair (x:xs) = [(x,xs)]

merge' heap
   | Map.null heap = []
   | otherwise = x:(merge' $ removeEqual x $ reinsert xs heap')
   where
   ((x,xs), heap') = deleteFindMin heap

reinsert [] heap = heap
reinsert (x:xs) heap = Map.insert x xs heap

removeEqual x heap
   | Map.null heap = heap
   | x /= y= heap
   | otherwise = removeEqual x $ reinsert ys heap'
   where
   ((y,ys), heap') = deleteFindMin heap


The other thing I have found myself doing often is using splitLookup
followed by union, though what I really want is "join" being the dual
of split - i.e. requiring all the keys in the rhs to be greater than
the keys in the lhs. My own AVL tree implementation has this operation
which is O(log n), which is rather better than union's O(n log n).

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: String Hashing

2007-06-19 Thread Thomas Conway

On 6/19/07, apfelmus <[EMAIL PROTECTED]> wrote:
   [nice exposition of how you could do lazy look-ma-no-updates tries]

Clearly, I still think too much like a logic programmer. :-)

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OS design & FP aesthetics

2007-06-18 Thread Thomas Conway

On 6/19/07, Creighton Hogg <[EMAIL PROTECTED]> wrote:

Okay, I remember seeing an example of this before , but I'm not sure if I
see what language based security Haskell's type system could provide in
protecting address spaces from each other.  Normally I've seen capabilities
used so that you can't access anything you can't name.  Can you elaborate a
little?


Nothing new here. Haskell might be more elegant than some of the
earlier proposals, but the idea is old. FWIW, my PhD supervisor,
Zoltan Somogyi did his PhD thesis on exactly such a scheme, using
logic programming, rather than lazy functional programming. I'm not
sure if the thesis itself is online.

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: String Hashing

2007-06-18 Thread Thomas Conway

On 6/19/07, apfelmus <[EMAIL PROTECTED]> wrote:

 Trie it is,
 not balanced tree.
 A logarithm in this
 would be new to me. :)


True enough, my braino.


As a side node, Mr. Exp says: 64 is large enough for the size needs of
any logarithm.


Que?


> type HashTable k v = TVar (Array Int (TVar [(k,v)]))

Don't you want a TArray Int [(k,v)]?


Essentially the same.


In any case, you could be able to set up an infinite trie and have lazy
evaluation allocate space as needed:

 type Trie a = Branch (TVar a) (Trie a) (Trie a)


Tree-like structure's are quite hostile to highly concurrent manipulation.

It helps to introduce TVar indirections at each level:

data Trie a = Branch (TVar a) (TVar (Trie a)) (TVar (Trie a))

Then you can update a subtree without having to modify the spine of the tree.

There is some very fine work on this by Kim Larsen (and others), see
for example http://citeseer.ist.psu.edu/2986.html

T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] String Hashing

2007-06-18 Thread Thomas Conway

On 6/19/07, Jan-Willem Maessen <[EMAIL PROTECTED]> wrote:

This looks like a version of the Bob Jenkins hash function from
burtleburtle.net.  I implemented the 32-bit version of this as follows:


Indeed. It's the 64-bit version. 32 bits is oh-so-last-century. ;-)


mix :: Int32 -> Int32 -> Int32 -> (Int32 -> Int32 -> Int32 -> a) -> a

   [deletia]

I mention this because your code writes the whole thing out
longhand---which might be faster, or might not, but certainly misses
the highest-level structural patterns in the original.  Your use of a
data type to represent triples is probably better nowadays than my
rather quirky use of CPS (in other words, this could have been a
function Triple -> Triple instead of the rather odd type you see above).


Well, the main difference, is the CPS version just folds the uses of
(.) into the individual groups of arithmetic. Actually, without
knowing what GHC *actually* does,  it is conceivable that a compiler
could do better with the CPS version, precisely because there's one
less layer of abstraction to inline/fold away. I'll have to give it a
go if I get a chance (this is code for my Real Job (TM), and tuning
the life out of the code isn't necessary right now, but I thought I'd
float this, as much because I might learn something, as anything. The
thinkon flux in this list is pretty favourable).


That said, I assume you instrumented your code and determined that
hash collisions are actually a bottleneck for you, and that a hash
table is the right structure to begin with?


I'm implementing a species of database (trade secrets, blah, blah,
blah), which needs to handle *large* data sets, and actually, an
external B-tree is probably better than an external hash table. I
decided to do a hash table first though to iron out some of the issues
to do with concurrent external structures. A linear hash table is
pretty simple compared to a B-tree. The Jenkins' hash function comes
into it because you really want to avoid overfull buckets.

It's also one of those cases, where you'd like the compiler to do a
good job. If the compiler can't do a good job of straight line
operations on [essentially] built in data types, then what hope have
we of convincing anyone, including ourselves, that Haskell is fit for
Real Programs (TM).


 When last I
checked the result was faster than Data.Map, but not by much, and
using strings probably wipes out that advantage vs. tries.



   No Strings, darling! No Strings.


cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: String Hashing

2007-06-18 Thread Thomas Conway

On 6/18/07, apfelmus <[EMAIL PROTECTED]> wrote:

Do you need the hash function for a hash table or for
fingerprints/signatures? In the former case, Tries are a much better
choice. For launching your own trie, see also


I'm actually using them for bucket addressing for external indexing
with a linear hash table. (Yes, the hashing does count, because
buckets are cached in memory.)

Actually, if one wants a concurrent dictionary, using something in the vein of

type HashTable k v = TVar (Array Int (TVar [(k,v)]))

has very good performance. It always seems something of a shame that
if you want all the benefits of lazy functional programming, you too
often have to settle for O(n log n) data structures.


Incidentally, while I've got your attention, I note that if you use a
good quality hash function like the one I ripped off, you don't need
to use [mostly] prime numbers for sizing your hash tables, and you can
use powers of two instead, which simplifies a bunch of things. This is
kind of obvious when you think about it, but every hash function I
came across as an undergraduate or even as a post-grad, with the
exception of md5 et al, was not good.  The dogma was that *good* hash
functions are too expensive for everyday use. So a word of advice to
all you worthy tertiary educators - this is not the 1970s any more -
good, cheap hash functions do exist, so update your course notes. :-)


We return you now to your normal haskell programming....

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] String Hashing

2007-06-17 Thread Thomas Conway

FWIW, here's a link to the original c code:
   http://www.burtleburtle.net/bob/hash/evahash.html

--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] String Hashing

2007-06-17 Thread Thomas Conway

Hi All,

I'm trying to figure out how to maximum performance out of one of my
inner loops which involves string hashing.

Consider the following hash function, which is a transliteration of a
good one written in C:

--8x--8x--8x--8x--8x--8x--8x--8x--8x
module HashStr where

import Data.Bits
import Data.ByteString as BLOB
import Data.Word

data Triple = Triple !Word64 !Word64 !Word64

hashStr :: ByteString -> Word64
hashStr str = hashBlock (Triple gold gold gold) str
   where
   gold = 0x9e3779b97f4a7c13

hashBlock (Triple a b c) str
   | BLOB.length str > 0 = hashBlock (mix (Triple a' b' c')) rest
   | otherwise   = c
   where
   a' = a + BLOB.foldl' make 0 (slice 0)
   b' = b + BLOB.foldl' make 0 (slice 1)
   c' = c + BLOB.foldl' make 0 (slice 2)
   make x w = (x `shiftL` 8) + fromIntegral w
   slice n = BLOB.take 8 $ BLOB.drop (8 * n) str
   rest = BLOB.drop 24 str

   mix :: Triple -> Triple
   mix = (\(Triple a b c) -> Triple (a - c) b c) .
 (\(Triple a b c) -> Triple (a `xor` (c `shiftR` 43)) b c) .
 (\(Triple a b c) -> Triple a (b - c) c) .
 (\(Triple a b c) -> Triple a (b - a) c) .
 (\(Triple a b c) -> Triple a (b `xor` (a `shiftL` 9)) c) .
 (\(Triple a b c) -> Triple a b (c - a)) .
 (\(Triple a b c) -> Triple a b (c - b)) .
 (\(Triple a b c) -> Triple a b (c `xor` (b `shiftR` 8))) .
 (\(Triple a b c) -> Triple (a - b) b c) .
 (\(Triple a b c) -> Triple (a - c) b c) .
 (\(Triple a b c) -> Triple (a `xor` (c `shiftR` 38)) b c) .
 (\(Triple a b c) -> Triple a (b - c) c) .
 (\(Triple a b c) -> Triple a (b - a) c) .
 (\(Triple a b c) -> Triple a (b `xor` (a `shiftL` 23)) c) .
 (\(Triple a b c) -> Triple a b (c - a)) .
 (\(Triple a b c) -> Triple a b (c - b)) .
 (\(Triple a b c) -> Triple a b (c `xor` (b `shiftR` 5))) .
 (\(Triple a b c) -> Triple (a - b) b c) .
 (\(Triple a b c) -> Triple (a - c) b c) .
 (\(Triple a b c) -> Triple (a `xor` (c `shiftR` 35)) b c) .
 (\(Triple a b c) -> Triple a (b - c) c) .
 (\(Triple a b c) -> Triple a (b - a) c) .
 (\(Triple a b c) -> Triple a (b `xor` (a `shiftL` 49)) c) .
 (\(Triple a b c) -> Triple a b (c - a)) .
 (\(Triple a b c) -> Triple a b (c - b)) .
 (\(Triple a b c) -> Triple a b (c `xor` (b `shiftR` 11))) .
 (\(Triple a b c) -> Triple (a - b) b c) .
 (\(Triple a b c) -> Triple (a - c) b c) .
 (\(Triple a b c) -> Triple (a `xor` (c `shiftR` 12)) b c) .
 (\(Triple a b c) -> Triple a (b - c) c) .
 (\(Triple a b c) -> Triple a (b - a) c) .
 (\(Triple a b c) -> Triple a (b `xor` (a `shiftL` 18)) c) .
 (\(Triple a b c) -> Triple a b (c - a)) .
 (\(Triple a b c) -> Triple a b (c - b)) .
 (\(Triple a b c) -> Triple a b (c `xor` (b `shiftR` 22)))

--8x--8x--8x--8x--8x--8x--8x--8x--8x

Obviously, we'd like all those lambdas and composes to be inlined,
along with all the intermediate Triple structures. So, how do you
convince ghc to do this? Alternatively, how would you *translate*
rather than transliterate, the mix function?

--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Implementing Mathematica

2007-06-14 Thread Thomas Conway

 You have brought up prolog, unification, etc .. and knowing this is the
Haskell board, just wondering what anyones thoughts on the hybrid haskell
based language CURRY, for these kind of problems.  It seems that it's
development is stalled... and sorry ahead of time if I am wrong on that
point.


In a previous life, I was a logic programming zealot, and looked at
curry. As far as I was concerned, it shared a significant problem with
most logic programming work: the designers had a bit of a slack
attitude to semantics. The key problem was that it had
non-deterministic functions so you could write (haskell syntax):
   main = do
   if f 42 /= f 42 then putStr "Look ma, no referential
transparency\n" else return ()

and expect the putStr could get executed.

For reference, it's not a problem in Prolog (if you overlook IO being
done with side effects ;-)) because the variables are explicit, and
not just a notational convenience as they are in lambda calculus:

main :-
   f(42,X), f(42,Y), ....

cheers,
T.
--
Dr Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Switch optimization

2007-06-10 Thread Thomas Conway

Hi All,

I'm writing some variable byte codec routines, which are used in
inner^N loops, and which I want to make really efficient. There are
several spots where the code uses lookup tables.

The best example is the table, which given the first byte, returns the
number of additional bytes. It is a precomputed version of the
following function:


codeLength :: Word8 -> Int
codeLength w
 | w .&. 0x80 == 0   = 0
 | otherwise = 1 + (codeLength $ w `shiftL` 1)


from which we construct a 256 entry table:

codeLen 0 = 0
codeLen 1 = 0
codeLen 2 = 0
...
codeLen 127 = 0
codeLen 128 = 1
...
codeLen 255 = 8

Now, compiling with ghc 6.6.1 and -O3, I see that it generates a long
chain of conditional branches. Now, even taking laziness into account,
this seems like terribly inefficient code.  I wrote this thinking it
would be more efficient than constructing a CAF array:

codeLens = listArray (0,255) (map codeLength [0..255])

but I'm guessing the CAF version would probably work out much more
efficient in the long run.

However, I think ghc (and any other compiler), should detect the
following properties:

1. all the clauses are mutually exclusive, so the sequencing is
irrelevant to the semantics

2. Given an explicit type declaration Word8 -> , the 256 cases
cover all the possible constructors of the type, so there are no
missing clauses.

I would have expected the generated code to have the form:

codeLen:
   x <- pick up arg
  return closure (codeLen' x)

codeLen':
   x' <- force x
   update precomputed-table[x']

Even if you leave out property 2 and include bounds checks, this seems
like an important kind of function to optimize well. So what have I
missed, or is it time to learn how to hack on ghc?

T.
--
Thomas Conway
[EMAIL PROTECTED]

Silence is the perfectest herald of joy:
I were but little happy, if I could say how much.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Why Perl is more learnable than Haskell

2007-04-11 Thread Thomas Conway

On 4/11/07, riccardo cagnasso <[EMAIL PROTECTED]> wrote:

If you first language is LISP probably you find easy Haskell and difficult
pearl.


I must say I agree here. I spent 10 years programming in prolog before
I tried haskell. Most of my problems with haskell are because it has a
rather opaque performance model (e.g. when should you use tail
recursion, and when should you not). But I happily acknowledge that my
experience is probably atypical. ;-)

cheers,
T.
--
Dr Thomas Conway  You are beautiful; but learn to work,
[EMAIL PROTECTED] for you cannot eat your beauty.
 -- Congo proverb
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: What I learned from my first serious attempt low-level Haskell programming

2007-04-08 Thread Thomas Conway

| 6. The inliner is a bit too greedy.  Removing the slow-path code from
|singleton doesn't help because popSingleton is only used once; but
|if I explicitly {-# NOINLINE popSingleton #-}, the code for
|singleton itself becomes much smaller, and inlinable (15% perf
|gain).  Plus the new singleton doesn't allocate memory, so I can
|use even MORE realWorld#s.

That's a hard one!  Inlining functions that are called just once is a huge win 
usually. I don't know how to spot what you did in an automated way.


Yeah. We found this to be an issue with the Mercury compiler. We
processed functions (well, okay predicates, in the case of Mercury) in
dependency order. We experimented with top down and bottom up order.
Bottom up inlining is great for eliminating all the little access and
convenience functions one writes, and top down gets the case above (at
least most of the time). IIRC, our experiments showed that overall,
bottom up inlining performed significantly better than top down, or
arbitrary order.

Bottom up inlining worked really well round the leaves because it
frequently replaced a call (requiring register saves, etc) with
structure packing/unpacking which didn't require register
saves/restores. Thus it eliminated calls altogether. It is also
advantageous when it allows producers and consumers to be merged,
eliminating memory allocations (as noted above). That said, I had
better point out that Mercury is strict, which simplifies things
rather.

Andrew Appel's code generator that used dynamic programming to select
between different generated code sequences comes to mind as potential
inspiration for a super-duper inliner.

cheers,
Tom
--
Dr Thomas Conway  You are beautiful; but learn to work,
[EMAIL PROTECTED] for you cannot eat your beauty.
 -- Congo proverb
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] idea for avoiding temporaries

2007-03-08 Thread Thomas Conway

On 3/9/07, David Roundy <[EMAIL PROTECTED]> wrote:

Nothing is being done concurrently, so I don't see what STM would gain us.
What is it that you're thinking we could gain from STM?


Its shiny and new, so it will make your code look sexy? :-)

So what happened to linear types? I remember reading Once Upon A Type
and Linear Types can Change the World during my hons year (and that
was more than 10 years ago). Mercury uses linear modes which are much
the same.

One of the arguments Fergus Henderson made was that in the case of
arrays, the update cost if you have to copy is too much, so you use
linear types/modes to make it impossible to do so. While you can
debate the pros and cons, the fact that a trivial "mistake" in your
program could change it from using in-place update to using copying
does lend weight to the argument for distinguishing the copying and
non-copying forms at the type level.

cheers,
T.
--
Dr Thomas Conway  You are beautiful; but learn to work,
[EMAIL PROTECTED] for you cannot eat your beauty.
 -- Congo proverb
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] TVars & throw

2007-03-08 Thread Thomas Conway

On 3/8/07, Chris Kuklewicz <[EMAIL PROTECTED]> wrote:

What happens in your throw/catch case if I have
  stm1 = do some_stm_code_that_throws_your_exception
  stm2 = return Foo
and I run "atomically (stm1 `orElse` stm2)" ?
Answer: The exception will prevent running stm2.


In the *specific* case of the external dictionary code, there is no
orElse case to worry about since the algorithms are deterministic in
this sense. For a more general case, it is an interesting problem to
consider. You might want something like
throwIfNothingElseSeemsLikeAGoodIdea, so that you can attempt any
alternatives, and if none succeed then throw (and I guess choose the
first if more than one alternative wants to throw - I like
determinism).


> Um, is
>unsafeIOToSTM $ atomically trans
> going to run you into problems?

YES!


My point is that onRetry really wants to be not "IO t" but
"IO_execpt_STM t", but there's no way of ensuring that, which
unfortunately antagonizes modularity in the same kind of way that
locks do.  It requires you to know how the implementation of the IO
actions you use work, in case they use STM inside. Since the actions
may come out of a library somewhere, you need to know how the library
is implemented.

 Why are tricky problems tricky? I thought Haskell was supposed
to make everything easy. ;-)

T.
--
Dr Thomas Conway  You are beautiful; but learn to work,
[EMAIL PROTECTED] for you cannot eat your beauty.
 -- Congo proverb
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] lhs2TeX

2007-03-07 Thread Thomas Conway

Hi all,

I noticed that in the polycode mode, lhs2TeX produces odd looking
output for hex constants.
E.g.
\begin{code}
wibble = 0x7F
\end{code}

It produces a nice little gap between the 0 and the x. Inspecting the
latex source, it even goes to some effort to do so:
\mathrm{0}\;\Varid{x7F}

I couldn't see anything in the manual to show me how.

Anyone?
--
Dr Thomas Conway  You are beautiful; but learn to work,
[EMAIL PROTECTED] for you cannot eat your beauty.
 -- Congo proverb
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] TVars & throw

2007-03-07 Thread Thomas Conway

Hi All,

Consider the following:

foo = do
   v <- newTVar "hi there!"
   throwDyn v

main = do
   catchDyn (atomically foo) \v -> do
   x <- atomically (readTVar v)
   putStr x


I.e. throw information that gets rolled back from inside a
transaction, catch it and use it.

This looks like bad. I assume it actually works, but should it?

T.
--
Dr Thomas Conway  You are beautiful; but learn to work,
[EMAIL PROTECTED] for you cannot eat your beauty.
 -- Congo proverb
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Usage of . and $

2007-03-07 Thread Thomas Conway

On 3/7/07, mm <[EMAIL PROTECTED]> wrote:

> f . g . h $ x

Alternativly,

(f . g . h) x

will work, too.


It always irks me that you don't actually save any horizontal space
using $. That is,
   (e) x
has the same number of characters (incl spaces) as
   e $ x

T.
--
Dr Thomas Conway  You are beautiful; but learn to work,
[EMAIL PROTECTED] for you cannot eat your beauty.
 -- Congo proverb
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] weak references and sharing

2007-03-06 Thread Thomas Conway

Hi All,

I have the situation where I have something like the following:

type Foo = TVar Int

data Clasp t = Clasp t

makeClasp x f
   = do
   let c = Clasp x
   mkWeakRef c (Just (f x))
   return c

finis p
   = do
   atomically $ do
   x <- readTVar p
   writeTVar p (x - 1)

 makeClasp p finis 

The Foo object has a greater lifetime than the Clasp.

The bit I want you to focus on is the three lines:
   let c = Clasp x
   mkWeakRef c (Just (f x))
   return c

We can only be sure that (f x) will be called after we let go of the
returned value if we can be sure that the language implementation
preserves the (explicit) sharing of 'c' between the call to mkWeakRef
and return. In particular, if the implementation rewrote the fragment
as:
   mkWeakRef (Clasp x) (Just (f x))
   return (Clasp x)
then the finalizer might be called sooner than we expect. The problem
is, that as I read it, making a distinction between the two fragments
would be a violation of referential transparency.

So here's the question: is there a way of writing this that is
guaranteed to work on all correct implementations of Haskell?

cheers,
T.

ps FWIW, the actual situation is that we have a TVar pointing to a
record representing a 'page' of an external data structure. For some
operations we need to make sure the the page is in-memory, but when
all the operations using a particular page are finished, we'd like to
flush it back to disk if it has been modified. Now it'd be really nice
to use the garbage collector to tell us when noone is using the
in-memory version of the page, so we can write it out. However, if we
can't be certain that we've got a genuine hard-reference to the
object, then we can't be sure it will not be prematurely flushed.
--
Dr Thomas Conway  You are beautiful; but learn to work,
[EMAIL PROTECTED] for you cannot eat your beauty.
 -- Congo proverb
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: weak references & caches

2007-02-08 Thread Thomas Conway

So I just coded up the approach. It compiles, so I assume it works as
intended. ;-)

Comments and clever obervations extremely welcome.

cheers,
T.
module W where

import Control.Concurrent.STM
import qualified Data.Map as M
import System.Mem.Weak
import GHC.Conc

makePtr tabPtr a
= do
tab <- readTVar tabPtr
case M.lookup a tab of
Nothing -> allocPtr tabPtr tab a
Just wp -> do
mp <- unsafeIOToSTM (deRefWeak wp)
case mp of
Nothing -> allocPtr tabPtr tab a
Just ptr -> return ptr
where
allocPtr tabPtr tab a = do
ptr <- newTVar (Just a, Nothing)
wp <- unsafeIOToSTM (mkWeakPtr ptr Nothing)
let tab' = M.insert a wp tab
writeTVar tabPtr tab'
return ptr
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] weak references & caches

2007-02-08 Thread Thomas Conway

Hi All,

I'm hacking some (external) B-Tree code, and amongst the numerous
interesting problems I've come up against[*], is to do with managing
which pages/nodes are in memory and which are not.

We use TVars to point from one (in-memory version of a) page to
another, so we have a type like the following:

data Node
   = Leaf [(Key,Value)]
   | Node NodePtr [(Key,NodePtr)]

type NodePtr = TVar (Maybe Address, Maybe Node)

type Address = Integer

Addresses are just page numbers or file offsets, or similar.

NodePtr is the interesting bit - the following table summarizes what
the combination of values mean:

explain (Just addr, Nothing) = "An external node we have not read in yet"
explain (Nothing, Just node) = "A new node for which we have not yet
allocated a page"
explain (Just addr, Just node) = "A node we've read in"
explain (Nothing, Nothing) = "Emit nasal daemons"

Now, we want a cache to hang on to nodes that we've read in to hold
them in memory, to save repeatedly reading them off disk
unnecessarily. But more than that, we want to make sure that when we
create a NodePtr pointing to a particular Address, that all references
to this particular address go through the same NodePtr, otherwise in a
concurrent situation we could well end up with two copies of the page
in memory, both recieving updates leading to havoc. At the same time,
we'd like to be able to evict pages according to some policy (the
Oracle hot-list cold-list cache is attractive, if implemented
carefully).

So one scheme that comes to mind is to keep a table mapping addresses
to (Weak NodePtr). When we ask for a node, first we consult the cache
(if we have one) and get the NodePtr right back. If the addressed node
is not in the cache then we check the table to see if there is/was a
known NodePtr for that address. If there is none then we can allocate
a new one and stick it in the table, and if it is there then we return
it.  This is also nice because we can attach finalizers to the (Weak
NodePtr)s to write them out if dirty.

So the question is, is there a simpler scheme (though this one isn't
to bad), and more importantly, is this likely to be horribly slow?
How efficient is dereferencing a weak reference, and what is the
likely impact on performance of having 10^3-10^5 weak references
around?

cheers,
Tom
[*]Film at 11. I'm planning to write it all up, maybe for a
conference, maybe just for the Monad Reader, or something.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] casting

2006-10-10 Thread Thomas Conway

On 10/11/06, Bulat Ziganshin <[EMAIL PROTECTED]> wrote:


it's a sort of problem that bites me many times when i start to wrote
Streams library :)  although you said that you discovered the
dictionaries mechanism, i propose you to read
http://haskell.org/haskellwiki/OOP_vs_type_classes page where you can
find something that you don't yet know


Yes, I did actually know this stuff, it was just paged out, and I
posted before thinking. :-)

Something to do with having two small kids, one of whom gets up
way too early in the morning (i.e. 5 - 5:30am). :-)

In theory, however, there is no particular reason why a runtime type
cast couldn't be implemented to allow downcasting, but it does require
the runtime system to have a repository of instance declarations, or
associate with static data for a type the classes of which it is an
instance. It's been a few years, but if my recallection is correct,
the Mercury runtime system does this.

T.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Vertical tabs in source code and other obscure chars

2006-10-09 Thread Thomas Conway

Mostly hysterical raisins, I think.

T.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] casting

2006-10-09 Thread Thomas Conway

Thanks Misha & Matthias.

I now get what's going on. The mention of the word "dictionary"
revealed it all. I've spent the last 7 years programming in C++, and
had dynamic_cast<> firmly fixed in my head. I totally forgot that
Fergus Henderson and I independently reinvented dictionary passing for
the implementation of Mercury circa 1995. :-)

I also realized that the concrete signatures from my actual
application don't quite satisfy the requirements anyway, and that the
problem evaporates when I get the signatures right.:-)

T.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: casting

2006-10-09 Thread Thomas Conway

On 10/9/06, I wrote:

So, can anyone suggest how I can achieve my goal?


And how many milliolegs of type hackery will it take? ;-)

Tom
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] casting

2006-10-09 Thread Thomas Conway

Hi All

I'm having some difficulty with typeclasses.

What I'm trying to do should be obvious, but it's still giving me
trouble. I want to take a packaged item, and strengthen the
constraints on its type. Rather than being just any type that is an
instance of A, I want to do a runtime check and do something different
if it is a type that is also an instance of B (which implies that it
is an instance of A also).

I've attached some code that attempts this, but ghci (er, 6.4.2 with
-fglasgow-exts) complains:

Compiling Main ( Cast.hs, interpreted )

Cast.hs:15:22:
   Ambiguous type variable `b' in the constraint:
 `B b' arising from use of `qux' at Cast.hs:15:22-24
   Probable fix: add a type signature that fixes these type variable(s)
Failed, modules loaded: none.

So, can anyone suggest how I can achieve my goal?

thanks,
Tom


Cast.hs
Description: Binary data
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] layout monads

2006-10-03 Thread Thomas Conway

Hi All,

Next monad query [*]

In the 1995 paper "Composing Haggis", layout is done using a monad to
compose individual elements. To modernize the syntax consider (forgive
the operator, but it avoids parentheses):

infixl 1 <|
f <| x = f x

mylayout
   = do
   hbox <| do
button "ok"
button "cancel"

It was nice, because you didn't need to worry about temporary
structures, variable names, and suchlike.

In the 1996 version of the paper, this silently changed to

mylayout
= do
 o <- button "ok"
 c <- button "cancel"
 hbox [o, c]

Does anyone know why the change occurred? IMO, the former was much more elegant.

Also, assuming you want to use a different underlying monad (e.g. IO),
how would you implement a layout monad.

FWIW, I'm not actually interested in widgets per se, but in building
html pages in such a way that you can specify the style
elements/scripts with the code that creates the nodes of the document
tree, but serialize them separately.

Tom
* Maybe we need a [EMAIL PROTECTED] list. On second thoughts,
no, because it would kill off the other mailingl lists. ;-)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Either as a Monad instance

2006-10-02 Thread Thomas Conway

Hi All,

I've been [trying to] grapple with the various monads and
transformers, and it occurs to me that the standard instance for
Either as a monadic type is unnecessarily restrictive. Is there a
compelling reason that it is not just

instance Monad (Either e) where
   return = Right
   (Left e) >>= f = Left e
   (Right x) >>= f = f x

abort = Left

Tom
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Anonymous types

2006-09-14 Thread Thomas Conway

Hi all,

Is there any deep and meaningful reason why Haskell doesn't have
anonymous discriminated union types?

I'm thinking of an example like:

data Amount = Amount Integer (Mg|G|Kg|T)

Now this particular case is perhaps unconvincing - a seperate Units
type would be quite sensible, however I'm thinking of translating
ASN.1 definitions into Haskell. ASN.1 allows SEQUENCE (record types)
and CHOICE (discriminated unions) as a kind of type constructor.

Not having to invent names for anonymous SEQUENCE and CHOICE types
would be nice.

T.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Bit string

2006-09-14 Thread Thomas Conway

On 9/15/06, Lennart Augustsson <[EMAIL PROTECTED]> wrote:

It's hard to tell what the best representation is if you don't know
what the operations that you are going to perform are.  If all you
are going to do is I/O of bitstrings, then [Bool] could be great.  If
you need to do bitwise boolean ops then Integer is a wise choice.


And is I/O of bitstring represented as Integer going to be
significantly worse [Bool]? It is certainly a much denser
representation. Actually the main operations are probably tests (is a
certain bit set), and encoding and decoding (to and from BER, PER,
etc).

The operations that need to be efficient are:

-- test to see if the Nth bit is set
testBit :: BitString -> Integer -> Bool

-- set the Nth bit
setBit :: BitString -> Integer -> BitString

-- clear the Nth bit
clearBit :: BitString -> Integer -> BitString

I can see the potential for creating large intermediate Integers (i.e.
1 `shift` n), if the number of bits is large.

T.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Bit string

2006-09-14 Thread Thomas Conway

Hi all,

I'm doing some work with ASN.1, and I'm thinking about how to
represent the type "BIT STRING".

The obvious, and inefficient representation would be

type BitString = [Bool]


A reasonable sparse representation might be

type BitString = [Integer]

where the integers represent the positions of the non-zero bits, but
other implementations (e.g. C & C++ based ones) mostly seem to use
dense representations, and there are certain predictability advantages
to be had by following existing implementations in this regard.

But given that a value of type BIT STRING is allowed to have leading 0
bits, which have no semantic effect, another representation would be

type BitString = [Word64]

(I'm a modern type, and believe in 64bit computing ;-), but you could
use Word32 if you liked).

However, after a few moments' consideration, I realized, that if I was
going to use a representation like that, I could probably use

type BitString = Integer

which already has [I assume] a carefully optimized implementation.
Also, it ends up being significantly more strict than a list
representation, which is probably a good thing in most situations.

My question for all present is: Have I missed either a problem with
using Integer, or have I overlooked a better representation?

T.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] C++ class = neutered (haskell class + haskell existential)

2006-08-20 Thread Thomas Conway

On 8/20/06, John Meacham <[EMAIL PROTECTED]> wrote:

C++ templates are a whole nother ball of wax.


And that's putting it politely. ;-)

T.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] C++ class = neutered (haskell class + haskell existential)

2006-08-17 Thread Thomas Conway

On 8/18/06, John Meacham <[EMAIL PROTECTED]> wrote:
   [lots of good argument before and after deleted]


There is a major difference though, in C++ (or java, or sather, or c#,
etc..) the dictionary is always attached to the value, the actual class
data type you pass around. in haskell, the dictionary is passed
separately and the appropriae one is infered by the type system. C++
doesn't infer, it just assumes everything will be carying around its
dictionary with it.


C++ programmers deal with this using a number of techniques, mostly
involving templates.

Actually, there is one technique using C++ templates that I really
want to see going mainstream in the Haskell implementations.
Existential types are already there, now I want to see associated
types (trait types in C++). Maybe I've been doing too much C++
programming in the last few years, but a lot of the times when I end
up using multiparameter type classes, what I really want is an
associated type. For example

class Monad s => Store s where
   type Key
   insert :: Binary -> s Key
   retrStore :: Key -> s Binary
   ...

so that part of the instance is a choice of the key type.

For those who are interested, I'm sure the relevant papers are readily
available on citeseer/Google. :-)

Tom
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How can we detect and fix memory leak due to lazyness?

2006-08-07 Thread Thomas Conway

Perhaps your instances will work correctly with this data declaration?


Perhaps it might.  But that misses an important point.

The biggest impediment to developing large robust applications with
Haskell is the opacity of its performance model.  Haskell is fantastic
in very many ways, but this is a really serious difficulty.  I can
make a seemingly slight change to my program and the performance
changes dramatically.  What's worse, the connection between the cause
of the blowup and place where it is observed can often be quite
subtle[*].

There's a classic example of two one line haskell programs, one of
which uses O(1) stack space and the other O(n) stack space, even
though they compute the same result, and which are so similar, you
have to stare at them for five minutes before you can spot the
difference.

Hughes' "Why functional programming matters" argues [rightly] that
lazy FP provides a better "glue", to allow greater abstraction at the
semantic level.  The flip side, which IIRC, he doesn't mention is the
opacity of the performance model.

Here's a question for the experts.  What generalizations can I make
about the performance of lazy functions under composition? In
particular, if all my individual functions are well behaved, will the
program as a whole be well behaved?

cheers,
Tom
[*] Gosh, this is beginning to sound like a diatribe on the evils of
pointers and manual memory management in C. Interesting
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe