Re: unique identifiers as a separate library

Isaac Dupree Wed, 24 Dec 2008 07:47:17 -0800

Iavor Diatchki wrote:

 Also, your
implementation of it could be more efficient: it doesn't need to do locking,
so I suggest modifyIORef rather than atomicModifyIORef (Actually you'll have
to use readIORef >>= writeIORef >> return, instead, because modifyIORef has
a different type than atomicModifyIORef).


I don't think that that's quite right.  I was thinking that it should
be OK to use different supplies that share a reference in different
threads.  So, for example, split the supply and pass each version to a
new thread.  This should be OK with the dupable primitives because the
thunks would be evaluated in different threads.  However, we still
need the synchronize access to the reference, or we'll get incorrect
values.

That's true (if you're careful enough about which thread isevaluating the splitting -- the existing split* functionsare actually *not* strict enough for this, and they probablyshouldn't be either, so an additional warning that the usermay have to add some `seq`s or `evaluate`s might be needed).Anyway, if atomicModifyIORef isn't that big of anoverhead, then no problem :-)

in fact, for lists (as you get a incomplete-pattern-match warning there, but
you know the list is always infinite, because you made it with "iterate"),
you could instead use an infinite-list type, Data.Stream from package
"Stream"[*]; as Stream is not a sum type (it only has one possible
constructor: Cons), it might even be a bit more efficient!


You are right but the benefit is small enough that I don't think it
warrants a dependency on another package.

I'm going to try to argue to convince you that it's entirelyappropriate to use Data.Stream :-)


From my somewhat mathematical point of view, that is.

1. Data.Stream is a small module that implements a verywell-known, practically standard (though under-used), andsimple data type. All you really need from it is importData.Stream(Stream(Cons)), plus (iterate, streamToList) ifyou're keeping your current interface. It's nothing youshould be afraid to depend on, if you're using its concept(which you are). It's much smaller than the 'containers'package, which similarly you don't use but if you needed aMap or something you obviously should.

2. The more compelling argument, that it took me a goodnight's sleep to think of:

Supply is an infinite binary tree. Stream is an infiniteunary tree (er, more commonly called infinite "list").They're both codata. They're both comonads. (although theydon't go so far as to depend on category-extras to providean instance.) The primary function/purpose of Supply is an*efficient* way to turn a Stream into a Supply. I daresayit would, even, be more fundamental to provide interface

newSupply      :: Stream a -> IO (Supply a)
than
newSupply      :: a -> (a -> a) -> IO (Supply a)

(although it might be worth keeping both interfaces? mainlyfor compatibility, since one's just a Data.Stream.iterateaway from the other and your haddocks could say so)In fact, it's terribly annoying to turn a (Stream a) into a(a, a -> a) -- in fact I don't think it can be done ingeneral (you can turn it into a (b, b -> (a, b)) though a-launfold, with b = Stream a) -- so you should definitelyprovide the above (Stream a -> IO (Supply a)) interface.(Although if you're cowardly enough about the extradependency, I guess you could make it the riskier ([a] -> IO(Supply a)), risk being if the user provides a finitelist... Sorry for conditionally insulting you; it seems ahorribly underhanded thing for me to do :-)


Likewise, it would be nice for
split          :: Supply a -> Stream (Supply a)
...and then you would not even depend on Data.List anymore!

(I don't happen to think the arguments that Data.List isbetter than Stream for definitely-infinite lists are veryconvincing; except possibly that List will be moreup-to-date with respect to stream-fusion optimizations, andeven then, value-supply doesn't actually rely on any ofthose optimizations; it actually does keep around the Stream(or List), or in the case of Num/Enum, it doesn't use one inthe first place.)

On the other hand, I still want the Stream-based interfaces,but my initial argument isn't even that necessary: considerimplementing the current signature:

newSupply      :: a -> (a -> a) -> IO (Supply a)
currently by:
newSupply x f   = genericNewSupply (iterate f x) listGenSym
but we don't need to use a list at all, it could be like:
newSupply x f   = genericNewSupply x (\a -> (f a, a))

(with atomicallyModifyIORef added as appropriate dependingon your refactorings)



P.S. more code cleanup, if you didn't notice it:

0.4:

-- XXX: Is the atomic necessary?
import Data.IORef(IORef,newIORef,atomicModifyIORef)

yes the atomic is necessary, so you can update the commentto say why? (because multiple threads might be reading froma Supply-structure created from the same newSupply run, andwe don't want memory corruption etc.)

import System.IO.Unsafe(unsafePerformIO,unsafeInterleaveIO)

#if __GLASGOW_HASKELL__ >= 608
import GHC.IOBase(unsafeDupableInterleaveIO,unsafeDupablePerformIO)
#else
...
unsafeDupablePerformIO :: IO a -> a
unsafeDupablePerformIO = unsafePerformIO

various "PerformIO" variants to delete (I'm absurdly pleasedthat we implemented this without needing any)


-Isaac
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: unique identifiers as a separate library

Reply via email to