Re: unique identifiers as a separate library

2008-12-18 Thread Simon Marlow

Sebastian Fischer wrote:

for a project I am using the libraries Unique, UniqSupply, and UniqFM 
from the package ghc. It seems odd to have a dependency on a whole 
compiler only in order to use its (highly efficient and, hence, 
preferred) implementation of unique identifiers.


Would it be possible to put everything concerned with unique identifiers 
in GHC into a separate package on Hackage?


This may also allow me to get a fix for 
http://hackage.haskell.org/trac/ghc/ticket/2880 without reinstalling GHC.


Sure, that would be a useful chunk to separate out from GHC.  However, 
looking at the code I see that our unique supply monad is really not a lazy 
 monad at all:


thenUs :: UniqSM a - (a - UniqSM b) - UniqSM b
thenUs expr cont us
  = case (expr us) of { (result, us') - cont result us' }

which is strict, and even the lazy version:

lazyThenUs :: UniqSM a - (a - UniqSM b) - UniqSM b
lazyThenUs (USM expr) cont
  = USM (\us - let (result, us') = expr us in unUSM (cont result) us')

doesn't really split the supply, because it will force the left side as 
soon as the unique on the right side is demanded.


Given that our monad is strict, there's no need for it to use 
mkSplitUniqueSupply, it could just call genSym to create new uniques.  I 
notice there are other parts of the compiler that do make use of the lazy 
splittable unique supply in their own monads, but I'm not sure if they 
really need it.


Cheers,
Simon
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: unique identifiers as a separate library

2008-12-18 Thread Sebastian Fischer

On Dec 17, 2008, at 10:54 AM, Sebastian Fischer wrote:

Would it be possible to put everything concerned with unique  
identifiers in GHC into a separate package on Hackage?



I have wrapped up (a tiny subset of) GHC's uniques into the package  
`uniqueid` and put it on Hackage:


http://hackage.haskell.org/cgi-bin/hackage-scripts/package/uniqueid

It only provides

type Id
hashedId :: Id - Int

type IdSupply
initIdSupply  :: Char - IO IdSupply
splitIdSupply :: IdSupply - (IdSupply,IdSupply)
idFromSupply  :: IdSupply - Id

instance Eq Id
instance Ord Id
instance Show Id

The main difference is due to my fear of depending on the foreign  
function `genSymZh` which I replaced by a global counting IORef.


The other difference is that the Show instance does not rely on GHC's  
static flags and can hence be used outside of GHC sessions.


The code is on github:

http://github.com/sebfisch/uniqueid

Extensions welcome!

Cheers,
Sebastian
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: unique identifiers as a separate library

2008-12-18 Thread Isaac Dupree

Sebastian Fischer wrote:

On Dec 17, 2008, at 10:54 AM, Sebastian Fischer wrote:

Would it be possible to put everything concerned with unique 
identifiers in GHC into a separate package on Hackage?



I have wrapped up (a tiny subset of) GHC's uniques into the package 
`uniqueid` and put it on Hackage:


thanks!

The main difference is due to my fear of depending on the foreign 
function `genSymZh` which I replaced by a global counting IORef.


which is its own risk.  maybe you should NOINLINE it?

Potential code criticisms / suggestions for it as a library:

Unboxed: so it only works on GHC, even though others have 
unsafe IO too.  In theory, strictness annotations should be 
able to achieve the same efficiency.


Char is supposed to represent a Unicode character -- but 
this code behaves oddly:

For 64-bit Int#, it does so.
For 32-bit Int#, it assumes Char is within the first 8 bits 
(ASCII and a little more).
If Int# (or Int) can be 30-bit (like Haskell98 permission), 
its correctness suffers even worse.
Is it really even a necessary part of the design?  The only 
way you provide to extract it or depend on its value is 
indirectly via the Show instance.  Its presence there is, 
in any case, at the cost of max. 2^24 (16 million) IDs 
before problems happen, whereas billions is still not a 
great limit but at least is somewhat larger. (applications 
that are long-running or deal with huge amounts of data 
could be affected)


unsafeDupableInterleaveIO: this Dupable was safe for GHC 
to use because GHC is single-threaded.  Is it safe in a 
library setting?  I guess likewise, the IORef global 
variable wouldn't be thread-safe... but this one isn't even 
safe between separate runs of initIdSupply.  On the other 
hand, thread-safety probably makes it much less efficient 
(if you can find a way to use atomic int CPU instructions, 
it might not be too bad, or else per-thread counters... or 
just declare how unsafe it is)


unsafePerformIO: it's not totally necessary here.  Its only 
function is to make IDs generated by different runs of 
initIdSupply be distinct.  So it could, anyway, probably be 
refactored to only use unsafePerformIO global-ness once per 
initIdSupply and just use unsafeInterleaveIO within (where 
currently nextInt is called).


-Isaac
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: unique identifiers as a separate library

2008-12-18 Thread ajb

G'day all.

Quoting Sebastian Fischer s...@informatik.uni-kiel.de:


I have wrapped up (a tiny subset of) GHC's uniques into the package
`uniqueid` and put it on Hackage:

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/uniqueid


First off, thanks for this.


The main difference is due to my fear of depending on the foreign
function `genSymZh` which I replaced by a global counting IORef.


Why not depend on this instead?

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/value-supply

Cheers,
Andrew Bromage
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users