Hi, just so that everybody knows the background of my usage of newtypes. In RNA-folding I used to represent nucleotides using nullery constructors and good a very nice speedup using newtypes. But, of course, I had billions of accesses (n^3, n ~= 1000).
For representing large sequences (lazyness, etc), one can use any representation for the strand information. Ketils basic typedefs should suit us well for sequence things. I will have a type class MkPrimary with (mkPrimary :: SeqData -> VU.Vector Nuc), for example. And maybe, this is where one can say the following: if it is performance critical (nucleotides in rna-folding), one should be able to define a 'Prim' instance using rl's primitive package. If performance is not that critical, use whatever is most convenient?! Btw. if you need performance, consider staying away from some types like Word. There are fun open bugs, where Int is 2.5x faster than Word. ;-) Gruss, Christian >On Thu, Jul 14, 2011 at 11:29 AM, Christian Höner zu Siederdissen ><[email protected]> wrote: >> Hi, >> >> newtype Strand = Strand Int >> >> uses a single-constructor datatype "Int" as strand repr. >> >> While "Bool" is algebraic with two constructors. This is not >> optimized completely. Or maybe "was", I don't know the current >> status of GHC regarding this but I think it is still open. > >Hmmm, using Int may or may not be better. Or maybe Word8, since we >only need one bit. > >> So "True" is a pointer toward a global "True" with an indirection, >> while "true = Strand 0" would be an actual "0 :: Int#". > >If your data type has something like > > data X = X {..., strand :: !Strand, ...} > >then, although not unpacked, the strand will always be a pointer >without indirections (i.e. not a thunk), right? > >> And at least 1 year ago, I had much better performance using >> newtypes of Ints instead of "data Nuc = A | C | G | U" > >I've seen this as well, but with an enumeration of more than 20 >constructors. But it's not like Strand is the bottleneck of some >application, I think. My concern is about losing readability without >gaining anything measurable in real tests. ADTs are really nice =). > >Cheers, =D > >-- >Felipe. _______________________________________________ Biohaskell mailing list [email protected] http://malde.org/cgi-bin/mailman/listinfo/biohaskell
