Re: RFC: Unpacking sum types
Hi, Am Dienstag, den 01.09.2015, 10:23 -0700 schrieb Johan Tibell: > I have a draft design for unpacking sum types that I'd like some > feedback on. In particular feedback both on: > > * the writing and clarity of the proposal and > * the proposal itself. > > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes The current proposed layout for a data D a = D a {-# UNPACK #-} !(Maybe a) would be [D’s pointer] [a] [tag (0 or 1)] [Just’s a] So the representation of D foo (Just bar) is [D_info] [] [1] [] and of D foo Nothingis [D_info] [] [0] [] where dummy is something that makes the GC happy. But assuming this dummy object is something that is never a valid heap objects of its own, then this should be sufficient to distinguish the two cases, and we could actually have that the representation of D foo (Just bar) is [D_info] [] [] and of D foo Nothingis [D_info] [] [] and an case analysis on D would compare the pointer in the third word with the well-known address of dummy to determine if we have Nothing or Just. This saves one word. If we generate a number of such static dummy objects, we can generalize this tag-field avoiding trick to other data types than Maybe. It seems that it is worth doing that if * the number of constructors is no more than the number of static dummy objects, and * there is one constructor which has more pointer fields than all other constructors. Also, this trick cannot be applied repeatedly: If we have data D = D {-# UNPACK #-} !(Maybe a) | D'Nothing data E = E {-# UNPACK #-} !(D a) then it cannot be applied when unpacking D into E. (Or maybe it can, but care has to be taken that D’s Nothing is represented by a different dummy object than Maybe’s Nothing.) Anyways, this is an optimization that can be implemented once unboxed sum type are finished and working reliably. Greetings, Joachim -- Joachim “nomeata” Breitner m...@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nome...@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nome...@debian.org signature.asc Description: This is a digitally signed message part ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
RFC: Unpacking sum types
I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on: * the writing and clarity of the proposal and * the proposal itself. https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes -- Johan ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: RFC: Unpacking sum types
I wonder: are there issues with strict/unpacked fields in the sum type, with regard to the 'fill in stuff' behavior? For example: data C = C1 !Int | C2 ![Int] data D = D1 !Double {-# UNPACK #-} !C Naively we might think: data D' = D1 !Double !Tag !Int ![Int] But this is obviously not going to work at the Haskell-implemented-level. Since we're at a lower level, we could just not seq the things from the opposite constructor, but are there problems that arise from that? Also of course the !Int will probably also be unpacked, so such prim types need different handling (fill with 0, I guess). -- Also, I guess this is orthogonal, but having primitive, unboxed sums (analogous to unboxed tuples) would be nice as well. Conceivably they could be used as part of the specification of unpacked sums, since we can apparently put unboxed tuples in data types now. I'm not certain if they would cover all cases, though (like the strictness concerns above). -- Dan On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibellwrote: > I have a draft design for unpacking sum types that I'd like some feedback > on. In particular feedback both on: > > * the writing and clarity of the proposal and > * the proposal itself. > > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > > -- Johan > > > ___ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: RFC: Unpacking sum types
Just a small comment about syntax. Why is there an "_n" suffix on the type constructor? Isn't it syntactically evident how many things are in the |# .. | .. #| block? More generally, are the parser changes and the wild new syntax strictly necessary? Could we instead just have a new keyword, but have at look like a normal type constructor? For example, the type: (Sum# T1 T2 T3) Where "UnboxedSum" can't be partially applied, and is variable arity. Likewise, "MkSum#" could be a keyword/syntactic-form: (MkSum# 1 3 expr) case x of MkSum# 1 3 v -> e Here "1" and "3" are part of the syntactic form, not expressions. But it can probably be handled after parsing and doesn't require the "_n_m" business. -Ryan On Tue, Sep 1, 2015 at 6:10 PM Johan Tibellwrote: > After some discussions with SPJ I've now rewritten the proposal in terms > of unboxed sums (which should suffer from the extra seq problem you mention > above). > > On Tue, Sep 1, 2015 at 11:31 AM, Dan Doel wrote: > >> I wonder: are there issues with strict/unpacked fields in the sum >> type, with regard to the 'fill in stuff' behavior? >> >> For example: >> >> data C = C1 !Int | C2 ![Int] >> >> data D = D1 !Double {-# UNPACK #-} !C >> >> Naively we might think: >> >> data D' = D1 !Double !Tag !Int ![Int] >> >> But this is obviously not going to work at the >> Haskell-implemented-level. Since we're at a lower level, we could just >> not seq the things from the opposite constructor, but are there >> problems that arise from that? Also of course the !Int will probably >> also be unpacked, so such prim types need different handling (fill >> with 0, I guess). >> >> -- >> >> Also, I guess this is orthogonal, but having primitive, unboxed sums >> (analogous to unboxed tuples) would be nice as well. Conceivably they >> could be used as part of the specification of unpacked sums, since we >> can apparently put unboxed tuples in data types now. I'm not certain >> if they would cover all cases, though (like the strictness concerns >> above). >> >> -- Dan >> >> >> On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell >> wrote: >> > I have a draft design for unpacking sum types that I'd like some >> feedback >> > on. In particular feedback both on: >> > >> > * the writing and clarity of the proposal and >> > * the proposal itself. >> > >> > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes >> > >> > -- Johan >> > >> > >> > ___ >> > ghc-devs mailing list >> > ghc-devs@haskell.org >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > >> > > ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: RFC: Unpacking sum types
> > If we expose it on the Haskell level, I find MkSum_1_2# the right thing > to do: It makes it clear that (conceptually) there really is a > constructor of that name, and it is distinct from MkSum_2_2#, and the > user cannot do computation with these indices. > I don't mind MkSum_1_2#, it avoids the awkwardness of attaching it to a closing delimiter. But... it does still introduce the idea of cutting up tokens to get numbers out of them, which is kind of hacky. (There seems to be a conserved particle of hackiness here that can't be eliminate, but it doesn't bother me too much.) ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: RFC: Unpacking sum types
After some discussions with SPJ I've now rewritten the proposal in terms of unboxed sums (which should suffer from the extra seq problem you mention above). On Tue, Sep 1, 2015 at 11:31 AM, Dan Doelwrote: > I wonder: are there issues with strict/unpacked fields in the sum > type, with regard to the 'fill in stuff' behavior? > > For example: > > data C = C1 !Int | C2 ![Int] > > data D = D1 !Double {-# UNPACK #-} !C > > Naively we might think: > > data D' = D1 !Double !Tag !Int ![Int] > > But this is obviously not going to work at the > Haskell-implemented-level. Since we're at a lower level, we could just > not seq the things from the opposite constructor, but are there > problems that arise from that? Also of course the !Int will probably > also be unpacked, so such prim types need different handling (fill > with 0, I guess). > > -- > > Also, I guess this is orthogonal, but having primitive, unboxed sums > (analogous to unboxed tuples) would be nice as well. Conceivably they > could be used as part of the specification of unpacked sums, since we > can apparently put unboxed tuples in data types now. I'm not certain > if they would cover all cases, though (like the strictness concerns > above). > > -- Dan > > > On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell > wrote: > > I have a draft design for unpacking sum types that I'd like some feedback > > on. In particular feedback both on: > > > > * the writing and clarity of the proposal and > > * the proposal itself. > > > > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > > > > -- Johan > > > > > > ___ > > ghc-devs mailing list > > ghc-devs@haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: RFC: Unpacking sum types
Hi, Am Mittwoch, den 02.09.2015, 01:44 + schrieb Ryan Newton: > Why is there an "_n" suffix on the type constructor? Isn't it > syntactically evident how many things are in the |# .. | .. #| > block? Correct. > More generally, are the parser changes and the wild new syntax > strictly necessary? If we just add it to Core, to support UNPACK, then there is no parser involved anyways, and the pretty-printer may do fancy stuff. (Why not unicode subscript numbers like ₂ :-)) But we probably want to provide this also on the Haskell level, just like unboxed products, right? Then we should have a nice syntax. Personally, I find (# a | b | c #) visually more pleasing. (The disadvantage is that this works only for two or more alternatives, but the one-alternative-unboxed-union is isomorphic to the one-element -unboxed-tuple anyways, isn’t it?) > Likewise, "MkSum#" could be a keyword/syntactic-form: > >(MkSum# 1 3 expr) > case x of MkSum# 1 3 v -> e > > Here "1" and "3" are part of the syntactic form, not expressions. > But it can probably be handled after parsing and doesn't require the > "_n_m" business. If we expose it on the Haskell level, I find MkSum_1_2# the right thing to do: It makes it clear that (conceptually) there really is a constructor of that name, and it is distinct from MkSum_2_2#, and the user cannot do computation with these indices. Greetings, Joachim -- Joachim “nomeata” Breitner m...@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nome...@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nome...@debian.org signature.asc Description: This is a digitally signed message part ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
Re: RFC: Unpacking sum types
Hi, just an idea that crossed my mind: Can we do without the worker/wrapper dance for data constructors if we instead phrase that in terms of pattern synonyms? Maybe that's a refactoring/code consolidation opportunity. Good night, Joachim Am 1. September 2015 10:23:35 PDT, schrieb Johan Tibell: >I have a draft design for unpacking sum types that I'd like some >feedback >on. In particular feedback both on: > > * the writing and clarity of the proposal and > * the proposal itself. > >https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes > >-- Johan > > > > >___ >ghc-devs mailing list >ghc-devs@haskell.org >http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs