Re: TTG: Handling Source Locations

Adam Gundry Tue, 12 Feb 2019 13:39:23 -0800

While we're on the topic, is there any plan to get rid of all those
panics? AFAICS they are entirely unnecessary: we should just use an
empty datatype for unused constructor extension points, then we can
eliminate it to get whatever we like. See #15247.


Adam


On 12/02/2019 15:40, Shayan Najd wrote:
>> Someone could easily call rnPatAndThen when they should call rnLPatAndThen. 
>> This would cause a panic.
> 
> With Solution A, there shouldn't be two functions `rnLPatAndThen` and
> `rnPatAndThen` anyways. There should be only `rnPatAndThen` with an
> extra case for the wrapper node.
> 
>> There's also the problem that the pattern-match checker can't usefully look 
>> through view patterns.
> 
> Yes, I have reported it while back. I don't know of the progress in fixing 
> this.
> 
> On Tue, 12 Feb 2019 at 16:24, Richard Eisenberg <r...@cs.brynmawr.edu> wrote:
>>
>> That's true, but how would it play out in practice? For example, take a look 
>> at RnPat. There is a rnLPatAndThen which uses wrapSrcSpanCps to extract the 
>> location and then call rnPatAndThen. rnPatAndThen, in turn, just panics if 
>> it sees the extension point, because that's an unexpected constructor. 
>> Someone could easily call rnPatAndThen when they should call rnLPatAndThen. 
>> This would cause a panic.
>>
>> There's also the problem that the pattern-match checker can't usefully look 
>> through view patterns. If there is a nested pattern-match (that is, we see 
>> dL->L _ (SomeOtherConstructor), then there is no way to guarantee a complete 
>> pattern-match short of a catch-all. So it doesn't seem to me that the 
>> pattern-match checker is really helping us achieve what we want here.
>>
>> Richard
>>
>>> On Feb 12, 2019, at 9:30 AM, Shayan Najd <sh.n...@gmail.com> wrote:
>>>
>>>> My problem, though, is that this is just a convention -- no one checks it. 
>>>> It would be easy to forget.
>>>
>>> I am not sure if I understand: shouldn't the totality checker warn if
>>> there is no pattern for the wrapper constructor (hence enforce the
>>> convention)?
>>>
>>>
>>> On Tue, 12 Feb 2019 at 15:19, Richard Eisenberg <r...@cs.brynmawr.edu> 
>>> wrote:
>>>>
>>>>
>>>>
>>>>> On Feb 12, 2019, at 5:19 AM, Shayan Najd <sh.n...@gmail.com> wrote:
>>>>>
>>>>> About the new code, the convention is straightforward: anytime you
>>>>> destruct an AST node, assume a wrapper node inside (add an extra
>>>>> case), or use the smart constructors/pattern synonyms.
>>>>
>>>> Aha! This, I did not know. So, you're saying that all the consumers of the 
>>>> GHC AST need to remember to use dL every time they pattern-match. With the 
>>>> new design, using dL when it's unnecessary doesn't hurt, but forgetting it 
>>>> is problematic. So: just use it every time. My problem, though, is that 
>>>> this is just a convention -- no one checks it. It would be easy to forget.
>>>>
>>>>> On Feb 12, 2019, at 6:00 AM, Simon Peyton Jones via ghc-devs 
>>>>> <ghc-devs@haskell.org> wrote:
>>>>>
>>>>> One way to think of it is this: we can now put SrcSpans where they make 
>>>>> sense, rather than everywhere.
>>>>
>>>> This has some logic to it, but I'm not quite sold. Another way of saying 
>>>> this is that the new design favors flexibility for the producer, at the 
>>>> cost of requiring consumers to be aware of and consistently apply the 
>>>> convention Shayan describes above. The problem is, though, that if the 
>>>> producer is stingy in adding source locations, the consumer won't know 
>>>> which locations are expected to be informative. Is the consumer expected 
>>>> to collect locations from a variety of places and try to combine them 
>>>> somehow? I doubt it. So this means that the flexibility for the producer 
>>>> isn't really there -- the type system will accept arbitrary choices of 
>>>> where to put locations, but consumers won't get the locations where they 
>>>> expect them.
>>>>
>>>>>  We can still say (Located t) in places where we want to guarantee a 
>>>>> SrcSpan.
>>>>
>>>> This seems to go against the TTG philosophy. We can do this in, say, the 
>>>> return type of a function, but we can't in the AST proper, because that's 
>>>> shared among a number of clients, some of whom don't want the source 
>>>> locations.
>>>>
>>>>>
>>>>> Yes, this lets us add more than one; that's redundant but not harmful.
>>>>
>>>> I disagree here. If we add locations to a node twice, then we'll have to 
>>>> use dL twice to find the underlying constructor. This is another case 
>>>> there the type system offers the producer flexibility but hamstrings the 
>>>> consumer.
>>>>
>>>>
>>>>> On Feb 12, 2019, at 7:32 AM, Vladislav Zavialov <vladis...@serokell.io> 
>>>>> wrote:
>>>>>
>>>>> I claim an SrcSpan makes sense everywhere, so this is not a useful
>>>>> distinction. Think about it as code provenance, an AST node always
>>>>> comes from somewhere
>>>>
>>>> I agree with this observation. Perhaps SrcSpan is a bad name, and 
>>>> SrcProvenance is better. We could even imagine using the new HasCallStack 
>>>> feature to track where generated code comes from (perhaps only in DEBUG 
>>>> compilers). Do we need to do this today? I'm not sure there's a crying 
>>>> need. But philosophically, we are able to attach a provenance to every 
>>>> slice of AST, so there's really no reason for uninformative locations.
>>>>
>>>>> My concrete proposal: let's just put SrcSpan in the extension fields
>>>>> of each node
>>>>
>>>> I support this counter-proposal. Perhaps if it required writing loads of 
>>>> extra type instances, I wouldn't be as much in favor. But we already have 
>>>> to write those instances -- they just change from NoExt to SrcSpan. This 
>>>> seems to solve all the problems nicely, at relatively low cost. And, I'm 
>>>> sure it's more efficient at runtime than either the previous ping-pong 
>>>> style or the current scenario, as we can pattern-match on constructors 
>>>> directly, requiring one less pointer-chase or function call.
>>>>
>>>> One downside of this proposal is that it means that more care will have to 
>>>> be taken when setting the extension field of AST nodes after a pass, 
>>>> making sure to preserve the location. (This isn't really all that 
>>>> different from location-shuffling today.) A quick mock-up shows that 
>>>> record-updates can make this easier:
>>>>
>>>>> data Phase = Parsed | Renamed
>>>>>
>>>>> data Exp p = Node (XNode p) Int
>>>>>
>>>>> type family XNode (p :: Phase)
>>>>> type instance XNode p = NodeExt p
>>>>>
>>>>> data NodeExt p where
>>>>>  NodeExt :: { flag :: Bool, fvs :: RenamedOnly p String } -> NodeExt p
>>>>>
>>>>> type family RenamedOnly p t where
>>>>>  RenamedOnly Parsed _ = ()
>>>>>  RenamedOnly Renamed t = t
>>>>>
>>>>> example :: Exp Parsed
>>>>> example = Node (NodeExt { flag = True, fvs = () }) 5
>>>>>
>>>>> rename :: Exp Parsed -> Exp Renamed
>>>>> rename (Node ext n) = Node (ext { fvs = "xyz" }) n
>>>>
>>>> Note that the extension point is a record type that has a field available 
>>>> only after renaming. We can then do a type-changing record update when 
>>>> producing the renamed node, preserving the flag in the code above. What's 
>>>> sad is that, if there were no renamer-only field, we couldn't do a 
>>>> type-changing empty record update as the default case. (Haskell doesn't 
>>>> have empty record updates. Perhaps it should. They would be useful in 
>>>> doing a type-change on a datatype with a phantom index. A clever compiler 
>>>> could even somehow ensure that such a record update is completely compiled 
>>>> away.) In any case, this little example is essentially orthogonal to my 
>>>> points above, and the choice of whether to use records or other structures 
>>>> are completely local to the extension point. I just thought it might make 
>>>> for a nice style.
>>>>
>>>> Thanks,
>>>> Richard

-- 
Adam Gundry, Haskell Consultant
Well-Typed LLP, https://www.well-typed.com/
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Re: TTG: Handling Source Locations

Reply via email to