Re: [racket-users] Pattern: reusing the same name in macro-generated definitions

zeRusski Fri, 05 Apr 2019 03:24:39 -0700


> If I understand correctly, the fourth paragraph here is relevant? 
>
>   
> https://docs.racket-lang.org/reference/syntax-model.html#%28part._transformer-model%29
>  
>
>
I dreaded someone pointing me there. I read it a year ago, took a lot of 
head scratching and careful reading before I convinced myself that I'd 
grokked it. Both the vocabulary used and apparently my understanding 
dissipated after a year. Had to read it again :)


> So, `foo-impl` is a binding introduced by the macro and gets that 
> macro invocation's fresh macro-introduction scope. 
>
> Whereas for example `name` is syntax coming from outside the macro, 
> and doing `(define-foo (blerg ___) ___)` twice would be an error due 
> to redefining `blerg`. 
>

Ok. Let's see if I can explain away all mysteries by carefully following 
the syntax and expansion model. Someone please read it through and poke 
holes. In the process, if I'm not mistaken, we are going to discover an 
error in the Scopes section of the docs.

Step aside everyone, I'm putting my Matthew hat on. Here goes nothing! 

We are trying to answer the following questions about this piece of code (I 
annotated some identifiers with indexes [in sq brackets]):

(define[1] (foo-impl[1] op a b) (op a b))

(define-simple-macro (define-foo (name:id formal:id ...) body:expr ...)
  (begin
    (define[2] (foo-impl[2] formal ...) body ...)
    (define-syntax (name stx)
         .....
      (foo-impl[3] . args)
         .....)))

(define-foo (bar op a b) (op a b))[4]
(define-foo (baz op a b) (op a b))[5]
(bar + 1 2)


Q1: My original question was why the two call sites [4] and [5] do not 
> complain about redefinition of /foo-impl/. After all every time the 
> transformer is invoked it generates a definition of the same identifier 
> /foo-impl/, which I can easily see in macro-expansion at the relevant call 
> site. However, if I were to type /foo-impl/ definitions by hand at 
> top-level or module level Racket would yell at me. Why the two cases look 
> about the same (i.e. end up producing visually the same code) but invoke 
> different reaction from the compiler?


Suppose the transformer at [4] did its job and we are now evaluating the 
code it produced, that is the binding [2] it introduced and that use of 
/foo-impl/ at [3]. Generated [2] will have at least two scopes: one from 
the macro definition site i.e. /define-foo/, the other is the fresh 
macro-introduction scope. When the reference to /foo-impl/ at [3] gets 
resolved we'll be looking for bindings of /foo-impl/ whose scope sets are 
subsets of the reference, that is of the identifier at [3]. In fact we find 
two such bindings: [1] and [2]. Which one do we choose? We choose the one 
whose scope set is the superset of any other binding we discovered. Here 
[2] has at least one extra scope (macro-intro scope) compared to [1], so we 
use [2].

Now, why is there are no ambiguity as to which /foo-impl/ to use when we 
expand and eval [5]? Well, we'll go through the same motions, but there 
will be one extra /foo-impl/ binding generated at [4], so we'll have to 
choose from the total of 3 bindings when resolving any /foo-impl/ ref at 
[5]. And again we choose the [2] that is the result of expansion of [5].  
Why? Well, it'll have that fresh macro intro scope passed to the 
transformer from [5] and it differs from the macro intro scope at [4], so 
there is no ambiguity between the two generated bindings at [4] and [5]. 
For any ref to /foo-impl/ generated by [5] we reject the /foo-impl/ binding 
generated at [4] because its scope set isn't a subset of any ref at [5]. To 
choose between [1] and generated [2] we use the same logic as in the 
previous paragraph: [2] wins cause its scope set is bigger.

So, the three bindings  of identifier /foo-impl/ that the code above 
introduces at [1], [4] and [5] (latter two generated from the template at 
[2]) are not at all the same, at least not in the syntax model of Racket. 
Identifiers aren't merely compared by name, their scope sets have the final 
say in how identifiers are resolved.

Q2: My macro introduces a new binding for /foo-impl/ at [2]. How is that 
> identifier different from the /define/ identifier at [2]? That is to ask 
> why the /define/ at [2] is bound as we expect to the /define/ in Racket, 
> while any reference to /foo-impl/ in the subsequent template code refers to 
> the binding at [2].


The part about any /foo-impl/ ref in the template to the binding at [2] we 
already answered in Q1. Other bindings e.g. /define/ at [2] are again 
resolved as we discussed in Q1. This particular /define/ would resolve 
using the macro definition site, one of /define-foo/.

Q3: If we were to remove /foo-impl/ binding at [2] any template code would 
> happily refer to /foo-impl/ binding at [1]. How so?


Again Q1 kinda answers that. Put simply there are fewer /foo-impl/ bindings 
to choose from, and the one at [1] happens to have the scope set that is a 
subset of any use in the template e.g. at [3].

Q4: If instead of [5] I were to just copy paste [4], would Racket yell 
> about attempt to redefine /bar/? Why? (this is inspired by Greg's comment 
> about defining /blerg/).


Yes it would. /bar/ id in both expansions of /define-foo/ would have the 
exact same scope-set (from the macro use site) and won't have a macro intro 
scope to disambiguate. That's cause we passed /bar/ id to the macro as 
syntax object (technically, each /bar/ would get a fresh macro intro scope 
on the way "into" the transformer, but it gets removed on the way "out" 
i.e. in the code generated by the transformer; in reverse any identifier in 
the macro template would start with no macro-intro scope but end up with 
one in the generated code - the process referred to in the docs as 
"flipping" the macro intro scopes).


In conclusion. Racket macro system requires some careful thought, looking 
at macro expansion will only take you so far. It took a lot out of me to 
think things through and put em in writing. Ben was right. Indeed, Racket 
macros are hygienic but I guess saying that it'll "gensym" for you is 
basically waving a lot of details away. Because instead of renaming we use 
these scope sets (glorified tokens or tags, really) and syntax objects 
carry those sets with them and can borrow, erase, lend them, I guess Racket 
can encode really bizarre scoping rules when you so desire (or you don't 
and just screwed up). It is certainly rich and expressive. But I wonder if 
there are shortcuts one could take to quickly reason in situations like 
these? Ones that would give the right answer 99% of time without 
deliberating about scope sets n all. Maybe it just gets easier every time 
you do it.


Now about that error in the 
docs: 
https://docs.racket-lang.org/reference/syntax-model.html#%28tech._scope._set%29

An identifier refers to a particular binding when the reference’s symbol 
> and the identifier’s symbol are the same, and when the reference’sscope 
> set 
> <https://docs.racket-lang.org/reference/syntax-model.html#%28tech._scope._set%29>
>  
> is a subset of the binding’s scope set 
> <https://docs.racket-lang.org/reference/syntax-model.html#%28tech._scope._set%29>
> .


Should probably read:

An identifier refers to a particular binding when the reference’s symbol 
> and the identifier’s symbol are the same, and when the reference’s scope 
> set 
> <https://docs.racket-lang.org/reference/syntax-model.html#%28tech._scope._set%29>
>  is 
> a superset of the binding’s scope set 
> <https://docs.racket-lang.org/reference/syntax-model.html#%28tech._scope._set%29>
> .


or perhaps equivalently

An identifier refers to a particular binding when the reference’s symbol 
> and the identifier’s symbol are the same, and when the binding’s scope set 
> <https://docs.racket-lang.org/reference/syntax-model.html#%28tech._scope._set%29>
>  is 
> a subset of the reference’s scope set 
> <https://docs.racket-lang.org/reference/syntax-model.html#%28tech._scope._set%29>
> . 


Unless I'm mistaken that much should be obvious from the examples and 
amounts to the fact that as we go deeper (in nesting) into the code tree 
the number of scopes attached to identifiers can only grow, therefore it 
follows that any reference to something "previously" defined would have 
"more" scopes not fewer compared to its potential bindings. (caveat: 
strictly this may not be always true cause a macro transformer could get 
fancy and borrow another identifier's scope for whatever it generates, but 
whatever). 

Also the next sentence about ref resolution kinda hints at the correct 
wording:

For a given identifier, multiple bindings may have scope sets 
> <https://docs.racket-lang.org/reference/syntax-model.html#%28tech._scope._set%29>
>  
> that are subsets of the identifier’s; in that case, the identifier refers 
> to the binding whose set is a superset of all others; if no such binding 
> exists, the reference is ambiguous
>

Was that convincing? Do I misunderstand how scope sets work after all? Do I 
need to PR?

Thanks


-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [racket-users] Pattern: reusing the same name in macro-generated definitions

Reply via email to