Re: Gensym collisions can be engineered.

2013-08-17 Thread J.-F. Rompre
Interesting discussion - I was pretty surprised to see this can happen

I don't understand all the details about the solutions discussed in this 
thread, but it seems that the only way to ensure no local can be clobbered, 
gensym'ed or not, is to introduce into the RT a local resolution phase  
table which maps all user (as found in code) symbols in the current scope 
to a unique symbol - sort of a gensym for regular vars, to that in the 
following example, the local G__2244 defined by the user would get remapped 
internally.
The table could be wiped out at the end of its scope...admittedly this is 
probably a challenge to provide for nested scopes.

(defmacro clobber [  forms ] 
 (let [ club (gensym) ] 
   `(let [ ~club squirrel ] 
   ~@forms
 ;;the following to help find a clobbered var 
in the scope 
 ;; enclosing the macro call 
 (println clobbering symbol:  '~club)
)))

 give it a few REPL (gensym) tries, and:

user= (let [ G__2244 mountain lion ] (clobber (println I am a  
G__2244)))
I am a  squirrel
clobbering symbol:  G__2244



On Saturday, November 7, 2009 11:59:08 PM UTC-5, John Harrop wrote:

 user= (def q 'G__723)
 #'user/q
 user= (def r (gensym))
 #'user/r
 user= q
 G__723
 user= r
 G__723
 user= (= q r)
 true

 It's possible to anticipate the next gensym name that will be generated 
 and then engineer a collision, and therefore possibly variable capture 
 unintended by the author of a macro.

 It looks like manually generating a name does not remove it from the 
 pool of allowable future gensym names.

 This shouldn't tend to cause accidental problems in practice, since gensym 
 names tend not to collide with the kinds of identifiers programmers 
 ordinarily use. Nonetheless it can be partially fixed comparatively easily 
 by adding to the runtime a WeakHashMap into which a reference to any 
 symbol, however created, is placed and modifying the gensym-generator to, 
 at the point where it currently returns a value, first check if the 
 WeakHashMap contains it already and if so, generate the next one, and the 
 next, as needed until it gets a not-in-use name. This requires gensym 
 creation and normal symbol creation to occur in a global lock but symbol 
 creation rarely occurs at runtime and even more rarely in any kind of 
 hot-spot at runtime. The use of WeakHashMap would prevent the runtime 
 clogging up with ungarbagecollectable symbols, so the memory overhead of 
 adding this would be smallish, one machine pointer per in-use symbol or so, 
 equivalent to if every symbol had a few more characters in its name.

 This would stop gensym from producing a name already in use. It wouldn't 
 prevent a gensym being generated somewhere and *then* the identical name 
 being put together someplace else and passed to the symbol function, 
 though; a small loophole. Collisions between gensyms in preexisting code 
 and an enclosing lexical scope in new code would become impossible, but 
 collisions between gensyms in preexisting code and an enclosed lexical 
 scope (e.g. a macro-invocation body, such as a loop body) would remain 
 theoretically possible.

 That last loophole can't really be plugged without giving Clojure true 
 gensyms (uninterned anywhere), which would bring with it its own 
 architectural problems. If that change were made, the above REPL 
 interaction would be possible up to the (= q r) evaluation, but that would 
 return false despite the names looking the same, so the symbols wouldn't 
 really collide even though a collision of their printed representations 
 could still be engineered. One bothersome consequence though would be that 
 the textual output of macroexpand-1 could no longer always be substituted 
 for a macro call in source code without altering the run-time semantics of 
 that code, even when the macro's expansion-generation does not have side 
 effects.

 (To reproduce that REPL interaction for yourself, evaluate (gensym) at the 
 REPL and note the number. Add seven and then evaluate (def q 'G__#) with # 
 replaced with the sum. Then evaluate (def r (gensym)) and, if you like, 
 skip straight to (= q r). If that doesn't work, the number it goes up by 
 must have changed between Clojure 1.0 and whatever version you're using; 
 evaluate (gensym), then (def q 'xyzzy), then (gensym) again and note the 
 increment and use that instead of seven. Oh, and don't have any background 
 threads running in that JVM instance that might do something autonomously 
 that causes a gensym to be generated.)



-- 
-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, 

Re: Gensym collisions can be engineered.

2009-11-12 Thread Kevin Tucker
Personally I think preventing unexpected gensym collisions is the more
important property, otherwise it's not even worth having, might as
well just make your own cryptic names.   I don't think you can have
both.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Gensym collisions can be engineered.

2009-11-11 Thread Kevin Tucker
Yeah, sorry, missed that.

How does making the gensyms unreadable make things worse for
macroexpand than they are in CL?  If the gensym is used more than once
in the expansion (like bound to something in a let then referenced),
then reading the expansion back in will read two different symbols and
it won't work anyway.  If it doesn't get printed in reread it isn't a
problem in either system.  Or am I missunderstanding what you are
saying?

* (defmacro result (e) (let ((g (gensym))) `(let ((,g ,e)) ,g)))
* (result (+ 1 2))
3
* (macroexpand-1 '(result (+ 1 2)))
(LET ((#:G645 (+ 1 2))) #:G645)
* (LET ((#:G645 (+ 1 2))) #:G645)
ERROR

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Gensym collisions can be engineered.

2009-11-11 Thread John Harrop
On Wed, Nov 11, 2009 at 10:46 PM, Kevin Tucker tuckerke...@gmail.comwrote:

 Yeah, sorry, missed that.

 How does making the gensyms unreadable make things worse for
 macroexpand than they are in CL?


It doesn't. Just worse than they currently are in Clojure. :)

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Gensym collisions can be engineered.

2009-11-10 Thread John Harrop
On Mon, Nov 9, 2009 at 10:04 PM, Kevin Tucker tuckerke...@gmail.com wrote:

 I in CL they can be read but aren't interned in any package so every
 time you read it you get a different symbol.


Yes, I know; I said that myself in the first post. But your first post
inspired in me the idea of simply making symbols that would collide with
gensyms unreadable period. :)

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Gensym collisions can be engineered.

2009-11-09 Thread Kevin Tucker

This is something that I have been wondering about too.  In CL the
symbols gensym produces can not be read by the reader so there can be
no collision cause the only way to get a handle on the symbol is to
create it with gensym and hold on to it.  In other words you couldn't
construct a symbol that would = one gensym returns even after the fact
if you wanted to except by copying the reference.  At least that was
my understanding of it.  Seem like those semantics work pretty well.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Gensym collisions can be engineered.

2009-11-09 Thread John Harrop
On Sun, Nov 8, 2009 at 5:56 PM, Kevin Tucker tuckerke...@gmail.com wrote:

 This is something that I have been wondering about too.  In CL the
 symbols gensym produces can not be read by the reader so there can be
 no collision cause the only way to get a handle on the symbol is to
 create it with gensym and hold on to it.  In other words you couldn't
 construct a symbol that would = one gensym returns even after the fact
 if you wanted to except by copying the reference.  At least that was
 my understanding of it.  Seem like those semantics work pretty well.


Actually, this suggests another possibility for collision prevention:
disallow one character from appearing in symbols read by the reader or
produced by (symbol a-string). Both would throw an exception if they saw it.
It should be a character rarely if ever used in symbols; something like ~ or
' would be ideal. Gensyms could then include this character and be assured
of never colliding with non-gensyms. (This would include the auto-generated
parameter names produced by the #(foo %1 bar %2) reader macro, as well as
those produced in `(foo x# ~bar) and by (gensym).)

There's still a problem with macroexpand output, but instead of run-time
behavior changes the output just wouldn't compile. The illegal character
might also be stripped from symbols in macroexpand output, to make output
that will compile and should have the intended semantics but will have the
original slight chance of a collision of names.

Making the gensym symbol names begin with a leading ' gets you halfway
already since no such symbol name can be generated by the reader already;
the ' will be treated as (quote ...). It also won't throw an exception,
though, but it will be clear why it doesn't work to anyone who knows about
quoting. Having the symbol function reject strings that start with ' would
then be the only other needed change, and is extremely unlikely to break any
existing code.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Gensym collisions can be engineered.

2009-11-09 Thread Kevin Tucker

I in CL they can be read but aren't interned in any package so every
time you read it you get a different symbol.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---



Re: Gensym collisions can be engineered.

2009-11-08 Thread André Ferreira

It's one thing to try to protect the programmer from accidentally
shooting himself on the foot, but I don't think it is as necessary for
a language to prevent him from doing it on purpose.

On 8 nov, 02:59, John Harrop jharrop...@gmail.com wrote:
 user= (def q 'G__723)
 #'user/q
 user= (def r (gensym))
 #'user/r
 user= q
 G__723
 user= r
 G__723
 user= (= q r)
 true

 It's possible to anticipate the next gensym name that will be generated and
 then engineer a collision, and therefore possibly variable capture
 unintended by the author of a macro.

 It looks like manually generating a name does not remove it from the pool
 of allowable future gensym names.

 This shouldn't tend to cause accidental problems in practice, since gensym
 names tend not to collide with the kinds of identifiers programmers
 ordinarily use. Nonetheless it can be partially fixed comparatively easily
 by adding to the runtime a WeakHashMap into which a reference to any symbol,
 however created, is placed and modifying the gensym-generator to, at the
 point where it currently returns a value, first check if the WeakHashMap
 contains it already and if so, generate the next one, and the next, as
 needed until it gets a not-in-use name. This requires gensym creation and
 normal symbol creation to occur in a global lock but symbol creation rarely
 occurs at runtime and even more rarely in any kind of hot-spot at runtime.
 The use of WeakHashMap would prevent the runtime clogging up with
 ungarbagecollectable symbols, so the memory overhead of adding this would be
 smallish, one machine pointer per in-use symbol or so, equivalent to if
 every symbol had a few more characters in its name.

 This would stop gensym from producing a name already in use. It wouldn't
 prevent a gensym being generated somewhere and *then* the identical name
 being put together someplace else and passed to the symbol function, though;
 a small loophole. Collisions between gensyms in preexisting code and an
 enclosing lexical scope in new code would become impossible, but collisions
 between gensyms in preexisting code and an enclosed lexical scope (e.g. a
 macro-invocation body, such as a loop body) would remain theoretically
 possible.

 That last loophole can't really be plugged without giving Clojure true
 gensyms (uninterned anywhere), which would bring with it its own
 architectural problems. If that change were made, the above REPL interaction
 would be possible up to the (= q r) evaluation, but that would return false
 despite the names looking the same, so the symbols wouldn't really collide
 even though a collision of their printed representations could still be
 engineered. One bothersome consequence though would be that the textual
 output of macroexpand-1 could no longer always be substituted for a macro
 call in source code without altering the run-time semantics of that code,
 even when the macro's expansion-generation does not have side effects.

 (To reproduce that REPL interaction for yourself, evaluate (gensym) at the
 REPL and note the number. Add seven and then evaluate (def q 'G__#) with #
 replaced with the sum. Then evaluate (def r (gensym)) and, if you like, skip
 straight to (= q r). If that doesn't work, the number it goes up by must
 have changed between Clojure 1.0 and whatever version you're using; evaluate
 (gensym), then (def q 'xyzzy), then (gensym) again and note the increment
 and use that instead of seven. Oh, and don't have any background threads
 running in that JVM instance that might do something autonomously that
 causes a gensym to be generated.)
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~--~~~~--~~--~--~---