user=> (def q 'G__723)
#'user/q
user=> (def r (gensym))
#'user/r
user=> q
G__723
user=> r
G__723
user=> (= q r)
true

It's possible to anticipate the next gensym name that will be generated and
then engineer a collision, and therefore possibly variable capture
unintended by the author of a macro.

It looks like "manually" generating a name does not remove it from the pool
of allowable future gensym names.

This shouldn't tend to cause accidental problems in practice, since gensym
names tend not to collide with the kinds of identifiers programmers
ordinarily use. Nonetheless it can be partially fixed comparatively easily
by adding to the runtime a WeakHashMap into which a reference to any symbol,
however created, is placed and modifying the gensym-generator to, at the
point where it currently returns a value, first check if the WeakHashMap
contains it already and if so, generate the next one, and the next, as
needed until it gets a not-in-use name. This requires gensym creation and
normal symbol creation to occur in a global lock but symbol creation rarely
occurs at runtime and even more rarely in any kind of hot-spot at runtime.
The use of WeakHashMap would prevent the runtime clogging up with
ungarbagecollectable symbols, so the memory overhead of adding this would be
smallish, one machine pointer per in-use symbol or so, equivalent to if
every symbol had a few more characters in its name.

This would stop gensym from producing a name already in use. It wouldn't
prevent a gensym being generated somewhere and *then* the identical name
being put together someplace else and passed to the symbol function, though;
a small loophole. Collisions between gensyms in preexisting code and an
enclosing lexical scope in new code would become impossible, but collisions
between gensyms in preexisting code and an enclosed lexical scope (e.g. a
macro-invocation body, such as a loop body) would remain theoretically
possible.

That last loophole can't really be plugged without giving Clojure "true"
gensyms (uninterned anywhere), which would bring with it its own
architectural problems. If that change were made, the above REPL interaction
would be possible up to the (= q r) evaluation, but that would return false
despite the names looking the same, so the symbols wouldn't really collide
even though a collision of their printed representations could still be
engineered. One bothersome consequence though would be that the textual
output of macroexpand-1 could no longer always be substituted for a macro
call in source code without altering the run-time semantics of that code,
even when the macro's expansion-generation does not have side effects.

(To reproduce that REPL interaction for yourself, evaluate (gensym) at the
REPL and note the number. Add seven and then evaluate (def q 'G__#) with #
replaced with the sum. Then evaluate (def r (gensym)) and, if you like, skip
straight to (= q r). If that doesn't work, the number it goes up by must
have changed between Clojure 1.0 and whatever version you're using; evaluate
(gensym), then (def q 'xyzzy), then (gensym) again and note the increment
and use that instead of seven. Oh, and don't have any background threads
running in that JVM instance that might do something autonomously that
causes a gensym to be generated.)

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to