I think the latest draft 6 ballot resolutions 
(http://trac.sacrideo.us/wg/wiki/WG1Ballot6Results) on circular data structures 
still have problems; below are two simple alternatives that I think would solve 
them.


First, trac #338 has the display procedure "generate loops like write", but it 
appears to me that the later resolution in #442 means that the resolution of 
#338 started with a false premise.  After all, the write procedure no longer 
generates "loops like write", as was assumed by #338 vote.  So #338 needs 
revisiting.

Second, trac #442's resolution imposes a large overhead that is completely 
unnecessary to meet the stated goal (termination).  The comments in #442 make 
it clear that for ordinary "write", the primary thing some people cared about 
was termination in the presence of cycles.

But if that were the goal, I would have expected the specification to permit 
far more efficient implementations than it appears to permit.  For example, I 
would have expected the spec to permit an approach I'll call the "suspicious 
counter" implementation:
* Run equal?/write/display for a while, and recurse *without* tracking cycles. 
Instead, have a running counter of the number of cons/vectors that have been 
iterated so far.  The running counter can be handled with tail recursion, so 
this counting would be very efficient.
* If the counter exceeds some "suspicious" value, THEN start tracking to see if 
we have a cycle (this tracking might be done with a hashtable, not available in 
the spec).

This would still impose a significant overhead, because ANY cycle-detection 
system imposes a massive overhead, but on average its overhead would be FAR 
smaller than the one currently mandated by #442 and #338 in most cases.

However, the resolution of #442 (write+simple+shared) seems to forbid a 
"suspicious counter" implementation.  The current resolution of #442 still 
requires labels when there's a cycle, even though it seems unlikely that the 
user actually wants (or cares) about labels. If the user wanted labels, the 
user would have called write-shared.  I recommend that write-shared be used 
when you actually want labels, and if the purpose is just to ensure termination 
of write and display with cycles, then just require that they terminate in the 
presence of cycles and say *nothing* *else* other than what happens when a 
cycle is detected.  In particular, don't require labelling; that would forbid a 
"suspicious counter" implementation.  This would make "write" and "display" 
more consistent with "equal?", since "equal?" already only requires termination 
in the presence of cycles.  I think it's reasonable to have write and display 
raise a condition; I/O routines already have the possibility of trouble .

In addition, there seems to be a major gap in the definition of "equal?" 
involving cycles.  It currently says, "Even if its arguments are circular data 
structures, equal? must always terminate."  That additional sentence adds a 
MAJOR new performance and implementation cost on "equal?" compared to R5RS.  
But what, exactly, is that procedure supposed to do if a cycle is detected?  
Return #f?  Throw something?  Either?  This additional requirement imposes a 
major new performance cost, but the spec doesn't even promise to do something 
if it happens.  That's a costly effort to produce an undefined result.  (My 
preference would be #f, f "equal?" has cycle detection.  People expect equal? 
to just return true or false, and few people would put guards around "equal?").

I think redefining these standard functions, which *explicitly* did not 
guarantee termination in the presence of cycles in R5RS and before, is 
unfortunate.  I understand the rationale.  But cycles are exceedingly rare, and 
detecting them has a significant overhead.  The spec is setting up a situation 
where NON-compliant Scheme will be demonstrably faster than compliant ones in 
the normal case, because there is no standard way to request the traditional 
semantics (WITHOUT cycle detection).  In particular, cycle detection would 
typically be done by a hash table (not in the standard!) or via lists 
(sloooow), both of which impose creating and growing data structures, etc.  Why 
impose such a big overhead on "normal" procedures for an extremely unusual 
case, especially since Scheme for decades has *specifically* made clear that 
users couldn't count on cycle detection in these procedures?  You don't need to 
be Neo to use traditional "write" and "equal?".  Scheme already has many ways 
to get into infinite loops; this is an expensive way to prevent an odd case 
while not really providing additional safety.

Don't get me wrong, I think cycle detection is valuable. But these 
cycle-detection procedures have fundamentally different same space and time 
properties from the R5RS procedures.  New procedures, with different semantics 
and impacts, should have new names!  Those new procedures should have the 
semantics above, where they ONLY are guaranteed to terminate in the presence of 
cycles and where the result of "equal?" in cycles is actually defined.

Alternatively, if you really do want to change equal?/write/display to do cycle 
detection (compared to R5RS), I think it is *critical* that the spec do the 
following:
* In "equal?", change "Even if its arguments are circular data structures, 
equal? must always terminate" to "Even if its arguments are circular data 
structures, equal? must always terminate; equal? must return #f if it 
terminates due to a cycle."
* "write" should specifically say: "Shared structure is not represented using 
datum labels (see write-shared)."
* Both "write" and "display" should say: "This procedure must always terminate; 
if a circular data structure is detected, a condition is raised."

Please do NOT require that write and display print labels; the user didn't 
request them.  And similarly, what exactly equal? will do if given a cycle 
needs to be clear, or users can't depend on it.

In general, I like how R7RS is shaping up; this seems to be a sore spot in an 
otherwise really promising spec.  I especially can't wait for a library and 
exception system that might actually *PORT* across Schemes.

Please re-examine #338, #442, and "equal?" in this light.

Thanks very much.

--- David A. Wheeler

_______________________________________________
Scheme-reports mailing list
[email protected]
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports

Reply via email to