Re: [r6rs-discuss] Proposed NON-features for small Scheme, part 8: string-set! must die

Thomas Lord Sun, 20 Sep 2009 12:08:32 -0700

On Sat, 2009-09-19 at 16:13 -1000, Shiro Kawai wrote:
> From: Thomas Lord <[email protected]>


> > [....]

> RnRS abandoning mutable strings does *not* prevent such
> tiny Scheme from having mutable strings as an implementation's
> extention. 

And vice versa.   

Both are "nice to have" and I would expect that
most implementations will want to support both.
It would be good to sanctify some specification 
of both types and how they relate.


> [...]
> Requiring string ports (string builder) shouldn't be much
> burden to the tiny Scheme; 

String ports are an example of a generic
problem for which disjointed, piecemeal 
solutions seem the wrong approach (puns intended).
String ports are not the only kind of alternative
port desirable.  Similarly it would be reasonable
to support and encourage new types of numbers,
new types of sequences, and so forth.

In other words, it comes up in many areas that
we look at some traditional "primitive" type 
in Scheme and want to be able to create "variations".
String ports are just one example.

Common Lisp sets an interesting example by
defining certain generic functions, such
as the sequence functions.   It provides a 
mechanism for creating new, disjoint types
and for creating and extending the behavior
of generics.

Core Scheme should get, as simply as practical,
to support for generics and program-created 
disjoint types - and then define these other 
things (like string ports) by giving implementations
in terms of those.  (Implementations are not
obligated to use the definitional implementations,
of course.)


> I feel that your discussion explains why mutable
> string benefits tiny Scheme, but doesn't support why
> mutable strings should be in the standard.

That is because we have to first agree on the 
desired form and function of the standard.

The classic old "50 pagers" made distinctions
between "core", "syntax", and "library" specifications.
The hint was that given the specified core and 
primitive syntax, the rest could be defined in terms
of those.   Oddly, in my opinion, library items 
were given only narrative definitions.

My thought for R7/small is for an even smaller
than traditional core, with "the rest" given both
narrative and code definitions.

I am not sure I would want to see "the rest"
polarized into "require" and "optional".  Rather,
"the rest" should comprise just about everything 
commonly found in implementations and anything widely
acclaimed, and various subsets of what is provided
or not provided from those things can be given names.

That gives a lattice of feature sets where each point
is what might be "built in".  It gives the opportunity
to give names to acclaimed "sweet spots" on that lattice.
Of course, using the definitional implementations of
features not found in a given implementation, programs can
always add any features they need but that aren't built-in.

Implementations then "conform" if they provide that
tiny core - e.g., with only vectors and fixnums, generics,
extensible syntax and lexical language, user-made disjoint
types, lambda, and minimal ports.   Implementations are
"correct" to the extent that any additional standard features
they happen to build-in conform to the definitional implementations
of those features.

That would, as a side effect, take the R7 authors out 
directly enumerating what an implementation MUST build 
in beyond the tiny core.   The choice of sweet spots on 
the lattice of feature mixes is not one that requires 
community-wide agreement.


> > > 2) Algorithms where you want to modify strings in the middle are rare,
> > 
> > Claims like this always make the hairs on the back of my 
> > neck stand up.  There are two problems with them.
> > First, there isn't a really good empirical way to establish
> > such claims.   Second, rarity per se, is not the most 
> > important consideration.
> [...]
> > Rarity is not an especially compelling argument.  More 
> > important is *importance*.   The question is less "how often
> > do I need to reach for string mutation?" so much as the
> > question is "how painful is it if when I want string mutation
> > I can't have it?".
> 
> It is plausible, but could you support your opinion with
> some concrete observation, experience, or algorithms?
> The counter observation of that 9 years of experience in
> Gauche community.

One of the more fun projects I've done in Scheme
was an Emacs-like text editor.   For that, I found
a very nice data-structure (good trade-offs) was
a kind of unholy mix of "gap buffers" (like in GNU
Emacs) with "ropes" (big strings represented as 
(in this case) splay trees of smaller strings).  
Modifying strings in the middle was important to 
good performance for this.  Not being able to modify
strings in the middle with expected-case decent
efficiency would have meant too much copying of data
or too high a fragmentation of long strings.

I noticed in the list of string-set! uses that Aubrey
posted from SLIB, one of the uses came from a library
that provided a "format" function: something that takes
a format string and a bunch of other parameters and
creates a new string (like sprintf in C).  That 
strikes me as another case where string mutation is
very handy for avoiding excess data copying and 
consing.

In any application where I/O filtering (read 
some input, tweak it, write output) needs to be
efficient, again, to avoid excessive data copying
string mutation is a big boon.


> > In the absence of mutation, when people want to implement
> > "a string like thing whose contents and length can 
> > change over time" 
> 
> string-set! and string-fill! aren't length-chaning operation,
> so discussing length changing case is somewhat irrelevant.

> Surely length changing operation is useful. Mutable strings
> via string-set! doesn't give it to you, though.


Given the problematics of Unicode encoding,
I think the time is ripe to bite the bullet and
make the primitive string-replace! (which replaces
in situ an arbitrary substring with an arbitrary string).



-t



_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Re: [r6rs-discuss] Proposed NON-features for small Scheme, part 8: string-set! must die

Reply via email to