Re: Guile: What's wrong with this?

2012-01-03 Thread Bruce Korb

Hi Mike,

Thank you for the explanation.  However:

On 01/03/12 07:03, Mike Gran wrote:

It worked until I "upgraded" to openSuSE 12.1.


  $ guile --version
  guile (GNU Guile) 2.0.2
  .



  (set! tmp-text (get "act-text"))
 (set! TMP-text (string-upcase tmp-text))


>>> ERROR: In procedure string-upcase:

ERROR: string is read-only: ""




There does seem to be some strangeness w.r.t. read-only
strings going on.

On Guile 1.8.8 if you create a string this way, it is
not read-only.

guile>  (define y "hello")
guile>  (string-set! y 0 #\x)
guile>  y
"xello"

On Guile 2.0.3, if you create a string the same way, it
is read-only for some reason.

scheme@(guile-user)>  (define y "hello")
scheme@(guile-user)>  (string-set! y 0 #\x)
ERROR: In procedure string-set!:
ERROR: string is read-only: "hello"

%string-dump can be used to confirm this


There are a couple of issues:

1.  "string-upcase" should only read the string
(as opposed to "string-upcase!", which rewrites it).
2.  it is completely, utterly wrong to mutilate the
Guile library into such a contortion that it
interprets this:
(define y "hello")
to be a request to create an immutable string anyway.
It very, very plainly says, "make 'y' and fill it with
the string "hello".  Making it read only is crazy.

Furthermore, I do not even have an obvious way to deal
with the problem, short of a massive rewrite.
I define variables this way all over the place.
rewriting the code to
   (define y (string-append "hell" "o"))
everywhere is stupid, laborious, time consuming for me,
and time consuming at execution time.

Guile 2.0.1, 2.0.2 and 2.0.3 need some rethinking.  Dang!



Re: Guile: What's wrong with this?

2012-01-03 Thread Mike Gran
> From: Bruce Korb 

> 2.  it is completely, utterly wrong to mutilate the
>     Guile library into such a contortion that it
>     interprets this:
>         (define y "hello")
>     to be a request to create an immutable string anyway.
>     It very, very plainly says, "make 'y' and fill it with
>     the string "hello".  Making it read only is crazy.

Agreed.

-Mike




Re: Guile: What's wrong with this?

2012-01-03 Thread Ludovic Courtès
Hi Bruce,

And happy new year!

Bruce Korb  skribis:

> Thank you for the explanation.  However:
>
> On 01/03/12 07:03, Mike Gran wrote:
>>> It worked until I "upgraded" to openSuSE 12.1.
>>>
   $ guile --version
   guile (GNU Guile) 2.0.2
   .
>
   (set! tmp-text (get "act-text"))
  (set! TMP-text (string-upcase tmp-text))
>
 ERROR: In procedure string-upcase:
 ERROR: string is read-only: ""

[...]

>> On Guile 2.0.3, if you create a string the same way, it
>> is read-only for some reason.
>>
>> scheme@(guile-user)>  (define y "hello")
>> scheme@(guile-user)>  (string-set! y 0 #\x)
>> ERROR: In procedure string-set!:
>> ERROR: string is read-only: "hello"
>>
>> %string-dump can be used to confirm this
>
> There are a couple of issues:
>
> 1.  "string-upcase" should only read the string
> (as opposed to "string-upcase!", which rewrites it).

Yes, that’s weird.  I can’t get string-upcase to raise a read-only
exception with 2.0.3, though.  Could you try with 2.0.3, or come up with
a reduced case?

> 2.  it is completely, utterly wrong to mutilate the
> Guile library into such a contortion that it
> interprets this:
> (define y "hello")
> to be a request to create an immutable string anyway.
> It very, very plainly says, "make 'y' and fill it with
> the string "hello".  Making it read only is crazy.

It stems from the fact that string literals are read-only, per R5RS
(info "(r5rs) Storage model"):

  In many systems it is desirable for constants (i.e. the values of literal
  expressions) to reside in read-only-memory.  To express this, it is
  convenient to imagine that every object that denotes locations is
  associated with a flag telling whether that object is mutable or immutable.
  In such systems literal constants and the strings returned by
  `symbol->string' are immutable objects, while all objects created by
  the other procedures listed in this report are mutable.  It is an error
  to attempt to store a new value into a location that is denoted by an
  immutable object.

In Guile this has been the case since commit
190d4b0d93599e5b58e773dc6375054c3a6e3dbf.

The reason for this is that Guile’s compiler tries hard to avoid
duplicating constants in the output bytecode.  Thus, modifying a
constant would actually change all other occurrences of that constant in
the code, making it a non-constant.  ;-)

> Furthermore, I do not even have an obvious way to deal
> with the problem,

You can use:

  (define y (string-copy "hello"))

> short of a massive rewrite.
> I define variables this way all over the place.
> rewriting the code to
>(define y (string-append "hell" "o"))
> everywhere is stupid, laborious, time consuming for me,
> and time consuming at execution time.

I agree that this is laborious, and I’m sorry about that.  I can only
say that Guile < 2.0 being more permissive than the standard turns out
to be a mistake, in hindsight.

Thanks,
Ludo’.




Re: Guile: What's wrong with this?

2012-01-03 Thread Bruce Korb

On 01/03/12 14:24, Ludovic Courtès wrote:

2.  it is completely, utterly wrong to mutilate the
 Guile library into such a contortion that it
 interprets this:
 (define y "hello")
 to be a request to create an immutable string anyway.
 It very, very plainly says, "make 'y' and fill it with
 the string "hello".  Making it read only is crazy.


It stems from the fact that string literals are read-only, per R5RS
(info "(r5rs) Storage model"):

   [[blah, blah, blah]]

In Guile this has been the case since commit
190d4b0d93599e5b58e773dc6375054c3a6e3dbf.

The reason for this is that Guile’s compiler tries hard to avoid
duplicating constants in the output bytecode.  Thus, modifying a


You have changed the interface without deprecation or any other multi-year 
process.
Please change it back.  Please fix the problem by adding (define-strict y 
"hello")
to have this new semantic.  Thank you.



Re: Guile: What's wrong with this?

2012-01-03 Thread Ludovic Courtès
Bruce,

Bruce Korb  skribis:

> On 01/03/12 14:24, Ludovic Courtès wrote:
>>> 2.  it is completely, utterly wrong to mutilate the
>>>  Guile library into such a contortion that it
>>>  interprets this:
>>>  (define y "hello")
>>>  to be a request to create an immutable string anyway.
>>>  It very, very plainly says, "make 'y' and fill it with
>>>  the string "hello".  Making it read only is crazy.
>>
>> It stems from the fact that string literals are read-only, per R5RS
>> (info "(r5rs) Storage model"):
>>
>>[[blah, blah, blah]]
>>
>> In Guile this has been the case since commit
>> 190d4b0d93599e5b58e773dc6375054c3a6e3dbf.
>>
>> The reason for this is that Guile’s compiler tries hard to avoid
>> duplicating constants in the output bytecode.  Thus, modifying a
>
> You have changed the interface without deprecation or any other multi-year 
> process.

I could be just as offensive by suggesting that R5RS is 14 years old,
etc., but I’d rather work towards an acceptable solution with you.

Could you point me to the affected code?  What would you think of using
string-copy as I suggested?  The disadvantage is that you need to modify
your code, but hopefully that can be automated with a sed script or so;
the advantage is that it would work with all versions of Guile.

Thanks,
Ludo’.



Re: Guile: What's wrong with this?

2012-01-03 Thread Bruce Korb

On 01/03/12 15:33, Ludovic Courtès wrote:

Could you point me to the affected code?  What would you think of using
string-copy as I suggested?  The disadvantage is that you need to modify
your code, but hopefully that can be automated with a sed script or so;
the advantage is that it would work with all versions of Guile.


The disadvantage is that I know I have "clients" that have rolled their
own templates, presumably by copy-and-edit processes that will invariably
include (define var "string") syntax.  Likely a better approach is to
re-define the "define" function to my own C code and call theproper
scm_whathaveyou functions under the covers.

I'm sorry about being irritable.  This is the third problem with 2.x.
First a pre-defined value disappeared.  A very minor nuisance.
Then it turned out that the string functions would now clear the
high order bit on strings, so they are no longer byte arrays and
there is no replacement but to roll my own.  I stopped supporting
byte arrays.  A noticable nuisance.

Now it turns out that the conventional, ordinary way of creating
a string variable yields a read-only string.  Ouch.  So I am cranky
and sorry about being so.

So I guess that's my fix.  Write another function dependent
upon Guile internals, much like scm_c_eval_string_from_file_line(),
by copying scm_define() code, checking for a string value and copying
that string -- if it is read-only?  Should I check for that?

What about "set!"?  Should I check for a read-only value there, too?
I do confess it feels a little bit like unraveling something.It is scary.



Re: Guile: What's wrong with this?

2012-01-03 Thread Mike Gran
>   In many systems it is desirable for constants (i.e. the values of literal
>   expressions) to reside in read-only-memory.  To express this, it is
>   convenient to imagine that every object that denotes locations is
>   associated with a flag telling whether that object is mutable or immutable.
>   In such systems literal constants and the strings returned by
>   `symbol->string' are immutable objects, while all objects created by
>   the other procedures listed in this report are mutable.  It is an error
>   to attempt to store a new value into a location that is denoted by an
>   immutable object.
> 
> In Guile this has been the case since commit
> 190d4b0d93599e5b58e773dc6375054c3a6e3dbf.
> 
> The reason for this is that Guile’s compiler tries hard to avoid
> duplicating constants in the output bytecode.  Thus, modifying a
> constant would actually change all other occurrences of that constant in
> the code, making it a non-constant.  ;-)

This is a terrible example of the RnRS promoting some strange idea of
mathematical purity over being useful.
 
The idea that the correct way to initialize a string is
(define x (string-copy "string")) is awkward.  "string" is a read-only
but copying it makes it modifyiable?  Copying implies mutability?
 
Copying doesn't imply modifying mutability in any other data type.
 
Why not change the behavior 'define' to be (define y (substring str 0)) when STR
is a read-only string?  This would preserve the shared memory if the variable 
is never
modified but still make the string copy-on-write.
 
Regards,
 
Mike



Re: Guile: What's wrong with this?

2012-01-03 Thread Noah Lavine
Hello,

> Then it turned out that the string functions would now clear the
> high order bit on strings, so they are no longer byte arrays and
> there is no replacement but to roll my own.  I stopped supporting
> byte arrays.  A noticable nuisance.

This is just a side note to the main discussion, but there is now a
'bytevector' datatype you can use. Does that work for you? If not,
what functionality is missing?

Thanks,
Noah



Re: Guile: What's wrong with this?

2012-01-04 Thread nalaginrut
> >   In many systems it is desirable for constants (i.e. the values of literal
> >   expressions) to reside in read-only-memory.  To express this, it is
> >   convenient to imagine that every object that denotes locations is
> >   associated with a flag telling whether that object is mutable or 
> > immutable.
> >   In such systems literal constants and the strings returned by
> >   `symbol->string' are immutable objects, while all objects created by
> >   the other procedures listed in this report are mutable.  It is an error
> >   to attempt to store a new value into a location that is denoted by an
> >   immutable object.
> > 
> > In Guile this has been the case since commit
> > 190d4b0d93599e5b58e773dc6375054c3a6e3dbf.
> > 
> > The reason for this is that Guile’s compiler tries hard to avoid
> > duplicating constants in the output bytecode.  Thus, modifying a
> > constant would actually change all other occurrences of that constant in
> > the code, making it a non-constant.  ;-)
> 
> This is a terrible example of the RnRS promoting some strange idea of
> mathematical purity over being useful.
>  
> The idea that the correct way to initialize a string is
> (define x (string-copy "string")) is awkward.  "string" is a read-only
> but copying it makes it modifyiable?  Copying implies mutability?
>  
> Copying doesn't imply modifying mutability in any other data type.
>  
> Why not change the behavior 'define' to be (define y (substring str 0)) when 
> STR
> is a read-only string?  This would preserve the shared memory if the variable 
> is never
> modified but still make the string copy-on-write.
>  
> Regards,
>  
> Mike
> 

Hi guys! I just pass by and see your dispute.
I have been confused by the new immutable string design. But I used a
macro "make-mutable-string" which hide string-copy for an abstraction.
Anyway, if the efficiency would be an issue, one may choose bytevector
to implement "make-mutable-string". And it's easy to substitute with
sed.

BTW, can't we make an efficient "mutable-string" module for an
alternative? Just like old version. I mean it could be a Guile specific
feature.

-- 
GNU Powered it
GPL Protected it
GOD Blessed it

HFG - NalaGinrut

--hacker key--
v4sw7CUSMhw6ln6pr8OSFck4ma9u8MLSOFw3WDXGm7g/l8Li6e7t4TNGSb8AGORTDLMen6g6RASZOGCHPa28s1MIr4p-x
 hackerkey.com
---end key---




Re: Guile: What's wrong with this?

2012-01-04 Thread David Kastrup
Mike Gran  writes:

>>   In many systems it is desirable for constants (i.e. the values of literal
>>   expressions) to reside in read-only-memory.  To express this, it is
>>   convenient to imagine that every object that denotes locations is
>>   associated with a flag telling whether that object is mutable or immutable.
>>   In such systems literal constants and the strings returned by
>>   `symbol->string' are immutable objects, while all objects created by
>>   the other procedures listed in this report are mutable.  It is an error
>>   to attempt to store a new value into a location that is denoted by an
>>   immutable object.
>> 
>> In Guile this has been the case since commit
>> 190d4b0d93599e5b58e773dc6375054c3a6e3dbf.
>> 
>> The reason for this is that Guile’s compiler tries hard to avoid
>> duplicating constants in the output bytecode.  Thus, modifying a
>> constant would actually change all other occurrences of that constant in
>> the code, making it a non-constant.  ;-)
>
> This is a terrible example of the RnRS promoting some strange idea of
> mathematical purity over being useful.
>  
> The idea that the correct way to initialize a string is
> (define x (string-copy "string")) is awkward.  "string" is a read-only
> but copying it makes it modifyiable?  Copying implies mutability?
>  
> Copying doesn't imply modifying mutability in any other data type.

Huh?

(set-car! '(4 5) 3) => bad
(set-car! (list-copy '(4 5)) 3) => ok

Similar with literal vectors.

Why should strings be different here?

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-04 Thread Mark H Weaver
Bruce Korb  writes:
> 2.  it is completely, utterly wrong to mutilate the
> Guile library into such a contortion that it
> interprets this:
> (define y "hello")
> to be a request to create an immutable string anyway.
> It very, very plainly says, "make 'y' and fill it with
> the string "hello".  Making it read only is crazy.

No, `define' does not copy an object, it merely makes a new reference to
an existing object.  This is also true in C for that matter, so this is
behavior is quite mainstream.  For example, the following program dies
with SIGSEGV on most modern systems, including GNU/Linux:

  int
  main()
  {
char *y = "hello";
y[0] = 'a';
return 0;
  }

Scheme and Guile are the same as C in this respect.  Earlier versions of
Guile didn't make a copy of the string in this case either, but it
lacked the mechanism to detect this error, and allowed you to modify the
string literal in the program text itself, which is a _very_ bad idea.

For example, look at what Guile 1.8 does:

  guile> (let loop ((i 0))
   (define y "hello")
   (display y)
   (newline)
   (string-set! y i #\a)
   (loop (1+ i)))
  hello
  aello
  aallo
  aaalo
  o
  a
  

So you see, even in Guile 1.8, (define y "hello") didn't do what you
thought it did.  It didn't fill y with the string "hello".  You were
actually changing the program text itself, and that was a serious
mistake.

I'm sincerely sorry that you got yourself into this mess, but I don't
see any good way out of it.  To fix it as you suggest would be like
suggesting that C should change the semantics of char *y = "hello" to
automaticallly do a strcpy because some existing programs were in the
habit of modifying the string constants of the program text.  That way
lies madness.

If you want to make a copy of a string constant from the program text as
a starting point for mutating the string, then you need to explicitly
copy it, just like in C.

  Mark



Re: Guile: What's wrong with this?

2012-01-04 Thread Ian Price
Bruce Korb  writes:

> You have changed the interface without deprecation or any other multi-year 
> process.
> Please change it back.  Please fix the problem by adding (define-strict y 
> "hello")
> to have this new semantic.  Thank you.

Fixing it with define-strict is ridiculous, as y is still mutable, it is
the string "hello" which is not. As for mutable strings, I consider them
a mistake to begin with, but if people expect them to be be mutable, and
historically they are mutable (in guile), it is a mistake to change this
without prior warning.

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



Re: Guile: What's wrong with this?

2012-01-04 Thread Mike Gran
> From: Mark H Weaver 
> No, `define' does not copy an object, it merely makes a new reference to
> an existing object.  This is also true in C for that matter, so this is
> behavior is quite mainstream.  For example, the following program dies
> with SIGSEGV on most modern systems, including GNU/Linux:
> 
>   int
>   main()
>   {
>     char *y = "hello";
>     y[0] = 'a';
>     return 0;
>   }

 
True, but the following also is quite mainstream
int main()
{
  char y[6] = "hello";
  y[0] = 'a';
  return 0;
}
 
C provides a way to create and initialize a mutable string.
 
> Scheme and Guile are the same as C in this respect.  Earlier versions of
> Guile didn't make a copy of the string in this case either, but it
> lacked the mechanism to detect this error, and allowed you to modify the
> string literal in the program text itself, which is a _very_ bad idea.

It all depends on your mental model.  Your saying that (define y "hello")
attaches "hello" to y, and since "hello" is a immutable, the string y
contains must be immutable.  This is an argument based on purity, not
utility.
 
If you follow that logic, then Guile is left without any shorthand
to create and initialize a mutable string other than
 
(define y (substring "hello" 0))
or 
(define y (string-copy "hello"))
 
Someone coming from any other language would be surpised to find that
the above is what you need to do to create an initialize a mutable string,
I think.
 
But 'define' just as easily can be considered a generic constructor
that is overloaded in a C++ sense, and when "hello" is a string, y is
assigned a copy-on-write version of the immutable string.
 
It was wrong to change this without deprecating it first.
 
Thanks,
 
Mike Gran



Re: Guile: What's wrong with this?

2012-01-04 Thread David Kastrup
Mike Gran  writes:

> If you follow that logic, then Guile is left without any shorthand
> to create and initialize a mutable string other than
>  
> (define y (substring "hello" 0))
> or 
> (define y (string-copy "hello"))

Sure.  Guile does not have shorthands for _mutable_ literals for lists
or vectors either.  One of the most significant points of a literal is
that you can rely on it staying the same.

> Someone coming from any other language would be surpised to find that
> the above is what you need to do to create an initialize a mutable
> string, I think.

I don't know any language that permits the modification of literals.

> But 'define' just as easily can be considered a generic constructor
> that is overloaded in a C++ sense,

It can be considered a lot of things that don't make sense.

> and when "hello" is a string, y is assigned a copy-on-write version of
> the immutable string.    It was wrong to change this without
> deprecating it first.

Modifying literals _never_ _ever_ was guaranteed to lead to predictable
results.  Undefined behavior before, undefined behavior afterwards.
There is no point in _deprecating_ something that _always_ was undefined
behavior.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-04 Thread Andy Wingo
On Wed 04 Jan 2012 09:29, Mike Gran  writes:

>   char y[6] = "hello";
>  
> C provides a way to create and initialize a mutable string.

This one is more like

  (define y (string #\h #\e #\l #\l #\o))

just like

  (define y (list #\h #\e #\l #\l #\o))
  (define y (vector #\h #\e #\l #\l #\o))

etc.

> It all depends on your mental model.  Your saying that (define y "hello")
> attaches "hello" to y, and since "hello" is a immutable, the string y
> contains must be immutable.

This is what the Scheme standard says, yes.

> This is an argument based on purity, not utility.

You don't think optimizations are of any use, then?  :-)  Immutable
literals allows literals to be coalesced, leading to the impressive 2x
speed improvements in Dorodango startup time, some months back.

> It was wrong to change this without deprecating it first.

I am not certain that is the case.  Mutating string literals has always
been an error in Scheme.  It did "work" with Guile 1.8 and before; but
since 1.9.0 when the compiler was introduced and started coalescing
literals, it has had the possibility to cause bugs.  The changes in
2.0.1 prevented those bugs by marking those strings as immutable.

I was going to propose a workaround with an option to change
vm-i-loader.c:43 and vm-i-loader.c:115 to use a
scm_i_mutable_string_literals_p instead of 1, but that really seems like
the path to perdition: previously compiled modules would start creating
mutable strings where they really shouldn't.

We could add a compiler option to turn string literals into (string-copy
FOO).  Perhaps that's the thing to do.

Andy
-- 
http://wingolog.org/



Re: Guile: What's wrong with this?

2012-01-04 Thread David Kastrup
Andy Wingo  writes:

> We could add a compiler option to turn string literals into
> (string-copy FOO).  Perhaps that's the thing to do.

What for?  It would mean that a literal would not be eq? to itself, a
nightmare for memoization purposes.

And for what?  For making code with explicitly undefined behavior
exhibit a particular behavior that is undesirable in general.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-04 Thread Bruce Korb

On 01/04/12 04:19, Ian Price wrote:

...  As for mutable strings, I consider them
a mistake to begin with,...


Let's step back and consider the whole point of Guile in the first place.

My understanding is that one primary purpose is to be a facilitation
language so that application developers have less to worry about and
futz over.  An extension language, if you like that phrase.  As such,
it would seem to me that a primary design goal would be to make the
pathway as smooth as possible, rather than trying to emulate C and/or
official Scheme language specs as closely as possible.  To me, my primary
concern is doing my little thing with the least total hassle.  Having
to study up on and thoroughly understand the Scheme language seems
a lot harder than just using Perl (or what-have-you).  Most scripting
languages don't cut you off at the knees (change interfaces).

So my main question is:

  Which is the higher priority, language purity or ease of use?

The answer to that question answers several other things, like
whether or not strings should be "allowed" to have high order bits
set (not be pure ASCII) and whether or not to make read only strings
be copy-on-write vs. fault-on-write.


We could add a compiler option to turn string literals into (string-copy
FOO).  Perhaps that's the thing to do.


No, because your clients have no control over how Guile gets built.
We _do_ have control over startup code, however:

  (if (defined? 'set-copy-on-write-strings)
  (set-copy-on-write-strings #t))

Or, better, keep historical behavior and add:

  (if (defined? 'set-no-copy-on-write-strings)
  (set-no-copy-on-write-strings #t))

and fix the 1.9 bug (scribbling on shared strings) by making them
copy-on-write thingys.

Thank you.



Re: Guile: What's wrong with this?

2012-01-04 Thread Mark H Weaver
Mike Gran  writes:

>> From: Mark H Weaver 
>> No, `define' does not copy an object, it merely makes a new reference to
>> an existing object.  This is also true in C for that matter, so this is
>> behavior is quite mainstream.  For example, the following program dies
>> with SIGSEGV on most modern systems, including GNU/Linux:
>> 
>>   int
>>   main()
>>   {
>>     char *y = "hello";
>>     y[0] = 'a';
>>     return 0;
>>   }
>
>  
> True, but the following also is quite mainstream
> int main()
> {
>   char y[6] = "hello";
>   y[0] = 'a';
>   return 0;
> }
>  
> C provides a way to create and initialize a mutable string.

Scheme and Guile provide ways to do that too, but that's _never_ what
`define' has done.

>> Scheme and Guile are the same as C in this respect.  Earlier versions of
>> Guile didn't make a copy of the string in this case either, but it
>> lacked the mechanism to detect this error, and allowed you to modify the
>> string literal in the program text itself, which is a _very_ bad idea.
>
> It all depends on your mental model.  Your saying that (define y "hello")
> attaches "hello" to y, and since "hello" is a immutable, the string y
> contains must be immutable.  This is an argument based on purity, not
> utility.

If we were designing a new language, then it would at least be pertinent
to argue this point.  However, this is the way `define' has _always_
worked in every variant of Scheme, and the same is true of the analogous
`set' in Lisp from the very beginning.

> If you follow that logic, then Guile is left without any shorthand
> to create and initialize a mutable string other than
>  
> (define y (substring "hello" 0))
> or 
> (define y (string-copy "hello"))

Guile provides all the machinery you need to define shorthand syntax if
you like, e.g:

  (define-syntax-rule (define-string v s) (define v (string-copy s)))

For that matter, you could also do something like this:

  (define-syntax define
(lambda (x)
  (with-syntax ((orig-define #'(@ (guile) define)))
(syntax-case x ()
  ((_ (proc arg ...) e0 e1 ...)
   #'(orig-define proc (lambda (arg ...) e0 e1 ...)))
  ((_ v e)
   (identifier? #'v)
   (if (string? (syntax->datum #'e))
   #'(orig-define v (string-copy e))
   #'(orig-define v e)))

This will change `define' (in the module where it's defined) to
automatically copy a bare string literal on the right side.  Note that
this check is done at compile-time, so it can't look at the dynamic type
of an expression.

If that's not good enough and you're willing to take the efficiency hit
at runtime for _every_ use of `define', you could change `define' to
wrap the right-hand expression within a procedure call to check for
read-only strings:

  (define (copy-if-string x)
(if (string? x)
(string-copy x)
x))
  
  (define-syntax define
(lambda (x)
  (with-syntax ((orig-define #'(@ (guile) define)))
(syntax-case x ()
  ((_ (proc arg ...) e0 e1 ...)
   #'(orig-define proc (lambda (arg ...) e0 e1 ...)))
  ((_ v e)
   #'(orig-define v (copy-if-string e)))

Scheme's nice handling of hygiene should make redefining `define' within
your own modules (including (guile-user)) harmless.  If it doesn't,
that's a bug and we'd like to hear about it.

> It was wrong to change this without deprecating it first.

The only change here was to add the machinery to detect an error that
was _always_ an error.  It _never_ did what you say that it should do.

What it did before was fail to detect that you were changing the string
constant in the program text itself.  The Guile 1.8 example I gave in my
last email in this thread demonstrates that.

To make that point even clearer, I'll post the full copy of the error
message Guile 1.8 gave when my loop ran past the end of the string:

  guile> (let loop ((i 0))
   (define y "hello")
   (display y)
   (newline)
   (string-set! y i #\a)
   (loop (1+ i)))
  hello
  aello
  aallo
  aaalo
  o
  a
  
  Backtrace:
  In standard input:
 2: 0* [loop 0]
  In unknown file:
 ?: 1  (letrec ((y "a")) (display y) ...)
 ...
 ?: 2  (letrec ((y "a")) (display y) ...)
  In standard input:
 2: 3* [string-set! "a" {5} #\a]
  
  standard input:2:60: In procedure string-set! in expression (string-set! y i 
...):
  standard input:2:60: Value out of range 0 to 4: 5
  ABORT: (out-of-range)
  guile> 

Take a look at the backtrace, where it helpfully shows you an excerpt of
the source code (admittedly after some transformation).  See how the
source code itself has been modified?  This is what Bruce's code does.
It was _always_ a serious error in the code, even if it went undetected
in earlier versions of Guile.

 Mark



Re: Guile: What's wrong with this?

2012-01-04 Thread Andy Wingo
On Wed 04 Jan 2012 12:16, Bruce Korb  writes:

>> We could add a compiler option to turn string literals into (string-copy
>> FOO).  Perhaps that's the thing to do.
>
> No, because your clients have no control over how Guile gets built.
> We _do_ have control over startup code, however:

I meant the Scheme compiler, Bruce -- the one that is in Guile.  Not the
C compiler used to compile Guile.

Andy
-- 
http://wingolog.org/



Re: Guile: What's wrong with this?

2012-01-04 Thread Bruce Korb

On 01/04/12 08:47, Andy Wingo wrote:

I was going to propose a workaround with an option to change
vm-i-loader.c:43 and vm-i-loader.c:115 to use a
scm_i_mutable_string_literals_p instead of 1, but that really seems like
the path to perdition: previously compiled modules would start creating
mutable strings where they really shouldn't.


Instead, long-standing, previously written code was invalidated with 1.9,
even if we were not smacked down until 2.0.1.

Just because an obscure-to-those-not-living-and-breathing-Scheme-daily
document said it was okay doesn't make it okay to those whacked by it.
I would think recompiling should not be a great burden, *ESPECIALLY*
given that it is a recent invention and therefore likely to have some
initial issues that need dealing with.  Like this, for example.



Re: Guile: What's wrong with this?

2012-01-04 Thread Andy Wingo
On Wed 04 Jan 2012 12:14, David Kastrup  writes:

> Andy Wingo  writes:
>
>> We could add a compiler option to turn string literals into
>> (string-copy FOO).  Perhaps that's the thing to do.
>
> What for?  It would mean that a literal would not be eq? to itself, a
> nightmare for memoization purposes.

  (eq? "hello" "hello")

This expression may be true or false.  It will be true in some
circumstances and false in others, in all versions of Guile.

> And for what?  For making code with explicitly undefined behavior
> exhibit a particular behavior that is undesirable in general.

The Scheme reports and the Guile manual are both positive and negative
specification: they require the implementation to do certain things, and
they allow it to do certain others.  Eq? on literals is one of the
liberties afforded to the implementation, and with good reason.  Correct
programs don't assume anything about the identities (in the sense of
eq?) of literals.

Andy
-- 
http://wingolog.org/



Re: Guile: What's wrong with this?

2012-01-04 Thread David Kastrup
Bruce Korb  writes:

> On 01/04/12 04:19, Ian Price wrote:
>> ...  As for mutable strings, I consider them
>> a mistake to begin with,...
>
> Let's step back and consider the whole point of Guile in the first place.
>
> My understanding is that one primary purpose is to be a facilitation
> language so that application developers have less to worry about and
> futz over.  An extension language, if you like that phrase.  As such,
> it would seem to me that a primary design goal would be to make the
> pathway as smooth as possible, rather than trying to emulate C and/or
> official Scheme language specs as closely as possible.  To me, my primary
> concern is doing my little thing with the least total hassle.  Having
> to study up on and thoroughly understand the Scheme language seems
> a lot harder than just using Perl (or what-have-you).  Most scripting
> languages don't cut you off at the knees (change interfaces).
>
> So my main question is:
>
>   Which is the higher priority, language purity or ease of use?

Encouraging language abuse like making _literals_ not eq? to themselves
makes a language unpredictable.  That is not a road to ease of use.  It
is a dead end.

> and fix the 1.9 bug (scribbling on shared strings) by making them
> copy-on-write thingys.

So you want to give eq? unpredictable semantics as well.  What else has
made your black list of things to sacrifice in order to keep undefined
code working in a particular undefined way?

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-04 Thread David Kastrup
Bruce Korb  writes:

> On 01/04/12 08:47, Andy Wingo wrote:
>> I was going to propose a workaround with an option to change
>> vm-i-loader.c:43 and vm-i-loader.c:115 to use a
>> scm_i_mutable_string_literals_p instead of 1, but that really seems like
>> the path to perdition: previously compiled modules would start creating
>> mutable strings where they really shouldn't.
>
> Instead, long-standing, previously written code was invalidated with
> 1.9, even if we were not smacked down until 2.0.1.

Yes, that is an inherent problem of writing code with undefined
behavior.  The only way to keep it working in the exact same manner is
to use the exact same interpreter.  And in the age of allocation
randomization and multi-threading, not even that is reliable.

> Just because an obscure-to-those-not-living-and-breathing-Scheme-daily
> document said it was okay doesn't make it okay to those whacked by it.

There was _never_ _any_ document that stated writing to literals was ok.
You did so entirely on your own initiative and just were lucky that it
happened to work under certain circumstances for a while.  If people
like to whack themselves, there is little one can do to keep them from
doing so.  They'll always find a way.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-04 Thread David Kastrup
Andy Wingo  writes:

> On Wed 04 Jan 2012 12:14, David Kastrup  writes:
>
>> Andy Wingo  writes:
>>
>>> We could add a compiler option to turn string literals into
>>> (string-copy FOO).  Perhaps that's the thing to do.
>>
>> What for?  It would mean that a literal would not be eq? to itself, a
>> nightmare for memoization purposes.
>
>   (eq? "hello" "hello")
>
> This expression may be true or false.  It will be true in some
> circumstances and false in others, in all versions of Guile.

To itself.  Not to a literal written in the same manner.

(define (zap) "hello")
(eq? (zap) (zap))

This expression may not choose to be true or false.

>> And for what?  For making code with explicitly undefined behavior
>> exhibit a particular behavior that is undesirable in general.
>
> The Scheme reports and the Guile manual are both positive and negative
> specification: they require the implementation to do certain things,
> and they allow it to do certain others.  Eq? on literals is one of the
> liberties afforded to the implementation, and with good reason.
> Correct programs don't assume anything about the identities (in the
> sense of eq?) of literals.

Of _different_ literals spelled in the same way.  But one and the same
literal has to be eq? to itself.  It can't just replace itself with a
non-eq? copy on a whim.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-04 Thread Andy Wingo
On Wed 04 Jan 2012 12:49, David Kastrup  writes:

> (define (zap) "hello")
> (eq? (zap) (zap))
>
> This expression may not choose to be true or false.

Indeed, good point.

Andy
-- 
http://wingolog.org/



Re: Guile: What's wrong with this?

2012-01-04 Thread Ian Price
Bruce Korb  writes:

> On 01/04/12 08:47, Andy Wingo wrote:
>> I was going to propose a workaround with an option to change
>> vm-i-loader.c:43 and vm-i-loader.c:115 to use a
>> scm_i_mutable_string_literals_p instead of 1, but that really seems like
>> the path to perdition: previously compiled modules would start creating
>> mutable strings where they really shouldn't.
>
> Instead, long-standing, previously written code was invalidated with 1.9,
long-standing, previously written _buggy_ code.

> even if we were not smacked down until 2.0.1.
>
> Just because an obscure-to-those-not-living-and-breathing-Scheme-daily
> document said it was okay doesn't make it okay to those whacked by it.
There's an old saying, "Ignorance of the law is no excuse". If I wrote C
code that doesn't conform to the C standard and depended on
implementation specific behaviour, I have no recourse if it breaks on a
different compiler. Guile explicitly claims to conform to the r5rs (and
partially to the r6rs), both of which make this behaviour undefined, and
srfi 13 explicitly makes this an error. (And FWIW I would not consider
the R5RS obscure to people who have used scheme for even a short while,
nor is it a terrific burden to read at 50 pages)

Now, if you want to argue your position, it'd be better to argue that
guile goes beyond r[56]rs in making these promises with regards to strings.

For instance, substring-fill! as found at
https://www.gnu.org/software/guile/manual/html_node/String-Modification.html
implies that string literals are mutable

— Scheme Procedure: substring-fill! str start end fill
— C Function: scm_substring_fill_x (str, start, end, fill)

Change every character in str between start and end to fill.

  (define y "abcdefg")
  (substring-fill! y 1 3 #\r)
  y
  ⇒ "arrdefg"

So too does string-upcase!
(https://www.gnu.org/software/guile/manual/html_node/Alphabetic-Case-Mapping.html),
if we assume y is the same binding in both functions

— Scheme Procedure: string-upcase! str [start [end]]
— C Function: scm_substring_upcase_x (str, start, end)
— C Function: scm_string_upcase_x (str)

Destructively upcase every character in str.

  (string-upcase! y)
  ⇒ "ARRDEFG"
  y
  ⇒ "ARRDEFG"

The same goes for string-downcase! and string-capitalize!

I think it would be fair to say that someone could surmise that literal
strings are meant to be mutable from these examples, and, if we do go
down the immutable string literal route these examples would need to be
addressed.

On the other hand, you can argue that string literal immutability is
implied by

— Scheme Procedure: string-for-each-index proc s [start [end]]
— C Function: scm_string_for_each_index (proc, s, start, end)

Call (proc i) for each index i in s, from left to right.

For example, to change characters to alternately upper and lower case,
p
  (define str (string-copy "studly"))
  (string-for-each-index
  (lambda (i)
(string-set! str i
  ((if (even? i) char-upcase char-downcase)
   (string-ref str i
  str)
  str ⇒ "StUdLy"

but on a purely numerical basis, mutability 4 - 0 immutability

> I would think recompiling should not be a great burden, *ESPECIALLY*
At this stage, I think that argument is fair enough, other people's
mileage may vary.

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



Re: Guile: What's wrong with this?

2012-01-04 Thread Mark H Weaver
Andy Wingo  writes:
> We could add a compiler option to turn string literals into (string-copy
> FOO).  Perhaps that's the thing to do.

I think this would be fine, as long as the default is _not_ to copy
string literals.  This would help Bruce a great deal with very little
effort on our part, without mucking up the semantics for anyone else.

David Kastrup  writes:
> What for?  It would mean that a literal would not be eq? to itself, a
> nightmare for memoization purposes.

I agree that it should not be the default behavior, but I don't see the
harm in allowing users to compile their own code this way.  The
memoization argument is a bit thin.  How often is it useful to memoize
against string arguments using eq? as the equality predicate?  Remember,
this would only for be for code that explicitly changed this compilation
option.

 Best,
  Mark



Re: Guile: What's wrong with this?

2012-01-04 Thread Andy Wingo
On Wed 04 Jan 2012 13:31, Mark H Weaver  writes:

> Andy Wingo  writes:
>> We could add a compiler option to turn string literals into (string-copy
>> FOO).  Perhaps that's the thing to do.
>
> I think this would be fine, as long as the default is _not_ to copy
> string literals.  This would help Bruce a great deal with very little
> effort on our part, without mucking up the semantics for anyone else.

Yes, this was what I was thinking.

> David Kastrup  writes:
>> What for?  It would mean that a literal would not be eq? to itself, a
>> nightmare for memoization purposes.
>
> I agree that it should not be the default behavior, but I don't see the
> harm in allowing users to compile their own code this way.

Well, we can fix this too: we can make

  "foo"

transform to

  (copy-once UNIQUE-GENSYM str)

with

(define (copy-once key str)
  (or (hashq-ref mutable-string-literals key)
  (let ((value (string-copy str)))
(hashq-set! mutable-string-literals key value)
value)))

Andy
-- 
http://wingolog.org/



Re: Guile: What's wrong with this?

2012-01-04 Thread Mark H Weaver
Ian Price  writes:

> — Scheme Procedure: substring-fill! str start end fill
> — C Function: scm_substring_fill_x (str, start, end, fill)
>
> Change every character in str between start and end to fill.
>
>   (define y "abcdefg")
>   (substring-fill! y 1 3 #\r)
>   y
>   ⇒ "arrdefg"
>
> So too does string-upcase!

Ugh, thanks for pointing this out!  Fixed.  Any others?

Mark



Re: Guile: What's wrong with this?

2012-01-04 Thread Mark H Weaver
Andy Wingo  writes:

>> David Kastrup  writes:
>>> What for?  It would mean that a literal would not be eq? to itself, a
>>> nightmare for memoization purposes.
>>
>> I agree that it should not be the default behavior, but I don't see the
>> harm in allowing users to compile their own code this way.
>
> Well, we can fix this too: we can make
>
>   "foo"
>
> transform to
>
>   (copy-once UNIQUE-GENSYM str)
>
> with
>
> (define (copy-once key str)
>   (or (hashq-ref mutable-string-literals key)
>   (let ((value (string-copy str)))
> (hashq-set! mutable-string-literals key value)
> value)))

Although this is a closer emulation of the previous (broken) behavior,
IMHO this would be less desirable than simply doing (string-copy "foo")
on every evaluation of "foo", which seems to be what Bruce (and probably
others) expected "foo" to do.

For example, based on the mental model that Bruce apparently had when he
wrote his code, he might have written something like this:

  (define (hello-world-with-one-char-changed i c)
(define str "Hello world")
(string-set! str i c)
str)

Your UNIQUE-GENSYM hack emulates the previous behavior that makes the
above procedure buggy.  Simply changing "hello" to (string-copy "hello")
would make the procedure work, and I believe conforms better to what
Bruce expects.

Of course, I'm only talking about what I think should be done when the
compiler option is changed to non-default behavior.  I strongly believe
that the _default_ behavior should stay as it is now.

  Mark



Re: Guile: What's wrong with this?

2012-01-04 Thread Bruce Korb

On 01/04/12 10:26, Ian Price wrote:

Just because an obscure-to-those-not-living-and-breathing-Scheme-daily
document said it was okay doesn't make it okay to those whacked by it.

There's an old saying, "Ignorance of the law is no excuse". If I wrote C
code that doesn't conform to the C standard


I did.  The standard changed.  My code broke.  The fix for
read-only string literals was obvious and straight forward.
The fix for pointer aliasing is virtually impossible, except
to -fno-strict-aliasing for GCC.  Yes, new code, fine, but
the millions of lines of old code I deal with? No way.

I think I've seen a reasonable way to go forward:  an option
to always copy newly defined strings.  I am also a little curious:
since this fault occurred on a string brought in via my C function
named ag_scm_get() and it created the value with a call to
scm_str02scm, shouldn't that function have created a mutable
string copy?


Now, if you want to argue your position, it'd be better to argue that
guile goes beyond r[56]rs in making these promises with regards to strings.


My number 1 argument may not be the strongest argument.
My number 1 argument is that Guile, being an extension language,
needs to be as forgiving and easy to use as it can possibly be
because its client programmers (programmers using it) want to
know as absolutely little as possible about it.  No, I do *not*
want to read, understand and remember 50 pages of stuff so that
I can use Guile as an extension language.  The memory barrier is
much, *MUCH* lower for other scripting languages.


For instance, substring-fill! as found at
https://www.gnu.org/software/guile/manual/html_node/String-Modification.html
implies that string literals are mutable

— Scheme Procedure: substring-fill! str start end fill
— C Function: scm_substring_fill_x (str, start, end, fill)

 Change every character in str between start and end to fill.

   (define y "abcdefg")
   (substring-fill! y 1 3 #\r)
   y
   ⇒ "arrdefg"


Who knows where I learned the idiom.  I learned the minimal amount of Guile
needed for my purposes a dozen years ago.  My actual problem stems from this:


Backtrace:
In ice-9/boot-9.scm:
 170: 3 [catch #t # ...]
In unknown file:
   ?: 2 [catch-closure]
In ice-9/eval.scm:
 420: 1 [eval # ()]
In unknown file:
   ?: 0 [string-upcase ""]

ERROR: In procedure string-upcase:
ERROR: string is read-only: ""
Scheme evaluation error.  AutoGen ABEND-ing in template
confmacs.tlib on line 209
Failing Guile command:  = = = = =

(set! tmp-text (get "act-text"))
   (set! TMP-text (string-upcase tmp-text))


What in heck is string-upcase doing trying to write to its input string?
Why was the string returned by ag_scm_get() (the function bound to "get")
an immutable string anyway?


SCM
ag_scm_get(SCM agName, SCM altVal)
{
tDefEntry*  pE;
ag_bool x;

pE = (! AG_SCM_STRING_P(agName)) ? NULL :
findDefEntry(ag_scm2zchars(agName, "ag value"), &x);

if ((pE == NULL) || (pE->valType != VALTYP_TEXT)) {
if (AG_SCM_STRING_P(altVal))
return altVal;
return AG_SCM_STR02SCM(zNil);
}

return AG_SCM_STR02SCM(pE->val.pzText);
}


"AG_SCM_STR02SCM" is either scm_makfrom0str or scm_from_locale_string,
depending on the age of the Guile library.  "zNil" is a pointer to a NUL
byte that is, indeed, in read only memory, but surely scm_from_locale_string
would not have been written in a way to detect that and add that attribute
because of doing a memory probe.  Further, it cannot be implemented in a
way that does not copy it because I will most certainly call
scm_from_locale_string using a pointer to memory that is immediately
deallocated.  It *MUST* copy the string.  So what is this really about anyway?


I think it would be fair to say that someone could surmise that literal
strings are meant to be mutable from these examples, and, if we do go
down the immutable string literal route these examples would need to be
addressed.


:)  I think so.  Meanwhile, I think the solution to be allowing
Guile clients to say, with some initialization code of some sort,
"copy my input strings" so the immutability flag is not set.
(I do think it correct to not scribble on shared strings)

Thank you for your help!  Regards, Bruce



Re: Guile: What's wrong with this?

2012-01-04 Thread Andy Wingo
On Wed 04 Jan 2012 14:29, Mark H Weaver  writes:

> Although this is a closer emulation of the previous (broken) behavior,
> IMHO this would be less desirable than simply doing (string-copy "foo")
> on every evaluation of "foo", which seems to be what Bruce (and probably
> others) expected "foo" to do.

Thing is, why are we doing this?  We know what the correct behavior is,
as you say:

> Of course, I'm only talking about what I think should be done when the
> compiler option is changed to non-default behavior.  I strongly believe
> that the _default_ behavior should stay as it is now.

The correct behavior is the status quo.  We are considering adding a
hack to produce different behavior for compatibility purposes.  We don't
have to worry about correctness in that case, only compatibility.  IMO
anyway :)

Andy
-- 
http://wingolog.org/



Re: Guile: What's wrong with this?

2012-01-04 Thread Bruce Korb

On 01/04/12 11:43, Andy Wingo wrote:

The correct behavior is the status quo.  We are considering adding a
hack to produce different behavior for compatibility purposes.  We don't
have to worry about correctness in that case, only compatibility.  IMO
anyway :)


It would be a nice added benefit if it worked as one would expect.
viz., you make actual, writable copies of strings you pull in so that
if the string-upcase function were to modify its input, then it
would not affect other SCMs with values that happen to be the same
sequence of bytes.



Re: Guile: What's wrong with this?

2012-01-04 Thread David Kastrup
Bruce Korb  writes:

> On 01/04/12 11:43, Andy Wingo wrote:
>> The correct behavior is the status quo.  We are considering adding a
>> hack to produce different behavior for compatibility purposes.  We don't
>> have to worry about correctness in that case, only compatibility.  IMO
>> anyway :)
>
> It would be a nice added benefit if it worked as one would expect.
> viz., you make actual, writable copies of strings you pull in so that
> if the string-upcase function were to modify its input, then it
> would not affect other SCMs with values that happen to be the same
> sequence of bytes.

If string-upcase modifies its input (or needs a mutable string to start
with), this is a bug, in contrast to what string-upcase! may do.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-04 Thread David Kastrup
Bruce Korb  writes:

> Who knows where I learned the idiom.  I learned the minimal amount of
> Guile needed for my purposes a dozen years ago.  My actual problem
> stems from this:
>
>> Backtrace:
>> In ice-9/boot-9.scm:
>>  170: 3 [catch #t # ...]
>> In unknown file:
>>?: 2 [catch-closure]
>> In ice-9/eval.scm:
>>  420: 1 [eval # ()]
>> In unknown file:
>>?: 0 [string-upcase ""]
>>
>> ERROR: In procedure string-upcase:
>> ERROR: string is read-only: ""
>> Scheme evaluation error.  AutoGen ABEND-ing in template
>>  confmacs.tlib on line 209
>> Failing Guile command:  = = = = =
>>
>> (set! tmp-text (get "act-text"))
>>(set! TMP-text (string-upcase tmp-text))
>
> What in heck is string-upcase doing trying to write to its input
> string?

This looks like it might be just a bug.  Could be that string-upcase
creates its own copy of the string incorrectly including the immutable
bit and then tries changing the string.

No reason to play helter-skelter with the language.  Instead the bug
should be fixed.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-04 Thread Andy Wingo
On Wed 04 Jan 2012 15:08, Bruce Korb  writes:

> On 01/04/12 11:43, Andy Wingo wrote:
>> The correct behavior is the status quo.  We are considering adding a
>> hack to produce different behavior for compatibility purposes.  We don't
>> have to worry about correctness in that case, only compatibility.  IMO
>> anyway :)
>
> It would be a nice added benefit if it worked as one would expect.

I think that in this case, your expectations are just incorrect.  I
don't mean this rudely.  I think you will be happier and more productive
if you change your expectations in this regard to better match "reality"
(the state of things, common practice, conventional Scheme wisdom, etc).

Andy
-- 
http://wingolog.org/



Re: Guile: What's wrong with this?

2012-01-04 Thread Ludovic Courtès
Hi!

Mike Gran  skribis:

>>   In many systems it is desirable for constants (i.e. the values of literal
>>   expressions) to reside in read-only-memory.  To express this, it is
>>   convenient to imagine that every object that denotes locations is
>>   associated with a flag telling whether that object is mutable or immutable.
>>   In such systems literal constants and the strings returned by
>>   `symbol->string' are immutable objects, while all objects created by
>>   the other procedures listed in this report are mutable.  It is an error
>>   to attempt to store a new value into a location that is denoted by an
>>   immutable object.

[...]

> The idea that the correct way to initialize a string is
> (define x (string-copy "string")) is awkward.  "string" is a read-only
> but copying it makes it modifyiable?  Copying implies mutability?

Sort-of:

  -- library procedure: string-copy string
  Returns a newly allocated copy of the given STRING.

And a “new allocated copy” is mutable.

> Copying doesn't imply modifying mutability in any other data type.

It’s not about modifying mutability of an object (this can’t be done),
but about fresh vs. constant storage.

> Why not change the behavior 'define' to be (define y (substring str 0)) when 
> STR
> is a read-only string?  This would preserve the shared memory if the variable 
> is never
> modified but still make the string copy-on-write.

I think all sorts of literal strings would have to be treated the same.

FTR, all these evaluate to #t:

  (apply eq? "hello" '("hello"))
  (apply eq? '(1 2 3) '((1 2 3)))
  (apply eq? '#(1 2 3) '(#(1 2 3)))

This is fine per R5RS (info "(r5rs) Equivalence predicates"), but
different from Guile <= 1.8.

(I use ‘apply’ here to fool peval, which otherwise evaluates the
expressions to #f at compile-time.  Andy: should peval be hacked to give
the same answer?)

Thanks,
Ludo’.



Re: Guile: What's wrong with this?

2012-01-04 Thread Ludovic Courtès
Hi Bruce,

Bruce Korb  skribis:

> On 01/03/12 15:33, Ludovic Courtès wrote:
>> Could you point me to the affected code?  What would you think of using
>> string-copy as I suggested?  The disadvantage is that you need to modify
>> your code, but hopefully that can be automated with a sed script or so;
>> the advantage is that it would work with all versions of Guile.
>
> The disadvantage is that I know I have "clients" that have rolled their
> own templates, presumably by copy-and-edit processes that will invariably
> include (define var "string") syntax.

If the users files are evaluated rather than compiled/loaded, this is
not a problem:

  scheme@(guile-user)> (eval (call-with-input-string "(define foo \"sdf\")" 
read) (interaction-environment))
  $9 = #
  scheme@(guile-user)> (string-set! (variable-ref $9) 1 #\x)
  scheme@(guile-user)> (variable-ref $9)
  $10 = "sxf"

Could you check whether this is the case?

In case it’s not, I have another possible solution in mind.  ;-)

> I'm sorry about being irritable.  This is the third problem with 2.x.

Yeah, I understand it can be really annoying and frustrating.  Believe
me, despite the breadth and depth of changes between 1.8 and 2.0, we did
our best to avoid such nuisances.  Hopefully we can help solve them with
you, so you can really benefit from 2.0 (it’s a significantly nicer
piece of software!)

Thanks,
Ludo’.



Re: Guile: What's wrong with this?

2012-01-04 Thread Bruce Korb

On 01/04/12 12:56, Andy Wingo wrote:

On Wed 04 Jan 2012 15:08, Bruce Korb  writes:


On 01/04/12 11:43, Andy Wingo wrote:

The correct behavior is the status quo.  We are considering adding a
hack to produce different behavior for compatibility purposes.  We don't
have to worry about correctness in that case, only compatibility.  IMO
anyway :)


It would be a nice added benefit if it worked as one would expect.


I think that in this case, your expectations are just incorrect.  I
don't mean this rudely.  I think you will be happier and more productive
if you change your expectations in this regard to better match "reality"
(the state of things, common practice, conventional Scheme wisdom, etc).


Going forward, yes, sure, like the pointer aliasing thing.
It was just never an issue with the original C model and it
became such later.  In this case, expectations were built
upon perl and shell scripting models, and it seemed to work
that way.  In any case, the specific problem that actually
triggered this whole thread was that scm_from_locale_string
seems to be returning a reference to an immutable string
(unexpected) *AND* the string-upcase function is objecting
to it (also unexpected).  Otherwise, I'd have gone on oblivious
to any sort of issue.  :)

Cheers - Bruce



Re: Guile: What's wrong with this?

2012-01-04 Thread Ian Price
Bruce Korb  writes:

> On 01/04/12 04:19, Ian Price wrote:
>> ...  As for mutable strings, I consider them
>> a mistake to begin with,...
>
> Let's step back and consider the whole point of Guile in the first place.
This was not intended as an answer to this question, nor to be
representative of the guile developers / users / what-have-you, but a
personal opinion.

> So my main question is:
>
>   Which is the higher priority, language purity or ease of use?
That is a loaded question, as it presupposes ease of use is always the
same thing as impurity e.g. A zipper is just as usable IMO as a gap
buffer, and doesn't require mutability.

My opinion of mutable strings is that they have little practical use to
me in my day to day programming, frankly I can count the number of times
I've done it in any high level language (so not C etc) over the past 4
or so years on one hand, and I consider most of those uses mistaken in
hindsight. It isn't just functional programming types who care about
this, Python is a great example of a language which has not been
hindered by immutable strings.

The most common string operations in practice (for me) are
concatenation, substrings, comparison/searching, and iteration, and I
would think a better foundation for strings could be found by starting
there rather than with the premise that strings are basically a specific
type of vector.

And again, just to be clear, I'm not making a proposal, just stating an
opinion.

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



Re: Guile: What's wrong with this?

2012-01-04 Thread Bruce Korb

On 01/04/12 13:52, Ian Price wrote:

So my main question is:

   Which is the higher priority, language purity or ease of use?

That is a loaded question, as it presupposes ease of use is always the
same thing as impurity e.g. ...


Absolutely not.  Making decisions is always about trade-offs,
otherwise it is not really a decision.  Should you give preference
to language aesthetics, or preference to ease of use *when*
there is a divergence?  More often than not, language purity
(consistency) *improves* ease of use.  Here we are looking at
something that does not appear to me to improve ease of use.
You have to go to some extra trouble to be certain that a string
value that you have assigned to an SCM is not read only.
That is not convenience.  If Guile were to implement copy on write,
then the user would not have to care whether a string were
shared read only or not.  It would be easier to use.  The only code
that would care at all would be the Guile internals.  (Where it
belongs -- my completely unhumble opinion :)



Re: Guile: What's wrong with this?

2012-01-04 Thread Bruce Korb

Hi Ludo,

On 01/04/12 13:17, Ludovic Courtès wrote:

If the users files are evaluated rather than compiled/loaded, this is
not a problem:


I do *all* guile processing via the ag_scm_c_eval_string_from_file_line
function.  I suck up a string from my input file, determine that it
needs guile processing and invoke that function.  It has this profile:


SCM
ag_scm_c_eval_string_from_file_line(
char const * pzExpr, char const * pzFile, int line);

#define SCM_EVAL_CONST(_s) \
do { static file_line_t const fl = { __LINE__ - 1, __FILE__, _s }; \
pzLastScheme = fl.text; \
ag_scm_c_eval_string_from_file_line(fl.text, fl.file, fl.line); \
} while (0)


and I *can* redefine define because I start Guile with my own
initialization:


#define SCHEME_INIT_FILE "directive.h"
static const int  schemeLine = __LINE__+2;
static char const zSchemeInit[3846] = // this is generated code...
"(use-modules (ice-9 common-list))\n\
..";

pzLastScheme = zSchemeInit;
ag_scm_c_eval_string_from_file_line(
zSchemeInit, SCHEME_INIT_FILE, schemeLine);

SCM_EVAL_CONST("(add-hook! before-error-hook error-source-line)\n"
   "(use-modules (ice-9 stack-catch))");



Could you check whether this is the case?


So it is the case.  My processing consists of slicing up the input
into a bunch of slivers based on markers.  I look at each sliver to
see how to process it.  Some are emitted directly, others trigger
internal mechanisms, a few are handed off to a separate server shell
process and finally, if the text starts with an open parenthesis
or a semi-colon (Guile comment marker), then Guile gets it via that call.

Thanks -Bruce



Re: Guile: What's wrong with this?

2012-01-04 Thread Ludovic Courtès
Hi!

Mark H Weaver  skribis:

> For example, look at what Guile 1.8 does:
>
>   guile> (let loop ((i 0))
>(define y "hello")
>(display y)
>(newline)
>(string-set! y i #\a)
>(loop (1+ i)))
>   hello
>   aello
>   aallo
>   aaalo
>   o
>   a
>   
>
> So you see, even in Guile 1.8, (define y "hello") didn't do what you
> thought it did.  It didn't fill y with the string "hello".  You were
> actually changing the program text itself, and that was a serious
> mistake.

Indeed, funny example!

Ludo’.




Re: Guile: What's wrong with this?

2012-01-04 Thread Ludovic Courtès
Hello,

Bruce Korb  skribis:

> So my main question is:
>
>   Which is the higher priority, language purity or ease of use?

FWIW I think “language purity” is one way to achieve “ease of use” (FSVO
“language purity” at least.)

Ludo’.




Re: Guile: What's wrong with this?

2012-01-04 Thread Mark H Weaver
Bruce Korb  writes:

>> ERROR: In procedure string-upcase:
>> ERROR: string is read-only: ""
>> Scheme evaluation error.  AutoGen ABEND-ing in template
>>  confmacs.tlib on line 209
>> Failing Guile command:  = = = = =
>>
>> (set! tmp-text (get "act-text"))
>>(set! TMP-text (string-upcase tmp-text))
>
> What in heck is string-upcase doing trying to write to its input string?
> Why was the string returned by ag_scm_get() (the function bound to "get")
> an immutable string anyway?

Good questions indeed.  I spent a bunch of time investigating this, and
found some bugs that might have caused this problem, although I'm not
certain.

Bruce: Can you please see if the patch below fixes this problem?

Mike: Would you be willing to review this (very small) patch to see if
it makes sense to you?  I'd like a second opinion from someone familiar
with that subsystem before I commit it.

 Thanks,
   Mark


>From a8da72937ff4d04e8d39531773cc05e676b2be1c Mon Sep 17 00:00:00 2001
From: Mark H Weaver 
Date: Wed, 4 Jan 2012 17:59:27 -0500
Subject: [PATCH] Fix bugs related to mutation-sharing substrings

* libguile/strings.c (scm_i_is_narrow_string, scm_i_try_narrow_string,
  scm_i_string_set_x): Check to see if the provided string is a
  mutation-sharing substring, and do the right thing in that case.
  Previously, if such a string was passed to these functions, they would
  behave very badly: while trying to fetch and/or mutate the cell
  containing the stringbuf, they were actually fetching or mutating the
  cell containing original shared string.  That's because
  mutation-sharing substring store the original string in CELL_1,
  whereas all other strings store the stringbuf there.
---
 libguile/strings.c |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/libguile/strings.c b/libguile/strings.c
index 666a951..1628aee 100644
--- a/libguile/strings.c
+++ b/libguile/strings.c
@@ -436,6 +436,9 @@ scm_i_string_length (SCM str)
 int
 scm_i_is_narrow_string (SCM str)
 {
+  if (IS_SH_STRING (str))
+str = SH_STRING_STRING (str);
+
   return !STRINGBUF_WIDE (STRING_STRINGBUF (str));
 }
 
@@ -446,6 +449,9 @@ scm_i_is_narrow_string (SCM str)
 int
 scm_i_try_narrow_string (SCM str)
 {
+  if (IS_SH_STRING (str))
+str = SH_STRING_STRING (str);
+
   SET_STRING_STRINGBUF (str, narrow_stringbuf (STRING_STRINGBUF (str)));
 
   return scm_i_is_narrow_string (str);
@@ -664,6 +670,12 @@ scm_i_string_strcmp (SCM sstr, size_t start_x, const char *cstr)
 void
 scm_i_string_set_x (SCM str, size_t p, scm_t_wchar chr)
 {
+  if (IS_SH_STRING (str))
+{
+  p += STRING_START (str);
+  str = SH_STRING_STRING (str);
+}
+
   if (chr > 0xFF && scm_i_is_narrow_string (str))
 SET_STRING_STRINGBUF (str, wide_stringbuf (STRING_STRINGBUF (str)));
 
-- 
1.7.5.4



Re: Guile: What's wrong with this?

2012-01-04 Thread Mike Gran
> From: Bruce Korb 
>>>     Which is the higher priority, language purity or ease of use?

>>  That is a loaded question, as it presupposes ease of use is always the
>>  same thing as impurity e.g. ...

> Absolutely not.  Making decisions is always about trade-offs,
> otherwise it is not really a decision.  Should you give preference
> to language aesthetics, or preference to ease of use *when*
> there is a divergence?  More often than not, language purity
> (consistency) *improves* ease of use.  Here we are looking at
> something that does not appear to me to improve ease of use.
> You have to go to some extra trouble to be certain that a string
> value that you have assigned to an SCM is not read only.
> That is not convenience.  If Guile were to implement copy on write,
> then the user would not have to care whether a string were
> shared read only or not.  It would be easier to use.  The only code
> that would care at all would be the Guile internals.  (Where it
> belongs -- my completely unhumble opinion :)

Well, I've read all the posts in this thread, and I was pretty aware
of the arguments about read-only strings before this.  So since I
have little left to contribute, I'll sign off with one final
statement about it...
 
I agree completely with Bruce's statement above.
 
The mutability of strings in Guile 1.8 was a feature, not a weakness.
Even though it wasn't properly implemented, as Mark pointed out, it
did what I meant every time I used it.
 
I believe that mutability should be the default in all data types.
Creating an immutable compound data type -- be it a string, pair,
vector or whatever -- should never be the default, and should always
be the case that requires extra syntax.
 
R{5,6,7}RS disagrees with me on that, of course.  I think R{5,6,7}RS
is wrong. 
 
I understand the efficiency argument for immutable strings (and pairs).
I don't care, because Guile has never been slow for anything I've asked
it to do.
 
That, I guess, is my completely unhumble opinion. :)
 
Regards,
 
Mike



Re: Guile: What's wrong with this?

2012-01-04 Thread Bruce Korb

On 01/04/12 15:19, Mark H Weaver wrote:

Bruce Korb  writes:


ERROR: In procedure string-upcase:
ERROR: string is read-only: ""
Scheme evaluation error.  AutoGen ABEND-ing in template
confmacs.tlib on line 209
Failing Guile command:  = = = = =

(set! tmp-text (get "act-text"))
(set! TMP-text (string-upcase tmp-text))


What in heck is string-upcase doing trying to write to its input string?
Why was the string returned by ag_scm_get() (the function bound to "get")
an immutable string anyway?


Good questions indeed.  I spent a bunch of time investigating this, and
found some bugs that might have caused this problem, although I'm not
certain.

Bruce: Can you please see if the patch below fixes this problem?


OK.  I'll have to play this weekend.  I have to download and install
Guile sources and, unfortunately, this thread notwithstanding, I do
have a day job

Thank you so much!!  Regards, Bruce



Re: Guile: What's wrong with this?

2012-01-04 Thread Mark H Weaver
Bruce Korb  writes:
> You have to go to some extra trouble to be certain that a string
> value that you have assigned to an SCM is not read only.

If you're going to mutate a string, you'd better be safe and make a copy
before mutating it, unless you know very clearly where it came from.
Otherwise, you might be mutating a string that some other data structure
still references, and it might not take kindly to having its string
mutated behind its back.

The fact that some string (whose origin you don't know about) might be
read-only is the least of your problems.  At least that problem will now
be flagged immediately, which is far better than the subtle and
hard-to-debug damage might be caused by mutating a string that other
data structures may reference.

All mutable values in Scheme are pointers.  In the case of strings, that
means that they're like "char *", not "char []".  A great deal of code
freely makes copies of these pointers instead of copying the underlying
string itself.  That's a very old tradition, because it is rare to
mutate strings in Scheme.

> If Guile were to implement copy on write, then the user would not have
> to care whether a string were shared read only or not.  It would be
> easier to use.

Guile already implements copy-on-write strings, but only in the sense of
postponing the copy done by `string-copy', `substring', etc.

Implementing copy-on-write transparently without the user explicitly
making a copy (that is postponed) is _impossible_.  The problem is that
although we could make a new copy of the string, we have no way to know
which pointers to the old object should be changed to point to the new
one.  We cannot read the user's mind.

Mark



Re: Guile: What's wrong with this?

2012-01-04 Thread Ludovic Courtès
Hi,

Bruce Korb  skribis:

> On 01/04/12 13:17, Ludovic Courtès wrote:
>> If the users files are evaluated rather than compiled/loaded, this is
>> not a problem:
>
> I do *all* guile processing via the ag_scm_c_eval_string_from_file_line
> function.

[...]

>> Could you check whether this is the case?
>
> So it is the case.

So this is good news: it means you only have to modify your own code
without worrying about your users’ code (modulo the fact that modifying
literals is still a bad idea, as others pointed out.)

BTW, were you able to find a stripped-down example that reproduces the
‘string-upcase’ problem?

Thanks,
Ludo’.



Re: Guile: What's wrong with this?

2012-01-04 Thread Mark H Weaver
I wrote:
>   (define-syntax define
> (lambda (x)
>   (with-syntax ((orig-define #'(@ (guile) define)))
> (syntax-case x ()
>   ((_ (proc arg ...) e0 e1 ...)
>#'(orig-define proc (lambda (arg ...) e0 e1 ...)))
>   ((_ v e)
>(identifier? #'v)
>(if (string? (syntax->datum #'e))
>#'(orig-define v (string-copy e))
>#'(orig-define v e)))

In case you're planning to use this, I just realized that this syntax
definition has a flaw: it won't handle cases like this:

  (define (map f . xs) ...)

To fix this flaw, change the two lines after syntax-case to:

>   ((_ (proc . args) e0 e1 ...)
>#'(orig-define proc (lambda args e0 e1 ...)))

The other macro I provided has the same flaw, and the same fix applies.

  Mark



Re: Guile: What's wrong with this?

2012-01-04 Thread David Kastrup
Bruce Korb  writes:

> On 01/04/12 13:52, Ian Price wrote:
>>> So my main question is:
>>>
>>>Which is the higher priority, language purity or ease of use?
>> That is a loaded question, as it presupposes ease of use is always the
>> same thing as impurity e.g. ...
>
> Absolutely not.  Making decisions is always about trade-offs,
> otherwise it is not really a decision.

That does not apparently preclude the option of marketing it as one.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-05 Thread Bruce Korb

On 01/04/12 15:59, Mark H Weaver wrote:

Implementing copy-on-write transparently without the user explicitly
making a copy (that is postponed) is _impossible_.  The problem is that
although we could make a new copy of the string, we have no way to know
which pointers to the old object should be changed to point to the new
one.  We cannot read the user's mind.


So because it might be the case that one reference might want to
see changes made via another reference then the whole concept is
trashed?  "all or nothing"?  Anyway, such a concept should be kept
very simple:  functions that modify their argument make copies of
any input argument that is read only.  Any other SCM's lying about
that refer to the unmodified object continue referring to that
same unmodified object.  No mind reading required.

   (define a "hello")
   (define b a)
   (string-upcase! a)
   b

yields "hello", not "HELLO".  Simple, comprehensible and, of course,
not the problem I was having.  :)

"it goes without saying (but I'll say it anyway)":

   (define a (string-copy "hello"))
   (define b a)
   (string-upcase! a)
   b

*does* yield "HELLO" and not "hello".  Why the inconsistency?

  Because it is better to do what is almost certainly expected
  rather than throw errors.

It is an ease of use over language purity thing.



Re: Guile: What's wrong with this?

2012-01-05 Thread Mark H Weaver
Bruce Korb  writes:
> So because it might be the case that one reference might want to
> see changes made via another reference then the whole concept is
> trashed?  "all or nothing"?  Anyway, such a concept should be kept
> very simple:  functions that modify their argument make copies of
> any input argument that is read only.  Any other SCM's lying about
> that refer to the unmodified object continue referring to that
> same unmodified object.  No mind reading required.
>
>(define a "hello")
>(define b a)
>(string-upcase! a)
>b

In order to do as you suggest, we'd have to change `string-upcase!' from
procedure to syntax.  That's because `string-upcase!' gets a _copy_ of
the pointer contained in `a', and is unable to change the pointer in
`a'.  This is fundamental to the semantics of Scheme.  We cannot change
it without breaking a _lot_ of code.

If we changed every string mutation procedure to syntax, then you
wouldn't be able to do things like this:

  (string-upcase! (proc arg ...))
  (map string-upcase! list-of-strings)

Also, if you wrote a procedure like this:

  (define (downcase-all-but-first! s)
(string-downcase! s)
(string-set! s 0 (char-upcase (string-ref s 0

it would work properly for mutable strings, but if you passed a
read-only string, it would do nothing at all from the caller's point of
view, because it would change the pointer in the local parameter s, but
not the caller's pointer.

These proposed semantics are bad because they don't compose well.

> "it goes without saying (but I'll say it anyway)":
>
>(define a (string-copy "hello"))
>(define b a)
>(string-upcase! a)
>b
>
> *does* yield "HELLO" and not "hello".  Why the inconsistency?

You are proceeding from the assumption that each variable contains its
own string buffer, when in fact they contain pointers, and (define b a)
copies only the pointer.  In other words, the code above is like:

  char *a = string_copy ("hello");
  char *b = a;
  string_upcase_x (a);
  return b;

What you are asking for cannot be done without changing the fundamental
semantics of Scheme at a very deep level.

 Mark



Re: Guile: What's wrong with this?

2012-01-05 Thread Mark H Weaver
Replying to myself...

>> "it goes without saying (but I'll say it anyway)":
>>
>>(define a (string-copy "hello"))
>>(define b a)
>>(string-upcase! a)
>>b
>>
>> *does* yield "HELLO" and not "hello".  Why the inconsistency?
>
> You are proceeding from the assumption that each variable contains its
> own string buffer, when in fact they contain pointers, and (define b a)
> copies only the pointer.  In other words, the code above is like:
>
>   char *a = string_copy ("hello");
>   char *b = a;
>   string_upcase_x (a);
>   return b;

Of course, in Scheme (and C) it is possible to do what you want by
changing string-upcase! (string_upcase_x) from a procedure to a macro,
but as you know, macros in C have significant disadvantages.  Scheme
macros are vastly more powerful and robust, but they also have
significant disadvantages compared with procedures.

Here's how you could do what you want with Scheme macros:

  (define-syntax-rule
(string-upcase!! x)
(set! x (string-upcase x)))

  Mark



Re: Guile: What's wrong with this?

2012-01-05 Thread David Kastrup
Bruce Korb  writes:

> On 01/04/12 15:59, Mark H Weaver wrote:
>> Implementing copy-on-write transparently without the user explicitly
>> making a copy (that is postponed) is _impossible_.  The problem is that
>> although we could make a new copy of the string, we have no way to know
>> which pointers to the old object should be changed to point to the new
>> one.  We cannot read the user's mind.
>
> So because it might be the case that one reference might want to
> see changes made via another reference then the whole concept is
> trashed?

Yes.  Because different references can't be distinguished, it would mean
that you'd not actually have a reference to the modified copy after
modifying it.  Which renders the modification useless.

> "all or nothing"?  Anyway, such a concept should be kept very simple:
> functions that modify their argument make copies of any input argument
> that is read only.  Any other SCM's lying about that refer to the
> unmodified object continue referring to that same unmodified object.
> No mind reading required.

>(define a "hello")
>(define b a)
>(string-upcase! a)
>b
>
> yields "hello", not "HELLO".  Simple, comprehensible and, of course,
> not the problem I was having.  :)

It is neither simple, nor comprehensible.

> "it goes without saying (but I'll say it anyway)":
>
>(define a (string-copy "hello"))
>(define b a)
>(string-upcase! a)
>b
>
> *does* yield "HELLO" and not "hello".  Why the inconsistency?
>
>   Because it is better to do what is almost certainly expected
>   rather than throw errors.
>
> It is an ease of use over language purity thing.

You probably don't realize how ironic that is.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-05 Thread Mark H Weaver
Bruce Korb  writes:
> Anyway, such a concept should be kept
> very simple:  functions that modify their argument make copies of
> any input argument that is read only.  Any other SCM's lying about
> that refer to the unmodified object continue referring to that
> same unmodified object.  No mind reading required.
>
>(define a "hello")
>(define b a)
>(string-upcase! a)
>b

I suspect that what you really want is for `define' (and maybe some
other things) to automatically do a deep copy instead of merely making a
new reference to an existing object.

For example, you seem to want (define a "hello") to make a fresh copy of
the string literal, and for (define b a) to make another copy so that
changes to the string referenced by `b' do not affect the string
referenced by `a'.

You seem to not want to think about aliasing issues.  Indeed, it would
be more intuitive if we always copied everything deeply, but that would
be strictly less powerful, not to mention far less efficient, especially
when handling large structures.

`define' merely makes a new reference to an existing object.  If you
want a copy, you must explicitly ask for one (though this could be
hidden by custom syntax).  It would not be desirable for the language to
make copies automatically as part of the core `define' syntax.  For one
thing, sometimes you don't want a copy.  Sometimes you want shared
mutable objects.

Even if you do want to copy, there are different kinds of copies.  How
deeply do you want to copy?  If it's a hierarchical list, do you want to
copy only the first level of the list, or do you want to recurse?
Suppose this hierarchical list contains strings.  Do you want to copy
the strings too, or just the list structure?  I could go on and on.
There's no good universal copier; it depends on your purposes.

If you want an abbreviated way to both `define' and `copy', then you'll
need to make new syntax to do that.  Guile provides all the power you
need to do this.

  Mark



Re: Guile: What's wrong with this?

2012-01-05 Thread Mike Gran
> `define' merely makes a new reference to an existing object.  If you
> want a copy, you must explicitly ask for one (though this could be
> hidden by custom syntax).  It would not be desirable for the language to
> make copies automatically as part of the core `define' syntax.  For one
> thing, sometimes you don't want a copy.  Sometimes you want shared
> mutable objects.

It is curious that action of 'copy' really means the
action of 'create a copy with different properties'.
 
Shouldn't (string-copy "a") create another immutable string?
 
Likewise, shouldn't (substring "abc" 1) return an immutable substring?



Re: Guile: What's wrong with this?

2012-01-05 Thread Mark H Weaver
Mike Gran  writes:
> It is curious that action of 'copy' really means the
> action of 'create a copy with different properties'.
>  
> Shouldn't (string-copy "a") create another immutable string?

Why would you want to copy an immutable string?

> Likewise, shouldn't (substring "abc" 1) return an immutable substring?

As I understand it, in the Scheme standards (at least before R6RS's
immutable pairs) the rationale behind marking literal constants as
immutable is solely to avoid needlessly making copies of those literals,
while flagging accidental attempts to modify them, since that is almost
certainly a mistake.

If that is the only rationale for marking things read-only, then there's
no reason to mark copies read-only.  The philosophy of Scheme (at least
before R6RS) was clearly to make almost all data structures mutable.

Following that philosophy, in Guile, even though (substring "abc" 1)
postpones copying the string buffer, it must create a new heap object.
Once you've done that, it is feasible to implement copy-on-write.

Now, the immutable pairs of R6RS and Racket have an entirely different
rationale, namely that they enable vastly more effective optimization in
a compiler.  In this case, presumably you'd want copies to retain the
immutability.

 Mark



Re: Guile: What's wrong with this?

2012-01-05 Thread Noah Lavine
Hello all,

I must admit that I do not know much about why R5RS says that literals
are constant, but I think there is a misunderstanding.

Bruce does not want `define' to always copy its result. I think what
he wants is for literals embedded in source code to be mutable. This
would, of course, imply that each literal in the source code would be
a new copy, even if they were identical.

Weirdly enough, that is how my intuition works too. After all, if I
made a string object in Scheme without going to any trouble, I would
get a mutable object. If I write down a string, I expect to get the
same sort of object. Bruce is also right that this enables quick and
easy programming that munges strings.

And I think the argument about putting strings in constant memory is
bad - constant memory is an implementation detail. If it happens that
we can store literals more efficiently when they are not mutated, then
perhaps we should just detect that case and switch representations.

Of course there is a trade-off here between ease of implementation and
ease of use. This change seems pretty unimportant to me, especially if
Python does all right with immutable strings, so I do not think it's
important for us to support it. I just don't buy the arguments against
supporting it.

Noah

On Thu, Jan 5, 2012 at 8:41 PM, Mark H Weaver  wrote:
> Mike Gran  writes:
>> It is curious that action of 'copy' really means the
>> action of 'create a copy with different properties'.
>>
>> Shouldn't (string-copy "a") create another immutable string?
>
> Why would you want to copy an immutable string?
>
>> Likewise, shouldn't (substring "abc" 1) return an immutable substring?
>
> As I understand it, in the Scheme standards (at least before R6RS's
> immutable pairs) the rationale behind marking literal constants as
> immutable is solely to avoid needlessly making copies of those literals,
> while flagging accidental attempts to modify them, since that is almost
> certainly a mistake.
>
> If that is the only rationale for marking things read-only, then there's
> no reason to mark copies read-only.  The philosophy of Scheme (at least
> before R6RS) was clearly to make almost all data structures mutable.
>
> Following that philosophy, in Guile, even though (substring "abc" 1)
> postpones copying the string buffer, it must create a new heap object.
> Once you've done that, it is feasible to implement copy-on-write.
>
> Now, the immutable pairs of R6RS and Racket have an entirely different
> rationale, namely that they enable vastly more effective optimization in
> a compiler.  In this case, presumably you'd want copies to retain the
> immutability.
>
>     Mark
>



Re: Guile: What's wrong with this?

2012-01-06 Thread David Kastrup
Mike Gran  writes:

>> `define' merely makes a new reference to an existing object.  If you
>> want a copy, you must explicitly ask for one (though this could be
>> hidden by custom syntax).  It would not be desirable for the language to
>> make copies automatically as part of the core `define' syntax.  For one
>> thing, sometimes you don't want a copy.  Sometimes you want shared
>> mutable objects.
>
> It is curious that action of 'copy' really means the
> action of 'create a copy with different properties'.
>  
> Shouldn't (string-copy "a") create another immutable string?

That would be rather pointless.  You could just use the original string.

> Likewise, shouldn't (substring "abc" 1) return an immutable substring?

Why wouldn't you be using substring/shared if you are not going to
modify either string?

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-06 Thread Mike Gran
> From: Mark H Weaver 
 >>  It is curious that action of 'copy' really means the
>>  action of 'create a copy with different properties'.
>>   
>>  Shouldn't (string-copy "a") create another immutable string?
> 
> Why would you want to copy an immutable string?
> 
>>  Likewise, shouldn't (substring "abc" 1) return an immutable 
> substring?

I was being too snarky and rhetorical.  Gotta stop writing e-mail
before getting coffee.
 
To say something possibly semi-constructive...
 
The word 'string' in Scheme is overloaded to mean both string
immutables and string mutables.   Since a string immutable
can't be modified to be a mutable, they really are different
object types.  String mutables appear to still exist in the 
latest draft of R7RS. 
 
Many of the procedures that operate on strings will are overloaded
to take both immutables and mutables, but some, like string-set!
take only mutables.
 
There is an obvious syntax to construct a string immutable
object: namely to have it appear as a literal in the source code.
There thus isn't a need for a constructor function.
 
There is a need for a constructor function to create string mutables,
because a literal string in the source code indicates a string immutable.
 
There are such constructors: (string  ...) and (make-string k )
which is fine.
 
But there is no constructor for a string mutable that initializes
it with a string in Guile 2.0.  There was in Guile 1.8, where
you could do (define  ). 
 
So instead, syntactically, we now have to use 'string-copy' or 'substring'
for its *side-effects*, namely that it doesn't mark the copy immutable.
Those are rather poor and confusing names for constructors.
 
If making such a suggestion weren't pointless, I'd pitch the idea
of overloading 'string' or 'make-string' so they can be used as
a constructor of a string mutable.  Something like
(string ) or (make-string ).  This
would be clearer than using string-copy, I think.
 
Thanks,
 
Mike



Re: Guile: What's wrong with this?

2012-01-06 Thread David Kastrup
Mike Gran  writes:

> There is an obvious syntax to construct a string immutable
> object: namely to have it appear as a literal in the source code.
> There thus isn't a need for a constructor function.

Huh?  There are _lots_ of strings which are better computed than spelled
out.

> But there is no constructor for a string mutable that initializes
> it with a string in Guile 2.0.

(string-copy "x")

> There was in Guile 1.8, where
> you could do (define  ).

No, it wasn't.

guile> (define (x) "x")
guile> (x)
"x"
guile> (string-upcase! (x))
"X"
guile> (x)
"X"
guile>

As you can see, reevaluating the definition suddenly delivers a changed
result, because we are not talking about modifying a mutable string
initialized with a literal, but about modifying the literal itself.

Whether or not you replace the function body with
(define y "x") y
instead of just "x" does not change the result and does not change
what happens.  y does not refer to a string initialized from the
literal, it refers to the literal.  And changing the literal is a really
bad idea.

Just because you do not understand what the code did previously does not
mean that the behavior was well-defined.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-06 Thread Mark H Weaver
Mike Gran  writes:
> The word 'string' in Scheme is overloaded to mean both string
> immutables and string mutables.   Since a string immutable
> can't be modified to be a mutable, they really are different
> object types.  String mutables appear to still exist in the 
> latest draft of R7RS. 
>  
> Many of the procedures that operate on strings will are overloaded
> to take both immutables and mutables, but some, like string-set!
> take only mutables.

This is the wrong way to think about it.  In Scheme, mutable and
immutable strings are _not_ different types.

The way to think about it is that in Scheme, the program text itself is
immutable, including any literals contained in it.  This is true of
_all_ literals, including '(literal lists), '#(literal vectors),
"literal strings", #'(literal syntax) and any other types that might be
added in the future that would otherwise be mutable.

Imagine that you were evaluating Scheme by hand on paper.  You have your
program written on one page, and you have another scratch page used for
the data structures that your program creates during evaluation.
Suppose your program contains a very large lookup table, written as a
literal list.  This lookup table is on your program page.

Now, suppose you are asked to evaluate (lookup key big-lookup-table).

The way Scheme works is that `big-lookup-table' is _not_ copied.  As
`lookup' traverses the table, it contains pointers within the program
page itself.  However, Scheme prohibits you from modifying _anything_
that happens to be on the program page.  It's not a question of type.
It's a question of which page the data happens to be on.

Now, we _could_ force you to copy big-lookup-table from the program page
onto the scratch page before doing `lookup', just in case `lookup' might
try to mutate its structure.  But that would be a lot of wasted effort.

Alternatively, we could allow you to modify the program itself.  This
is what Guile 1.8 did.  You _could_ make an argument that this is
desirable, on the grounds that we should trust that the programmer knows
what he's doing.

However, it's clear that Bruce did _not_ understood what he was doing.
I don't think that he (or you) realized that the following procedure was
buggy in Guile 1.8:

  (define (ten-spaces-with-one-star-at i)
(define s "  ")
(string-set! s i #\*)
s)

Guile 1.8's permissivity allowed Bruce to unwittingly create a large
body of code that was inherently buggy.  IMHO, it would have been much
better to nip that in the bud and alert him to the fact that he was
doing something that was almost certainly unwise.

> There is a need for a constructor function to create string mutables,
> because a literal string in the source code indicates a string immutable.
>  
> There are such constructors: (string  ...) and (make-string k )
> which is fine.
>  
> But there is no constructor for a string mutable that initializes
> it with a string in Guile 2.0.

Yes there is: (string-copy "string-literal")

If you don't like the name, then rename it:

  (define mutable-string string-copy)

 Mark



Re: Guile: What's wrong with this?

2012-01-06 Thread Bruce Korb

On 01/06/12 10:13, Mark H Weaver wrote:

Imagine that you were evaluating Scheme by hand on paper.  You have your
program written on one page, and you have another scratch page used for
the data structures that your program creates during evaluation.
Suppose your program contains a very large lookup table, written as a
literal list.  This lookup table is on your program page.

Now, suppose


That is where my mental model diverges!!


sprintf(buf, "(define %s \"%s\")", "foo", my_str);
scm_eval_string(buf);
sprintf(buf, "(string-upcase! %s)", "foo")
// the string from my_str in "buf" is now scribbled over and completely gone
scm_eval_string(buf);


Since I know the program I initially wrote (the define) is now gone,
the string must have been copied off somewhere.  I think one's first
guess is that it was copied to someplace modifiable.  However, that
would be incorrect.  It is copied off to writable memory, but marked
as read-only for the purposes of Guile.  Not intuitively obvious.


Guile 1.8's permissivity allowed Bruce to unwittingly create a large
body of code that was inherently buggy.  IMHO, it would have been much
better to nip that in the bud and alert him to the fact that he was
doing something that was almost certainly unwise.


Fail early and fail hard.  Yes.  But after all these discussions, I
now doubt I have too many places where I am expecting to change a
static value.  Most of the strings that I wind up  altering are
created with a scm_from_locale_string() C function call.  Very few
strings are ever actually initialized with (define foo "something"),
other than when creating placeholders because you cannot define
within a nested collection of functions.  e.g.
  (if (whatever)
  (define foo (get "this"))
  (define foo (get "that"))  )
  (string-upcase! foo)



Anyway, I did compile and build my toy and guile with CFLAGS='-g -O0'.
The error message did not show.  Instead it seg faulted while trying
to make this call:  scm_from_locale_string("");

There must be a corruption somewhere.  It is either asymptomatic with
Guile 1.8 (viz. my fault) or it is introduced with Guile 2.0 (meaning
a Guile code issue).  More in a few days.

Thank you.



Re: Guile: What's wrong with this?

2012-01-06 Thread David Kastrup
Bruce Korb  writes:

> On 01/06/12 10:13, Mark H Weaver wrote:
>> Imagine that you were evaluating Scheme by hand on paper.  You have your
>> program written on one page, and you have another scratch page used for
>> the data structures that your program creates during evaluation.
>> Suppose your program contains a very large lookup table, written as a
>> literal list.  This lookup table is on your program page.
>>
>> Now, suppose
>
> That is where my mental model diverges!!

The mental model of the computer is what counts.

>> sprintf(buf, "(define %s \"%s\")", "foo", my_str);
>> scm_eval_string(buf);
>> sprintf(buf, "(string-upcase! %s)", "foo")
>> // the string from my_str in "buf" is now scribbled over and completely gone
>> scm_eval_string(buf);
>
> Since I know the program I initially wrote (the define) is now gone,

Why would a define be gone?

> the string must have been copied off somewhere.

I don't think you understand the concept of garbage collection.
_Everything_ in Scheme exists permanently regarding all observable
semantics (well, weak hash tables are a somewhat weird exception).
Definitions, variables, continuations.  There is no concept like a stack
of local values that would get erased.  Thanks to call/cc, there is not
even a return stack that would get erased.  Every object carries its own
lifetime with it.  It dies when nobody remembers it, not because of
being in some scope or whatever else.

> I think one's first guess is that it was copied to someplace
> modifiable.  However, that would be incorrect.  It is copied off to
> writable memory, but marked as read-only for the purposes of Guile.
> Not intuitively obvious.

Also wrong.

-- 
David Kastrup




Re: Guile: What's wrong with this?

2012-01-06 Thread Mark H Weaver
David Kastrup  writes:

> Bruce Korb  writes:
>
>>> sprintf(buf, "(define %s \"%s\")", "foo", my_str);
>>> scm_eval_string(buf);
>>> sprintf(buf, "(string-upcase! %s)", "foo")
>>> // the string from my_str in "buf" is now scribbled over and completely gone
>>> scm_eval_string(buf);
>>
>> Since I know the program I initially wrote (the define) is now gone,
>
> Why would a define be gone?

I think what Bruce means here is that, in theory, the string object
created in the above `define' might have held a reference to part of his
buffer `buf'.  And indeed, we do make a copy of that buffer.  So why not
make a mutable copy?

The reason is that, even though we make a copy of the program as we read
it (converting from the string representation of `buf' into our internal
representation), we'd like to be able to use the program multiple times.

When I speak of the "program text", I'm not referring to the string
representation of the program, but rather the internal representation.

If we allow the user to unwittingly modify the program, it might work
once but fail thereafter, as in:

  (define ten-spaces-with-one-star-at
(lambda (i)
  (define s "  ")
  (string-set! s i #\*)
  s))

Now, some reasonable people might say "Why arbitrarily limit the user?
He might know what he's doing, and he might really want to do this!"

Scheme provides a nice way to do this too:

  (define ten-spaces-with-new-star-at
(let ((s (make-string 10 #\space)))
  (lambda (i)
(string-set! s i #\*)
s)))

I normally lean toward assuming that the user knows what he's doing, but
in this case I think Scheme got it right.  Accidentally modifying
literals is a very common mistake, and is almost never a good idea.

If you want to make a program with internal mutable state, Scheme
provides free variables, as used in the example above.

 Mark



Re: Guile: What's wrong with this?

2012-01-07 Thread Mark H Weaver
Bruce Korb  writes:
> Fail early and fail hard.  Yes.  But after all these discussions, I
> now doubt I have too many places where I am expecting to change a
> static value.

That's good news! :)

> Most of the strings that I wind up altering are created with a
> scm_from_locale_string() C function call.

BTW, beware that scm_from_locale_string() is only appropriate for
strings that came from the user (e.g. command-line arguments, reading
from a port, etc).  When converting string literals from your own source
code, you should use scm_from_latin1_string() or scm_from_utf8_string().

Similarly, to make symbols from C string literals, use
scm_from_latin1_symbol() or scm_from_utf8_symbol().

Caveat: these functions did not exist in Guile 1.8.  If your C string
literals are ASCII-only, I guess it won't matter in practice which
function you use, although it would be good to spread the understanding
that C string literals should not be interpreted according to the user's
locale.

Best,
 Mark



Re: Guile: What's wrong with this?

2012-01-07 Thread Ian Price
Mark H Weaver  writes:

> As I understand it, in the Scheme standards (at least before R6RS's
> immutable pairs) the rationale behind marking literal constants as
> immutable is solely to avoid needlessly making copies of those literals,
> while flagging accidental attempts to modify them, since that is almost
> certainly a mistake.
Erm, if you don't count literals, which were already immutable, then
R6RS doesn't have immutable pairs. It does move the mutators to a
separate module, but that is a not really equivalent, because even if
you don't import (rnrs mutable-pairs), another module may mutate pairs
returned by your library. Ditto for strings,etc.

To quote section 5.10
"Literal constants, the strings returned by symbol->string, records with
no mutable fields, and other values explicitly designated as immutable
are immutable objects, while all objects created by the other procedures
listed in this report are mutable."

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



Re: Guile: What's wrong with this?

2012-01-07 Thread Mark H Weaver
Hi Ian!

Ian Price  writes:

> Mark H Weaver  writes:
>
>> As I understand it, in the Scheme standards (at least before R6RS's
>> immutable pairs) the rationale behind marking literal constants as
>> immutable is solely to avoid needlessly making copies of those literals,
>> while flagging accidental attempts to modify them, since that is almost
>> certainly a mistake.
> Erm, if you don't count literals, which were already immutable, then
> R6RS doesn't have immutable pairs. It does move the mutators to a
> separate module, but that is a not really equivalent, because even if
> you don't import (rnrs mutable-pairs), another module may mutate pairs
> returned by your library. Ditto for strings,etc.
>
> To quote section 5.10
> "Literal constants, the strings returned by symbol->string, records with
> no mutable fields, and other values explicitly designated as immutable
> are immutable objects, while all objects created by the other procedures
> listed in this report are mutable."

Ah, I guess you're right.  I never studied the R6RS carefully outside of
its handling of numerics.  I wrote "at least before R6RS" to indicate
that I was only knowledgeable about earlier versions.

Racket's immutable pairs represent a break in the older tradition.  Last
I looked anyway, Racket's mutable pairs cannot even be accessed with the
standard `car' and `cdr'.  Therefore, they really are a different (and
incompatible) type from mutable pairs.

I still suspect that the rationale behind immutable pairs in the R6RS is
to discourage mutation of pairs, to give compiler implementations such
as Racket the freedom to make pairs truly immutable as thus benefit from
better optimizer.  However, I mistakenly implied that immutable pairs
were a distinct type in the R6RS itself, and for that I apologize.

Thanks,
  Mark