Re: [PATCHES] concat, read, substr, added 'ord' operator, and a SURPRISE

2001-11-13 Thread Tom Hughes

In message <[EMAIL PROTECTED]>
  Dan Sugalski <[EMAIL PROTECTED]> wrote:

> At 03:35 AM 11/11/2001 -0500, James Mastros wrote:
>
> >No, it isn't.  I'm not sure s->strlen is always gaurnteed to be correct;
> >string_length(s) is.  (I found a case where it was wrong when coding my
> >version of ord() once, though that ended up being a problem with my
> >version of chr().  The point is that string_length is an API, but the
> >contents of the struct are not.)
> 
> We shouldn't cheat--the string length field should be considered a black
> box until we need the speed, at which point we play Macro Games and change
> string_length into a direct fetch.

As far as I know the strlen member should always be correct. I was
certainly trying to make sure it was because strings.pod explictly
says that it will be and that it can be used directly instead of
calling string_length().

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu/




Re: [PATCHES] concat, read, substr, added 'ord' operator, and a SURPRISE

2001-11-12 Thread Simon Cozens

On Mon, Nov 12, 2001 at 12:05:04PM -0500, Dan Sugalski wrote:
> Will. Docs, darn it! Must have docs! Tests, too, but if you have docs you 
> can rope someone into writing the tests and the lot of 'ya can submit a 
> chunk of patches. :)

And if you have docs and tests, you might be able to convince someone
here to commit it. :) (That's a funny way of saying "the patch looks
good, add a bit of polish and I'll commit it")

-- 
It's difficult to see the picture when you are inside the frame.



Re: [PATCHES] concat, read, substr, added 'ord' operator, and a SURPRISE

2001-11-12 Thread Dan Sugalski

At 03:35 AM 11/11/2001 -0500, James Mastros wrote:
>On Sun, 11 Nov 2001, Alex Gough wrote:
> > On Sun, 11 Nov 2001, Alex Gough wrote:
> > > ook, cool, but string_length returns an INTVAL, not an int.
> > Remember that people who say "negative" usually mean "positive", they
> > just don't know it yet.  Always look on the bright si-ide of life, de
> > do, de do de do de do.
>Huha?  Anyway, it doesn't matter.  INTVAL's and ints will be casted
>implicitly in C.

Doesn't matter--we should do The Right Thing from the beginning. (And we're 
not guaranteed they're the same, since we're not guaranteeing that 
sizeof(INTVAL) == sizeof(int))

> > Yes, also this doesn't follow the style of the rest of string.c
> > (s->strlen is your friend)
>No, it isn't.  I'm not sure s->strlen is always gaurnteed to be correct;
>string_length(s) is.  (I found a case where it was wrong when coding my
>version of ord() once, though that ended up being a problem with my
>version of chr().  The point is that string_length is an API, but the
>contents of the struct are not.)

We shouldn't cheat--the string length field should be considered a black 
box until we need the speed, at which point we play Macro Games and change 
string_length into a direct fetch.

It's possible that for speed reasons some code will not bother setting the 
length field in the string struct. The GC system needs the bytes allocated 
to be right, but that's it.

> > if (index < 0 ) {
> > string_ord should be more like this anyhow:
> > index = s->strlen + index; /* zero based */
> > }
>I'd tend to disagree; this is more of a language thing then a runtime
>thing.  If it's done on a Parrot level, then it should be in the oplib,
>not in string.c (I'd also say that this goes for all the
>negitive-index-from-end stuff that Perl does.  (Which I like, but biases
>the Parrot's core in a way that I don't like.))

I don't mind biasing Parrot's core this way. negative-from-the-end doesn't 
get in the way of other languages, so I'm fine with it, and it's 
potentially a speedup.

> > Also, is it wise to be #defining every one of our errors to be 1,
> > aren't these better being an enum, or is there merely not yet a plan
> > for exceptions that works?
>I think there's not yet a plan for exceptions.  (Note, though, that the
>values of the exception #defines are currently only used for errorlevels
>on the intepreter's death.)

Exceptions are close to the top of the list for things, since we've gotten 
to the point where stuff can throw more than fatal errors.

> > (The general gist of the patch is damn good, btw)
>I agree.  (The documentation is even more lacking then in mine, though.
>Dan'll complain.  Or should, anyway.)

Will. Docs, darn it! Must have docs! Tests, too, but if you have docs you 
can rope someone into writing the tests and the lot of 'ya can submit a 
chunk of patches. :)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: [PATCHES] concat, read, substr, added 'ord' operator,and a SURPRISE

2001-11-11 Thread James Mastros

On Sun, 11 Nov 2001, Alex Gough wrote:
> On Sun, 11 Nov 2001, Alex Gough wrote:
> > ook, cool, but string_length returns an INTVAL, not an int.
> Remember that people who say "negative" usually mean "positive", they
> just don't know it yet.  Always look on the bright si-ide of life, de
> do, de do de do de do.
Huha?  Anyway, it doesn't matter.  INTVAL's and ints will be casted
implicitly in C.

> Yes, also this doesn't follow the style of the rest of string.c
> (s->strlen is your friend)
No, it isn't.  I'm not sure s->strlen is always gaurnteed to be correct;
string_length(s) is.  (I found a case where it was wrong when coding my
version of ord() once, though that ended up being a problem with my
version of chr().  The point is that string_length is an API, but the
contents of the struct are not.)

> and I'm not sure that (char*)[index]
> is the right way to get ord for $encoding.
It is in fact quite wrong; it's return is useless for anything but
singlebyte and 7-bit utf8.

> Have we actually worked out what ord should do yet?
The meaning that I like is this:

Two-arg ord sets $1 to the codepoint of $2's first character in $2's
current chartype.  (This means that it isn't very useful without another
op to get the current chartype of a string.) 

Thus, ord(s, index) looks somthing like:
if (s->encoding == ENCODING_SINGLEBYTE) {
return (char*)s->bufstart[index];
} else {
transcode(s, ENCODING_UTF32, NULL);
return (utf32_t *)s->bufstart[index];
}


> if (index < 0 ) {
> string_ord should be more like this anyhow:
> index = s->strlen + index; /* zero based */
> }
I'd tend to disagree; this is more of a language thing then a runtime
thing.  If it's done on a Parrot level, then it should be in the oplib,
not in string.c (I'd also say that this goes for all the
negitive-index-from-end stuff that Perl does.  (Which I like, but biases
the Parrot's core in a way that I don't like.))

> Also, is it wise to be #defining every one of our errors to be 1,
> aren't these better being an enum, or is there merely not yet a plan
> for exceptions that works?
I think there's not yet a plan for exceptions.  (Note, though, that the
values of the exception #defines are currently only used for errorlevels
on the intepreter's death.)

> (The general gist of the patch is damn good, btw)
I agree.  (The documentation is even more lacking then in mine, though.
Dan'll complain.  Or should, anyway.)

-=- James Mastros
-- 
Put bin Laden out like a bad cigar: http://www.fieler.com/terror
"You know what happens when you bomb Afghanastan?  Thats right, you knock
over the rubble."   -=- SLM