Nicholas Clark wrote:
> 
> On Wed, Nov 22, 2000 at 01:24:50PM -0500, Chaim Frenkel wrote:
> > I'd offer the possiblity that there are two (or perhaps more)
> > different problems here.  One is the current bunch of bytes (string,
> > executable to be twiddled) Another which the attribute on strings
> > seems to be structured data.
> >
> > Squeezing attributes onto a buffer, seems to be shoehorning a more
> > general problem onto a specific implementation.
> >
> > Getting an efficient representation of a meaningful structure should
> > be done a new data type.
> >
> > (I'm thinking of representing COBOL records/data, or even XML documents)

That's (XML) what I was thinking also when writing the proposal.
Hmm, I should modify it to use the XML buzzword, this could greatly
enhance its obvious value.  Maybe the title should be :

"Perl should use XML as its basic data type instead of linear strings"

How does that sound?

> Have I misunderstood you if I suggest that "two or more" is actually a
> continuous range of representation from
> 
> 1 (contiguous linear) string data with 0 or more attribute attached to each
>   character where the string's text is the backbone
>   [and the global and local order of the characters in string is crucial
>    to the value and equality with other variables]
> 
> 2 structured data (eg XML) where the string's text is just part of the data
>   held in the structure, and you could sort the data in different ways
>   without changing its value
> 
> Are those end members in a continuum? or are hybrids of the 2 impossible?
> Am I barking up the wrong tree completely?

I would see that (1) is the simplest form of (2), so once handling (2)
is
solved, (1) is also handled.  This is from a functional point of view,
performance is another issue.  It could be well so that the solution to 
(2) needs only minor tweaking to be fast enough for (1) compared to
the current solution.  Or a complete separate implementation is
warranted.

I'm with Chaim Frenkel, who wrote:
> If for no other reason, there are many ways of having the attributes
> distribute across, deletions, additions, and moves. That is a policy
> decision that should not be done at the perl internal level.

This means IMHO, that the basic data structure for (1) must be
extensible in a way that it can be morphed into the one for (2).
But the implementation of functions that work on (2) are separable
from those that work on (1).

David Mitchell has a proposal how this could be done:
> One way round this is to leave the semantics to implementor of the SV type.
> This could be done by having vtable methods for *all* string ops
> known to Perl; in particular m//, s// and tr//.
> 
> The way this could work is for the Perl core to provide a generic regex
> library, which uses only the public interface to SVs to extract
> and manipulate its contents. Standard string SVs would have the relevant
> vtable entries point to these generic regex functions.
> However, if someone wants to implement a HTML SV type say, then
> (if they are keen enough) they can write their own m//, s// methods
> which are efficent (becuase they can access the internal representation),
> and can have whatever semantics the author wishes.
> 
> However, since the internals of regexes are a dark art to me, I dont know
> whether is is sensible to have a single regex compiler, but multiple
> regex executors (if that's the right terminology).

I'm very happy how this discussion is going.  Are you guys also
feeling that this could be of immense value for a lot of Perl users
out there?

Best regards,

Roland
--
[EMAIL PROTECTED]

Reply via email to