David Mitchell wrote:
> 
> Roland Giersig <[EMAIL PROTECTED]> wrote:
> 
> > Maybe:
> >
> > "Perl6 should excell at manipulating *formatted* text."
> 
> Quite possibly, although as a previous poster has pointed out,
> formatted text != XML.

Yes, but both share a common underlying structure: they are both
chunks of text that have (hierarchical) attributes attached, so
a data structure that can handle general XML can also hold e.g.
a RTF document.

> ie in the sense that HTML, RTF, TeX etc, have a natural sense of containing
> a single piece of text with embedded attributes - which could in principle
> be stripped away or ignored. So for example, a regex would operate on the
> whole underlying text. XML on the other hand is far more general. For example
> one particular XML document might hold the names and contact details
> for a thousand individuals. Trying to treat the document as a single
> string and applying a regex to it doesnt have any particularly strong
> semantic doo-dah. In that particular example, it might make sense to
> think of the XML doc as an array of thinggies rather than a single string.

This is cause by the (well, mostly) clear distinction between style 
and content in documents, so if you take away the style (which has a 
rather low information level), you still have the full content information.

A database in XML in contrast has its information equally divided
between data attributes and data content, so you cannot simply strip 
the attributes.

This does not mean that a datastructure cannot equally hold both.  But
to do something sensible with it, a regex machine must be able to
match on both attributes and textual content.  This is what my proposal
is about.

Hope this clarifys it somewhat.

Hmm, I'm back to

"Perl should weave its magic upon attributed text chunks 
 instead of linear text."

Roland
--
[EMAIL PROTECTED]

Reply via email to