Gianugo Rabellino wrote:

Murray Altheim wrote:

Gianugo Rabellino wrote:

1. I couldn't find any SAX support in the API. Is that on purpose?

I'm not sure what you mean by this. Isn't "SAX support" rather orthogonal to the design of the API? Maybe an example would help here.

Sure thing. I think that the three methods in XNode interface:

getContentAsDOM()
modifyContentAsDOM()
setContentAsDOM()

should have their SAX counterparts, like

getContentAsSAX(ContentHandler handler)


This doesn't make any sense to me. SAX is an event-based API, so
asking a method to return an object would return what? The whole
inner guts of Xindice is about the passing of objects, with the
only role of SAX being the construction of those objects during
the parsing of some XML content.


or, better yet, how about a

Resource getResource();
setResource(Resource resource);

returning an XML:DB resource?


You already have the XML:DB API to do this sort of thing. XNode was
developed to operate at a bit simpler, perhaps higher level, that of
essentially a DOM object (the DOM Document or Element node, depending
on how one looks at it) with the added ability to add metadata due
to XNode's encapsulation. That's it: it's nothing more than that. The
combination of XML:DB, XUpdate, XPath and XNode pretty much provides
you with anything you might need in terms of views or query methods.
Certainly a more complex mechanism could be developed, but that's not
the goal of XNode (since the other three methods provide quite a lot
already).


2. AFAIU metadata, in your proposal, are a fixed set (xnode:created and xnode:modified) plus an extensible and free set of properties based on key/value pairs. Can this be expressive enough? I mean, originally I had in mind:

- a set of fixed metadata a bit more extended than yours (basically mimicking the Unix "stat.h" structure). These are valid potentially for any XML:DB database, they share a common vucabulary and can be expressed as XML attributes;

Yes, but I wouldnt' assume that they are so common as you might suppose, and the wide variety of Xindice/XNode applications would seem to suggest that there may be a number of metadata approaches, where including a fixed set might conflict with the design of one of them, either by name or semantically. After adding the extension mechanism I thought about even moving 'created' and 'modified' and leaving *only* ID. Not everyone needs the same set.

Definitely. Yet there are some metadata that belong to the concept of "document". A document is created, modified, accessed and owned by someone and possibly some group. This is a basic set that IMHO can act as a foundation.


But not all database applications *need* that. XNode allows it, and
it's certainly possible to standardize the metadata named properties
(such as using DC elements), but I'd resist assuming that *everyone*
wants a specific set. For example, my application doesn't need ownership
metadata, individual or group. Somebody else does. Now if we were to
put a specific ownership property in XNode (ie., hardwired), there
very well might be database applications that literally don't agree
with that ownership semantic, may have simpler, more complex, or
just different semantics. The definitions of a specific property can
be quite important in determining its usage. It's not one-size-fits-all.


- a set of application-specific metadata (implemented by any vendor of XML database). This is more a matter regarding the XAPI effort;

This is supportable within XNode as is, unless one really needs to be able to namespace the individual metadata elements because one wants to mix them. Even then, the property name could contain the necessary resolution.

OK.

- an extension mechanism based on RDF triplets or some technology of this kind. I'm afraid that plain key/value pairs aren't expressive enough to gather all the possible needs.

I don't see this. In what way does a node-based metadata scheme need the lexical complexity, confusion, and lack of normal XML validation facility of RDF. If your needs for metadata are as complex as that, such as wanting to put Dublin Core in XNode, you don't need RDF to do that, just use the Dublin Core element names. I believe there are docs on their site regarding non-RDF use of DC.

Fine. Yet, as you correctly point out later on:

 > My take on metadata and Xindice is that if one's metadata needs are
 > really that complex, it's likely to be a better solution to keep it
 > separated from the nodes themselves and use some linking solution,
 > rather than bloating XNode with a lot of features that not everyone
 > might want.

This is the whole point. There are two approaches, each with pros and cons. The point is finding a balance between a simple solution that might end up to be poor and a rich one which might be overkill.

It was pointed out many times that it was perfectly possible to write an Xindice based application using metadata stored in parallel collection or some tricks of the like. Now, I'm a bit against it: I don't think that moving this concern into the application domain is a good thing.

I'd rather have hooks directly in the API to do my metadata stuff. And if we are to follow this path, then the API should be as simple as possible yet rich and powerful enough to allow for complex
needs.


This is why I'm a bit scared about the key/value approach, which might bee too simple for many environments. Think about workflow: I think this can be a good candidate for metadata, yet I fail to see how it can be expressed with a simple (yet poor) attribute approach. The DC suggestion is OK, but only for the publishing industry, yet fail short elsewhere.


Because XNode's metadata model is simple, ie., the metadata is
associated with the node by virtue of it being in the <xnode:Header>
element, querying is simple. There's an effective tuple built in:

         node -------------> property
                    |

                    |

                   name

This operates at the node level. It'd be possible to have metadata
pointing *into* the node, but as I mentioned in my previous post,
this starts to stray away from simplicity pretty quickly.

You're mostly correct in that DC was developed by the library community
to maintain metadata (ie., library catalog data) about their holdings.
But it's really quite useful outside of that context. For example,
using DC's "subject" one can go a long way in describing things, esp.
given the overlap with topic maps. Note that the Dublin Core web site
has a number of online documents describing how to extend DC to include
other categorization schemes.

Then again, if DC doesn't work in your application, don't use it. That's
why I didn't put DC into XNode. But XNode *allows* storage of DC content,
as per the DC recommendations for storing DC properties in attribute
values. (I'll have to dig up that document; I wrote up a spec last year
about extending HTML/XHTML's metadata capabilities using this particular
DC spec as a basis).


OK, RDF is overkill. Definitely. But properties aren't probably enough. There might be a balance. Hmm... just a stone in the lake, but how about nested properties? Something like:

<xnode:Property name="changelog" value="1.0">
  <xnode:Property name="version" value="0.9">
    <xnode:Property name="author" value="Murray Altheim"/>
    <xnode:Property name="comment" value="Bugs and fixes"/>
    <xnode:Property name="date" value="2002-04-16"/>
  </xnode:Property>
  <xnode:Property name="version" value="0.9">
    <xnode:Property name="author" value="Gianugo Rabellino"/>
    <xnode:Property name="comment" value="Initial proposal"/>
    <xnode:Property name="date" value="2002-04-15"/>
  </xnode:Property>
</xnode:Property>

How about this?

Seems pretty ripe for abuse, really. How would you efficiently and accurately query it? If you really want this level of metadata, I'd begin to store the complex metadata in another XNode and point to it from within the original node, esp. since it seems you'd want to treat that metadata as a document in its own right.

Cheers,

Murray

......................................................................
Murray Altheim                         <mailto:m.altheim @ open.ac.uk>
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK

     In the evening
     The rice leaves in the garden
     Rustle in the autumn wind
     That blows through my reed hut.  -- Minamoto no Tsunenobu



Reply via email to