Re: UUID musings and area bashing

Joern Nettingsmeier Tue, 15 Aug 2006 09:32:30 -0700

Andreas Hartmann wrote:
> Hi Joern,
> 
> thanks for your comprehensive comments!


>> * are UUIDs unique across publications?
>>   -> if yes, {pubId} is redundant. do we want to drag it along?
> 
> It would be great if we could omit it, but this would require a
> performant lookup mechanism. Or we just put all content in a
> single box, and add the publication ID to the meta data. This
> sounds quite appealing to me.

+1 for all-in-one-box, but -1 for adding it to the per-{UUID+lang}
metadata. that's suboptimal from an index efficiency pov. the
publication should maintain a list of which UUIDs belong to it. this is
also easier to debug, since you can see everything at one glance.

moreover, the property "belonging to <pub>" is orthogonal to revisions,
while per-{UUID+lang} metadata is not.

>> * UUIDs are definitely orthogonal to revisions. we do not need to access
>> revisions other than "current" most of the time, but we should make it
>> possible now in order to avoid having to tack on another mechanism for
>> situations where revisions are involved.
> 
> +1, this sounds useful.

so the "unique index" to borrow from database theory would be tuple
{UUID+lang+revision}

>> [terminology]
>>
>> are we realling "addressing documents"?
>>
>> currently, i find in the sitemaps the term "document-uuid".
>> that implies we use the term "document" to mean "the set of all stored
>> data snippets (including meta) that corresponds to a particular UUID".
>>
>> so we are not addressing documents. we are addressing particular
>> instances of a document in a certain publication, area and language.
> 
> At the moment, a document is a particular translation in a particular
> area in a particular publication (we didn't yet change the terminology,
> at least as far as the class names are concerned).
>
> IIRC we agreed upon the term "translation" for this.
> We don't have a class for "the object that contains all translations
> for a UUID in a certain area" yet. That would be a document/resource/asset
> (IIRC "document" was the preferred term).

so let me propose the following:

<section status="draft" normative="yes">
the entirety of all data pertaining to what is traditionally called a
"web page" is called a *document* within lenya.

documents are uniquely identified by *UUIDs*, which may therefore be
called *document UUIDs* for extra clarity.

documents contain one or more *translations*. "translation" here refers
to the actual content, and includes the "original language version",
being a general category.
each translation has *metadata* associated with it.

the terms MUST, MUST NOT and SHOOTING OFFENSE are to interpreted as
described in RFC2119.
</sections>

>> [areas]
>>
>> thinking about andreas' suggestion, it becomes ever more evident to me
>> that the area concept is flawed. areas should be done in altogether in
>> the not too far future.
> 
> I agree that it has to be reconsidered, but should we address in 1.4?

HELL NO! :-D

this is 1.5 stuff. but i should think that the 1.4 cycle will be short
anyway.

>>> An internal link URL might look like this:
>>>
>>>   document://{pubId}:{area}:{uuid}:{language}
>>
>> what about lenya: and lenyadoc:? i must confess i have never quite
>> grasped the concept...
> 
> lenya:// is one layer below this, it addresses repository nodes.

does that mean that it's obsolete now? or if not, what is it currently
used for?

> lenyadoc:// is probably fine for links. Maybe we should just use that one.

>> in any case, the protocol should definitely begin with "lenya...", so
>> that it's immediately obvious what's going on in the sitemap.
>>
>> i would even go as far as suggesting that all our input modules and
>> pseudo-protocols that are not suited for upstream cocoon be re-named
>> lenya-fallback, {lenya-docinfo:...} etc.
>> this would greatly reduce the learning curve for our users, and make
>> life easier for casual committers from other apache projects, since it's
>> obvious if custom magic is at work, as opposed to core cocoon
>> functionality.
> 
> -0.5, I'd prefer to keep them short, but it's OK with me to change it.

i strongly feel that cocoon namespaces must be restructured, even at the
cost of increased verbosity. it should be easy to register both the
traditional and the prefixed name for a grace period, and move the
sitemaps over piecemeal without breaking external code too soon.





-- 
"I don't need backups. I need restore!" - Trad.

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: [EMAIL PROTECTED], Telefon: 0203/379-2736


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: UUID musings and area bashing

Reply via email to