[EMAIL PROTECTED] wrote:
On 12/7/05, Felix Röthenbacher <[EMAIL PROTECTED]> wrote:

Andreas Hartmann wrote:

>> [...]

I have always been critical of the "area" concept, and I dislike #1
and #2 for the reasons stated.  This sounds like the time to improve
the architecture by using something like #3.

In Lenya 1.2 terms, I would prefer:
content/docpath/doc-id/language/versions

There have been suggestions to move "language" even earlier in the
path.  As JCR, language could be either a node or a property.  I
cannot think of any reasons to have it as a node.

I also think that the language attribute should be implemented as
a property but if it comes to a language document it should be
implemented as a separate node.


"versions" could be creation-time-based names for each revision
"200512071751.xml", plus "index.xml" or "live.xml" for the currently
published version.  "archive" could prefix an "a".  Trash could prefix
a "d" (for about to be "deleted").  Publishing is simply overwriting
"live.xml" with the desired revision.

Are you referring to JCR versioning or do you propose to implement
a versioning system on its own. To me, the latter seems the case
in your argumentation. If so, can you give us some more details what
would be the advantages over JCR versioning?


The advantages are every revision of a document is in one place.  The
disadvantage was the live version would not be distinct in the file
system, which was useful.  That disadvantage does not apply when using
Jackrabbit because file operations are very difficult.

===
In JCR terms, use properties for the revision nodes:
- status = "archived", "deleted", "active", "draft" (useful so someone
else does not publish it before it is ready)
- created = {time}
- edited = {time of last edit}
- language (??)

It seems that the vocabulary is not clearly defined. Are you referring
to workflow states or to revisions in the sense used in a versioning
system like SVN or CVS?


There should also be a history (for each language) at the document
node for which revisions were published and when.  The document node
should also specify which revision is the live one (unless that node
is duplicated with a special identifier.)

Combining all this into the simplest structure (N=Node, P=Property,
-=1, +=many):
N:Content
+ N:DocumentID
- + N:SubdocumentID (same structure as DocumentID)
- - N:Document (extra node to ease distunguishing between the
Document's Nodes and Nodes for the subdocuments.)
- - - P:Type (DocumentType = XHTML...)
- - - P:LiveRevision (for each language)
- - - P:Status = Draft, Review, Published, Version (published but
another version is ready)
- - - P:Visibility
- - - P:Creator
- - - P:Languages Available
- - - P:{Other document properties}
- - - N:History of publishing for each language
- - + N:Revision
- - - - P:Status
- - - - P:Language
- - - - P:Editor
- - - - P:{Other revision properties}
- - - - N:ContentInformation
- - - - - N|P:Document-type-specific fields

As discussed before, one possibility is to keep the actual content
separate from any representation (MVC model) meaning that the
content has a flat hierarchy and the actual hierarchy is built
on a sitetree which is in a separate place apart from the data.
Are you referring to the sitetree or the document content?
Should the information be stored with the document or within
the sitetree?


Which properties apply to the document and which apply to the
revision?  Title, Navigation Title, Subject, and Description could be
either.  They could also be moved to Nodes with their own versioning,
but that may be overkill.  Does rollback to a revision also rollback
those fields? Do we need a history of changes to those fields? Should changing the Title create a new revision? Those answers will
determine the best structure.

I think every modification should result in a new revision.


Lenya 1.2's various sitemap.xml files should be built dynamically from
the content.  That may require caching for good performance.

Do you mean sitetree.xml or sitemap.xmap?


===
Another decision is whether subdocuments are Nodes of the document. There has been much discussion about using a unique identifier for
every document.  This can be implemented as a minor change to the
above.  First, discard the "N:Document" node because it would not be
necessary, because there would be no N:SubdocumentID nodes.  Second,
add a P:ParentDocumentID.

Now use a flat structure:
N:Content
- N:Hierarchy (created from the Document Nodes, could include
information such as Visibility, substitute for sitetree.xml)
+ N:DocumentUniqueID
- - P:ParentDocumentID - only if a subdocument
- - P:Path
- - Everything under N:Document above

This is really good for many reasons:
1. No need to distinguish between content Node and subdocument Nodes.
2. Easy operation on multiple or all documents, such as building the
Search index.

Right.

3. Easy moving of subdocuments. Just change the ParentID, rather than
moving Nodes.

Would you like to have UID stored with each document for modeling the
parent/child relationship? Or would it be better to have a separate
sitetree?

4. Easy creation of flat views.  See all documents by Title, Author,
Status, Type.  The Status view will show all documents having versions
waiting to be published.  The Type view can show all Employees, even
if they are created as subdocuments of Departments.
5. All information is stored in the Document Nodes, so "Visibility"
and "Navigation Title" are  available when working with the document.

It also imposes additional concerns:
1. The tree of documents (N:Hierarchy) must be maintained for easy
transversal.  It should only contain the DocumentUniqueIDs.  Creating
sitetree.xml should access the document Nodes for the Visibility
property. This allows creating menus based on other properties. Eventually someone will add individual document security, and menus
will be built to show only the documents allowed to each person (based
on their name and/or Groups).

2. What happens to orphans?  Subdocuments will not automatically be
deleted when the Document Node is deleted.  It would be easy to create
a view of documents that have ParentDocumentIDs that no longer exist. Or the Delete process could also delete all subdocuments recursively.

I think there should be some kind of referential integrity to avoid any
orphans.

Thanks for some clarifications of your ideas!

- Felix


solprovider



--
Felix Röthenbacher                  [EMAIL PROTECTED]
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to