On 1/9/06, Andreas Hartmann <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > On 1/6/06, Andreas Hartmann <[EMAIL PROTECTED]> wrote:
> >>At the moment, the concept of multiple URLs per document is typically
> >>used for language versions (foo_{defaultlanguage}.html = foo.html)
> >>and to support different URL suffixes (foo, foo.htm, foo.html).
> > Language is a property required for determining the response.  It must
> > be included in the request, or the default is used.  Language does not
> > need to be in the URL; it could be handled by the session information
> > for the visitor.
> At the moment, we only support languages as part of the URL (see
> DocumentBuilder method signatures). Actually, I wouldn't like to
> allow other means of language specification for the following reasons:
> - Providing the language in session/cookie/etc. information can easily
>    implemented using a redirect to the corresponding URL (including the
>    language).
> - IMO the content represented by a URL (I mean the raw information
>    content, excluding personalization etc.) shouldn't depend on session
>    information etc.
> - The API is probably easier when you can build document objects
>    based on string information rather than complex request objects.
>    (sure, the API shouldn't limit the flexibility, but see above -
>    IMO redirects are sufficient)

I agree language should remain part of the URL.  It is easier and
cleaner.  Some publications may not have the same documents for every
language, and multiple language publications should provide an easy
method for switching languages.

The advantage of moving language to Session is /foo.html always opens
in the desired language for that visitor.  We could add it as an
option someday, but the functionality does not need integration into
core (meaning a little work with an XMAP could handle it.)

> > JCR allows more options for the structure, but we still have to decide
> > if we want the ability for a document to have several parents.  This
> > functionality may be useful in rare cases.  Is it worth the additional
> > complexity to make Lenya useful for those cases?  It would be much
> > more difficult to add it later than to integrate it during the
> > migration to JCR.
> Maybe you'd like to start a thread about this? I agree that it is
> worth discussing.

OK.  I'll start that next.

> >>The ambiguity
> >>that multipe, arbitrary URLs can point to a document would be removed.
> It might occur that multiple URLs represent a single Document object.
> With URLs u1 and u2, it is not possible to check if u1 and u2 represent
> the same document without using the DocumentBuilder.

With the flat structure, every document has a UNID (Unique Identifier):
getUNID(url1) == getUNID(url2)
tells if they are the same document.  The display may be different
based on the module/area/usecase used for the display:
/live/foo.html
/map/foo.html
both refer to the same document, but the first displays the document
and the last display the sitemap with the document highlighted.

> > Language must be a property of all documents.  "/foo.html" is a
> > shortcut to the default.
> > Our 1.2 publication forces /foo.html to redirect the browser to
> > /foo_{currentLanguage}.html.
> This is IMO a good idea. The redirect removes the necessity to represent
> /foo.html and /foo_en.html by the same document.

I am uncertain of real world usage of /foo.html.  Most publications
use the default language.  Others might return a "Choose Language"
module.  Is there another usage?

>  > The "normal" hierarchical View is for documents to appear
> > under their parent.
> Just a note - would that be the parent in the default (URL related)
> site structure, i.e. /foo would be the parent of /foo/bar?

Yes.

> > Additional Views could be specified (although the
> > only one I can imagine quickly would be a "Only Primary Children" View
> > if we allow documents to be children of multiple parents.)  The View
> > choice would be specified and used by the module, especially if we use
> > the "Area" part of the URL to specify the module.  Examples:
> > /pub/live/docID uses the "live" module which displays a Document while
> > using the hierarchical View for menus.
> >
> > /pub/map[/docID] uses the "map" module which uses the hierarchical
> > View to display the entire structure with the optional document
> > highlighted.
> >
> > /pub/index/titles[/docID] uses the "title" module which uses the flat View
> > to display the entire structure sorted alphabetically by Title with
> > the optional document highlighted.
> > /pub/index/created
> > /pub/index/published (last published at bottom)
> > /pub/index/published-reverse (last published at top)
>
> That's actually not exactly what I had in mind, but it is very interesting
> as well. I was only thinking of navigation widgets that operate differently
> on the same URL space and therefore would have to be tracked using
> the session etc.
>
> The examples you're refering to would imply reserved URL spaces.
> Actually this concept is not yet supported by the Lenya internals
> (sure, you can implement it using Cocoon internals), and is IMO
> too complex to be discussed in this thread (though it obviously is
> related).

I believe this is the simplest method of handling
modules/areas/usecases.  Move "live" and "authoring" from "Areas" to
"Usecases".  Make all the Usecases into "Modules".  Use the Area part
of the URLs to specify Modules.  Done.

It is so easy that I almost patched Lenya1.2 to handle it.  It removes
the "?lenya.usecase=".

There is no need to track anything using Session.  As in the language
discussion earlier, there are many advantages to keeping everything in
the URL.

======
On 1/9/06, Andreas Hartmann <[EMAIL PROTECTED]> wrote:
> Michael Wechner wrote:
> > Andreas Hartmann wrote:
> >> 2. A document can be represented by an arbitrary number of URLs.
> > you mean like "softlinks"?
> Actually not quite. Softlinks would be fine, because that would imply
> a "real", native URL. But Lenya supports multiple "native" URLs for
> a Document object. In the filesystem, that would mean that one and
> the same file occurs in multiple paths, without the ability of singling
> out one of them.

getUNID(any url)
getPrimaryURL(UNID) uses the main hierarchical index to create the "native" URL.

getAllURLs(UNID) would be publication-specific, and would be difficult to write.


> >> 3. For each document, there is exactly one canonical URL.
> > what do you mean by canonical URL?
> We once created the term to be able to denote a kind of "primary" URL.
> It is not clearly defined what this URL should look like. You can
> probably compare it to a canonical filesystem path.
>
> If you generate the canonical URLs of two documents d1 and d2, and the
> canonical URLs are equal, then d1 and d2 represent the same document.

As I wrote earlier, if you need this test, compare the UNIDs.

> The DefaultDocumentBuilder returns /foo.html for the default language
> and /foo_xx.html for the other languages as canonical URLs.

Is that desirable?  Should the document for the default language
return "/foo_{language}.html"?  The consistency makes XSL programming
much easier.  Before I converted our publication to always use the
language in the URLs, every function related to language had to check
if the language existed, approximately doubling the code required.  I
noticed the same extra code because of inconsistent URLs thoughout
Lenya.  If we add a few lines as early as possible to add the language
to any requested URL that does not contain the language, much of Lenya
would be cleaner.

> > I am not sure if I understand you correctly, but I would say we should
> > go with (2), but
> > I guess if you make an example, e.g.
> > /en/developers/andreas-hartmann
> > /de/entwickler/andreas-hartmann
> > /en/committers/andi
> > /de/committers/andreas
> In my opinion, only one of these URLs should actually represent the document.
> The others should merely point to the document, i.e. by redirects, URL
> rewriting or another concept like this. If you ask the document for its
> URL, there should be only one option that can be returned.

Is this "multiple parents" or "softlinks"?  In other words, should one
document be accessible through multiple parents, or are there several
documents that dereference to one document with the content?  (Let us
support both, and let the editors decide when each is appropriate.)

> >> Another question: With multiple site structures, how does the system keep
> >> track of the currently selected site structure?
> >>   - URL prefix
> > that would be my first suggestion, similar to "context"  for servlets
> Yes, but it would require reserved URL spaces.

Back to my "Use Areas for Modules".  I do not think of them as
"reserved".  The intention is to remove the currently reserved "live"
and "authoring" URL spaces by using the same process for all URLs.

Keep the Usecase/Module framework, and use the old Area to specify
which Module to use.  This adds much flexibility without losing
backwards-compatibility.  (We should keep the "?lenya.usecase="
override format for backwards-compatibility.)

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to