Michael Wechner wrote:
Andreas Hartmann wrote:
Michael Wechner wrote:
[...]
2. A document can be represented by an arbitrary number of URLs.
you mean like "softlinks"?
Actually not quite. Softlinks would be fine, because that would imply
a "real", native URL. But Lenya supports multiple "native" URLs for
a Document object. In the filesystem, that would mean that one and
the same file occurs in multiple paths, without the ability of singling
out one of them.
you mean duplicated content?
No, the content is the same. IMO it's just confusing that we have a
1:n mapping from document objects to URLs, especially if this is not
really necessary and can be implemented in a higher layer.
3. For each document, there is exactly one canonical URL.
what do you mean by canonical URL?
We once created the term to be able to denote a kind of "primary" URL.
It is not clearly defined what this URL should look like. You can
probably compare it to a canonical filesystem path.
If you generate the canonical URLs of two documents d1 and d2, and the
canonical URLs are equal, then d1 and d2 represent the same document.
The DefaultDocumentBuilder returns /foo.html for the default language
and /foo_xx.html for the other languages as canonical URLs.
>
let's assume the default language is english and you ask for foo/en,
then you will
receive foo.html instead of foo_en.html?
Yes, this is possible, and I don't like it.
If you operate on a Document object which was built using the URL /foo/en
and your DocumentBuilder resolves this to (documentID=foo, language=en),
it might return /foo_en.html as the canonical URL. IMO this behaviour is
confusing, that's why I'd like to remove the DocumentBuilder.
I would expext it the other way around ...
That depends entirely on your implementation. You can have it this way
or the other way round. I don't like either of them.
This is reflected in the following methods:
DocumentBuilder.buildDocument(...)
DocumentBuilder.buildCanonicalUrl(...)
if you use the DocumentBuilder, then I guess the above is correct,
but I don't think one has to use the DocumentBuilder
It's virtually impossible not to use Lenya without the DocumentBuilder.
hm ... well, I do at least with 1.2.x
Really? Are you sure it's not called somewhere? :)
[...]
I guess if you make an example, e.g.
/en/developers/andreas-hartmann
/de/entwickler/andreas-hartmann
/en/committers/andi
/de/committers/andreas
In my opinion, only one of these URLs should actually represent the
document.
The others should merely point to the document, i.e. by redirects, URL
rewriting or another concept like this. If you ask the document for its
URL, there should be only one option that can be returned.
well, I don't think so, resp. to me the above is just a naviagtion thing
and within
the repository you have a different path resp. a UUID
For the same content? I see that there are differnent language versions,
but I'm talking about a single content item.
Option (2) implies that, when a document is created, its URL and its
location
in the site structure have to be determined. IMO this is just a GUI
issue.
In most cases, a default site structure which corresponds to the URL
space,
will be used to create documents. These documents can be referenced
from
other site structures later on.
I'm not particularly fond of the DocumentBuilder concept. With
option (2)
and the default site structure it would be obsolete, because the
document
could be derived directly from the default site structure. The
ambiguity
that multipe, arbitrary URLs can point to a document would be removed.
----
The question is if multiple URLs for a document should be allowed at
all.
sure, why not? I think there are many usecases for that and existing
URL spaces
which couldn't be handled by Lenya if it won't support this...
Sure, the system should allow to have multiple URLs pointing to a
document.
But, as I already mentioned, there are several concepts to support this:
- redirects
- URL rewriting (proxy)
- soft links
- ... (?)
We don't have to support multiple URLs to natively *represent* a
document.
well, it seems to me you can do this very easily by separating the
navigation framework properly
from the "repository space", whereas one can offer the repository
navigation as default or "canonical" navigation
Sure, it can easily be implemented, that's what I'm suggesting.
Actually I don't think this is necessary. At the moment, many
publications
show the following behaviour:
/foo.html -> Hello World!
/foo_en.html -> Hello World!
/foo_de.html -> Hallo Welt!
Why is the support for /foo_en.html necessary? I see only two reasons:
1. Laziness. You don't have to find out the default language to
create a URL.
2. You can switch the default language without creating dead URLs.
IMO both of them don't outweigh the disadvantages of an ambiguous
URL space.
In fact, (2) should probably be avoided because the content of a
document
page changes (it becomes a different language version). So IMO it
could look
like this:
/foo.html -> Hello World!
/foo_en.html -> 404
/foo_de.html -> Hallo Welt!
what if you switch the default language to german,
... which is not a good thing to do IMO, see above ...
why not?
Because you change the content language of a URL. Imagine you make
a bookmark of an English page, and two weeks later you go there
and get Japanese content ...
It seems to me very simple, because it's just matter of not
favorizing any (pre)selected language ...
then suddenly all foo_de become 404?!
You could solve this using redirects, as Solprovider suggested.
Or using softlinks.
I don't think that's necessary, because these documents/URLs do exist,
so why not let people retrieve them?!
What do you mean? The client doesn't matter here. She can't tell if
he accesses a document by its native URL or by an internal softlink.
The HTML page will be exactly the same.
Actually this would simplify the URL mapping concept by merging
document ID
(or better document path to avoid confusion with the UUID) and
language.
In the site structure, there wouldn't be multiple language versions
of a document, but only links to documents. The connection between
the actual
language versions of a document would be represented in another
location
(see ContentNode and Document in o.a.l.cms.repo for more information).
Assuming we have two documents which are language versions of the
same content:
* language="en" uuid="1-en"
* language="de" uuid="1-de"
This could be represented for instance by the following default site
structures:
1. /foo.html
/foo_de.html
<node id="foo" document-uuid="1-en"/>
<node id="foo_de" document-uuid="1-de"/>
(note that the language suffix "_de" is just a part of the URL)
I am not sure if this is a good idea and what the consequences are
... my belly tells me that it's a bad idea ;-)
(e.g. in the case of switching the default language)
OK, how about this:
<node id="foo" softlink="1-en"/>
<node id="foo_en" document-uuid="1-en"/>
<node id="foo_de" document-uuid="1-de"/>
If you change the default language, you'd have to change the links
(automatically), but IMO this price can be paid.
I rather think that the default language should be a redirect to the
actual language, e.g.
/foo.html is being redirected to /foo_de.html
or
/foo.html is being redirected to /de/foo.html
Hmm - isn't that what I'm suggesting?
[...]
Yes, my statements were of rather general nature. Is there anything
particular you'd like an example for?
yes, but I think it's best if we just start usecases in the Wiki and
start defining a common language,
otherwise I am afraid that we might disagree on stuff we actually agree
on and vice versa ;-)
Acually I'm afraid so as well :)
I started the discussion to point to an IMO unclean design issue of our
API which IMO makes the implementation complicated and sometimes confusing,
but I see that the scope obviously became much larger than that.
-- Andreas
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]