Michael Wechner wrote:
Andreas Hartmann wrote:

Michael Wechner wrote:

[...]

2. A document can be represented by an arbitrary number of URLs.



you mean like "softlinks"?



Actually not quite. Softlinks would be fine, because that would imply
a "real", native URL. But Lenya supports multiple "native" URLs for
a Document object. In the filesystem, that would mean that one and
the same file occurs in multiple paths, without the ability of singling
out one of them.



you mean duplicated content?

No, the content is the same. IMO it's just confusing that we have a
1:n mapping from document objects to URLs, especially if this is not
really necessary and can be implemented in a higher layer.


3. For each document, there is exactly one canonical URL.



what do you mean by canonical URL?



We once created the term to be able to denote a kind of "primary" URL.
It is not clearly defined what this URL should look like. You can
probably compare it to a canonical filesystem path.

If you generate the canonical URLs of two documents d1 and d2, and the
canonical URLs are equal, then d1 and d2 represent the same document.

The DefaultDocumentBuilder returns /foo.html for the default language
and /foo_xx.html for the other languages as canonical URLs.

>
let's assume the default language is english and you ask for foo/en, then you will
receive foo.html instead of foo_en.html?

Yes, this is possible, and I don't like it.
If you operate on a Document object which was built using the URL /foo/en
and your DocumentBuilder resolves this to (documentID=foo, language=en),
it might return /foo_en.html as the canonical URL. IMO this behaviour is
confusing, that's why I'd like to remove the DocumentBuilder.


I would expext it the other way around ...

That depends entirely on your implementation. You can have it this way
or the other way round. I don't like either of them.


This is reflected in the following methods:

  DocumentBuilder.buildDocument(...)
  DocumentBuilder.buildCanonicalUrl(...)





if you use the DocumentBuilder, then I guess the above is correct, but I don't think one has to use the DocumentBuilder



It's virtually impossible not to use Lenya without the DocumentBuilder.


hm ... well, I do at least with 1.2.x

Really? Are you sure it's not called somewhere? :)


[...]

I guess if you make an example, e.g.

/en/developers/andreas-hartmann

/de/entwickler/andreas-hartmann

/en/committers/andi

/de/committers/andreas



In my opinion, only one of these URLs should actually represent the document.
The others should merely point to the document, i.e. by redirects, URL
rewriting or another concept like this. If you ask the document for its
URL, there should be only one option that can be returned.




well, I don't think so, resp. to me the above is just a naviagtion thing and within
the repository you have a different path resp. a UUID

For the same content? I see that there are differnent language versions,
but I'm talking about a single content item.


Option (2) implies that, when a document is created, its URL and its location in the site structure have to be determined. IMO this is just a GUI issue. In most cases, a default site structure which corresponds to the URL space, will be used to create documents. These documents can be referenced from
other site structures later on.

I'm not particularly fond of the DocumentBuilder concept. With option (2) and the default site structure it would be obsolete, because the document could be derived directly from the default site structure. The ambiguity
that multipe, arbitrary URLs can point to a document would be removed.

----

The question is if multiple URLs for a document should be allowed at all.





sure, why not? I think there are many usecases for that and existing URL spaces
which couldn't be handled by Lenya if it won't support this...



Sure, the system should allow to have multiple URLs pointing to a document.
But, as I already mentioned, there are several concepts to support this:

- redirects
- URL rewriting (proxy)
- soft links
- ... (?)

We don't have to support multiple URLs to natively *represent* a document.




well, it seems to me you can do this very easily by separating the navigation framework properly from the "repository space", whereas one can offer the repository navigation as default or "canonical" navigation

Sure, it can easily be implemented, that's what I'm suggesting.


Actually I don't think this is necessary. At the moment, many publications
show the following behaviour:

/foo.html       -> Hello World!
/foo_en.html    -> Hello World!
/foo_de.html    -> Hallo Welt!

Why is the support for /foo_en.html necessary? I see only two reasons:

1. Laziness. You don't have to find out the default language to create a URL.
2. You can switch the default language without creating dead URLs.

IMO both of them don't outweigh the disadvantages of an ambiguous URL space. In fact, (2) should probably be avoided because the content of a document page changes (it becomes a different language version). So IMO it could look
like this:

/foo.html       -> Hello World!
/foo_en.html    -> 404
/foo_de.html    -> Hallo Welt!






what if you switch the default language to german,



... which is not a good thing to do IMO, see above ...



why not?

Because you change the content language of a URL. Imagine you make
a bookmark of an English page, and two weeks later you go there
and get Japanese content ...


It seems to me very simple, because it's just matter of not favorizing any (pre)selected language ...


then suddenly all foo_de become 404?!



You could solve this using redirects, as Solprovider suggested.
Or using softlinks.



I don't think that's necessary, because these documents/URLs do exist, so why not let people retrieve them?!

What do you mean? The client doesn't matter here. She can't tell if
he accesses a document by its native URL or by an internal softlink.
The HTML page will be exactly the same.


Actually this would simplify the URL mapping concept by merging document ID (or better document path to avoid confusion with the UUID) and language. In the site structure, there wouldn't be multiple language versions of a document, but only links to documents. The connection between the actual language versions of a document would be represented in another location
(see ContentNode and Document in o.a.l.cms.repo for more information).

Assuming we have two documents which are language versions of the
same content:

* language="en" uuid="1-en"
* language="de" uuid="1-de"

This could be represented for instance by the following default site structures:

1. /foo.html
   /foo_de.html

   <node id="foo" document-uuid="1-en"/>
   <node id="foo_de" document-uuid="1-de"/>

   (note that the language suffix "_de" is just a part of the URL)





I am not sure if this is a good idea and what the consequences are ... my belly tells me that it's a bad idea ;-)
(e.g. in the case of switching the default language)




OK, how about this:

    <node id="foo" softlink="1-en"/>
    <node id="foo_en" document-uuid="1-en"/>
    <node id="foo_de" document-uuid="1-de"/>

If you change the default language, you'd have to change the links
(automatically), but IMO this price can be paid.



I rather think that the default language should be a redirect to the actual language, e.g.

/foo.html is being redirected to /foo_de.html

or

/foo.html is being redirected to /de/foo.html

Hmm - isn't that what I'm suggesting?

[...]

Yes, my statements were of rather general nature. Is there anything
particular you'd like an example for?




yes, but I think it's best if we just start usecases in the Wiki and start defining a common language, otherwise I am afraid that we might disagree on stuff we actually agree on and vice versa ;-)

Acually I'm afraid so as well :)

I started the discussion to point to an IMO unclean design issue of our
API which IMO makes the implementation complicated and sometimes confusing,
but I see that the scope obviously became much larger than that.

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to