[ http://issues.apache.org/jira/browse/FOR-677?page=all ]
David Crossley updated FOR-677:
-------------------------------
Fix Version/s: 0.9
(was: 0.8-dev)
Moving this issue to next release. As said above:
"We need to have Forrest "relativise" and "absolutise" the links, or make the
linkmap intelligent enough to relaise that "root/index.html" is the same as
"/root/index.html"
The latter happens in the Cocoon Linkrewriter Block.
> leading slash in gathered URIs causes double the number of links to be
> processed
> --------------------------------------------------------------------------------
>
> Key: FOR-677
> URL: http://issues.apache.org/jira/browse/FOR-677
> Project: Forrest
> Issue Type: Bug
> Components: Core operations
> Affects Versions: 0.7, 0.8-dev
> Reporter: David Crossley
> Fix For: 0.9
>
>
> Doing 'forrest' starts at the virtual document called linkmap.html where the
> Cocoon crawler gathers the initial set of links, then starts crawling and
> generating pages. Any new links are pushed onto the linkmap. However, for
> some sites, such as our own "seed-sample" and our "site-author", there is a
> sudden jump in the number of URIs remaining to be processed.
> This is due to a URI with a leading slash (e.g. /samples/faq.html). When that
> URI is processed, it gains a whole new set of links all with leading slashes,
> and so the list of URIs is potentially doubled.
> This issue could be due to a user error, i.e. adding a link that deliberately
> begins with a slash. Sometimes, that is unavoidable.
> However, we do have a sitemap transformer to "relativize" and "absolutize"
> the links. Should it always trim the leading slash? Or are there cases where
> that should not happen, so cannot generalise?
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira