Rasik Pandey wrote:
Ross Gardler wrote:
Ferdinand Soethe wrote:
Good point. However, I don't think OAI has a "minimal" form, I did some
preliminary research into it a few months ago. Let me check it out, I'll
report back.
However, I'd still like to see support for Google sitemaps since we can
do it very quickly and it is more "approachable" than OAI since everyone
knows Google.
If we go for the Google format, I'd like to suggest to use slightly
more than the minimum format in this form (as documented in
https://www.google.com/webmasters/sitemaps/docs/en/protocol.html)
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="
http://www.google.com/schemas/sitemap/0.84"
<http://www.google.com/schemas/sitemap/0.84">>
<url>
<loc>http://www.yoursite.com/catalog?item=83&desc=vacation_usa
<http://www.yoursite.com/catalog?item=83&desc=vacation_usa></loc>
<lastmod>2004-11-23</lastmod>
</url>
</urlset>
and include the 'lastmod' right away as that would be the key to speedy
updates. Can we do that?
Why not use rss2.0 as the format
http://www.google.com/webmasters/sitemaps/docs/en/other.html#feed
?
It's not the format of the document that is a problem, that part is
easy. The hard part is knowing when the page has been regnerated because
of a change.
I'd recomend getting the minimal done, then looking at a way of getting
the lastmod as well.
What do you consider the minimal? In rss <pubDate> and <link> ?
The minimum required by Google, i.e those marked requried in the following:
http://www.google.com/webmasters/sitemaps/docs/en/protocol.html#xmlTagDefinitions
(or if we used RSS instead whatever is required in that format).
Did you see that Google wants the urls to be url encoded? Does our
XSLT-engine have a function for that?
http://www.exslt.org/str/functions/encode-uri/index.html
Why not use the
http://cocoon.apache.org/2.1/userdocs/transformers/encodeurl-transformer.html?
Why not indeed. Thanks for the pointer.
Ross