On 12/31/2010 03:58 AM, Derek Munneke wrote:
For indexing use a sitemap.xml
As far as I understand sitemap.xml is possibly only part of the
solution. I mean, I can list in the sitemap two URLs such as:
http://acme.com/foo/bar/#1
http://acme.com/foo/bar/#2
(and I'm not sure BTW if the #hash is swallowed by the crawler), but in
the end the crawler will always download the very same page, whose
contents are dynamically adapted by JavaScript. AFAIK the Google crawler
could run some JavaScript (the best of my knowledge is here:
http://blogs.forbes.com/velocity/2010/06/25/google-isnt-just-reading-your-links-its-now-running-your-code/),
but I doubt it would run my slideshow code, that if not stopped by the
specific pressure of a button runs forever (the cited blog is all about
the problem of detecting the termination of a script). I suppose I could
detect that the page has been grabbed by the Google crawler, and in this
case just run a very simple script that adds a text section to the HTML
with the description of the photo and then stops. But how would I guess
if it works? It could take days before the crawler gets my page after
any change, and if it doesn't work I would be never sure about what's
broken. I hope there's some more documentation that Google provides to
us about this problem.
--
Fabrizio Giudici - Java Architect, Project Manager
Tidalwave s.a.s. - "We make Java work. Everywhere."
java.net/blog/fabriziogiudici - www.tidalwave.it/people
fabrizio.giud...@tidalwave.it
--
You received this message because you are subscribed to the Google Groups "The Java
Posse" group.
To post to this group, send email to javapo...@googlegroups.com.
To unsubscribe from this group, send email to
javaposse+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/javaposse?hl=en.