On 12/31/2010 03:58 AM, Derek Munneke wrote:

For indexing use a sitemap.xml


As far as I understand sitemap.xml is possibly only part of the solution. I mean, I can list in the sitemap two URLs such as:

http://acme.com/foo/bar/#1
http://acme.com/foo/bar/#2

(and I'm not sure BTW if the #hash is swallowed by the crawler), but in the end the crawler will always download the very same page, whose contents are dynamically adapted by JavaScript. AFAIK the Google crawler could run some JavaScript (the best of my knowledge is here: http://blogs.forbes.com/velocity/2010/06/25/google-isnt-just-reading-your-links-its-now-running-your-code/), but I doubt it would run my slideshow code, that if not stopped by the specific pressure of a button runs forever (the cited blog is all about the problem of detecting the termination of a script). I suppose I could detect that the page has been grabbed by the Google crawler, and in this case just run a very simple script that adds a text section to the HTML with the description of the photo and then stops. But how would I guess if it works? It could take days before the crawler gets my page after any change, and if it doesn't work I would be never sure about what's broken. I hope there's some more documentation that Google provides to us about this problem.

--
Fabrizio Giudici - Java Architect, Project Manager
Tidalwave s.a.s. - "We make Java work. Everywhere."
java.net/blog/fabriziogiudici - www.tidalwave.it/people
fabrizio.giud...@tidalwave.it

--
You received this message because you are subscribed to the Google Groups "The Java 
Posse" group.
To post to this group, send email to javapo...@googlegroups.com.
To unsubscribe from this group, send email to 
javaposse+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en.

Reply via email to