Hi guys,

I am very interested in your aproach, my "similar" personal solution
to this problem (I have built an ecommerce platform with gwt) was to
develop an small program that generates one html file per product of
the store, it is, one html file with apropiated tags (title,
description, keywords) and an iframe wich contains the gwt-history-
like-urls of the products. After the iframe, I add static html links
to other products. My servlet also build a sitemap to link this files

What do you think about this "solution"? May be, the mixture of this
two aproachs is the key.

Regards





On 27 feb, 16:26, Nicolas Wetzel <wetz...@gmail.com> wrote:
> Hi all,
>
> I'm working for a compagny which build a web site broadcasting music based
> on gwt:www.awdio.com
>
> On SEO, we've found some interresting stuff to cope with Ajax specifity :
> search engine can't  have javascript engine so they are not able to retrieve
> the entire html produced by gwt script (or by other ajax framework script).
> So each page ie "gwt screen"   can not be indexed by them. Rather than
> duplicate each page with a hand-made static html page accessible by the
> noscript tag, we produce them with a java program which launches an
> SWTBrowser (Eclipse 3.4) with the start url :http://www.awdio.com.
>
> The main issue with this approach is that the client program has no means of
> knowing when the page is fully rendered by the javascript process.
> In the gwt awdio code we implemented a "semaphore" (flag) which notifies the
> SWTBrowser based client of the completion.
>
> This semaphore works with a hidden <DIV> drawing</DIV> which is accessible
> or not in the DOM, i.e the the html content produced contains it.
> With org.eclipse.swt.browser.Browser.getText() we can retrieve the html
> content, and test for the presence of the above mentioned flag.
>
> To do that  the java program listens at the Browser statustext event
> (org.eclipse.swt.browser.StatusTextListener).
>
> Also, when the page is loaded, the program gets the content and looks up at
> all the internal links <a href="#  built by the gwt Hyperlink widget. Before
> storing the html content in a cachable static page,  all the '#' are
> remplaced by a '/' so that the bot will get fully qualified URLs (the
> crawlers do not handle anchors).
>
> Finally, the program follow each links with the SWTBrowser so all the static
> version of the pages can be produced automaticaly.
>
> At last in the awdio server, a front-end servlet detects the user-agent of
> request and if it's a search engine the static produced page is returned.
> else the gwt host page is returned.
> As far as I understand, this might be considered "shadowing". But the
> content seen by the crawler is exactly the same as the one seen by the user
> (after Javascript execution).
>
> In the onModuleLoad of the awdio EntryPoint the right part of the url is
> parsed to build the corresponding historyToken. So when 
> thewww.awdio.com/eventsis requested on a browser, it react in the same way as
> if the user clicked on an internal link (#events).
>
> Everythink looksfine,  but there is still a big issue.....
>
> If a user copies and pastes one of our URLs on his own site, it will contain
> the hash sign (e.g. :http://www.awdio.com/#events). Which means that the
> search engine will not rank pages independently (all pages will be
> considered as a single one :http://www.awdio.com).
>
> We can still add "link to this page" buttons wherever necessary, but it's
> not satisfactory.
>
> To conclude, it seems that this whole solution solves the AJAX indexing
> issue, with the very annoying exception of page ranking (due to the #anchor
> URLs). Maybe Google should start to consider #anchors as having a new
> meaning for our Web 2.0 generation ? Maybe by considering a specific value
> of the "rel" attribute ? (e.g. :  <A HREF="#mypage" rel="ispagelink">My
> Page</A>) ?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to Google-Web-Toolkit@googlegroups.com
To unsubscribe from this group, send email to 
google-web-toolkit+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/Google-Web-Toolkit?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to