On Thu, Apr 3, 2008 at 6:09 PM, Dan Kaplan <[EMAIL PROTECTED]> wrote: > Ok, at least I'm not missing anything. I understand the benefits it's > providing with its stateful framework. Developing a site with Wicket is > easier than with any other framework I've used. But this statefulness, > which makes websites so easy to develop, seems to be counter productive to > SEO:
well, perhaps the differentiator here is that wicket is made for web applications not web sites. > GoogleBot will follow and index stateful links. Worst case scenario, these > actually become visible to google users and when they click the link it > takes them to an "invalid session" page. They think, "This site is broken" > and move on to the next link of their search result. yep, you need to make sure that all stateful links are behind a login or something similar that the bot cant get passed. > Another approach to solving this is to block all the stateful pages in my > robots.txt file. But how can I block these links in robots.txt since they > change per session? Is there any way to know what the url will resolve to > when googlebot tries to visit my site so I can tell it to disallow: > /?wicket:interface=:10:1::: and ?wicket:interface=:0:1::: and ...? no there isnt a way, you have to use wildmasks on the other hand it is not that difficult to develop the stateless paging navigator, it will take a bit of work though. -igor > > > > > -----Original Message----- > > From: Igor Vaynberg [mailto:[EMAIL PROTECTED] > > > > Sent: Thursday, April 03, 2008 5:45 PM > > To: users@wicket.apache.org > > Subject: Re: Removing the jsessionid for SEO > > > > On Thu, Apr 3, 2008 at 5:31 PM, Dan Kaplan <[EMAIL PROTECTED]> > > wrote: > > > Ok I did a little preliminary research on this. Right now > > PagingNavigator > > > uses PagingNavigationLink's to represent its page. This extends Link. > > I'm > > > supposed to override PagingNavigator's newPagingNavigationLink() method > > to > > > accomplish this (I think) but past that, this isn't very > > straightforward to > > > me. > > > > > > Do I need to create my own BookmarkablePagingNavigationLink? When I > > do... > > > what next? I really don't know enough about bookmarkablePageLinks to > > do > > > this. Right now, all the magic happens inside PagingNavigationLink. > > Won't > > > I have to move all that logic into the WebPage that I'm passing into > > > BookmarkablePagingNavigationLink? This seems like a lot of work. Am I > > > missing something critical? > > > > no, you are not missing anything. you see, when you go stateless, like > > what you want, then you have to recreate all the magic stuff that > > makes stateful links Just Work. Without state you are back to the > > servlet/mvc programming model: you have to encode the state that you > > want into the link, then on the trip back decode it, recreate > > something from it, and then apply that something onto the components. > > This is the crapwork that wicket does for you usually. > > > > -igor > > > > > > > > > > > > > > -----Original Message----- > > > > From: Igor Vaynberg [mailto:[EMAIL PROTECTED] > > > > > > > > > > Sent: Thursday, April 03, 2008 3:40 PM > > > > To: users@wicket.apache.org > > > > Subject: Re: Removing the jsessionid for SEO > > > > > > > > you subclass the pagenavigator and make it use bookmarkable links > > > > also. it has factory methods for all the links it uses. > > > > > > > > -igor > > > > > > > > > > > > On Thu, Apr 3, 2008 at 3:36 PM, Dan Kaplan <[EMAIL PROTECTED]> > > > > wrote: > > > > > I wasn't talking about the links that are on the list (I already > > make > > > > those > > > > > bookmarkable). I'm talking about the links that the Navigator > > > > generates. > > > > > How do I make it so page 2 is bookmarkable? > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Igor Vaynberg [mailto:[EMAIL PROTECTED] > > > > > > > > > > > > > > > Sent: Thursday, April 03, 2008 3:30 PM > > > > > To: users@wicket.apache.org > > > > > Subject: Re: Removing the jsessionid for SEO > > > > > > > > > > instead of > > > > > > > > > > item.add(new link("foo") { onclick() }); > > > > > > > > > > do > > > > > > > > > > item.add(new bookmarkablepagelink("foo", page.class)); > > > > > > > > > > -igor > > > > > > > > > > > > > > > On Thu, Apr 3, 2008 at 3:28 PM, Dan Kaplan > > <[EMAIL PROTECTED]> > > > > wrote: > > > > > > How? I asked how to do it before and nobody suggested this as a > > > > > > possibility. > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: Igor Vaynberg [mailto:[EMAIL PROTECTED] > > > > > > Sent: Thursday, April 03, 2008 3:26 PM > > > > > > To: users@wicket.apache.org > > > > > > Subject: Re: Removing the jsessionid for SEO > > > > > > > > > > > > dataview can work in a stateless mode, just use bookmarkable > > links > > > > inside > > > > > it > > > > > > > > > > > > -igor > > > > > > > > > > > > > > > > > > On Thu, Apr 3, 2008 at 3:22 PM, Dan Kaplan > > <[EMAIL PROTECTED]> > > > > > wrote: > > > > > > > Regardless, at the very least this makes your site look > > "weird" > > > > and > > > > > > > unprofessional when google puts a jsessionid on your url. > > There > > > > has > > > > > got > > > > > > to > > > > > > > be some negative effect when google visits it the second > > time and > > > > the > > > > > > > jsessionid has changed but it sees the same exact content. > > Worst > > > > > case, > > > > > > > it'll think you're trying to trick it. > > > > > > > > > > > > > > About those 404s, I'm finding that with the fix I provided I > > > > don't get > > > > > a > > > > > > > 404, but the links refresh the page I'm already on. IE: If > > I'm > > > > on A, > > > > > and > > > > > > a > > > > > > > link to B is non-bookmarkable, clicking B refreshes A. > > > > > > > > > > > > > > This issue is very disconcerting to me. It's one of the > > reasons > > > > I > > > > > wish > > > > > > that > > > > > > > DataView had an option to work in stateless mode. Cause if > > I ban > > > > > cookies > > > > > > > and Googlebot visits my home page (with a navigator on it), > > it'll > > > > try > > > > > to > > > > > > > follow all these page links and from its perspective, they > > all > > > > lead > > > > > back > > > > > > to > > > > > > > the first page. So it's kinda a catch-22: Include the > > jsessionid > > > > in > > > > > the > > > > > > > urls and get bad SEO or remove the jsessionid and get bad > > SEO :( > > > > > > > > > > > > > > Perhaps the answer to my prayers is a combination of the > > > > > noindex/nofollow > > > > > > > meta tag with a sitemap.xml. I'm thinking I can put a > > nofollow > > > > on the > > > > > > home > > > > > > > page (so googlebot doesn't try to follow the navigator > > links) and > > > > use > > > > > the > > > > > > > sitemap.xml to point out the individual pages I want it to > > index. > > > > > > > > > > > > > > > > > > > > > Matej: can you go into more detail about your hybrid URL > > > > statement? > > > > > > Won't > > > > > > > google index, for example, /home and /home.1 if I use it? > > When > > > > it > > > > > > follows > > > > > > > the next page, won't the url become /home.1.2 or something? > > That > > > > .2 > > > > > is a > > > > > > > page version: If google indexes that and tries to visit it > > again, > > > > > won't > > > > > > it > > > > > > > report about an invalid session? > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Matej Knopp [mailto:[EMAIL PROTECTED] > > > > > > > Sent: Thursday, April 03, 2008 11:10 AM > > > > > > > To: users@wicket.apache.org > > > > > > > Subject: Re: Removing the jsessionid for SEO > > > > > > > > > > > > > > On the other hand, crawling non-bookmarkable pages is not > > very > > > > useful > > > > > > > anyway, since ?wicket:interface url will always get page > > expired > > > > when > > > > > > > you click on the result. > > > > > > > > > > > > > > However, preserving session makes lot of sense with hybrid > > url. > > > > Google > > > > > > > remembers the original url (without page instance) while > > indexing > > > > the > > > > > > > real page (after redirect). > > > > > > > > > > > > > > I think though that the crawler is quite advanced. I'm would > > > > think it > > > > > > > supports cookies (at least JSESSIONID) as well as it > > evaluates > > > > some of > > > > > > > the javascript on page. > > > > > > > > > > > > > > -Matej > > > > > > > > > > > > > > On Thu, Apr 3, 2008 at 6:56 PM, Igor Vaynberg > > > > > <[EMAIL PROTECTED]> > > > > > > > wrote: > > > > > > > > right. if you strip sessionid then all your > > nonbookmarkable > > > > urls > > > > > will > > > > > > > > resolve to a 404. that will probably drop your rank a lot > > > > > faster.... > > > > > > > > > > > > > > > > -igor > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Apr 3, 2008 at 9:16 AM, Johan Compagner > > > > > <[EMAIL PROTECTED]> > > > > > > > wrote: > > > > > > > > > the problem is that then you have to have all stateless > > > > pages. > > > > > Else > > > > > > > google > > > > > > > > > can't crawl your website. > > > > > > > > > And if that is the case then you could be completely > > > > stateless > > > > > so > > > > > > you > > > > > > > dont > > > > > > > > > have a session (id) to worry about at all. > > > > > > > > > > > > > > > > > > johan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Apr 3, 2008 at 4:54 PM, Zappaterrini, Larry < > > > > > > > > > [EMAIL PROTECTED]> wrote: > > > > > > > > > > > > > > > > > > > When Google asks to not have special treatment for > > their > > > > bot, > > > > > > they > > > > > > > are > > > > > > > > > > referring to content more than anything. Regarding > > the > > > > session > > > > > id > > > > > > > being > > > > > > > > > > coded in the URL, see the Technical guidelines > > section of > > > > > > Google's > > > > > > > > > > Webmaster Guidelines - > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://www.google.com/support/webmasters/bin/answer.py?answer=35769#desi > > > > > > > > > > gn > > > > > > > > > > > > > > > > > > > > It specifically recommends "allow(ing) search bots > > to > > > > crawl > > > > > your > > > > > > > sites > > > > > > > > > > without session IDs or arguments that track their > > path > > > > through > > > > > > the > > > > > > > > > > site." > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > > > From: Johan Compagner [mailto:[EMAIL PROTECTED] > > > > > > > > > > Sent: Thursday, April 03, 2008 7:35 AM > > > > > > > > > > To: users@wicket.apache.org > > > > > > > > > > Subject: Re: Removing the jsessionid for SEO > > > > > > > > > > > > > > > > > > > > isnt google always saying that you shouldn't alter > > > > behavior of > > > > > > your > > > > > > > site > > > > > > > > > > depending of it is there bot or not? > > > > > > > > > > > > > > > > > > > > On Thu, Apr 3, 2008 at 1:00 PM, Artur W. > > > > <[EMAIL PROTECTED]> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > igor.vaynberg wrote: > > > > > > > > > > > > > > > > > > > > > > > > also by doing what you have done users with > > cookies > > > > > disabled > > > > > > > wont be > > > > > > > > > > > > able to use your site... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In my opinion session id is a problem. Google > > index the > > > > same > > > > > > page > > > > > > > > > > again > > > > > > > > > > > and > > > > > > > > > > > again. > > > > > > > > > > > > > > > > > > > > > > About the users without cookies we can do like > > this: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > static class Unbuffered extends WebResponse > > { > > > > > > > > > > > > > > > > > > > > > > private static final String[] > > botAgents > > > > = { > > > > > > > > > > "onetszukaj", > > > > > > > > > > > "googlebot", > > > > > > > > > > > "appie", "architext", > > > > > > > > > > > "jeeves", "bjaaland", > > "ferret", > > > > > > "gulliver", > > > > > > > > > > > "harvest", "htdig", > > > > > > > > > > > "linkwalker", "lycos_", > > "moget", > > > > > > > > > > "muscatferret", > > > > > > > > > > > "myweb", "nomad", > > > > > > > > > > > "scooter", > > > > > > > > > > > "yahoo!\\sslurp\\schina", > > > > "slurp", > > > > > > > "weblayers", > > > > > > > > > > > "antibot", "bruinbot", > > > > > > > > > > > "digout4u", > > > > > > > > > > > "echo!", "ia_archiver", > > > > "jennybot", > > > > > > > "mercator", > > > > > > > > > > > "netcraft", "msnbot", > > > > > > > > > > > "petersnews", > > > > > > > > > > > "unlost_web_crawler", > > "voila", > > > > > > "webbase", > > > > > > > > > > > "webcollage", "cfetch", > > > > > > > > > > > "zyborg", > > > > > > > > > > > "wisenutbot", "robot", > > "crawl", > > > > > "spider" > > > > > > }; > > > > > > > /* > > > > > > > > > > and > > > > > > > > > > > so on... */ > > > > > > > > > > > > > > > > > > > > > > public Unbuffered(final > > > > HttpServletResponse > > > > > res) > > > > > > { > > > > > > > > > > > super(res); > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > @Override > > > > > > > > > > > public CharSequence encodeURL(final > > CharSequence > > > > url) > > > > > { > > > > > > > > > > > return isAgent() ? url : > > > > super.encodeURL(url); > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > private static boolean isAgent() { > > > > > > > > > > > > > > > > > > > > > > String agent = > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ((WebRequest)RequestCycle.get().getRequest()).getHttpServletRequest().ge > > > > > > > > > > tHeader("User-Agent"); > > > > > > > > > > > > > > > > > > > > > > for(String bot : botAgents) > > { > > > > > > > > > > > if > > > > > > > (agent.toLowerCase().indexOf(bot) != > > > > > > > > > > -1) > > > > > > > > > > > { > > > > > > > > > > > return > > true; > > > > > > > > > > > } > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > return false; > > > > > > > > > > > } > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I didn't test this code but I do similar thing in > > my > > > > old > > > > > > > application > > > > > > > > > > in > > > > > > > > > > > Spring and it works. > > > > > > > > > > > > > > > > > > > > > > Take care, > > > > > > > > > > > Artur > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > View this message in context: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://www.nabble.com/Removing-the-jsessionid-for-SEO- > > > > tp16464534p1646739 > > > > > > > > > > > > > > > > > > > > > > > > > > > > 6.html<http://www.nabble.com/Removing-the-jsessionid-for-SEO- > > > > tp16464534p1646 > > > > > > > 7396.html> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sent from the Wicket - User mailing list archive at > > > > > Nabble.com. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------ > > ----- > > > > ---- > > > > > > > > > > > To unsubscribe, e-mail: users- > > > > [EMAIL PROTECTED] > > > > > > > > > > > For additional commands, e-mail: > > > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ______________ > > > > > > > > > > > > > > > > > > > > The information contained in this message is > > proprietary > > > > > and/or > > > > > > > > > > confidential. If you are not the > > > > > > > > > > intended recipient, please: (i) delete the message > > and > > > > all > > > > > > copies; > > > > > > > (ii) do > > > > > > > > > > not disclose, > > > > > > > > > > distribute or use the message in any manner; and > > (iii) > > > > notify > > > > > the > > > > > > > sender > > > > > > > > > > immediately. In addition, > > > > > > > > > > please be aware that any message addressed to our > > domain > > > > is > > > > > > subject > > > > > > > to > > > > > > > > > > archiving and review by > > > > > > > > > > persons other than the intended recipient. Thank > > you. > > > > > > > > > > _____________ > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------ > > ----- > > > > ---- > > > > > > > > > > To unsubscribe, e-mail: users- > > > > [EMAIL PROTECTED] > > > > > > > > > > For additional commands, e-mail: users- > > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > --- > > > > > > > > To unsubscribe, e-mail: users- > > [EMAIL PROTECTED] > > > > > > > > For additional commands, e-mail: users- > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Resizable and reorderable grid components. > > > > > > > http://www.inmethod.com > > > > > > > > > > > > > > ------------------------------------------------------------ > > ----- > > > > ---- > > > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > > > > For additional commands, e-mail: users- > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------ > > ----- > > > > ---- > > > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > > > > For additional commands, e-mail: users- > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------- > > ----- > > > > - > > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > --------------------------------------------------------------- > > ----- > > > > - > > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > --- > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > --- > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]