If you want to be seo friendly then change the application so that session is not in the url.
Google sees each unique url as one page. With each visit to a jsession site it will see the same content on multiple pages and the score will go down for those seo terms. Simple answer: Use cookies for sessions. Keep urls clean and descriptive using - between words. On 2/9/10, Pid <p...@pidster.com> wrote: > On 09/02/2010 16:32, Marian Simpetru wrote: >> jsessionid in URLs returned around 79 million search results. > > Yep. I know they're there. > >> google search on jsessionid SEO will give you lots of examples. >> >> On a question asked to google, they reply by explaining the algorithm >> (multiple URL with same content -> lower ranking, JSESSIONID=zzz -> >> multiple URLS) >> >> I can see there is a penalty in google webmaster tools. Can't say on >> other websites... > > This I also know. > > But as I said, it would be surprising *to me* to find that Google > weren't trying to filter this type of noise out of their URL indexes. > > Having thought about it a little more, I would like to add that we > implement XML Sitemaps on our site and this may be having an effect on > matters. > > http://sitemaps.org/protocol.php > > > When I look in my logs I can see sequential(ish) requests for URLs from > all of the bots hitting our site and they do not have session id > parameters appended. > > Of 68600 URLs appearing in the Google index of the site I have in mind, > only 46 match a search for jsessionid and some of those appear because > the HTML contains a URL to another site with the parameter present. > > The total number of URLs referenced in the XML sitemaps is somewhat > below the total indexed on this domain and the difference is markedly > larger than 46. > > I, perhaps hastily, have concluded that search engines are somehow > storing pages without the session id parameter present in the URL. > > > p > > >> Marian >> >> On Tue, 2010-02-09 at 16:07 +0000, Pid wrote: >>> On 09/02/2010 15:46, Christopher Schultz wrote: >>> > -----BEGIN PGP SIGNED MESSAGE----- >>> > Hash: SHA1 >>> > >>> > Marian, >>> > >>> > On 2/9/2010 9:31 AM, Marian Simpetru wrote: >>> >> Google act as a non cookie browser and hence he is served with non >>> >> unique URLs (because of session ID is appended to URL). >>> > >>> > I heard at one point that Google's crawler *did* support cookies. I >>> > never verified that, but it sounds like they currently do not support >>> > them. >>> > >>> >> Question is: Is there a way to configure tomcat to only use cookies >>> >> (not >>> >> append jsessionid to URL for cookie0less browsers). >>> > >>> > It's not a Tomcat configuration, but you can always write a filter >>> > like >>> > this: >>> > >>> > public class NoURLRewriteFilter >>> > implements Filter >>> > { >>> > public void doFilter(...) { >>> > chain.doFilter(request, new HttpServletResponseWrapper(response) >>> > { >>> > public String encodeURL(String url) { return url }; >>> > public String encodeUrl(String url) { return url }; >>> > public String encodeRedirectURL(String url) { return url }; >>> > public String encodeRedirectUrl(String url) { return url }; >>> > }); >>> > } >>> > } >>> > >>> > Now, this will likely cause an explosion in the number of sessions >>> > generated by Google's crawler. You might want to couple this with a >>> > separate filter (or just create a GoogleCrawlerFilter that does all >>> > this) that identifies Google's (and others) user agent and intercepts >>> > calls to getSession() and either refuses to create a session (probably >>> > not a good idea) or returns a fake session that gets discarded after >>> > every request. Another option would be to set the session timeout to >>> > something like 10 seconds so the session dies relatively quickly >>> > instead >>> > of sticking around for a long time, wasting memory. >>> > >>> >> Maybe a better idea would be that someone from Apache Tomcat should >>> >> push >>> >> to google with some standards tomcat implement in this respect so >>> >> that >>> >> google change the algorithm and not punish with low ranking websites >>> >> powered by tomcat. >>> > >>> > This is not a"Tomcat problem": it's a problem with any site that >>> > requires sessions to maintain state on the server. >>> > >>> > I agree with Chuck: fix your webapp to tolerate Google's crawler, or >>> > suffer the consequences. >>> > >>> > Something else you can do is use a robots.txt file to prevent the >>> > crawler from hitting certain URLs. That might help. >>> >>> I'm not doing anything special, I don't think. >>> Google bots hit our site, the session count goes up a bit. >>> Google does not include jsessionid in the URLs it indexes. >>> >>> It may be that the site has been around for long enough that the Google >>> algorithms know that we have a session id should be removed from a URL. >>> >>> It would be surprising to me if Google (et al) was not trying to remove >>> PHPSESSIONID and JSESSIONID data from URLs. >>> >>> >>> p >>> >>> >>> > - -chris >>> > -----BEGIN PGP SIGNATURE----- >>> > Version: GnuPG v1.4.10 (MingW32) >>> > Comment: Using GnuPG with Mozilla -http://enigmail.mozdev.org/ >>> > >>> > iEYEARECAAYFAktxg08ACgkQ9CaO5/Lv0PBxDACgweTaZAglz476s7TvYo63//2a >>> > IgcAoIp0u2ZxOes8fFPuUAoP2FrHk/VN >>> > =FjsP >>> > -----END PGP SIGNATURE----- >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe, e-mail:users-unsubscr...@tomcat.apache.org >>> > <mailto:users-unsubscr...@tomcat.apache.org> >>> > For additional commands, e-mail:users-h...@tomcat.apache.org >>> > <mailto:users-h...@tomcat.apache.org> >>> > >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail:users-unsubscr...@tomcat.apache.org >>> <mailto:users-unsubscr...@tomcat.apache.org> >>> For additional commands, e-mail:users-h...@tomcat.apache.org >>> <mailto:users-h...@tomcat.apache.org> >>> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > -- Sent from my mobile device --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org