If you want to be seo friendly then change the application so that
session is not in the url.

Google sees each unique url as one page. With each visit to a jsession
site it will see the same content on multiple pages and the score will
go down for those seo terms.

Simple answer: Use cookies for sessions.

Keep urls clean and descriptive using - between words.

On 2/9/10, Pid <p...@pidster.com> wrote:
> On 09/02/2010 16:32, Marian Simpetru wrote:
>> jsessionid in URLs returned around 79 million search results.
>
> Yep. I know they're there.
>
>> google search on jsessionid SEO will give you lots of examples.
>>
>> On a question asked to google, they reply by explaining the algorithm
>> (multiple URL with same content -> lower ranking, JSESSIONID=zzz ->
>> multiple URLS)
>>
>> I can see there is a penalty in google webmaster tools. Can't say on
>> other websites...
>
> This I also know.
>
> But as I said, it would be surprising *to me* to find that Google
> weren't trying to filter this type of noise out of their URL indexes.
>
> Having thought about it a little more, I would like to add that we
> implement XML Sitemaps on our site and this may be having an effect on
> matters.
>
>   http://sitemaps.org/protocol.php
>
>
> When I look in my logs I can see sequential(ish) requests for URLs from
> all of the bots hitting our site and they do not have session id
> parameters appended.
>
> Of 68600 URLs appearing in the Google index of the site I have in mind,
> only 46 match a search for jsessionid and some of those appear because
> the HTML contains a URL to another site with the parameter present.
>
> The total number of URLs referenced in the XML sitemaps is somewhat
> below the total indexed on this domain and the difference is markedly
> larger than 46.
>
> I, perhaps hastily, have concluded that search engines are somehow
> storing pages without the session id parameter present in the URL.
>
>
> p
>
>
>> Marian
>>
>> On Tue, 2010-02-09 at 16:07 +0000, Pid wrote:
>>> On 09/02/2010 15:46, Christopher Schultz wrote:
>>> >  -----BEGIN PGP SIGNED MESSAGE-----
>>> >  Hash: SHA1
>>> >
>>> >  Marian,
>>> >
>>> >  On 2/9/2010 9:31 AM, Marian Simpetru wrote:
>>> >>  Google act as a non cookie browser and hence he is served with non
>>> >>  unique URLs (because of session ID is appended to URL).
>>> >
>>> >  I heard at one point that Google's crawler *did* support cookies. I
>>> >  never verified that, but it sounds like they currently do not support
>>> > them.
>>> >
>>> >>  Question is: Is there a way to configure tomcat to only use cookies
>>> >> (not
>>> >>  append jsessionid to URL for cookie0less browsers).
>>> >
>>> >  It's not a Tomcat configuration, but you can always write a filter
>>> > like
>>> >  this:
>>> >
>>> >  public class NoURLRewriteFilter
>>> >      implements Filter
>>> >  {
>>> >     public void doFilter(...) {
>>> >       chain.doFilter(request, new HttpServletResponseWrapper(response)
>>> > {
>>> >         public String encodeURL(String url) { return url };
>>> >         public String encodeUrl(String url) { return url };
>>> >         public String encodeRedirectURL(String url) { return url };
>>> >         public String encodeRedirectUrl(String url) { return url };
>>> >       });
>>> >     }
>>> >  }
>>> >
>>> >  Now, this will likely cause an explosion in the number of sessions
>>> >  generated by Google's crawler. You might want to couple this with a
>>> >  separate filter (or just create a GoogleCrawlerFilter that does all
>>> >  this) that identifies Google's (and others) user agent and intercepts
>>> >  calls to getSession() and either refuses to create a session (probably
>>> >  not a good idea) or returns a fake session that gets discarded after
>>> >  every request. Another option would be to set the session timeout to
>>> >  something like 10 seconds so the session dies relatively quickly
>>> > instead
>>> >  of sticking around for a long time, wasting memory.
>>> >
>>> >>  Maybe a better idea would be that someone from Apache Tomcat should
>>> >> push
>>> >>  to google with some standards tomcat implement in this respect so
>>> >> that
>>> >>  google change the algorithm and not punish with low ranking websites
>>> >>  powered by tomcat.
>>> >
>>> >  This is not a"Tomcat problem": it's a problem with any site that
>>> >  requires sessions to maintain state on the server.
>>> >
>>> >  I agree with Chuck: fix your webapp to tolerate Google's crawler, or
>>> >  suffer the consequences.
>>> >
>>> >  Something else you can do is use a robots.txt file to prevent the
>>> >  crawler from hitting certain URLs. That might help.
>>>
>>> I'm not doing anything special, I don't think.
>>> Google bots hit our site, the session count goes up a bit.
>>> Google does not include jsessionid in the URLs it indexes.
>>>
>>> It may be that the site has been around for long enough that the Google
>>> algorithms know that we have a session id should be removed from a URL.
>>>
>>> It would be surprising to me if Google (et al) was not trying to remove
>>> PHPSESSIONID and JSESSIONID data from URLs.
>>>
>>>
>>> p
>>>
>>>
>>> >  - -chris
>>> >  -----BEGIN PGP SIGNATURE-----
>>> >  Version: GnuPG v1.4.10 (MingW32)
>>> >  Comment: Using GnuPG with Mozilla -http://enigmail.mozdev.org/
>>> >
>>> >  iEYEARECAAYFAktxg08ACgkQ9CaO5/Lv0PBxDACgweTaZAglz476s7TvYo63//2a
>>> >  IgcAoIp0u2ZxOes8fFPuUAoP2FrHk/VN
>>> >  =FjsP
>>> >  -----END PGP SIGNATURE-----
>>> >
>>> >  ---------------------------------------------------------------------
>>> >  To unsubscribe, e-mail:users-unsubscr...@tomcat.apache.org
>>> > <mailto:users-unsubscr...@tomcat.apache.org>
>>> >  For additional commands, e-mail:users-h...@tomcat.apache.org
>>> > <mailto:users-h...@tomcat.apache.org>
>>> >
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:users-unsubscr...@tomcat.apache.org
>>> <mailto:users-unsubscr...@tomcat.apache.org>
>>> For additional commands, e-mail:users-h...@tomcat.apache.org
>>> <mailto:users-h...@tomcat.apache.org>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

-- 
Sent from my mobile device

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to