Hello everybody,

I just want to add my 2 cents to this discussion.

At IndyPhone we too wanted to get rid of jesessionid-URLs in google's index.
Yeah, it would be nice if the google bot would be as clever as the one from
yahoo, and just remove them himself. But he doesn't.

So I implemented a Servlet-Filter which checks the user agent header for
google bot, and skips the url rewriting just for those clients. As this will
generate lots of new sessions, the filter invalidates the session right
after the request. Also, if a crawler is doing a request containing a
jsessionid (which he stored before the filter was implemented), he redirects
the crawler to the same URL, just without the jsessionid parameter. That
way, the index will be updated for those old URLs.

Now we have almost none of those URLs in google's index.

If anyone is interested in the code, I'd be willing to publish this. As it
is not wicket specific, I could share it with some generic servlet tools OS
project - is there something like that on apache or elsewhere?

But maybe Google is smarter by now, and it is not required anymore?

-- 
greetings from Berlin,

Rüdiger Schulz

www.2rue.de
www.indyphone.de - Coole Handy Logos einfach selber bauen

Reply via email to