This seems to be more of a session handling issue on the websphere server than a nutch fetching issue. Nutch doesn't actually create the session, it just doesn't store cookies or session information so websphere is creating a new session per fetch.

While having a single stored session for fetching the same domain in Nutch seems like it might be interesting functionality, I don't believe that currently exists. My suggestion is to look into tuning websphere session timeouts. My guess would be they are set to a very high level.

Dennis

kazam wrote:
Hi there,
I am generating nutch indexes for our site which is running off a websphere
server. The indexing takes about 20 hours to complete. However, after about
15-16 hours the websphere server crashes, because of too many sessions being
created.
It seems that each fetch creates a new session. Is there a way that all
nutch fetches can be done via a single session.
Has anyone else encountered such problem? All ideas are welcome.

Thanks.

Reply via email to