This seems to be more of a session handling issue on the websphere
server than a nutch fetching issue. Nutch doesn't actually create the
session, it just doesn't store cookies or session information so
websphere is creating a new session per fetch.
While having a single stored session for fetching the same domain in
Nutch seems like it might be interesting functionality, I don't believe
that currently exists. My suggestion is to look into tuning websphere
session timeouts. My guess would be they are set to a very high level.
Dennis
kazam wrote:
Hi there,
I am generating nutch indexes for our site which is running off a websphere
server. The indexing takes about 20 hours to complete. However, after about
15-16 hours the websphere server crashes, because of too many sessions being
created.
It seems that each fetch creates a new session. Is there a way that all
nutch fetches can be done via a single session.
Has anyone else encountered such problem? All ideas are welcome.
Thanks.