Any hint to increase the session time of the Nutch crawl thread. I tried crawling with one thread, still no luck.
---- Thanks/Regards, Parvez On Tue, Sep 8, 2009 at 4:02 PM, Mohamed Parvez <[email protected]> wrote: > I have a paginated pages, which will only work if its crawled in a given > sequence, and in the same session. > > For example first URL is > > http://www.myhost.com/?page_number=1 > http://www.myhost.com/?page_number=2 > http://www.myhost.com/?page_number=3 > > The first page has link to second page. > Second page has link to first and second page. > Third page has link to third and second page. > So On... > > Nutch is able to crawl the the first 6 pages, but beyond that it is not > able to crawl or is getting empty result. > > If I manually click through the pagination, in a browser, I can reach till > the end with no problem. > > Is the Nutch Crawl Session timing out? How do we increase it. > > I tried crawling with on thread but still same result. > > Any suggestion ? > > --- > Thanks/Regards, > Parvez > >
