Hi, Thank you for your help. I setted the db.max.outlinks.per.page as -1. Now, there is no limit for files number under one folder.
2008/9/2 Onur Deniz <[EMAIL PROTECTED]> > hi, > > if 32 is a limit for all urls, i think you did not edit nutch-default.xml > maybe... > > take a look at that file unde conf folder. > > setting db.max.outlinks.per.page as -1 may solve your problem. but also > take a look at other variables. those alse may cause a problem in future, > like http.content.limit... > > hope this helps.. > > regards > > onur deniz > > > > --- On Tue, 9/2/08, 宫照 <[EMAIL PROTECTED]> wrote: > > > From: 宫照 <[EMAIL PROTECTED]> > > Subject: can not deal too many files under one folder > > To: nutch-user@lucene.apache.org > > Date: Tuesday, September 2, 2008, 6:43 AM > > Hi all, > > > > I have post this porblem before, but not solved. > > > > I use nutch to crawl on intranet to crawl some documents. > > For some urls there > > are many documents under it. > > > > I find after crawling, if there are more than 32 files > > under one > > folder, I only can search 32 documents before ,other > > documents after > > can not be searched. I check it at luke, it have the same > > situation. > > > > It means it only deal with first 32 documents, if we have > > more > > document than 32, > > it can not be crawled, and every url have the same problem. > > > > Anybody know the reason? > > > > regards, > > Gong Zhao > > > >