Hi,

Thank you for your help.
I setted the db.max.outlinks.per.page as -1. Now, there is no limit for
files number under one folder.


2008/9/2 Onur Deniz <[EMAIL PROTECTED]>

> hi,
>
> if 32 is a limit for all urls, i think you did not edit nutch-default.xml
> maybe...
>
> take a look at that file unde conf folder.
>
> setting db.max.outlinks.per.page as -1 may solve your problem. but also
> take a look at other variables. those alse may cause a problem in future,
> like http.content.limit...
>
> hope this helps..
>
> regards
>
> onur deniz
>
>
>
> --- On Tue, 9/2/08, 宫照 <[EMAIL PROTECTED]> wrote:
>
> > From: 宫照 <[EMAIL PROTECTED]>
> > Subject: can not deal too many files under one folder
> > To: nutch-user@lucene.apache.org
> > Date: Tuesday, September 2, 2008, 6:43 AM
> > Hi all,
> >
> > I have post this porblem before, but not solved.
> >
> > I use nutch to crawl on intranet to crawl some documents.
> > For some urls there
> > are many documents under it.
> >
> > I find after crawling, if there are more than 32 files
> > under one
> > folder, I only can search 32 documents before ,other
> > documents after
> > can not be searched. I check it at luke, it have the same
> > situation.
> >
> > It means it only deal with first 32 documents,  if we have
> > more
> > document than 32,
> > it can not be crawled, and every url have the same problem.
> >
> > Anybody know the reason?
> >
> > regards,
> > Gong Zhao
>
>
>
>

Reply via email to