> take a look at that file unde conf folder.
>
> setting db.max.outlinks.per.page as -1 may solve your problem. but also
> take a look at other variables. those alse may cause a problem in future,
> like http.content.limit...
>
> hope this helps..
>
> regards
>
> onur
Hi all,
I have post this porblem before, but not solved.
I use nutch to crawl on intranet to crawl some documents. For some urls there
are many documents under it.
I find after crawling, if there are more than 32 files under one
folder, I only can search 32 documents before ,other documents afte
Hi all,
I met a problem when using nutch.
I use it to crawl on intranet to crawl some documents. For some urls there
are many documents under it.
I find after crawling, I only can search 32 documents,other documents after
can not be searched. I check it at luke, it have the same situation.
It m
.
regards,
Gong Zhao
2008/7/28 wuqi <[EMAIL PROTECTED]>
> Try to set log for Dedup program to "DEBUG" in your log4j.properties
> file and you may find the cause..
>
> - Original Message -
> *From:* 宫照 <[EMAIL PROTECTED]>
> *To:* nutch-user@lucen
the
> segement file..
>
>
>
> - Original Message -
> From: "宫照" <[EMAIL PROTECTED]>
> To: ; <[EMAIL PROTECTED]>
> Sent: Friday, July 25, 2008 9:53 AM
> Subject: Re: nutch fetched but no indexed
>
>
> > Hi Patrick,
> >
> >
)|...
>
>
>
>
> That's the only thing I can think of at first glance.
>
> Patrick
> -Original Message-
> From: 宫照 [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, July 23, 2008 11:27 PM
> To: nutch-user@lucene.apache.org
> Subject: nutch fetched but n
Hi everybody,
I face a problem when using nutch. I use nuth to crawl in intranet. It works
well before. But recently, I add some urls to crawl. These urls ara
different with normal .The new urls like this:
http://compass.mydomain.com/go/247460034
there are many folders or documents under this url
Hi,
I have the same problems. Because there are some bugs with hadoop-0.12.2,I
want to change to hadoop-0.17.0, but the api changed,we can't use it
directly. If your find the way to solve this problem. let me know.
Regards,
gong zhao
2008/7/15 kranthi reddy <[EMAIL PROTECTED]>:
> Hi,
>
> I am
gt;
> for example:
>
>
>plugin.includes
>protocol-(httpclient|file)|urlfilter-(regex)|parse-(text|
> html|js|pdf|msword)|index-(basic)|query-
> (basic|site|url)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|
> basic)
>
>
>
>
> On Tue,
hi everybody,
I setup nuthc-0.9, and I can search txt and html in local system . Now i
want to search pdf and msword , can you tell me how to do?
BR,
mingkong
hi everybody,
I setup nuthc-0.9, and I can search txt and html in local system . Now i
want to search pdf and msword , can you tell me how to do?
BR,
mingkong
11 matches
Mail list logo