致公司经理或财务负责人
您好!本公司有多种发票可以代理.欢迎来电咨询'洽谈:13824373512林先生
可网上查询验证,期待与您真诚合作!如打扰勿怪,谢谢
邮件信箱:[EMAIL PROTECTED]
-
Using Tomcat but need to do more? Need to support web services, security?
Get st
Thanks for the pointer.
It does perfectly the job!
-Original Message-
From: Dennis Kubes [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 29, 2006 7:59 AM
To: nutch-dev@lucene.apache.org
Subject: Re: Hadoop job question
Although it is kinda hacking the system you may be able to do it in
Stefan Groschupf wrote:
> Hi,
>
>> + You may have problems with some imports in parse-mp3 and parse-rtf
>> plugins. Because of incompatibility with apache licence they were
>> left from sources. You can find it here:
>> +
>> + http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/
Hi,
> + You may have problems with some imports in parse-mp3 and parse-
> rtf plugins. Because of incompatibility with apache licence they
> were left from sources. You can find it here:
> +
> + http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/
> lib/
> +
> + http://nutch.cvs.
Mladen Adamovic wrote:
> Hi!
>
> I want to get more insight into various search engine algorithms. I
> have wide knowledge of standard data structures & algorithms
> (hashvalues, trees, graphs, etc.). I thought that Lucene would be
> good place to start to seek for information and indeed I've f
Hi!
I want to get more insight into various search engine algorithms. I have
wide knowledge of standard data structures & algorithms (hashvalues,
trees, graphs, etc.). I thought that Lucene would be good place to
start to seek for information and indeed I've found some decent
information at N
Although it is kinda hacking the system you may be able to do it in the
map method by writing a custom MapRunner and having an object that lives
in the MapRunner but that you set into each mapper instance.
Dennis
HUYLEBROECK Jeremy RD-ILAB-SSF wrote:
> I currently have a MR task that reads a Se
Hi Stefan,
Yes, you're right. The index built without deduping does not have the first
instance of the problem (though of course, it's also filled with duplicates,
so it has other problems). It still shows the problems with missing
redirects, though this could be something else (will investigate
Hi,
I do some changes in CrawlDatum but some things I'm not quite understand.
My idea is to add int hop in CrawlDatum and set this in Injector to 0.
Then after fetching other urls this can be calculated parenturl + 1.
I try to find where adding new urls to webDB is done. If somebody could
expl
Hi Doug,
I'm pretty sure that your problem is related to the deduping of your
index.
In general the hash of the content of a page is used as key for the
dedub tool.
We ran into the the forwarding problem also in a other case.
https://issues.apache.org/jira/browse/NUTCH-353
So may be we should
10 matches
Mail list logo