Sorry for late reply but thanks for your quick response Doug. I really
appreiciate it.
Any ideas when would nutch-0.8 be released officially?
Thanks and Regards,
Pushpesh
On 1/10/06, Doug Cutting <[EMAIL PROTECTED]> wrote:
>
> Pushpesh Kr. Rajwanshi wrote:
> > Just wanted to confirm that this
Pushpesh Kr. Rajwanshi wrote:
Just wanted to confirm that this distributed crawl you
did using nutch version 0.7.1 or some other version? And was that a
successful distributed crawl using map reduce or some work around for
distributed crawl?
No, this is 0.8-dev. This was using in early Decembe
Hi Doug,
Thanks alot for your precious time you gave for writing such a detailed and
informative reply. Just wanted to confirm that this distributed crawl you
did using nutch version 0.7.1 or some other version? And was that a
successful distributed crawl using map reduce or some work around for
d
Earl Cahill wrote:
Any chance you could walk through your implementation?
Like how the twenty boxes were assigned? Maybe
upload your confs somewhere, and outline what commands
you actually ran?
All 20 boxes are configured identically, running a Debian 2.4 kernel.
These are dual-processor box
+1
On Mon, 2006-01-02 at 13:39 -0800, Earl Cahill wrote:
> Any chance you could walk through your implementation?
> Like how the twenty boxes were assigned? Maybe
> upload your confs somewhere, and outline what commands
> you actually ran?
>
> Thanks,
> Earl
>
> --- Doug Cutting <[EMAIL PROTEC
Any chance you could walk through your implementation?
Like how the twenty boxes were assigned? Maybe
upload your confs somewhere, and outline what commands
you actually ran?
Thanks,
Earl
--- Doug Cutting <[EMAIL PROTECTED]> wrote:
> Pushpesh Kr. Rajwanshi wrote:
> > I want to know if anyone i
Pushpesh Kr. Rajwanshi wrote:
I want to know if anyone is able to successfully run distributed crawl on
multiple machines involving crawling millions of pages? and how hard is to
do that? Do i just have to do some configuration and set up or do some
implementations also?
I recently performed a
Hi there,
Thanks for reply again. What volume of data you are crawling and on how many
machines? Which version of nutch you are using? 0.7.1 or any other? Actually
it is working more or less fine but i want to know how much resources i will
need (machines) for crawling 20,000 websites in a day? If
Hi
I have had no problem doing distributed crawl.
On 12/28/05, Pushpesh Kr. Rajwanshi <[EMAIL PROTECTED]> wrote:
> Hi NN,
>
> Thanks for replying me. Actually I wanted to know if distributed crawling in
> nutch is working fine and to what success? Like i am successful in setting
> up distributed
Hi NN,
Thanks for replying me. Actually I wanted to know if distributed crawling in
nutch is working fine and to what success? Like i am successful in setting
up distributed crawl for 2 machines (1 master and 1 slave) but when i try
with more than two machines there seems problem specially while i
Have you tried the following:
http://wiki.apache.org/nutch/HardwareRequirements
and
http://wiki.apache.org/nutch/
There are no quick answer if one is planning to crawl million
pages..Read..Try.. Read..
On 12/28/05, Pushpesh Kr. Rajwanshi <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I want to know if
Hi,
I want to know if anyone is able to successfully run distributed crawl on
multiple machines involving crawling millions of pages? and how hard is to
do that? Do i just have to do some configuration and set up or do some
implementations also?
Also can anyone tell me if i want to crawl around 2
12 matches
Mail list logo