Hey Kevin,

You are right. It's around 30-40 TBs for google. But as far as Nutch is
concerned, I think Jack Yu is right.

Following is what he said, in case you did not receive it:
0.1 billion pages for 1.5TB

Regards,
Gaurang

2009/10/4 kevin chen <kevinc...@bdsing.com>

>
> The estimated size of Google's index is 15 billion. So even for 1KB per
> page, it will be 15 TBs. But I think the average page size is way more
> than 1k.
>
>
> On Sun, 2009-10-04 at 17:28 -0700, Gaurang Patel wrote:
> > All-
> >
> > I am novice to using Nutch. Can anyone tell me the estimated size in
> > (I suppose, in TBs) that will be required to store the crawled results
> > for whole web? I want to get estimate of the memory requirements for
> > my project, that uses Nutch web crawler.
> >
> >
> >
> > Regards,
> > Gaurang Patel
>
>

Reply via email to