Stefan Groschupf wrote:

Lets do some calculation:
2 billion pages: (google has 8 billion)
100 kilobytes * 2 000 000 000 = 186.264515 terabytes per Month
1 * 100MBit per Month = 33.1776 TB
186 / 33 = 5.6
The cheapest offer for 100 MBit I found was 1000 USD per month.
So you pay 6000 USD per month just crawling without any user query.
If you _only_ have 1 million queries per day you have another 3 TB traffic.
Math.round(idea) = 20 .000 USD per Month in case all servers are in same location.


yes, that's quite a lot of money and I appreciate it as food for thought


(Please note there was already some offers of servers and bandwidth from companies and as well from non profit organizations.)


have you collected these offers somewhere?


We talk only about traffic no servers, no power for the servers, no server maintenance etc.


ok


If you now think about a P2P model see some old discussion about P2P search engines in the source-forge mail archive.


ok, will check them

May think about semantic web, but from my point of view we are years before that will be realistic.


right, but one has to start somewhere ;-)

Thanks

Michi


Stefan





Am 17.03.2005 um 00:09 schrieb Michael Wechner:

Hi

I was recently thinking that it would be fun to start a non-profit organization
in order to run Nutch as a really "transparent and open" search engine, very
similar as for instance Google, but really focusing only on the search.


Thanks to all Nutch devs the software is there or it's getting there, but it would be a nice challenge to get the infrastructure founded on a non-profit basis, whereas I can only talk for myself and I would be happy to provide a dedicated root server and appropriate bandwidth (whatever that means). And I am sure there would be many others in order to challenge for instance Google. A 1000 servers is quite a lot of money, but
one server is affordable by all kind of people and companies.


I am aware that servers is not the only thing, but I would be interested what the community thinks about such an "infrastructure" project.

Thanks

Michi

--
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
[EMAIL PROTECTED]                        [EMAIL PROTECTED]



---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net




--
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
[EMAIL PROTECTED]                        [EMAIL PROTECTED]



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to