RE: Cannot access svn.apache.org -- mirror?

2008-11-14 Thread Dan Segel
Please remove me from these emails!

-Original Message-
From: Kevin Peterson [EMAIL PROTECTED]
Sent: Friday, November 14, 2008 7:33 PM
To: core-user@hadoop.apache.org
Subject: Cannot access svn.apache.org -- mirror?

I'm trying to import Hadoop Core into our local repository using piston
( http://piston.rubyforge.org/index.html ).

I can't seem to access svn.apache.org though. I've also tried the EU
mirror. No errors, nothing but eventual timeout. Traceroute fails at
corv-car1-gw.nero.net. I got the same errors a couple weeks ago, but
assumed they were just temporary downtime. I have found some messages
from earlier this year about a similar problem where some people can
access it fine, and others just can't connect. I'm able to access it
from a remote shell account, but not from my machine.

Has anyone been able to work around this? Is there any mirror of the
Hadoop repository?



Gigablast.com search engine, 10billion pages!!!

2008-06-05 Thread Dan Segel
Our ultimate goal is to basically replicate gigablast.com search engine.
They claim to have less than 500 servers that contain 10billion pages
indexed, spidered and updated on a routine basis...  I am looking at
featuring 500 million pages indexed per node, and have a total of 20 nodes.
Each node will feature 2 quad core processes, 4TB (at raid 5) and 32 gb of
ram.  I believe this can be done however how many searches per second do you
think would be realistic in this instance?  We are looking at achieving
25+/- searches per second ultimately spread out over the 20 nodes... I can
really uses some advice with this one.

Thanks,
D. Segel


Gigablast.com search engine- 10BILLION PAGES!

2008-06-05 Thread Dan Segel
Our ultimate goal is to basically replicate gigablast.com search engine.  They 
claim to have less than 500 servers that contain 10billion pages indexed, 
spidered and updated on a routine basis...  I am looking at featuring 500 
million pages indexed per node, and have a total of 20 nodes.  Each node will 
feature 2 quad core processes, 4TB (at raid 5) and 32 gb of ram.  I believe 
this can be done however how many searches per second do you think would be 
realistic in this instance?  We are looking at achieving 25+/- searches per 
second ultimately spread out over the 20 nodes... I can really uses some advice 
with this one.

Thanks,
D. Segel