> > memory and disk space...what about a a quad PIII/550 Xenon with
> > 2GB RAM with a 5TB RAID array on a T3.  somthing such as that would take
> > away a good amount of the theoretical hindermets.  with those out of the
> > way, would it be possible to index the web?
>
> I still wouldn't recommend it. It's only recently that we've been
> receiving feedback on scaling to huge (i.e. 500,000+ URLs) indexes. In
> particular, the 3.1.x series requires the htmerge phase and that requires
> sorting the word database. For even modest-sized databases, it can take an
> enormous amount of RAM to sort.

I agree 110%...I have a 500,000+ url search engine running on a dual piii
500 box with 1 gig of ram, running freebsd with 8 IBM 10,000RPM LVD drives
in a vinum software raid 0 array (60MB/sec under bonnie). A lot of searches
are quite slow (think john smith)...

I'm thinking about building a 10 box cluster with little searches on each
box that feed raw data up to another box that merges the sorted results for
final display.

Randy Winch


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to