> > memory and disk space...what about a a quad PIII/550 Xenon with
> > 2GB RAM with a 5TB RAID array on a T3. somthing such as that would take
> > away a good amount of the theoretical hindermets. with those out of the
> > way, would it be possible to index the web?
>
> I still wouldn't recommend it. It's only recently that we've been
> receiving feedback on scaling to huge (i.e. 500,000+ URLs) indexes. In
> particular, the 3.1.x series requires the htmerge phase and that requires
> sorting the word database. For even modest-sized databases, it can take an
> enormous amount of RAM to sort.
I agree 110%...I have a 500,000+ url search engine running on a dual piii
500 box with 1 gig of ram, running freebsd with 8 IBM 10,000RPM LVD drives
in a vinum software raid 0 array (60MB/sec under bonnie). A lot of searches
are quite slow (think john smith)...
I'm thinking about building a 10 box cluster with little searches on each
box that feed raw data up to another box that merges the sorted results for
final display.
Randy Winch
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.