According to Sleepycat Software:
>> There are actually more than you'd imagine. Even my "lowly" student
>> organization here at Williams College has somewhere around 80,000
>> documents right now. The school itself (around 2,000 undergraduates) has
>> upwards of 150,000 documents.
>
>Well, yeah, but we're still talking a problem that can be solved
>for $10.
Which still leaves the issue of performance...
>> As you might expect, people indexing lots of documents are also willing to
>> invest in development. Currently the word index is on the order of the
>> size of the documents indexed--it's quite clear that compression would
>> help. This helps large-scale users get around OS limits on file sizes
>> (e.g. users bumping into the 2GB limit), as well as smaller-scale users
>> since they'd use less disk and/or get better transfer time. Benchmarking
>> indicates that disk latency is still a problem for us.
>
>I don't mean to be a jerk, honest, but the obvious answer here is
>to switch to a different operating system. There are lots of free
>OS releases that support large filesystems. Memory is cheap, disk
>is cheap, development is very, very expensive. I expect to support
>100+GB caches in Berkeley DB this fall and we already support large
>databases (I'm seeing 1TB databases in the field, now). If you
>wait 12 months, this problem will almost certainly go away, and
>it's unclear how much of a development effort you can deploy in
>under 12 months.
Switching OS is not a choice for quite a number of people.
Larger disks still leave the problem of disk-transfer time,
which decreases performance of *any* database application
when using larger databases than neccessary. You will always
gain speed from a better/smaller database organization.
So, money is not the matter - and hardware isn't either.
And if we "wait 12 months", the problem won't go away either,
because then we will have to care about even larger document
collections.
just some thoughts,
Torsten
--
InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstra�e 14 Tel: +49-4101-403605
D-25474 Ellerbek Fax: +49-4101-403606
E-Mail: [EMAIL PROTECTED] Internet: http://www.inwise.de
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.