> Very long documents are useful for testing for anomalies, but they're > not so useful as retrieved documents, nor typical of applications.
That's what I thought, too. I'm kinda curious to see how gutenburg compares to wikipedia for things like merge policy, in particular, by-docs vs. by-bytes. But while a curiosity, I don't see that it has all much practical value, so I'm not sure I can get it to the top of the to-do list. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]