On 3/25/2004 11:50 PM, Glenn Linderman wrote:

For sorted lists of text, like dictionaries, one quick-to-decode technique that saves a fair amount of space, is to start each string with the number of bytes that match the previous string, and then append the remainder of the string.

In other words, the list of words

though
thought
thoughtful

would reduce to

0though
6t
7ful

I seem to recall stumbling across a Perl module that does this sort of thing once, but I'm not getting the right keywords in my searches to find it again. Or else I'm searching in the wrong places (CPAN, Google).

Any one know where such a module might be hiding?

Hi Glenn,


I think the term your thinking of is stemming. Maybe Lingua-Stem <http://search.cpan.org/dist/Lingua-Stem/> is what your looking for?

Regards,
Randy.


_______________________________________________ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to