+1 for better dictionaries.

BTW, on a related note, this is an awesome book:

http://www.amazon.com/Algorithms-Strings-Trees-Sequences-Computational/dp/0521585198

On Wed, Jul 6, 2011 at 9:41 AM, Jörn Kottmann <[email protected]> wrote:

> On 7/6/11 4:38 PM, [email protected] wrote:
>
>> but it also consume less memory after loading. This LGPL dictionary
>> library
>> uses a FSA data structure that requires less memory than Hashtable to
>> store
>> 500k words, and also is fast enough during runtime.
>>
>
> Yeah, it would be nice to have a better dictionary in OpenNLP, we also
> discussed the usage of bloom-filters, which I believe might be good
> enough for feature generation anyway in many cases.
>
> Jörn
>



-- 
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge

Reply via email to