On 04/10/2013 03:59 PM, William Colen wrote:
Yes, good point. We discussed it already, and as far as I remember the
conclusion was that it is a good idea. My issue with the XML implementation
is its load time, and it requires a lot of memory for a big dictionary.
Let me have a closer look tonight.
Is the memory issue is caused by the fact the dictionaries (e.g.
POSDictionary) are using
the Java HashMap and String key/values?
Did you implement your own POSDictionary for your thesis?
The current dictionary package has an API to read and serialize a
dictionary from and to the
XML format. That could be changed to some binary based format which
could be much faster.
But as far as I understand is the main issue we have is the
representation of the dictionary in memory
and not the serialization of it.
Jörn