On 04/26/2013 09:06 AM, Svetoslav Marinov wrote:
I'm wondering what is the max size (if such exists) for training a NER model? I have a corpus of 2 600 000 sentences annotated with just one category, 310M in size. However, the training never finishes – 8G memory resulted in java out of memory exception, and when I increased it to 16G it just died with no error message.
Do you use the command line interface or the API for the training? At which stage of the training did you get the out of memory exception? Where did it just die when you used 16G of memory (maybe do a jstack) ? Jörn
