Re: models in memory

Jörn Kottmann Wed, 28 May 2014 00:49:33 -0700

The model size depends on the amount of features you have, each feature

is stored as a String object in memory combined with some weights whichare stored

as doubles.

How much training data do you have? How many features and outcomes doesthe data have?


Jörn

On 05/28/2014 12:32 AM, William Colen wrote:

Usually you don't need a huge training data set to have an effective model.
You can measure the tradeoff between the training dataset size, the cutoff
and the algorithm using the 10-fold cross-validation tool included in the
OpenNLP command line interface. You would need to run different experiments
changing these parameters. In your case not only the F-measure is
important, but also the model size.


2014-05-27 18:59 GMT-03:00 Jeffrey Zemerick <[email protected]>:

I do not, William. I assumed it was due to the large training data set. I
will look into the things you mentioned. Thanks!


On Tue, May 27, 2014 at 3:35 PM, William Colen <[email protected]

wrote:
Do you know why your model is so big?

You can reduce its size by using a higher cutoff, or trying Perceptron.

You

can also try using a entity dictionary, which will avoid the algorithm
storing the entities in the form of features.

I am not aware of a way to avoid loading it into memory.

Regards,
William

2014-05-27 16:11 GMT-03:00 Jeffrey Zemerick <[email protected]>:

Hi Users,

Is anyone aware of a way to load a TokenNameFinder model and use it

without

storing the entire model in memory? My models take up about 6 GB of

memory.

I see in the code that the model files are unzipped and put into a

HashMap.

Is it possible to store the data structure off-heap somewhere?

Thanks,
Jeff

Re: models in memory

Reply via email to