I prefer the API as it gives me more flexibility and fits the overall
architecture of our components. But here is part of my set-up:

Cutoff 6
Iterations 200
CustomFeatureGenerator with looking at the 4 previous and 2 subsequent
tokens.

So, I gave it a whole night and I saw the process was dead in the morning.
But I'll give it another try and will let you know.

Thank you!

Svetoslav


On 2013-04-26 12:42, "Jörn Kottmann" <[email protected]> wrote:

>I always edit the opennlp script and change it to what I need.
>
>Anyway, we have a Two Pass Data Indexer which writes the features to disk
>to save memory during indexing, depending on how you train you might
>have a cutoff=5 which eliminates probably a lot of your features and
>therefore
>saves a lot of memory.
>
>The indexing might just need a bit of time, how long did you wait?
>
>Jörn
>
>On 04/26/2013 12:33 PM, William Colen wrote:
>>  From command line you can specify memory using
>>
>> MAVEN_OPTS="-Xmx4048m"
>>
>> You can also set it as JVM arguments if you are using from the API:
>>
>> java -Xmx4048m ...
>>
>>
>>
>> On Fri, Apr 26, 2013 at 4:30 AM, Svetoslav Marinov <
>> [email protected]> wrote:
>>
>>> I use the API. Can one specify the memory size via the command line? I
>>> think the default there is 1024M? At 8G memory during "computing event
>>> counts...", at 16G during indexing: "Computing event counts...  done.
>>> 50153300 events
>>>          IndexingŠ"
>>>
>>> Svetoslav
>>>
>>> On 2013-04-26 09:12, "Jörn Kottmann" <[email protected]> wrote:
>>>
>>>> On 04/26/2013 09:06 AM, Svetoslav Marinov wrote:
>>>>> I'm wondering what is the max size (if such exists) for training a
>>>>>NER
>>>>> model? I have a corpus of 2 600 000 sentences annotated with just one
>>>>> category, 310M in size. However, the training never finishes ­ 8G
>>>>>memory
>>>>> resulted in java out of memory exception, and when I increased it to
>>>>>16G
>>>>> it just died with no error message.
>>>> Do you use the command line interface or the API for the training?
>>>> At which stage of the training did you get the out of memory
>>>>exception?
>>>> Where did it just die when you used 16G of memory (maybe do a jstack)
>>>>?
>>>>
>>>> Jörn
>>>>
>>>
>>>
>
>

Reply via email to