The artifactMap map contains a manifest (that is a Properties object).
You should store the EOS chars in this manifest. We need a smart way to
convert
them into a String.

The Sentence Detector should retrieve the EOS chars then from the model
e.g. make a method getEosChars.

Have a look at the other model classes as well, e.g. the tokenizer model.
It stores some settings in the manifest. That is a good place to look for a
code sample.

Jörn


On Thu, Feb 9, 2012 at 12:38 PM, Katrin Tomanek
<[email protected]>wrote:

> Hi,
>
> I am moving the discussion on making the EOS characters of the sentence
> splitter configurable to the dev list (it was previously on the user list).
>
> I am currently trying to make the EOS characters a parameter of the
> SentenceDetectorME and store it as model parameter.
>
> Thus far, this works fine (although it requires quite some positions in
> the code to change).
>
> I am putting a "char[] eosCharacters" to the artifactMap in SentenceModel.
> When predicting with a model, I test whether the eos parameter is set and
> if so I use these eos symbols, otherwise the language dependent ones.
>
> Anyways, I am now getting into troubles when serializing the model with
> the new "char[]" parameter:
>
> Writing sentence detector model ... Exception in thread "main" java.lang.*
> *IllegalStateException: Missing serializer for eosCharacters
>
> I know that I would have to write such a serializer, however, I am a bit
> lost here. Any hints (maybe there is already a serializer for char[] which
> I could easily use).
>
> Best
> Katrin
>

Reply via email to