Hi Jörn,

I did that:


  public SentenceModel(String languageCode, AbstractModel sentModel,
boolean useTokenEnd, Dictionary abbreviations, char[] eosCharacters, Map<String, String> manifestInfoEntries) {

    super(COMPONENT_NAME, languageCode, manifestInfoEntries);

    artifactMap.put(MAXENT_MODEL_ENTRY_NAME, sentModel);

    setManifestProperty(TOKEN_END_PROPERTY, Boolean.toString(useTokenEnd));

    // Abbreviations are optional
    if (abbreviations != null)
        artifactMap.put(ABBREVIATIONS_ENTRY_NAME, abbreviations);

    // EOS characters are optional
    if (eosCharacters!=null)
artifactMap.put(EOS_CHARACTERS_ENTRY_NAME, eosCharArrayToString(eosCharacters));

    checkArtifactMap();
  }

the EOS-Char-Array is transformed to a string which is written to the manifest.

Still, wenn serializing the model, I get:

Exception in thread "main" java.lang.IllegalStateException: Missing serializer for eosCharacters


Best,
Katrin

On 02/09/2012 12:48 PM, Joern Kottmann wrote:
The artifactMap map contains a manifest (that is a Properties object).
You should store the EOS chars in this manifest. We need a smart way to
convert
them into a String.

The Sentence Detector should retrieve the EOS chars then from the model
e.g. make a method getEosChars.

Have a look at the other model classes as well, e.g. the tokenizer model.
It stores some settings in the manifest. That is a good place to look for a
code sample.

Jörn


On Thu, Feb 9, 2012 at 12:38 PM, Katrin Tomanek
<[email protected]>wrote:

Hi,

I am moving the discussion on making the EOS characters of the sentence
splitter configurable to the dev list (it was previously on the user list).

I am currently trying to make the EOS characters a parameter of the
SentenceDetectorME and store it as model parameter.

Thus far, this works fine (although it requires quite some positions in
the code to change).

I am putting a "char[] eosCharacters" to the artifactMap in SentenceModel.
When predicting with a model, I test whether the eos parameter is set and
if so I use these eos symbols, otherwise the language dependent ones.

Anyways, I am now getting into troubles when serializing the model with
the new "char[]" parameter:

Writing sentence detector model ... Exception in thread "main" java.lang.*
*IllegalStateException: Missing serializer for eosCharacters

I know that I would have to write such a serializer, however, I am a bit
lost here. Any hints (maybe there is already a serializer for char[] which
I could easily use).

Best
Katrin




--
Dr. Katrin Tomanek
Averbis GmbH
Tennenbacher Strasse 11
D-79106 Freiburg

Fon: +49 (0) 761 - 203 97696
Fax: +49 (0) 761 - 203 97694
E-Mail: [email protected]

Geschäftsführer: Dr. med. Philipp Daumke, Dr. Kornél Markó
Sitz der Gesellschaft: Freiburg i. Br.
AG Freiburg i. Br., HRB 701080

Reply via email to