Hi again,

ok, found it... now I understood what you meant with "manifest". It did this:

--------------------------------
if (eosCharacters!=null)
setManifestProperty(EOS_CHARACTERS_PROPERTY, eosCharArrayToString(eosCharacters));
--------------------------------

now it works.

Best
Katrin

On 02/09/2012 02:20 PM, Katrin Tomanek wrote:
Hi Jörn,

I did that:


public SentenceModel(String languageCode, AbstractModel sentModel,
boolean useTokenEnd, Dictionary abbreviations, char[] eosCharacters,
Map<String, String> manifestInfoEntries) {

super(COMPONENT_NAME, languageCode, manifestInfoEntries);

artifactMap.put(MAXENT_MODEL_ENTRY_NAME, sentModel);

setManifestProperty(TOKEN_END_PROPERTY, Boolean.toString(useTokenEnd));

// Abbreviations are optional
if (abbreviations != null)
artifactMap.put(ABBREVIATIONS_ENTRY_NAME, abbreviations);

// EOS characters are optional
if (eosCharacters!=null)
artifactMap.put(EOS_CHARACTERS_ENTRY_NAME,
eosCharArrayToString(eosCharacters));

checkArtifactMap();
}

the EOS-Char-Array is transformed to a string which is written to the
manifest.

Still, wenn serializing the model, I get:

Exception in thread "main" java.lang.IllegalStateException: Missing
serializer for eosCharacters


Best,
Katrin

On 02/09/2012 12:48 PM, Joern Kottmann wrote:
The artifactMap map contains a manifest (that is a Properties object).
You should store the EOS chars in this manifest. We need a smart way to
convert
them into a String.

The Sentence Detector should retrieve the EOS chars then from the model
e.g. make a method getEosChars.

Have a look at the other model classes as well, e.g. the tokenizer model.
It stores some settings in the manifest. That is a good place to look
for a
code sample.

Jörn


On Thu, Feb 9, 2012 at 12:38 PM, Katrin Tomanek
<[email protected]>wrote:

Hi,

I am moving the discussion on making the EOS characters of the sentence
splitter configurable to the dev list (it was previously on the user
list).

I am currently trying to make the EOS characters a parameter of the
SentenceDetectorME and store it as model parameter.

Thus far, this works fine (although it requires quite some positions in
the code to change).

I am putting a "char[] eosCharacters" to the artifactMap in
SentenceModel.
When predicting with a model, I test whether the eos parameter is set
and
if so I use these eos symbols, otherwise the language dependent ones.

Anyways, I am now getting into troubles when serializing the model with
the new "char[]" parameter:

Writing sentence detector model ... Exception in thread "main"
java.lang.*
*IllegalStateException: Missing serializer for eosCharacters

I know that I would have to write such a serializer, however, I am a bit
lost here. Any hints (maybe there is already a serializer for char[]
which
I could easily use).

Best
Katrin






--
Dr. Katrin Tomanek
Averbis GmbH
Tennenbacher Strasse 11
D-79106 Freiburg

Fon: +49 (0) 761 - 203 97696
Fax: +49 (0) 761 - 203 97694
E-Mail: [email protected]

Geschäftsführer: Dr. med. Philipp Daumke, Dr. Kornél Markó
Sitz der Gesellschaft: Freiburg i. Br.
AG Freiburg i. Br., HRB 701080

Reply via email to