Did a mistake, should be:

The quick brown
fox jumps over the lazy dog

I will encode my training sentence in one line as:

The quick brown <LF> fox  jumps over the lazy dog <LF>

Eventhough I am not sure if I can avoid the line space after dog so
swiching to

The quick brown <LF> fox  jumps over the lazy dog<LF>

lg Markus



2017-09-28 9:21 GMT+02:00 Markus Kreuzthaler <[email protected]>:

> Hi William!
>
> I found this issue which was obviously fixed:
> https://issues.apache.org/jira/browse/OPENNLP-602
>
> So when I have a sentence like:
>
> The quick brown
> fox jumps over the lazy dog
>
> I will encode my training sentence in one line as:
>
> The quick brown fox <LF> jumps over the lazy dog <LF>
>
> Eventhough I am not sure if I can avoid the line space after dog so
> swiching to
>
> The quick brown fox <LF> jumps over the lazy dog<LF>
>
> I will give it a try, or maybe someone can give me a hint which version is
> correct...
>
> Thank you!
>
> lg Markus
>
>
> 2017-09-27 17:44 GMT+02:00 William Colen <[email protected]>:
>
>> Sentence detector will have a bad time learning from samples without EOS
>> (end of sentence) mark. This is common in headlines of articles, for
>> example.
>> I usually remove from the training/evaluating corpus sentences with no
>> clear EOS.
>> During runtime, I apply some code to split sentences in new lines if I can
>> clear identify it as a complete headline.
>>
>>
>> Regards
>> William
>>
>> 2017-09-27 11:10 GMT-03:00 Gary Underwood <[email protected]>:
>>
>> > The sentences for training are in the format of 1 per line so it should
>> be
>> > fine as it is (unless you have sentences that also span lines).
>> >
>> > Gary Underwood
>> > [email protected]
>> >
>> >
>> >
>> > > On Sep 27, 2017, at 6:49 AM, Markus Kreuzthaler <
>> > [email protected]> wrote:
>> > >
>> > > Hello!
>> > >
>> > > How do I have to prepare the training data for sentence detection
>> when I
>> > > have cases where sentences end just via a new line char, without e.g.
>> a
>> > > period character / full stop at the end of the training sentence.
>> > >
>> > > Is there some special encoding for this case?
>> > >
>> > > Thank you for you help!
>> > >
>> > > lg Markus
>> >
>> >
>>
>
>

Reply via email to