Re: Sentence Detector

2017-08-25 Thread William Colen
The writer did a mistake by not adding a space after the dot. The sentence
detector model will not know how to deal with it because not very often
there are dots without space splitting sentences.

This is very common in social network. I apply some regex to check if it is
not a UR, email or number, them add the missing space.



2017-08-25 9:31 GMT-03:00 Manoj B. Narayanan :

> Hi,
>
> The OpenNLP sentence detector detects sentences when the period at the end
> of a sentence and the next word are separated by a . If there is no
>  in between it doesn't split them. Is there a way that could help me
> solve this?
>
> *Example.1*
>
> It is with great pleasure that I write to invite you to the launch of the
> University of Reading’s Centre for Food Security on Thursday 25 November
> 2010.** The Centre offers a new focus for research on the challenges
> of meeting global demands for food in a sustainable way.
>
> *Output1*
> It is with great pleasure that I write to invite you to the launch of the
> University of Reading’s Centre for Food Security on Thursday 25 November
> 2010.
> The Centre offers a new focus for research on the challenges of meeting
> global demands for food in a sustainable way.
>
>
> *Example.2*
>
> It is with great pleasure that I write to invite you to the launch of the
> University of Reading’s Centre for Food Security on Thursday 25 November
> 2010.The Centre offers a new focus for research on the challenges of
> meeting global demands for food in a sustainable way.
>
> *Output2*
> It is with great pleasure that I write to invite you to the launch of the
> University of Reading’s Centre for Food Security on Thursday 25 November
> 2010.The Centre offers a new focus for research on the challenges of
> meeting global demands for food in a sustainable way.
>
> Thanks,
> Manoj.
>


Sentence Detector

2017-08-25 Thread Manoj B. Narayanan
Hi,

The OpenNLP sentence detector detects sentences when the period at the end
of a sentence and the next word are separated by a . If there is no
 in between it doesn't split them. Is there a way that could help me
solve this?

*Example.1*

It is with great pleasure that I write to invite you to the launch of the
University of Reading’s Centre for Food Security on Thursday 25 November
2010.** The Centre offers a new focus for research on the challenges
of meeting global demands for food in a sustainable way.

*Output1*
It is with great pleasure that I write to invite you to the launch of the
University of Reading’s Centre for Food Security on Thursday 25 November
2010.
The Centre offers a new focus for research on the challenges of meeting
global demands for food in a sustainable way.


*Example.2*

It is with great pleasure that I write to invite you to the launch of the
University of Reading’s Centre for Food Security on Thursday 25 November
2010.The Centre offers a new focus for research on the challenges of
meeting global demands for food in a sustainable way.

*Output2*
It is with great pleasure that I write to invite you to the launch of the
University of Reading’s Centre for Food Security on Thursday 25 November
2010.The Centre offers a new focus for research on the challenges of
meeting global demands for food in a sustainable way.

Thanks,
Manoj.