You say that named entity recognition is not generalised beyond Mail,
but the support library is there for anyone to use.  See for
example
https://developer.apple.com/documentation/foundation/nslinguistictagger/identifying_people_places_and_organizations

In Python, you can use NLTK to do roughly the same.

There's no real point in reimplementing this stuff in Pharo.
Just set up a separate process, send text to it, and receive
results back.


On Thu, 7 Mar 2019 at 22:53, Cédrick Béler <cdric...@gmail.com> wrote:

> Hi all,
>
> I’ve often got the need to analyse some random unstructured text to
> discover (structured) information (in email for instance), to extract :
> - emails
> - telephone numbers
> - addresses
> - events
> - person names (according to a list of known persons),
> - etc…
>
> Apple do it in email for instance (strangely, this is not generalized).
>
>
> So my questions are :
> - do we have something equivalent in Smalltalk/Pharo ? (I didn’t find)
> - if not, what strategy would you use ?
> => I do really stupid text analysis (substrings, finding @, …, parsing
> according to the text structure when there is… kind of Soup parsing…)
> => I feel this is a job for PetitParser ? And would be a nice feet to the
> new GToolkit.
>
> All ideas or suggestions are welcome ;-)
>
>
> TIA,
>
> Cédrick
>
>
>
>

Reply via email to