Hello,
I don't have any numbers for you. The performance depends highly on the
model you are using, the configured feature generation and the number of
features in your training data.
To get a good number you probably have to run a test on your machines.
All modern CPUs have multiple cores these
Okay, I have no problem with that. I'll look over some other issues.
In the meantime, I think I would like to work on medical de-identification.
How would I go about starting this work? What all would I need to know?
On Mon, Mar 16, 2015 at 7:15 PM, Joern Kottmann kottm...@gmail.com wrote:
Hello,
thanks for your interest in OpenNLP. We already have a lot of candidates
for those GSOC issues.
You are welcome to suggest something you would like to work on here on
the dev list, create an issue for it and contribute some code to solve
it.
The best way to get started is probably to
hi,
i wanted some information regarding the performance of opennlp entity
extraction modals in documents/seconds and Mb/seconds.
Currently I am using person, location, organisation and money extraction
modals.
If possible, please tell the speeds when combination of modals is used too.
Thank you
I would certainly like to get involved in this then.
I looked over the paper and its results were highly positive. So does this
mean that we would be implementing their model that gave such good results?
Also, I was looking at the OpenNLP issues on the JIRA page and I really
liked this one--
Opennlp is a standard lib used by many apache NLP projects. The clinical
text engine (ctakes.apache.org) is one such use of open NLP. There is a
medical data privacy engine (de-identification) that does medical concept
recognition and privacy features described in the paper. We used it to
conduct