[
https://issues.apache.org/jira/browse/CONNECTORS-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117964#comment-15117964
]
Karl Wright commented on CONNECTORS-1270:
-----------------------------------------
[~rafaharo], trying the connector out, it looks like the job specification
still relies on a set of file paths. This is not going to work in a
multi-process environment. I thought there had been some discussion about
having a canned set of model resources included in the jar? It doesn't look
like that ever was done...
The included script downloads five model files locally:
{code}
wget -O ${MODELS_DIR}/en-sent.bin
http://opennlp.sourceforge.net/models-1.5/en-sent.bin
wget -O ${MODELS_DIR}/en-token.bin
http://opennlp.sourceforge.net/models-1.5/en-token.bin
wget -O ${MODELS_DIR}/en-ner-person.bin
http://opennlp.sourceforge.net/models-1.5/en-ner-person.bin
wget -O ${MODELS_DIR}/en-ner-location.bin
http://opennlp.sourceforge.net/models-1.5/en-ner-location.bin
wget -O ${MODELS_DIR}/en-ner-organization.bin
http://opennlp.sourceforge.net/models-1.5/en-ner-organization.bin
{code}
It seems to me that there are a couple of ways forward. First possibility: If
these are accessible by URL, and are licensed in a manner compatible with
Apache redistribution, we could just incorporate them in the build and (for
instance) bundle them as resources in the opennlp connector jar. Second
possibility: We could download the model on the fly in the connector given the
URL. For the second possibility to make any sense, though, this would have to
be done when a connection was configured, not as part of the specification
information, which would rearrange the connector somewhat.
> Import OpenNLP connector into trunk
> -----------------------------------
>
> Key: CONNECTORS-1270
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1270
> Project: ManifoldCF
> Issue Type: Task
> Reporter: Karl Wright
> Assignee: Rafa Haro
> Fix For: ManifoldCF 2.4
>
>
> An OpenNLP connector has been contributed on github. Need to import it into
> MCF, first to a branch, then to trunk.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)