[ 
https://issues.apache.org/jira/browse/CONNECTORS-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117964#comment-15117964
 ] 

Karl Wright commented on CONNECTORS-1270:
-----------------------------------------

[~rafaharo], trying the connector out, it looks like the job specification 
still relies on a set of file paths.  This is not going to work in a 
multi-process environment.  I thought there had been some discussion about 
having a canned set of model resources included in the jar?  It doesn't look 
like that ever was done...

The included script downloads five model files locally:

{code}
wget -O ${MODELS_DIR}/en-sent.bin 
http://opennlp.sourceforge.net/models-1.5/en-sent.bin
wget -O ${MODELS_DIR}/en-token.bin 
http://opennlp.sourceforge.net/models-1.5/en-token.bin
wget -O ${MODELS_DIR}/en-ner-person.bin 
http://opennlp.sourceforge.net/models-1.5/en-ner-person.bin
wget -O ${MODELS_DIR}/en-ner-location.bin 
http://opennlp.sourceforge.net/models-1.5/en-ner-location.bin
wget -O ${MODELS_DIR}/en-ner-organization.bin 
http://opennlp.sourceforge.net/models-1.5/en-ner-organization.bin
{code}

It seems to me that there are a couple of ways forward.  First possibility: If 
these are accessible by URL, and are licensed in a manner compatible with 
Apache redistribution, we could just incorporate them in the build and (for 
instance) bundle them as resources in the opennlp connector jar.  Second 
possibility: We could download the model on the fly in the connector given the 
URL.  For the second possibility to make any sense, though, this would have to 
be done when a connection was configured, not as part of the specification 
information, which would rearrange the connector somewhat.


> Import OpenNLP connector into trunk
> -----------------------------------
>
>                 Key: CONNECTORS-1270
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1270
>             Project: ManifoldCF
>          Issue Type: Task
>            Reporter: Karl Wright
>            Assignee: Rafa Haro
>             Fix For: ManifoldCF 2.4
>
>
> An OpenNLP connector has been contributed on github.  Need to import it into 
> MCF, first to a branch, then to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to