[
https://jira.nuxeo.com/browse/NXSEM-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=95577#comment-95577
]
Olivier Grisel commented on NXSEM-12:
-------------------------------------
Work under way at:
https://github.com/ogrisel/pignlproc/tree/master/examples/ner-corpus
Mostly working: handling redirects is still needed to make it not skip
important entities such as China => People's Republic of China.
> hadoop script to build NER training corpus from wikipedia sentences with
> links to Person, Organization or Places
> ----------------------------------------------------------------------------------------------------------------
>
> Key: NXSEM-12
> URL: https://jira.nuxeo.com/browse/NXSEM-12
> Project: Nuxeo Semantic R&D
> Issue Type: Task
> Reporter: Olivier Grisel
> Assignee: Olivier Grisel
> Fix For: 5.4.2
>
>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
_______________________________________________
ECM-tickets mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm-tickets