[ 
https://issues.apache.org/jira/browse/STANBOL-90?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rupert Westenthaler resolved STANBOL-90.
----------------------------------------

    Resolution: Fixed
    
> Create a maven artifact to embed all the default stanbol models data
> --------------------------------------------------------------------
>
>                 Key: STANBOL-90
>                 URL: https://issues.apache.org/jira/browse/STANBOL-90
>             Project: Stanbol
>          Issue Type: New Feature
>            Reporter: Olivier Grisel
>            Assignee: Olivier Grisel
>
> To make stanbol useful, esp. in offline mode, it needs to some statistical 
> model and entity / topic indices. Those indices can be huge (several GB for 
> all the entities of dbpedia and geonames for instance) hence cannot be 
> packaged as part of the default distrib. However it is very desirable to 
> embed some default statistical models
> - opennlp sentence detector for English
> - opennlp name finder models for English for organizations, people, places
> - solr index for the top 10000 most popular entities (of type organizations, 
> people, places) as measured by number of incoming links in the Wikipedia 
> article graph.
> - solr index for the top 1000 most popular topics number of Wikipedia 
> articles categorized in this category or subcategory
> The goal is to keep that maven artifact less that 100 MB (ideally even 
> smaller) so that it does not put a big barrier to entry to people downloading 
> the default distribution of Stanbol.
> To avoid slowing down the svn repo, those data files will not be put under 
> version control, just the pom.xml + script to rebuild the artifact from a 
> previous version of the jar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to