Hey Alberto,

we started indexing for Czech using [2] but did not finish it because of the 
missing stemmer. At the moment we only use snowball stemmers via Lucene but we 
would be more than happy for contributions.

For [2], Stemmers need to implement org.dbpedia.spotlight.db.model.Stemmer
To make it general enough, you could add a companion object Stemmer, e.g.

> object Stemmer {
> 
>   def forLanguage(lang: String): Stemmer = {
>       if (lang equals "cs")
>          your stemmer
>       else
>         default snowball stemmer
>   }
> 
> }
> 

Stemmer.forLanguage would have to be added to 
org.dbpedia.spotlight.db.CreateSpotlightModel and 
org.dbpedia.spotlight.db.SpotlightModel as well as to the pignlproc scripts 
that do the counting in pignlproc.index.GetCounts* 

In any case, using [2] and [1] should be the easiest method and give the best 
results.

[1] https://github.com/jodaiber/model-quickstarter

Best,
Joachim


Am 17.05.2013 um 14:00 schrieb Alberto Reggiori:

> 
> Hi all
> 
> I am in the process of trying out a customised setup of DBPedia-Spotlight in 
> Slovak following the instructions at [1][2][3] possibly configuring/adding 
> custom a tokenisers/stemmers [4][5] (and in parallel perhaps looking at 
> defining the necessary DBPedia infobox mappings).
> 
> Before I duplicate any work, I am wondering if anyone on this list has been 
> playing with Slavonic languages, such Slovak and Czech etc. - and if they is 
> any public available work/project out there.
> 
> Thank you very much in advance for any follow up
> 
> 
> Best regards
> 
> Alberto
> 
> [1] 
> https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Internationalization-(Lucene-backed-core)
> [2] 
> https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Internationalization-(DB-backed-core)
> [3] 
> https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Internationalization
> [4] 
> http://vi.ikt.ui.sav.sk/Projekty/Projekty_2008%2F%2F2009/Hana_Pifková_-_Stemer
> [5] http://www.languagetool.org/languages/
> ------------------------------------------------------------------------------
> AlienVault Unified Security Management (USM) platform delivers complete
> security visibility with the essential security capabilities. Easily and
> efficiently configure, manage, and operate all of your security controls
> from a single console and one unified framework. Download a free trial.
> http://p.sf.net/sfu/alienvault_d2d
> _______________________________________________
> Dbp-spotlight-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

------------------------------------------------------------------------------
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

Reply via email to