On Wed, Apr 30, 2014 at 10:37 AM, Cristian Petroaca
<cristian.petro...@gmail.com> wrote:
> Hi All,
>
> I'm currently working on https://issues.apache.org/jira/browse/STANBOL-1279.
>
> I am using the SiteManager to get a Site with referenceId = "dbpedia" and
> am querying data related to some NERs (querying by NER label and type).
> This works and I do get results from the dbpedia index.
>
> What I want to do is this :
>
> 1. I want to be able to store and get yago class types in the dbpedia data.
> This data is stored in the yago-types.nt file from the dbpedia 3.9
> downloads. Is it possible to create a new dbpedia index with the 3.9 files
> using this script
> https://svn.apache.org/repos/asf/stanbol/trunk/entityhub/indexing/dbpedia/dbpedia-3.8/fetch_data_en_int.sh
> ?

yep. Just make suer you change

    DBPEDIA=http://downloads.dbpedia.org/3.8

to dbpedia 3.9

BTW: you can also remove

        #corrects encoding and recompress using gz
        bzcat ${filename}.bz2 \
            | sed 's/\\\\/\\u005c\\u005c/g;s/\\\([^u"]\)/\\u005c\1/g' \
            | gzip -c > ${filename}.gz
        rm -f ${filename}.bz2

as this is no longer necessary.

>
> 2. I want to access some specific dbpedia properties such as
> dbpedia-owl:locationCity and others. These are already present in the
> mappingbased_properties_en.nt
> file which is in the fetch_data_en_int.sh script but are not in the
> https://svn.apache.org/repos/asf/stanbol/trunk/entityhub/indexing/dbpedia/src/main/resources/indexing/config/mappings.txt
> file.
> Should I include them there and do a dbpedia index rebuild?

Exactly. If the size of the created SolrIndex is an issue I recommend
also that you remove properties you do not need.

>
> I've already described this in the "Named entity coref resolution based on
> dbpedia" mail thread but I thought of creating a new mail for visibility
> and for not clogging the other thread.

The old thread is anyways already much to long. Please make sure that
important points and decisions of that thread are also reflected in
the description of STANBOL-1279

best
Rupert

>
> Thanks,
> Cristian



-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO 
..........................................................................
| http://redlink.co/

Reply via email to