On Wed, Apr 30, 2014 at 10:37 AM, Cristian Petroaca
<[email protected]> wrote:
> Hi All,
>
> I'm currently working on https://issues.apache.org/jira/browse/STANBOL-1279.
>
> I am using the SiteManager to get a Site with referenceId = "dbpedia" and
> am querying data related to some NERs (querying by NER label and type).
> This works and I do get results from the dbpedia index.
>
> What I want to do is this :
>
> 1. I want to be able to store and get yago class types in the dbpedia data.
> This data is stored in the yago-types.nt file from the dbpedia 3.9
> downloads. Is it possible to create a new dbpedia index with the 3.9 files
> using this script
> https://svn.apache.org/repos/asf/stanbol/trunk/entityhub/indexing/dbpedia/dbpedia-3.8/fetch_data_en_int.sh
> ?
yep. Just make suer you change
DBPEDIA=http://downloads.dbpedia.org/3.8
to dbpedia 3.9
BTW: you can also remove
#corrects encoding and recompress using gz
bzcat ${filename}.bz2 \
| sed 's/\\\\/\\u005c\\u005c/g;s/\\\([^u"]\)/\\u005c\1/g' \
| gzip -c > ${filename}.gz
rm -f ${filename}.bz2
as this is no longer necessary.
>
> 2. I want to access some specific dbpedia properties such as
> dbpedia-owl:locationCity and others. These are already present in the
> mappingbased_properties_en.nt
> file which is in the fetch_data_en_int.sh script but are not in the
> https://svn.apache.org/repos/asf/stanbol/trunk/entityhub/indexing/dbpedia/src/main/resources/indexing/config/mappings.txt
> file.
> Should I include them there and do a dbpedia index rebuild?
Exactly. If the size of the created SolrIndex is an issue I recommend
also that you remove properties you do not need.
>
> I've already described this in the "Named entity coref resolution based on
> dbpedia" mail thread but I thought of creating a new mail for visibility
> and for not clogging the other thread.
The old thread is anyways already much to long. Please make sure that
important points and decisions of that thread are also reflected in
the description of STANBOL-1279
best
Rupert
>
> Thanks,
> Cristian
--
| Rupert Westenthaler [email protected]
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO
..........................................................................
| http://redlink.co/