Hi all,

Luis says:

> First I assumed that the infobox types (instance_types_en.nt.bz2) contains
> all the tags which spotlight is able to extract from a text. However as I
> found out later, it does not. For example
> http://dbpedia.org/resource/Photography is missing from the
> instance_types dump. I assume it has something to do with the fact that the
> instance_types are called "infobox types" and therefore do not contain
> everything? However I was not able to find the Photography URI in any file
> in combination with it's ontology (owl#Thing). Does there even exist a dump
> for this, you know anything about that?
>

Luis, instance types are extracted from mappings that were handcrafted at
http://mappings.dbpedia.org
This means that if a page does not have an infobox, it will not have a type
statement.
You can get a list of all DBpedia resources that DBpedia Spotlight cares
about by running our indexing code and grabbing the "Concept URIs" file. As
an alternative, you could just "bzcat | cut | sort | uniq" the union of
DBpedia download files.

But, folks, I think that Luis has a point. On the one hand, by definition,
everything is an owl:Thing, so even if we do not include this statement
explicitly, any reasoner should be able to infer that. On the other hand,
we do not assume that people are using reasoners, so I wonder what is the
right thing to do here. A few options that I considered:
1) add documentation explaining that clients should infer "rdf:type
owl:Thing" for every resource mentioned anywhere in DBpedia
2) provide a new dataset "obvious_instance_types_en.nt" with statements
like "?resource rdf:type owl:Thing" for every DBpedia Resource that we
include in our dumps
3) include "?resource rdf:type owl:Thing" in instance_types for every
resource that does not have another type there

Is there another easy way to grab the set of all resources in DBpedia?

Cheers,
Pablo
------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to