> On 11/2/2011 1:38 PM, Dario wrote: >> Hello, >> >> for research purposes I'm interested in differentiating DBpedia entities >> into two types, those which are actual existing elements (i.e., things >> you can see and touch) and generalizations of elements (i.e., things >> which are abstractions of existing elements). >> Examples of the first ones could be: John Turturro, the Golden Gate >> Bridge, the Enola Gay bomber. >> Examples of generalizations could be: Football, the femur, Boeing B-29 >> Superfortress bomber. > This is a great research topic, but it's also of considerable > commercial importance. If one were interested in converting DBpedia > facts to text or creating a user interface, it would be good to know > about "abstract" vs "concrete"
Hi Paul, I agree. From my point of view, concrete entities have a different nature than abstract entities, which justifies the fact of using them differently. Those differences have been discussed for thousands of years (see Aristotle for example), and we should not ignore all that previous work. I may be naive in that matter, but I think that this subject is one of the keys for the success of the Semantic Web and for the field of AI in general. > > There is the possibility of defining classes > (:SomethingThatCanHaveAMember) or (:SomethingThatIsOnlyAnInstance) but > also the possibility of defining an abstract/concrete score which is > numerical. People tend to be very concrete, but we have no idea who > :D.B._Cooper was or what his fate was. :Captain_Kirk is more abstract > than :William_Shatner. When dealing with the more difficult stuff, a > numeric score might be the best you can do. To be honest, I had not thought about a numeric value of abstractness before. Although it may be useful in practice (I'll have to think about that), I do not think it is 'natural'. As I see it, it would be more like a measure of ignorance. For example, your neighbor. Imagine you have never seen him, but you know he's a man because you have heard his voice. He is however a concrete entity for sure. The issue here is that your ignorance towards that entity causes you to generalize with him and treat him, in some aspects, like an abstract entity. You would assume he has two eyes, for example. That's a topic that also interests me and which falls within my PhD, since my main interest is to make knowledge discovery based on some Semantic Web. >> After reading some documentation on the DBpedia, including the latest >> article published, it looks to me that such difference is never made. >> Furthermore I wonder if it is even possible to make that difference >> based on the information available. Unfortunately, I do not know enough >> about the Wikipedia semantics to answer that. >> >> The only solution I can think of is manually tagging entities. That >> could be facilitated by grouping elements (e.g., every entity of class >> Person is an existing entity). However, other classes would require >> individual treatment. >> >> So my questions are these: >> -Is there a difference in DBpedia between existing entities and general >> entities? >> -Is there information available in the Wikipedia to make such difference? >> -Based on the DBpedia, is there any other method beyond manual tagging >> to make that difference? >> -Of the DBpedia Ontology, which classes could be considered as holding >> existing entities? Person, Place, Planet, Work, ...? >> >> I know is quite an abstract question, and not fully related with >> technical aspects of the DBpedia, but I think this is the place to ask. >> > I think the strategy of starting with types and then refining > the results is best. You could probably get a large majority of topics > properly typed, particularly if you use type information from > Freebase, which is more accurate and comprehensive than DBpedia types. > The hard ones are going to be the things that fall through the cracks in > the type system, like > > http://dbpedia.org/page/Fire > > but note that Freebase has 18 types for this topic, so you're not > without hope. > > http://www.freebase.com/edit/topic/en/fire > > Maybe it's a fair guess to say that "things that fall through the > cracks" are abstract. > > I say: try the obvious thing with types, then do some evaluation. If > you're not happy with it, maybe you'll think of another heuristic > (traditional knowledge engineering) or maybe you can train a machine > learning algorithm to make the distinction. Evaluate again and repeat > until you've got enough for a paper... or a product that's "good enough > to use". > > I'd love to see a Turtle file published with these classifications > because I could use them. > Thanks for the advice. It seems that Freebase is a little more specific in that difference than DBpedia. I'll study their data sets in detail. There are several approaches to solve this problem. Data Mining, Natural Language Processing, Machine Learning, Ontologies... Non of which has worked well enough so far. My proposed solution, which I have not seen anyone else trying, uses inference methods to discover and learn about entities. I'd say its a mix of DM, ML and Ont. But I need some data to start with. Freebase or DBpedia may just be it. Hopefully in one or two years you will have your Turtle file. Dario. > > > > ------------------------------------------------------------------------------ > RSA(R) Conference 2012 > Save $700 by Nov 18 > Register now > http://p.sf.net/sfu/rsa-sfdev2dev1 > _______________________________________________ > Dbpedia-discussion mailing list > Dbpedia-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion > ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion