>   On 11/2/2011 1:38 PM, Dario wrote:
>> Hello,
>>
>> for research purposes I'm interested in differentiating DBpedia entities
>> into two types, those which are actual existing elements (i.e., things
>> you can see and touch) and generalizations of elements (i.e., things
>> which are abstractions of existing elements).
>> Examples of the first ones could be: John Turturro, the Golden Gate
>> Bridge, the Enola Gay bomber.
>> Examples of generalizations could be: Football, the femur, Boeing B-29
>> Superfortress bomber.
>      This is a great research topic,  but it's also of considerable
> commercial importance.  If one were interested in converting DBpedia
> facts to text or creating a user interface,  it would be good to know
> about "abstract" vs "concrete"

Hi Paul,

I agree. From my point of view, concrete entities have a different nature than
abstract entities, which justifies the fact of using them differently. Those
differences have been discussed for thousands of years (see Aristotle for
example), and we should not ignore all that previous work.
I may be naive in that matter, but I think that this subject is one of the keys
for the success of the Semantic Web and for the field of AI in general.


>
>      There is the possibility of defining classes
> (:SomethingThatCanHaveAMember) or (:SomethingThatIsOnlyAnInstance) but
> also the possibility of defining an abstract/concrete score which is
> numerical.  People tend to be very concrete,  but we have no idea who
> :D.B._Cooper was or what his fate was.  :Captain_Kirk is more abstract
> than :William_Shatner.  When dealing with the more difficult stuff,  a
> numeric score might be the best you can do.

To be honest, I had not thought about a numeric value of abstractness before.
Although it may be useful in practice (I'll have to think about that), I do not
think it is 'natural'. As I see it, it would be more like a measure of 
ignorance.
For example, your neighbor. Imagine you have never seen him, but you know he's a
man because you have heard his voice. He is however a concrete entity for sure.
The issue here is that your ignorance towards that entity causes you to 
generalize
with him and treat him, in some aspects, like an abstract entity. You would 
assume
he has two eyes, for example.
That's a topic that also interests me and which falls within my PhD, since my 
main
interest is to make knowledge discovery based on some Semantic Web.

>> After reading some documentation on the DBpedia, including the latest
>> article published, it looks to me that such difference is never made.
>> Furthermore I wonder if it is even possible to make that difference
>> based on the information available. Unfortunately, I do not know enough
>> about the Wikipedia semantics to answer that.
>>
>> The only solution I can think of is manually tagging entities. That
>> could be facilitated by grouping elements (e.g., every entity of class
>> Person is an existing entity). However, other classes would require
>> individual treatment.
>>
>> So my questions are these:
>> -Is there a difference in DBpedia between existing entities and general
>> entities?
>> -Is there information available in the Wikipedia to make such difference?
>> -Based on the DBpedia, is there any other method beyond manual tagging
>> to make that difference?
>> -Of the DBpedia Ontology, which classes could be considered as holding
>> existing entities? Person, Place, Planet, Work, ...?
>>
>> I know is quite an abstract question, and not fully related with
>> technical aspects of the DBpedia, but I think this is the place to ask.
>>
>          I think the strategy of starting with types and then refining
> the results is best.  You could probably get a large majority of topics
> properly typed,  particularly if you use type information from
> Freebase,  which is more accurate and comprehensive than DBpedia types.
> The hard ones are going to be the things that fall through the cracks in
> the type system,  like
>
> http://dbpedia.org/page/Fire
>
> but note that Freebase has 18 types for this topic,  so you're not
> without hope.
>
> http://www.freebase.com/edit/topic/en/fire
>
> Maybe it's a fair guess to say that "things that fall through the
> cracks" are abstract.
>
> I say:  try the obvious thing with types,  then do some evaluation.  If
> you're not happy with it,  maybe you'll think of another heuristic
> (traditional knowledge engineering) or maybe you can train a machine
> learning algorithm to make the distinction.  Evaluate again and repeat
> until you've got enough for a paper...  or a product that's "good enough
> to use".
>
> I'd love to see a Turtle file published with these classifications
> because I could use them.
>

Thanks for the advice. It seems that Freebase is a little more specific in that
difference than DBpedia. I'll study their data sets in detail.

There are several approaches to solve this problem. Data Mining, Natural 
Language
Processing, Machine Learning, Ontologies... Non of which has worked well enough 
so
far. My proposed solution, which I have not seen anyone else trying, uses
inference methods to discover and learn about entities. I'd say its a mix of DM,
ML and Ont. But I need some data to start with. Freebase or DBpedia may just be
it. Hopefully in one or two years you will have your Turtle file.

Dario.

>
>
>
> ------------------------------------------------------------------------------
> RSA(R) Conference 2012
> Save $700 by Nov 18
> Register now
> http://p.sf.net/sfu/rsa-sfdev2dev1
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>



------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to