Hi Rafa, I don't yet have a concrete heursitic but I'm working on it. I'll provide it here so that you guys can give me a feedback on it.
What are "locality" features? I looked at Bart and other coref tools such as ArkRef and CherryPicker and they don't provide such a coreference. Cristian 2014-01-30 Rafa Haro <rh...@apache.org>: > Hi Cristian, > > Without having more details about your concrete heuristic, in my honest > opinion, such approach could produce a lot of false positives. I don't know > if you are planning to use some "locality" features to detect such > coreferences but you need to take into account that it is quite usual that > coreferenced mentions can occurs even in different paragraphs. Although I'm > not an expert in Natural Language Understanding, I would say it is quite > difficult to get decent precision/recall rates for coreferencing using > fixed rules. Maybe you can give a try to others tools like BART ( > http://www.bart-coref.org/). > > Cheers, > Rafa Haro > > El 30/01/14 10:33, Cristian Petroaca escribió: > > Hi, >> >> One of the necessary steps for implementing the Event extraction Engine >> feature : https://issues.apache.org/jira/browse/STANBOL-1121 is to have >> coreference resolution in the given text. This is provided now via the >> stanford-nlp project but as far as I saw this module is performing mostly >> pronomial (He, She) or nominal (Barack Obama and Mr. Obama) coreference >> resolution. >> >> In order to get more coreferences from the text I though of creating some >> logic that would detect this kind of coreference : >> "Apple reaches new profit heights. The software company just announced its >> 2013 earnings." >> Here "The software company" obviously refers to "Apple". >> So I'd like to detect coreferences of Named Entities which are of the >> rdf:type of the Named Entity , in this case "company" and also have >> attributes which can be found in the dbpedia categories of the named >> entity, in this case "software". >> >> The detection of coreferences such as "The software company" in the text >> would also be done by either using the new Pos Tag Based Phrase extraction >> Engine (noun phrases) or by using a dependency tree of the sentence and >> picking up only subjects or objects. >> >> At this point I'd like to know if this kind of logic would be useful as a >> separate Enhancement Engine (in case the precision and recall are good >> enough) in Stanbol? >> >> Thanks, >> Cristian >> >> >