looks very promising! what about IKS? http://wiki.iks-project.eu/index.php/Main_Page
2012/3/23 Rupert Westenthaler <[email protected]> > Hi Stanbol community > > Let me forward this very good discussion and proposal for integrating > DBpedia Spotlight with Apache Stanbol. > > Feedback is very welcome! > > best > Rupert Westenthaler > > > From: Pablo Mendes <[email protected]> > > Subject: Re: [Dbp-spotlight-users] [GSoC 2012] Project Proposal for > "Integrate DBpedia Spotlight as Enhancement Engine within Apache Stanbol" > (Siwei Yu) > > Date: 23. März 2012 16:02:24 MEZ > > To: Siwei Yu <[email protected]> > > Cc: Rupert Westenthaler <[email protected]>, > [email protected] > > > > > > Hi Siwei, (switching to dbp-spotlight-developers, as to avoid spamming > users in dbp-spotlight-users) > > Please see answers below. > > > > On Fri, Mar 23, 2012 at 3:51 PM, Siwei Yu <[email protected]> wrote: > > Dear Pablo and Rupert, > > > > I'm sorry to post an incomplete email just now. Please ignore the > > previous email. > > > > No problem. I figured it was an accidental ctrl+enter. > > > > > > Thanks a lot for your instructions! According to your comments, let me > > summarise the current status of the service mapped to the four stages: > > (1) Spotting, (2) Candidate Selection, (3) Disambiguation, (4) > > Filtering > > /annotate: (1), (2), (3)first candidate, (4) > > /candidate: (1), (2), (3)all candidate > > /disambiguate: (3) > > /feedback: not implemented > > Please let me know if the previous summary is incorrect. > > > > > > Correct. > > > > > > > > However, Apache Stanbol each Enhancement Engine in an Enhancement > > Chain handles single task respectively (Rupert, is it true?). The > > functions of Enhancement Engines are not supposed to overlap others. > > We need to adjust the services of DBpedia Spotlight as follows: > > /spot: (1), to be implemented in this project, for > DBpediaSpotlightSpotEngine > > > > > > It is likely that we will implement /spot for release v0.6, which may > happen before GSoC starts. > > > > > > /candidate: (2), to be refactored from current status, for > > DBpediaSpotlightCandidateEngine > > /disambiguate: (3), to be refactored from current status, for > > DBpediaSpotlightDisambiguateEngine > > > > > > We would probably provide a wrapper, rather than a refactored version. > > > > > > /filter: (4), to be implemented in this project, for > > DBpediaSpotlightFilterEngine > > As to /annotate, I think it's a complicated service which is not > > applicable for Apache Stanbol's "single task for each Enhancement > > Engine" requirement. But we can retain it for DBpedia Spotlight for > > other users (i.e. not for Apache Stanbol). > > > > Sounds like /annotate would be an enhancement chain. > > > > The /feedback API could be interesting, which I'd like to try to > > implement. More details should be discussed beforehand. However, I'm > > not sure there's enough time to complete it in this two-month summer. > > > > I don't feel like wrapping DBpedia Spotlight classes is enough for a > summer-long coding project. > > You should include the /feedback in your project to make it stronger. > > This API should take in feedback from any CMS, as Stanbol is > CMS-agnostic. > > It should be able to store and later let engines query those, in order > to learn from their mistakes. > > You could think, for example, about filtering implementations that would > use feedback data to stop making the same mistakes. > > This is potentially the most interesting part for this project idea. > > > > > > > > If the project scopes discussed above are generally OK, I'd like to > > think about the project plan and come up with a project proposal > > draft. > > > > By the way, I have two small questions for DBpedia Spotlight Spotting > > and Enhancement Chain: > > 1. For Pablo, it's mentioned in [3] that there're three > > implementations for Spotting: Ling Pipe Spotter, Trie Spotter, Ling > > Pipe Chunk Spotter. How does /annotate determine which the best > > implementation is, for a service request? Can the user choose among > > them manually by sending different parameter(s)? > > > > We also have by now 4 other implementations. We have to update the > documentation. > > Please see: > > > http://www.wiwiss.fu-berlin.de/en/institute/pwo/bizer/research/publications/Mendes-Daiber-Rajapakse-Sasaki-Bizer-DBpediaSpotlight-LREC2012.pdf > > > > 2. For Rupert, could you please show me some examples of Enhancement > > Chain? I've studied some Enhancement Engines here [1]. I can > > understand how an individual Enhancement Engine works and how to > > implement a new one. After studying [2], I find Enhancement Chain a > > little confusing. Could you please lead me to the source code of the > > implementation of a concrete Enhancement Chain? I want to know the > > data I/O interface from one Enhancement Engine to another. In other > > words, how do the output of an Enhancement Engine become the input of > > another one? > > > > Best regards, > > Siwei Yu > > > > [1] > http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/list.html > > [2] http://incubator.apache.org/stanbol/docs/trunk/enhancer/chains/ > > [3] http://wiki.dbpedia.org/spotlight/technicaldocumentation?v=3qy > > > > > On Wed, Mar 21, 2012 at 4:27 PM, Rupert Westenthaler > > > <[email protected]> wrote: > > >> > > >> Hi Siwei Yu, Pablo > > >> > > >> see my comments inline. To make it better readable I also removed the > > >> parts of the mail that are not relevant to my comments. > > >> > > >> On Wed, Mar 21, 2012 at 12:01 AM, Pablo Mendes <[email protected]> > wrote: > > >> > On Tue, Mar 20, 2012 at 4:24 PM, Siwei Yu <[email protected]> > wrote: > > >> >> 2. Should I develop one Enhancement Engine containing three > services, > > >> >> or three engines (i.e. each service as an engine)? It's maybe > related > > >> >> to the service function granularity. What's your opinion? > > >> > > > >> > > > >> > We could have one engine for each task separately, and an > enhancement chain > > >> > should connect them together. We should also introduce a REST API > /spot for > > >> > (1). We could perhaps make /candidates implement only (2) and make > /annotate > > >> > accept a &verbose=on to act like the current /candidates does. > > >> > > > >> > Besides all of this reorganization that has to happen, Rupert is > the guy > > >> > from Stanbol that can help you position your application in that > regard. > > >> > > > >> > > >> I fully agree with that. > > >> > > >> Having separate EnhancementEngines for spotting, candidates selection > > >> and disambiguation would provide a lot of additional flexibility to > > >> experienced Stanbol users as they could even use parts of the DBpedia > > >> Spotlight functionalities within their existing enhancement engines. > > >> > > >> The definition of a DBpedia Spotlight EnhancementChain ensures that > > >> typical users can use Spotlight without the need to know the inner > > >> working. Users would just need to send enhancement requests to > > >> "http://{host}:{port}/enhancer/chin/dbpedia" assuming that the > DBpedia > > >> Spotlight chain is called "dbpedia". There would even be the > > >> possibility to make the Dbpedia Spotlight EnhancementChain the default > > >> enhancement chain so that requests to "/enhancer" would be processed > > >> by it. > > >> > > >> >> > > >> >> By the way, my name is Siwei Yu. I have good knowledge of semantic > > >> >> technologies, such as RDF, OWL, SPARQL. I'm also familiar with the > > >> >> mainstream Java based RDF/OWL processing tools like owlapi, Jena, > > >> >> Sesame, AllegroGraph. I have strong Java coding skills with of good > > >> >> knowledge of the software design patterns. My research background > > >> >> meets the requirements very well. I believe it'll be a wonderful > > >> >> summer working with the DBpedia Spotlight community. > > >> > > > >> > > > >> > It would be good if you leveraged some of your Semantic Web > background in > > >> > your application. The idea of a /feedback API, which receives > corrections > > >> > made by the users could fit well in this regard. > > >> > > > >> > > >> A feedback API is also something that would be interesting for the > > >> Stanbol Enhancer. > > >> > > >> best > > >> Rupert Westenthaler > > >> > > >> -- > > >> | Rupert Westenthaler [email protected] > > >> | Bodenlehenstraße 11 ++43-699-11108907 > > >> | A-5500 Bischofshofen > > > >
