On 27.10.2011, at 16:59, Ali Anil SINACI wrote:
>> 
>> 
>> * The LMF semantic search component overlaps greatly with the recently by 
>> Anil contributed "contenthub/search/engines/solr" component.  Related to 
>> this it would be great if Anil could have a look at [2] and check for 
>> similarities/differencies and possible integration paths.
>> 
> 
> I had a look on the semantic search component of LMF. As you pointed it out, 
> LMF semantic search provides a convenient way to index any part of documents 
> with the help of RDFPath Language. I think that we can make use of this 
> feature in contenthub. As I described in my previous e-mail, currently, 
> contenthub indexes a number of semantic fields based on DBPedia relations. 
> These are hardcoded relations. RDFPath language can be used  to indicate 
> specific semantic fields to be indexed along with the content itself. Let me 
> describe the thing in our mind in a scenario:
> 
> A user provides a domain ontology (e.g. music domain), submits to Entityhub 
> to be used in the enhancement process. Suppose the domain ontology includes 
> vast of information about artists, their albums etc... I assume that this 
> ontology does not include conceptual definitions (it only includes Abox 
> definitions). User writes an RDF Path Program (in LMF terminology) to 
> indicate the fields to be indexed when a content item has an enhancement 
> related with any path in that program. Suppose user submits a content item 
> along with the RDF Path Program(s) to be used to determine the fields to be 
> indexed. Enhancement engines find an entity (or lots of entities). Now, we 
> execute the selected RDF Path Program(s) and embed the results into the Solr 
> representation of the content item.
> 
> If you have any other suggestions, please let me know so that we can discuss 
> in detail (in SRDC) before the meeting.
> 
This is exactly what I was thinking about. Let me only add that such additional 
Knowledge to be included within the Semantic Index might not only come from the 
Entityhub, but also from other sources (like the CMS via the CMS adapter)

I you would like to help me with an Implementation of the RdfPathLanguage (e.g. 
the Clerezza based Implementation, or maybe a Jena bases implementation) please 
let me know. Help would be greatly welcome, because I have already a lot of 
things on my TODO list before the Meeting in November (such as defining a 
Proposal for the Stanbol Enhancement Structure).

>> * The Semantic Search Inteface: The Contenthub currently defines it's own 
>> query API (supports keyword based search as well as "field ->  value" like 
>> constraints, supports facets). The LMF directly exposes the RESTful API of 
>> the semantic Solr index. I strongly prefer the approach of the LMF, because 
>> the two points already described above.
> 
> We think that we do not have to make a selection here. We can keep a simple 
> wrap-up on the Solr interface (contenthub's own query API) while providing 
> the Solr RESTful API as is. IMO a wrap-up on Solr interface would be 
> beneficial. On the other hand, in this interface we try to make use of an 
> ontology to be used in OntologyResourceSearchEngine. This might help to 
> figure out new keywords based on the subsumption hierarchy inside the 
> ontology. However, I think this may lead to performance issues and may not be 
> useful at all. We can decide on this later.

You forgot to mention one additional advantage for using the Solr RESTful API: 
If we do that one could create the Semantic Index and than copy it over to some 
other SolrServer without the need to run Stanbol directly on the production 
infrastructure.

In general I would suggest to first focus the discussion on the unique features 
we would like to provide with the Semantic Search component. I already included 
three features I would like to have in my first Mail (Query preprocessing, 
Entity Facets, Semantic Facets). As you now mention the 
OntologyResourceSearchEngine is very relevant in relation to such features.
However adding such features must not necessarily mean to create an own query 
language. One could also try to add such features directly to Solr by 
implementing some Solr extensions. 

> 
>>  But I am also the opinion that a semantic search interface should at least 
>> provide the following three additional features:
>>     1. Query preprocessing: e.g. substitute  "Paris" in the query with 
>> "http://dbpedia.org/resource/Paris";;
>>     2. Entity Facets: if a keyword matches a Entity (e.g. "Paris" ->  
>> "dbpedia:Paris", "dbpedia:Paris_Texas", "dbpedia:Paris_Hilton") than provide 
>> a Facet to the user over such possible matches;
>>     3. Semantic Facets: if a user uses an instance of an ontology type (e.g. 
>> a Place, Person, Organization) in a query, that provide facets over semantic 
>> relations for such types (e.g. fiends for persons, products/services for 
>> Organizations, nearby Points-Of-Interests for Places, Participants for 
>> Events, …). To implement features like that we need components that provide 
>> query preprocessing capabilities based on data available in the Entityhub, 
>> Ontonet … . To me it seams that the 
>> contenthub/search/engines/ontologyresource component provides already some 
>> functionality related to this so this might be a good starting point.
> 

best
Rupert

> -- 
> Ali Anil SINACI
> 
> Software Research, Development and Consultancy Ltd.
> Phone: +90 (312) 2101763
> Fax: +90 (312) 2101837
> E-mail: [email protected]
> 

Reply via email to