Re: Newbie questions

Suat Gonul Thu, 15 Mar 2012 01:17:26 -0700

Hi Michel,

In the current state, Contenthub is a knowledge repository which indexes
the documents and information related to documents in Solr cores. So, it
provides different search over the indexes by making us of Solr
capabilities.


LDPath is basically an RDF Path Language which allows querying RDF data
with different kinds of features. You can have a look at the link I
provided in the previous mail for those features. Contenthub uses this
language to create semantic, domain specific indexes. If you populate
these indexes with your own documents, first enhancements regarding to
the document is obtained from Enhancer. Then, LDPath endpoint of
Entityhub is queried for each named entity detected for the documents.
During the querying process the LDPath program, which is used to create
selected Solr index, is used. In this way, actual content of the
document, its enhancements and other semantic information related with
its entities are materialized as single document in Solr.

I am preparing a demo (planning to finish it today or tomorrow) and once
it is ready I will share the link as it might give an idea about usage
of the Contenthub.

Best,
Suat

On 03/14/2012 05:16 PM, Michel Benevento wrote:
> Suat,
>
> I am happy to report some initial success, as I got the OWL file indexed and 
> loaded and it produces results (steps 1 & 2) through enhancer.
>
> I am not quite sure what you mean by points 3 & 4 because I don't now LDPath 
> or what contenthub exactly is. Are you still talking about the content 
> management use case or is this related to public delivery of content? I will 
> study some more, since I am still figuring out what Stanbol's role could be 
> in live search scenarios or how we want to deploy this. Any advise is welcome 
> of course.
>
> Anyway, thanks so far, more experiments await.
>
> Brgds,
> Michel
>
>
> On 14 mrt. 2012, at 10:21, Suat Gonul wrote:
>
>> Hi Michel,
>>
>> Let me tell you about some generic use cases about usage of the ontology
>> you pointed out.
>>
>> 1) You can create a referenced site from this ontology so that this
>> referenced site can be used to extract entities during the enhancement
>> process of a content item.
>>
>> To create a referenced site from the ontology you pointed, you can use
>> the instructions that are explained in [1]. Also the documentation page
>> for using custom vocabularies in Stanbol is at [2]. The mail thread at
>> [3] includes valuable information about index creation and problems you
>> can encounter.
>>
>> Let me propose a mappings configuration for the ontology you pointed
>> out. Basically, the properties you specified in these configurations
>> will be fetched from the ontology and indexed. You can directly write
>> full URIs of the properties you want to index e.g:
>>
>> http://www.w3.org/2000/01/rdf-schema#label
>> http://www.mindswap.org/2003/nciOncology.owl#id
>> http://www.mindswap.org/2003/nciOncology.owl#code
>> http://www.mindswap.org/2003/nciOncology.owl#Semantic_Type
>> http://www.mindswap.org/2003/nciOncology.owl#Preferred_Name
>> http://www.mindswap.org/2003/nciOncology.owl#UMLS_CUI
>> http://www.mindswap.org/2003/nciOncology.owl#CTRM_ID
>> http://www.mindswap.org/2003/nciOncology.owl#NSC_Code
>> http://www.mindswap.org/2003/nciOncology.owl#Synonym
>>
>>
>> 2) To be able to use the referenced site in the enhancement process, the
>> easiest way from my side is to configure a "KeywordLinkingEngine".
>> @Stanbolers, please correct me if I'm wrong. [2] contains documentation
>> about the "KeywordLinkingEngine".
>>
>> 3) You can create semantic Solr indexes to store your documents together
>> with their enhancements obtained via Stanbol Enhancer. This can be done
>> thanks to LDPath[4] integration of Contenthub. You can use RESTful
>> services of Contenthub to create new Solr cores based on the provided
>> LDPath. See also the documentation at [5]. Here is a possible LDPath
>> program that can be used to create semantic Solr indexes.
>>
>> @prefix rdfs : <http://www.w3.org/2000/01/rdf-schema#>;
>> @prefix nci : <http://www.mindswap.org/2003/nciOncology.owl#>;
>>
>> nci_semantic_type = nci:Semantic_Type :: xsd:string;
>> nci_umls_cui = nci:UMLS_CUI :: xsd:string;
>>
>> When you submit a document to the Solr core created based on this
>> LDPath, all named entities detected from the document will be queried
>> with this LDPath. This means, the fields of entities will be obtained
>> from Entityhub and they will be stored along with the actual content.
>>
>> 4) You can use the semantic Solr indexes directly through their HTML
>> entry points or you can use Contenthub with different kinds of search
>> functionalities e.g. keyword search, faceted search.
>>
>> This is a long story and there is much to read. Please ask about any
>> further questions if you have any.
>>
>> Best,
>> Suat
>>
>>
>> [1]
>> http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/genericrdf/README.md
>> [2] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html
>> [3] http://markmail.org/message/mvwik2ykgkindnzh
>> [4] http://code.google.com/p/ldpath/
>> [5] http://incubator.apache.org/stanbol/docs/trunk/contenthub/contenthub5min
>>
>>
>> On 03/13/2012 11:26 PM, Michel Benevento wrote:
>>> Hello,
>>>
>>> I am evaluating Stanbol as a 'content enhancer' for medical (cancer) 
>>> related content. I am new to the world of ontologies and all things 
>>> Stanbol, so please bear with me as I ask a few simple questions about 
>>> getting an initial ontology loaded.  I found an owl file here that I would 
>>> like to use for evaluation purposes: 
>>> http://www.mindswap.org/2003/CancerOntology/
>>>
>>> I downloaded and installed Stanbol succesfully and it runs OK.
>>>
>>> Questions:
>>> - is it possible to 'load' a Stanbol ontology with this OWL file? Does this 
>>> even make sense?
>>> - If so, please provide step by step instructions, I have a clean install 
>>> up and running and nothing more.
>>> - If not, how would I go about this? Please be a elaborate as you can.
>>>
>>> Thanks a lot,
>>> Michel Benevento.
>>>
>>>
>

Re: Newbie questions

Reply via email to