Hi Suat,

Great news! I will have a detailed look next week.

best
Rupert

On Thu, Jul 19, 2012 at 4:15 PM, Suat Gonul <[email protected]> wrote:
> By the way, STANBOL-471 is the initial issue dedicated to this structure.
>
>
> On 07/19/2012 05:12 PM, Suat Gonul wrote:
>> Hi everyone,
>>
>> I have just committed the initial implementation of the index part of
>> the 2-layered structure of Contenthub. So, we have initial
>> implementations for both Store and Index layers now. Currently, this
>> work is carried on under the "contenthub-two-layered-structure" branch.
>> So, to try out this new structure, contenthub module under this branch
>> should be built.
>>
>> I would be very glad to hear your feedbacks. Below, you can see the logs
>> from the commit:
>>
>> Best,
>> Suat
>>
>> Logs:
>> Initial version of the default implementation of the SemanticIndex
>> interface which is defined in STANBOL-499.
>>
>> SemanticIndex is one part of the 2-layered structure of Contenthub. The
>> other part is the Store which is defined in STANBOL-498.
>>
>> Default implementation of the SemanticIndex interface
>> (LDPathSemanticIndex) is based on the LDPath language. A new
>> LDPathSemanticIndex can be created by providing name, description and
>> LDPath values. In the scope of LDPathSemanticIndex the provided LDPath
>> program is used in two ways which will be explained later in this log.
>>
>> Each instance of this implementation checks the changes in the Store at
>> regular intervals in a separate thread and the interval length is
>> configurable. After processing the changes in the Store, the last
>> revision is stored persistently. In this way, when the index is
>> restarted it will check the the changes as of the latest persisted
>> revision. However, when the LDPath is changed the LDPathSemanticIndex
>> will index the ContentItems from scratch. In this period the index will
>> be REINDEXING state, and during this period, it does not allow other
>> index or remove operations. After reindexing is completed, the state of
>> the index will be ACTIVE.
>>
>> LDPath usages in LDPathSemanticIndex
>> ====================================
>> a) It is used to configure the underlying Solr core. With an LDPath the
>> index fields are determined and Solr specific properties such as
>> "multiValued", "termVectors" can be configured.
>>
>> b) When indexing of a ContentItem is in progress, each named entity
>> contained in the enhancements of the ContentItem will be queried through
>> the Entityhub. Then, the values obtained from Entityhub will be indexed
>> along with the actual content as additional metadata. And the additional
>> metadata will be completely compatible with the underlying Solr core.
>>
>> This ability to create customized indexes allows compatibility with
>> different domains or use-cases.
>>
>> Creating,Retrieving LDPathSemanticIndex instances
>> =================================================
>> {stanbol_host}/index endpoint can be used to retrieve already registered
>> SemanticIndexes. An LDPathSemantic index can be created through the
>> RESTful service i.e {stanbol_host}/index/ldpath or through the Felix Web
>> Console by configuring a "Apache Stanbol Contenthub LDPath Based
>> Semantic Index".
>>
>> Each instance of LDPathSemanticIndex is registered as an OSGi component.
>> So, they can be obtained through ServiceTracker/@Reference.
>> Name(Semantic-Index-Name) and description(Semantic-Index-Name)
>> properties can be used to retrieve specific instances of
>> LDPathSemanticIndex from OSGi environment. Also, the
>> SemanticIndexManager service, provides retrieval of indexes according to
>> their names and EndpointTypes.
>>
>> Search over the LDPathSemanticIndex
>> ===================================
>> The previous search functionality of the Contenthub has not changed.
>> They are wrapped under two types of endpoints: 1) RESTful endpoints 2)
>> OSGi based Java endpoints. There are two RESTful endpoints which are
>> SOLR and CONTENTHUB. SOLR endpoint can be used to query the actual
>> underlying Solr core. CONTENTHUB endpoint offers a search option of
>> which results contain additional information in addition to the
>> resultant documents. Those additional information are facets regarding
>> the resultant documents and related keywords about the original query
>> term. This endpoint is more experimental one which is open to changes.
>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to