[jira] [Commented] (STANBOL-187) Extendable indexing infrastructure for the Entityhub

Florent ANDRE (JIRA) Fri, 03 Jun 2011 02:58:41 -0700

    [ 
https://issues.apache.org/jira/browse/STANBOL-187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043272#comment-13043272
 ]


Florent ANDRE commented on STANBOL-187:
---------------------------------------

Hi Rupert, 

Thanks for this useful add, that work well (faster than a D2RQ call) and very 
simple (auto-install of configured bundles in perfect ! )

I have 2 remarks thought (Last Changed Rev: 1130735)

1) When indexing a skos file, only terms with multi-words are indexed, and not 
term with one word. I observe this first on my particular thesaurus then also 
in the iptc one. I try this request 
$ curl -X POST -F "[email protected]" 
http://localhost:8080/entityhub/site/iptc/query 
with queries : 

1.A) @fieldQuery.json = 
{
    "offset": "0", 
    "limit": "30", 
    "constraints": [
        { 
          "type": "value", 
          "field": "http:\/\/www.w3.org\/2004\/02\/skos\/core#prefLabel", 
          "value": "Africa", 
        } 
    ]
}

==> output no results

1.B) @fieldQuery.json =
{
    "offset": "0", 
    "limit": "30", 
    "constraints": [
        { 
          "type": "value", 
          "field": "http:\/\/www.w3.org\/2004\/02\/skos\/core#prefLabel", 
          "value": "South America", 
        } 
    ]
}


==> output results.

"Africa" and "South America" are skos:prefLabel in world-region.rdf in iptc 
dataset.

2) When open the "Entity hub referenced site configuration" for imported site 
in Felix/Sling console configuration, the "Fields mapping" part contain all the 
mapping.txt file with blank and comment (#) lines, and not only mappings. It 
may be expected.

Cheers.
++


> Extendable indexing infrastructure for the Entityhub
> ----------------------------------------------------
>
>                 Key: STANBOL-187
>                 URL: https://issues.apache.org/jira/browse/STANBOL-187
>             Project: Stanbol
>          Issue Type: Improvement
>          Components: Entity Hub
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>
> Currently the Entityhub includes some utilities to create Indexes for 
> dbPedia, geonames and dblp. There exists also an generic RDF indexer that is 
> used by the dbPedia and dblp however also this implementation is not 
> extendable and not really suitable to add features requested by issues like 
> STANBOL-92, STANBOL-93 and STANBOL-163.
> The goal is to create an infrastructure that provides an implementation of
>  - the indexing workflow
>  - configuration and initialization
> and defines Interfaces that allows to plug in
>  - different Data Sources
>  - entity ranking implementations
>  - entity data mapper (e.g. filtering some fields, schema translations ...)
>  - indexing targets (the Yard that stores the indexed entities)
> The existing Indexing utilities need to be moved to use the new Infrastructure

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (STANBOL-187) Extendable indexing infrastructure for the Entityhub

Reply via email to