[jira] [Comment Edited] (STANBOL-488) EnhancementProperties

Rupert Westenthaler (JIRA) Wed, 30 Apr 2014 01:47:32 -0700

    [ 
https://issues.apache.org/jira/browse/STANBOL-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889425#comment-13889425
 ]


Rupert Westenthaler edited comment on STANBOL-488 at 4/30/14 8:44 AM:
----------------------------------------------------------------------

*NOTE:* This comment contains a documentation of how Enhancement Properties are 
planed to be added to Stanbol 1.0.0. 

h2. Enhancement Property naming and definition

EnhancementProperties should be defined both as an RDF data type property 
within an Ontology and as constants in some Java Class or Interface. The URI 
version will use the enhancement property namespace 
(`http://stanbol.apache.org/ontology/enhancementproperties#`) and the ID of the 
property as local name. IDs SHOULD use the 
`{level-1}.{level-2}.{property-name}` syntax as typically used for java 
properties. Properties are case sensitive and should only use lower case 
characters. The '-' char shall be used to make properties with multiple names 
easier to read. 

Globally defined properties will use '`enhancer`' as {level-1}. For Enhancement 
Engine specific properties a possible shorted/simplified name of the engine 
should be used as {level-1}.

Typical examples for Enhancement Properties are `enhancer.max-suggestions`, 
`enhancer.min-confidence`, `entity-co-mention.adjust-existing-confidence`

The definition as RDF property will use the URI and MAY also include the XSD 
data type of supported values. 

{code}

    @prefix ehprop <http://stanbol.apache.org/ontology/enhancementproperties#>
    
    ehprop:enhancer.max-suggestions     rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Integer;

    ehprop:enhancer.min-confidence      rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Double;

    ehprop:entity-co-mention.adjust-existing-confidence rdf:type        
rdfs:DatatypeProperty,
        xsd:datatype    xsd:Double;

{code}

*TODO:* Maybe the Enhancer should provide a service to register & lookup 
defined enhancement properties


h2. Java Interface

EnhancementProperties are parsed to EnhancementEngines as a Map with String 
keys and Object values. This map is parsed as additional argument to the 
computeEnhancement(..) method of the EnhancementEngine interface. Keys need to 
be supported EnhancementEngines. Engines are expected to silently ignore 
unsupported keys. For illegal values of supported keys EnhancementEngines 
SHOULD throw an EnhancementPropertyExceptions (a subclass of the 
EnhancementException.

Engines MUST support parsing of EnhancementProperty values from their String 
representation. Typically this means that parsing Numeric values from Strings. 
In case of multiple values the value will be a Collection of Objects. Present 
properties MUST NOT be NULL. However collections used as value might contain 
NULL values.


*TODO:* 

* Check if EnhancementProperties should be also parsed to the canEnhance(..) 
method os additional parameter.
* Allow EnhancementEngine to expose the list of supported Enhancement Properties

h2. Definition of Enhancement Properties for Chains 

The definition of Enhancement Properties for Enhancement Chains allow to have 
chain specific configurations for an Enhancement Chain. Enhancement Properties 
defined for a Chain will get applied to every request for that chain. Note that 
they might get overridden by Enhancement Properties parsed with specific 
requests.

EnhancementProperties that are defined for Chains are represented as part of 
the ExecutionPlan. Properties added to the the `ep:ExecutionPlan` node will get 
applied to every `ep:ExecutionNode`  of the execution plan. Those defined for a 
specific `ep:ExecutionNode` apply only for the engine represented by this node. 
Properties of a specific `ep:ExecutionNode` override the same property set for 
the `ep:ExecutionPlan.

As all information of the execution plan are cpied by the EnhancementJobManager 
to the ExecutionMetadata it is possible to access execution plan level 
EnhancementProperties during execution.

h2. Parsing Enhancement Properties with Enhancement Requests

Enhancement Properties can also be parsed with single enhancement request. In 
this case the properties will get attached directly to the `em:ChainExecution` 
and `em:EngineExecution` nodes. 

The following listing shows a possible API for adding execution scoped 
properties to an EnhancementJob.

{code}

    @Reference
    EnhancementJobManager jobManager;

    String maxSuggestions = "enhancer.max-suggestions";
    String minConfidence = "enhancer.min-confidence";

    EnhancementJob job = EnhancementJobManager.createJob(contentItem, 
chainName);
    //set max suggestions to 5 for all engine
    job.setProperty(maxSuggestions, 5);
    //set min confidence of 0.33 to dbpedia-linking
    job.setProperty("dbpedia-linking", minConfidence, 0.33);
    //set min confidence of 0.85 for confidence filter engine
    //assuming this is a post-processing engine after disambiguation
    job.setProperty("conf-filter", minConfidence, 0.85);

    //now we can execute the job
    jobManager.execute(job);

{code}

*NOTE:* that the setProperty(..) methods will change the state of the 
ExecutionMetadata by modifying the RDF graph.

h3. Parsing Enhancement Properties via the Enhancer RESTful Service

Enhancement Properties can be parsed as query parameter via their ID. In case a 
property should only be applied to a specific engine 
`{engine-name}:{property-id}` needs to be used.

The following shows the curl request generating the equivalent of the example 
used in the above section:

{code}

    curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
        --data "The Eifeltower is located in Paris." 
        http://localhost:8080/enhancer?enhancer.max-suggestions=5&\
        dbpedia-linking:enhancer.min-confidence=0.33&\
        conf-filter:enhancer.min-confidence=0.85

{code}

For formally defined EnhancementProperties one could implement a automatic 
validation/conversion of values (e.g. parsing the Integer value for the 
`enhancer.max-suggestions` property).

h2. Enhancement Properties Precedence

Properties with a higher precedence will override those with a lower one:

1. Engine specific request properties
2. Global request properties
3. `ep:ExecutionNode` properties
4. `ep:ExecutionPlan` properties 

In other words: request scoped properties do always override chain scoped 
properties and within the same scope engine specific properties do always 
override globally defined once.




was (Author: rwesten):
*NOTE:* This comment contains a documentation of how Enhancement Properties are 
planed to be added to Stanbol 1.0.0. 

h2. Enhancement Property naming and definition

EnhancementProperties should be defined both as an RDF data type property 
within an Ontology and as constants in some Java Class or Interface. The URI 
version will use the enhancement property namespace 
(`http://stanbol.apache.org/ontology/enhancementproperties#`) and the ID of the 
property as local name. IDs SHOULD use the 
`{level-1}.{level-2}.{property-name}` syntax as typically used for java 
properties. Properties are case sensitive and should only use lower case 
characters. The '-' char shall be used to make properties with multiple names 
easier to read. 

Globally defined properties will use '`enhancer`' as {level-1}. For Enhancement 
Engine specific properties a possible shorted/simplified name of the engine 
should be used as {level-1}.

Typical examples for Enhancement Properties are `enhancer.max-suggestions`, 
`enhancer.min-confidence`, `entity-co-mention.adjust-existing-confidence`

The definition as RDF property will use the URI and MAY also include the XSD 
data type of supported values. 

{code}

    @prefix ehprop <http://stanbol.apache.org/ontology/enhancementproperties#>
    
    ehprop:enhancer.max-suggestions     rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Integer;

    ehprop:enhancer.min-confidence      rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Double;

    ehprop:entity-co-mention.adjust-existing-confidence rdf:type        
rdfs:DatatypeProperty,
        xsd:datatype    xsd:Double;

{code}

*TODO:* Maybe the Enhancer should provide a service to register & lookup 
defined enhancement properties


h2. Java Interface

EnhancementProperties are parsed to EnhancementEngines as a Map with String 
keys and Object values. This map is parsed as additional argument to the 
computeEnhancement(..) method of the EnhancementEngine interface. Keys need to 
be supported EnhancementEngines. Engines are expected to silently ignore 
unsupported keys. For illegal values of supported keys EnhancementEngines 
SHOULD throw an EnhancementPropertyExceptions (a subclass of the 
EnhancementException.

Engines MUST support parsing of EnhancementProperty values from their String 
representation. Typically this means that parsing Numeric values from Strings. 
In case of multiple values the value will be a Collection of Objects. Present 
properties MUST NOT be NULL. However collections used as value might contain 
NULL values.


*TODO:* 

* Check if EnhancementProperties should be also parsed to the canEnhance(..) 
method os additional parameter.
* Allow EnhancementEngine to expose the list of supported Enhancement Properties

h2. Definition of Enhancement Properties for Chains 

The definition of Enhancement Properties for Enhancement Chains allow to have 
chain specific configurations for an Enhancement Chain. Enhancement Properties 
defined for a Chain will get applied to every request for that chain. Note that 
they might get overridden by Enhancement Properties parsed with specific 
requests.

EnhancementProperties that are defined for Chains are represented as part of 
the ExecutionPlan. Properties added to the the `ep:ExecutionPlan` node will get 
applied to every engine defined by the chain. Those defined for an 
`ep:ExecutionNode` apply only for the engine represented by this node. Property 
values defined on a engine execution will override values of the same property 
define for the chain execution.

The EnhancementJobManager is responsible for copying over EnhancementProperties 
to the ExecutionMetadata on every request for a Chain. Enhancement Properties 
defined for the `ep:ExecutionPlan` need to be applied to the 
`em:ChainExecution` node and properties present for `ep:ExecutionNode's` need 
to be applied for the respective `em:EngineExecution` nodes. This is required 
as Enhancement Properties can also be parsed with single requests (see next 
section) and those are supposed to override properties statically defined for 
the chain.

The enhancer.servicesapi module will provide utilities for this task.

h2. Parsing Enhancement Properties with Enhancement Requests

Enhancement Properties can also be parsed with single enhancement request. In 
this case the properties will get attached directly to the `em:ChainExecution` 
and `em:EngineExecution` nodes. 

To ensure that request specific properties do override properties statically 
defined for the chain it is important that this takes place after the 
ExecutionMetadata where initialized by the EnhancementJobManager. Setting 
request specific Enhancement Properties will be possible by using the new  
EnhancementJob class.

{code}

    @Reference
    EnhancementJobManager jobManager;

    String maxSuggestions = "enhancer.max-suggestions";
    String minConfidence = "enhancer.min-confidence";

    EnhancementJob job = EnhancementJobManager.createJob(contentItem, 
chainName);
    //set max suggestions to 5 for all engine
    job.setProperty(maxSuggestions, 5);
    //set min confidence of 0.33 to dbpedia-linking
    job.setProperty("dbpedia-linking", minConfidence, 0.33);
    //set min confidence of 0.85 for confidence filter engine
    //assuming this is a post-processing engine after disambiguation
    job.setProperty("conf-filter", minConfidence, 0.85);

    //now we can execute the job
    jobManager.execute(job);

{code}

*NOTE:* that the setProperty(..) methods will change the state of the 
ExecutionMetadata by modifying the RDF graph.


h2. Parsing Enhancement Properties via the Enhancer RESTful Service

Enhancement Properties can be parsed as query parameter via their ID. In case a 
property should only be applied to a specific engine 
`{engine-name}:{property-id}` needs to be used.

The following shows the curl request generating the equivalent of the example 
used in the above section:

{code}

    curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
        --data "The Eifeltower is located in Paris." 
        http://localhost:8080/enhancer?enhancer.max-suggestions=5&\
        dbpedia-linking:enhancer.min-confidence=0.33&\
        conf-filter:enhancer.min-confidence=0.85

{code}

For formally defined EnhancementProperties one could implement a automatic 
validation/conversion of values (e.g. parsing the Integer value for the 
`enhancer.max-suggestions` property).



> EnhancementProperties
> ---------------------
>
>                 Key: STANBOL-488
>                 URL: https://issues.apache.org/jira/browse/STANBOL-488
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Enhancer
>    Affects Versions: 1.0.0
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>             Fix For: 1.0.0
>
>
> Enhancement Properties aim to provide Chain and Request scoped configurations 
> to EnhancementEngines. 
> __IMPORTANT NOTE:__ This Issue introduces incompatible API changes to core 
> interfaces of the Stanbol Enhancer. This includes the `EnhancementEngine` 
> interface.
> Expected use ages include:
> * parse through of user names and passwords for EnhancementEngines that 
> depend on external services. This will allow such engines to use the user 
> account of the the one parsing the request (request scope) or the one 
> configuring the chain (chain configuration scope)
> * parse request specific constraints (e.g. the minimum confidence level for 
> Enhancements). The acceptable confidence might depend on the actual context 
> of the client application (e.g. if the user will review results or not)
> * configure dereferencing on a request bases (e.g. depending on the 
> requirements of the UI showing the enhancement results)
> * reduce the number of configured engine instances (e.g. when specifying the 
> minimum required confidence level for a chain or on request level one would 
> only need a single instance of a confidence-level-filter-engine; The same 
> would be true for dbpedia-linking engines with a different amount of 
> suggested results) 
> * mapping of HTTP header fields to enhancement properties (e.g. for using the 
> "Content-Language" header for specifying the language of the content)
> NOTES:
> * See detailed description in comments dating from February 2014 or later. 
> * This is not the initial description of this issue. This is important as the 
> first 5 comments do refer to the old description. You can still read the old 
> version by looking at the history of this issue. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (STANBOL-488) EnhancementProperties

Reply via email to