Re: One Content Item, many representations

Rupert Westenthaler Thu, 20 Oct 2011 04:31:53 -0700

Hi

> On 10/20/2011 10:35 AM, florent andré wrote:
>> With camel Route you have a splitter [2] build in and as a counter part
>> an aggregator [3].
>>
>> For both you can define particular split/aggregate business logic.


So you use this to send the different parts of an email to different
Stanbol Instances and after that you merge the enhancement results
together?

On Thu, Oct 20, 2011 at 11:08 AM, florent andré
<[email protected]> wrote:
> maybe this one : http://www.semanticdesktop.org/ontologies/nmo/
> What do you think about that ? Others more suitable ?

In the case of E-Mails the semanticdesktop NMO ontology looks ok.

I think that the decision on how to model relations between
ContentItems should be up to the Stanbol User. Stanbol returns a RDF
Graph that connects all enhancements to the ContentItems they are
extracted from. Users can than use any Ontology they like to to link
such ContentItems together (e.g. in the Business logic of the
aggregator) .


Also note that this is related to the following two topics:

1. Content Adapter Pattern: (User sends PDF; Enhancement Engine asks
the ContentAdapter to get the Text version of the PDF). The
ContentAdapter could not only support the conversion of Format A >>
Format B but also - as in the case of E-Mails - know that there is
already a Text AND a HTML version.

2. Definition of the Stanbol Enhancement Structure (see STANBOL-351)
[1]. Here one could argue that Stanbol should support parent child
relations between ContentItems.

best
Rupert

> ++
>
>>
>>
>> This idea will be not so hard to implement then :
>>  >> One could also add some additional triples that link the attachment
>> with
>>  >> the Mail and that the content of the Mail is available as a text and
>>  >> html version.
>>
>> There is some particular / recommended / standard type of triples for
>> describe :
>> - attachment graph is link to Mail graph
>> - content available as text and html
>> ?
>>
>> Thanks.
>>
>> [1] : http://camel.apache.org/enterprise-integration-patterns.html
>> [2] : http://camel.apache.org/splitter.html
>> [3] : http://camel.apache.org/aggregator2.html
>>
>> On 10/20/2011 09:07 AM, Fabian Christ wrote:
>>>
>>> Hi,
>>>
>>> if I remember correctly, we had the idea to allow different chains of
>>> enhancement engines to be configured under different URLs. Maybe
>>> Florent's use case is interesting for this. Florent could create an
>>> engine that is able to split the different content types and then
>>> start enhancement with different chains for each content type. If
>>> chains can call other chains, it would be possible to define such
>>> complex workflows for content enhancement.
>>>
>>> Best,
>>> - Fabian
>>>
>>> 2011/10/19 Rupert Westenthaler<[email protected]>:
>>>>
>>>> Hi florent
>>>>
>>>> I would create use two enhancement request
>>>>
>>>> 1. for the Text and
>>>> 2. for the Attachment.
>>>>
>>>> and then merge the returned RDF graphs with the enhancements. One
>>>> could also add some additional triples that link the attachment with
>>>> the Mail and that the content of the Mail is available as a text and
>>>> html version.
>>>>
>>>> best
>>>> Rupert
>>>>
>>>> On Wed, Oct 19, 2011 at 6:22 PM, florent andré
>>>> <[email protected]> wrote:
>>>>>
>>>>> Hi Stanbolers !
>>>>>
>>>>> Imagine a classical html mail with attachment.
>>>>> This mail is in fact composed by (at least) 3 parts :
>>>>> * text/plain mail body
>>>>> * html mail body
>>>>> * attachment.
>>>>>
>>>>> One html mail + attachment can be considered as one CI - one piece of
>>>>> information/knowledge send by a guy.
>>>>>
>>>>> In fact, text plain and html will have (pretty much*) the same
>>>>> metadatas and
>>>>> keeping both is interesting :
>>>>> - text plain for processing and annotations positions
>>>>> - html for keep the source and be able to enhance the html with rdfa,
>>>>> links,...
>>>>>
>>>>> And attachment, will mostly have a different metadata, but this
>>>>> metadatas
>>>>> are in a way related to the mail body's one...
>>>>>
>>>>> It could be domageable - IMO - to manage attachment and mail body
>>>>> metadatas
>>>>> in a totally disconnected way (aka two different Content Item).
>>>>>
>>>>> Note that this usecase also match with CMS articles with files (pdf,
>>>>> odt...)
>>>>> to downloads for further reading.
>>>>>
>>>>> And now the real question :
>>>>> How can we manage nicely this kind of "composed things" ?
>>>>>
>>>>> Insights are very welcome ! :)
>>>>> Have a good day
>>>>> ++
>>>>>
>>>>>
>>>>> * pretty much because when can imagine be able to extract some more
>>>>> metatadas from html (color, font size, rdfa, ...)
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> | Rupert Westenthaler [email protected]
>>>> | Bodenlehenstraße 11 ++43-699-11108907
>>>> | A-5500 Bischofshofen
>>>>
>>>
>>>
>>>
>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: One Content Item, many representations

Reply via email to