Re: Save Skos thesauri, modify, map and share them

Alessandro Adamou Mon, 06 Feb 2012 09:49:54 -0800

Hi Florent,

Sorry for getting back to you so late on your feedback, kind of slippedthrough - hope you can still put my answer to use now.



On 1/19/12 2:52 PM, florent andré wrote:

What I clearly understand now it that all is store in theStanbol/clerezza store.
And that I can store thesauri directly via repository or via Ontonet.
What is not totally clear for me now are the concepts of "spaces","session" and "scope"...

Don't worry about spaces right now. You can just assume for now, thatinto scope S1 you will load

- The SKOS ontology into its core space "S1.core"

- The thesauri for a specific user into its custom space "S1.custom" (ifthese thesauri can coexist without creating inconsistencies, otherwise asingle user might need multiple scopes).


/****** Free-thought rambling ******/

A much more sophisticated use of spaces, not currently implemented,could be:

- load the SKOS ontology in S1.core
- load the initial, unabridged thesaurus in S1.custom

- then we could have "user spaces" on top of the custom spaces (similarto the now-deprecated session spaces) to store diffs for each user'sreorganization of the thesaurus.


e.g. in the initial thesaurus says that Concept 1 and Concept2 are siblings
- in S1.userA : "Concept1 skos:narrower Concept2"
- In S1.userC : "Concept2 skos:altLabel Concept1"

... and these "user spaces" would not see each other, unless one userexplicitly decides to inspect another user's thesauri and handle overlaps.The hardship in implementing this is the monotonic nature of theSemantic Web: we would have to implement axiom negation and such.

/****** END Free-thought rambling ******/

In my usecase I will have many users that each save one to manythesaurus :
- user A will store thesaurus 1 (TA1) and TA2
- user B will store thesaurus 1 (TB1) and TB2 and ...
When user C store his TC1 he will choose to map-it to one alreadyexisting depending on the more appropriate one.
Let's say he select the TA2.
So, the mapping will be done between TC1 and TA2 (and any otherscombinations can be done afterwards).

which is why I think it is best to use scopes here. Because one userneeds to reference knowledge managed by another user.

Sessions are volatile, so you shouldn't use them for cross-userreferencing. Sessions come into play the moment you need to do somethinglike "Hey Stanbol, here are my data for today, please classify themaccording to this SKOS schema here and that one there".

... for my user C :
- I create a sessionC
- I create a "C-skos-thesaurus" scope
- Load TC1 in "C-skos-thesaurus".coreSpace
- load TA2 in "C-skos-thesaurus".customSpace
- then store mappings done in sessionC

That's a good use of session, scope and space ?


Quite close indeed. I would imagine a scenario such as this.

User A:
* creates "A-skos-thesaurus" scope

* loads the SKOS ontology in "A-skos-thesaurus".core and TA2"A-skos-thesaurus".custom


User C:
* creates "C-skos-thesaurus" scope

* loads the SKOS ontology in "C-skos-thesaurus".core and TC1"C-skos-thesaurus".custom

* opens session C-201202061828

* store SKOS mappings in an ontologyhttp://example.org/C-skos-alignments and load it onto C-201202061828

--> the mappings will be stored to persistence even if they are loadedon a session, but you will not need the time-identifier to retrieve it:it will be stored as[stanbol-host]/ontonet/http://example.org/C-skos-alignments


If C also wants to obtain additional inferred mappings, he will:
* attach scopes "C-skos-thesaurus" and "A-skos-thesaurus" to C-201202061828.
* call Stanbol Reasoners and tell it to classify session "C-skos-thesaurus"

Even better, if you think you can benefit from partitioning the
thesaurus somehow, you can manage multiple scopes with one partition in
the custom space of each. This usually comes into play if you need to
perform some reasoning.

Ok, so my understanding from our exchange is that overlaps are possibleand partitioning is unlikely.

As for the *concepts*, there's no rewriting of entity IRIs, nor were we
sure to do it as logically it would open a can of worms - that is,
unless we add an OWL equivalence statement everytime a concept is

"moved", but even so all the "old" names should still bedereferenceable!


Thesauri I will import don't have prior IRIs (they are in CSV).

Ok - I don't know of any CSV support in Stanbol, though... should beported to RDF beforehand.

So I can set up them as I want and in line with the server name.

Okay. I would not rule out adding a REST resource to OntoNet that, if itidentifies the URI to be an entity, returns its signature (e.g. by aSPARQL DESCRIBE or an EntityHub query)

Get old names is really problematic... only currents one will beinteresting...Redirect from old to current with the help of modifications historycould be really good...

It's not in the short-term plans for OntoNet, but for now I guess wecould add some "safe" OWL equivalences. Anyway if the deal is to onlybring current names "to the front" for now, it shouldn't be a bigbottleneck.

* what would your mappings look like? depending on the complexity, you
could find Stanbol Rules to be of use too.
For now (it's not clearly define though), mapping will be done withSKOS properties.

Ok, then I guess we just need to call different reasoning primitives(enrich rather than classify) when needed.

* It's better to use rules in this case (mapping TC1 / TA2) ?
Constraints (for now) are to be able to get :
- original thesaurus (just TC1)
- or the complete one (TC1 and TA2 with mappings)

I'm not sure I understand, but it seems to me that the latter could useStanbol Rules.

No clear idea of the size of each individuals thesaurus... The pointhere is more the amount of thesaurus...
IMO : 15+ of not so big thesauri.

ok so we're around the tens-to-hundreds of triples as I read in theproposal.


HTH

Best,

Alessandro

--
M.Sc. Alessandro Adamou

Alma Mater Studiorum - Università di Bologna
Department of Computer Science
Mura Anteo Zamboni 7, 40127 Bologna - Italy

Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, 00161 Rome - Italy


"As for the charges against me, I am unconcerned. I am beyond their timid, lying 
morality, and so I am beyond caring."
(Col. Walter E. Kurtz)

Not sent from my iSnobTechDevice

Re: Save Skos thesauri, modify, map and share them

Reply via email to