Re: [Crm-sig] Interface between CRM and other frameworks

2020-02-27 Thread Ethan Gruber
I trust someone else's URIs when those URIs are backed by an organization
committed to the long-term maintenance of those URIs and has a fairly high
degree of community buy-in, e.g., the Getty, Geonames, Pleiades, Nomisma,
etc. I don't trust an entity that wants to take the ID of an authoritative
institution and re-cast it in a new URL pattern. People should be using
authoritative vocabulary URIs in their data, not intermediate web service
URLs. Push the web service calls onto the developer and keep them out of
the model. E.g., you could create a resolver (your bullet point #2 'we
develop a web service which implements this mapping, taking one URL from
the external resource as its input and returning CRM RDF'):
http://example.org/resolver?uri=http://vocab.getty.edu/ulan/500018666 that
reads the URI pattern and performs the underlying query and returns the
concept as CIDOC CRM in a serialization or profile of your choosing.

I like your idea of a resolver for transforming geographic and personal
authorities into CRM, but simply disagree that people should be inserting
new URIs back into their data.

On Thu, Feb 27, 2020 at 11:15 AM Richard Light 
wrote:

> On 27/02/2020 13:46, Ethan Gruber wrote:
>
> I really disagree with alternative URL patterns and using them in RDF.
> That URL pattern is *not* the concept,
>
> The URL has well-defined semantics (e.g.) "this is Richard Light's CRM
> rendition of the Geonames place Burgess Hill". It's a derived concept.
>
> and whomever generates these URLs is responsible for maintaining them
> permanently.
>
> Absolutely. If someone sets up a (sub-)domain to provide this sort of
> service, they should do so with the same degree of commitment as someone
> setting up a Linked Data resource from scratch.  But then, if we can never
> get to the point of trusting, and using, 'someone else's URLs', we will
> forever remain cowering in our silos, as I mentioned below.  And our data
> would not, in any useful sense, be Linked Data.
>
> A web service like this works in theory, but I would say that the majority
> of the LOD-oriented vocabulary systems used in cultural heritage do not
> come with SPARQL endpoints. Each of them offers some flavor of
> machine-readable data, so you'd have to build your web service around REST
> calls for RDF/XML or JSON and building a mapping from those serializations
> into Linked Art JSON-LD.
>
> Yes, you probably would [have to do this].  So you would need to decide
> whether it was worth the investment in the mapping work and software
> development, given the scale and utility of the resource this would give
> you (and everyone else) access to.  I don't see that as an argument for not
> adopting the approach at all.
>
> Also, please note that I am discussing this in the context of the CRM SIG,
> not Linked Art.  So the service would simply aim to generate a generic
> CRM-valid sub-graph. Also, I would expect it to offer a reasonable range of
> serializations, not just JSON-LD.
>
> The best solution is to relax CIDOC CRM to allow people to use
> vocabularies that aren't built on CIDOC CRM. Domains and ranges should be
> considered guidelines, not absolutes. There's nothing technically
> prohibitive about inserting CRM linking to Getty URIs describing artistic
> objects into a SPARQL endpoint, and also loading those Getty vocabularies
> into the same endpoint, and then building SPARQL queries that exploit the
> capabilities of both data models. Using property paths in SPARQL are more
> scalable in production than activating inferencing engines.
>
> That's sort of where I started from ("be relaxed about the semantic
> discontinuity" below).  However, I suspect there are members of this group
> who would be far from relaxed about this.
>
> As it happens, another respondent has just pointed out what looks like a
> solution to the original problem which exercised the Linked Art group
> (using ULAN and TGN in a CRM-compatible setting), and I have forwarded
> their comments to that group.
>
> Richard
>
>
> Ethan
>
> On Thu, Feb 27, 2020 at 7:01 AM Richard Light 
> wrote:
>
>> Hi,
>>
>> The Linked Art group has been discussing the issue of URIs which point to
>> resources in other frameworks (
>> https://github.com/linked-art/linked.art/issues/307). The discussion has
>> noted the advice in our RDF implementation document (
>> http://www.cidoc-crm.org/sites/default/files/Implementing%20the%20CIDOC%20Conceptual%20Reference%20Model%20in%20RDF_0.pdf),
>> in particular the advice that skos:Concept should not be used for people or
>> places. This raises an issue in relation to ULAN and TGN, two Getty
>> vocabularies which Linked Art would expect to be able to use. Various
>> work-rounds have been proposed, of varying complexity.
>>
>> After giving this issue some thought, I contributed the following to the
>> discussion:
>>
>> Interesting problem. This issue will crop up wherever you want to exploit
>> the potential of Linked Data by linking out across a 'boun

Re: [Crm-sig] Interface between CRM and other frameworks

2020-02-27 Thread Richard Light
On 27/02/2020 13:46, Ethan Gruber wrote:
I really disagree with alternative URL patterns and using them in RDF. That URL 
pattern is *not* the concept,
The URL has well-defined semantics (e.g.) "this is Richard Light's CRM 
rendition of the Geonames place Burgess Hill". It's a derived concept.
and whomever generates these URLs is responsible for maintaining them 
permanently.
Absolutely. If someone sets up a (sub-)domain to provide this sort of service, 
they should do so with the same degree of commitment as someone setting up a 
Linked Data resource from scratch.  But then, if we can never get to the point 
of trusting, and using, 'someone else's URLs', we will forever remain cowering 
in our silos, as I mentioned below.  And our data would not, in any useful 
sense, be Linked Data.
A web service like this works in theory, but I would say that the majority of 
the LOD-oriented vocabulary systems used in cultural heritage do not come with 
SPARQL endpoints. Each of them offers some flavor of machine-readable data, so 
you'd have to build your web service around REST calls for RDF/XML or JSON and 
building a mapping from those serializations into Linked Art JSON-LD.

Yes, you probably would [have to do this].  So you would need to decide whether 
it was worth the investment in the mapping work and software development, given 
the scale and utility of the resource this would give you (and everyone else) 
access to.  I don't see that as an argument for not adopting the approach at 
all.

Also, please note that I am discussing this in the context of the CRM SIG, not 
Linked Art.  So the service would simply aim to generate a generic CRM-valid 
sub-graph. Also, I would expect it to offer a reasonable range of 
serializations, not just JSON-LD.

The best solution is to relax CIDOC CRM to allow people to use vocabularies 
that aren't built on CIDOC CRM. Domains and ranges should be considered 
guidelines, not absolutes. There's nothing technically prohibitive about 
inserting CRM linking to Getty URIs describing artistic objects into a SPARQL 
endpoint, and also loading those Getty vocabularies into the same endpoint, and 
then building SPARQL queries that exploit the capabilities of both data models. 
Using property paths in SPARQL are more scalable in production than activating 
inferencing engines.

That's sort of where I started from ("be relaxed about the semantic 
discontinuity" below).  However, I suspect there are members of this group who 
would be far from relaxed about this.

As it happens, another respondent has just pointed out what looks like a 
solution to the original problem which exercised the Linked Art group (using 
ULAN and TGN in a CRM-compatible setting), and I have forwarded their comments 
to that group.

Richard

Ethan

On Thu, Feb 27, 2020 at 7:01 AM Richard Light 
mailto:rich...@light.demon.co.uk>> wrote:

Hi,

The Linked Art group has been discussing the issue of URIs which point to 
resources in other frameworks 
(https://github.com/linked-art/linked.art/issues/307). The discussion has noted 
the advice in our RDF implementation document 
(http://www.cidoc-crm.org/sites/default/files/Implementing%20the%20CIDOC%20Conceptual%20Reference%20Model%20in%20RDF_0.pdf),
 in particular the advice that skos:Concept should not be used for people or 
places. This raises an issue in relation to ULAN and TGN, two Getty 
vocabularies which Linked Art would expect to be able to use. Various 
work-rounds have been proposed, of varying complexity.

After giving this issue some thought, I contributed the following to the 
discussion:

Interesting problem. This issue will crop up wherever you want to exploit the 
potential of Linked Data by linking out across a 'boundary' to a LD resource 
which plays by different rules to your own. So it's not just a Linked Art 
problem. The alternatives would appear to be:

  *   be relaxed about the semantic discontinuity
  *   insist the rest of the LD world changes to fit your world view (which 
appears to be the CRM SIG position)
  *   cower inside your silo and ignore everything outside it
I would argue for adopting the first option. The external resource will still 
dereference for you; it will still deliver a machine-readable payload. As 
mentioned above, you won't find any Linked Art or CRM concepts in there, but 
does that matter?
There might be benefit in inventing a relationship for Linked Art which says, 
in effect, "this is an equivalent but 'external' concept".
To go beyond this, assuming that resources such as Geonames will continue to 
happily ignore our existence, I would suggest a dynamic mapping service, which 
takes e.g. a Geonames URL, retrieves its contents, and re-expresses those 
assertions in a CRM-compatible format. Make the call to that service a URL in 
our Linked Data with the Geonames URL as a parameter, and we will have extended 
our Linked Data graph to include a virtual CRM-compatible Geonames. Rinse and 
repeat with other external res

Re: [Crm-sig] Interface between CRM and other frameworks

2020-02-27 Thread Ethan Gruber
I really disagree with alternative URL patterns and using them in RDF. That
URL pattern is *not* the concept, and whomever generates these URLs is
responsible for maintaining them permanently. A web service like this works
in theory, but I would say that the majority of the LOD-oriented vocabulary
systems used in cultural heritage do not come with SPARQL endpoints. Each
of them offers some flavor of machine-readable data, so you'd have to build
your web service around REST calls for RDF/XML or JSON and building a
mapping from those serializations into Linked Art JSON-LD.

The best solution is to relax CIDOC CRM to allow people to use vocabularies
that aren't built on CIDOC CRM. Domains and ranges should be considered
guidelines, not absolutes. There's nothing technically prohibitive about
inserting CRM linking to Getty URIs describing artistic objects into a
SPARQL endpoint, and also loading those Getty vocabularies into the same
endpoint, and then building SPARQL queries that exploit the capabilities of
both data models. Using property paths in SPARQL are more scalable in
production than activating inferencing engines.

Ethan

On Thu, Feb 27, 2020 at 7:01 AM Richard Light 
wrote:

> Hi,
>
> The Linked Art group has been discussing the issue of URIs which point to
> resources in other frameworks (
> https://github.com/linked-art/linked.art/issues/307). The discussion has
> noted the advice in our RDF implementation document (
> http://www.cidoc-crm.org/sites/default/files/Implementing%20the%20CIDOC%20Conceptual%20Reference%20Model%20in%20RDF_0.pdf),
> in particular the advice that skos:Concept should not be used for people or
> places. This raises an issue in relation to ULAN and TGN, two Getty
> vocabularies which Linked Art would expect to be able to use. Various
> work-rounds have been proposed, of varying complexity.
>
> After giving this issue some thought, I contributed the following to the
> discussion:
>
> Interesting problem. This issue will crop up wherever you want to exploit
> the potential of Linked Data by linking out across a 'boundary' to a LD
> resource which plays by different rules to your own. So it's not just a
> Linked Art problem. The alternatives would appear to be:
>
>- be relaxed about the semantic discontinuity
>- insist the rest of the LD world changes to fit your world view
>(which appears to be the CRM SIG position)
>- cower inside your silo and ignore everything outside it
>I would argue for adopting the first option. The external resource
>will still dereference for you; it will still deliver a machine-readable
>payload. As mentioned above, you won't find any Linked Art or CRM concepts
>in there, but does that matter?
>There might be benefit in inventing a relationship for Linked Art
>which says, in effect, "this is an equivalent but 'external' concept".
>To go beyond this, assuming that resources such as Geonames will
>continue to happily ignore our existence, I would suggest a dynamic mapping
>service, which takes e.g. a Geonames URL, retrieves its contents, and
>re-expresses those assertions in a CRM-compatible format. Make the call to
>that service a URL in our Linked Data with the Geonames URL as a parameter,
>and we will have extended our Linked Data graph to include a virtual
>CRM-compatible Geonames. Rinse and repeat with other external resources
>which are big enough to be of interest to us, and too big to re-design
>along CRM lines.
>
> On reflection, I increasingly like the idea of a dynamic mapping service.
> Maybe we should add something along those lines to the RDF implementation
> document?  The way it would work could be as follows:
>
>
>- we analyse the RDF which is generated by the external resource and
>re-express those parts of it which are CRM-compatible in CRM RDF (i.e. do a
>mapping). Some concepts may not map, and would be excluded from the process
>- we develop a web service which implements this mapping, taking one
>URL from the external resource as its input and returning CRM RDF
>- we support a variant URL pattern which maps to this web service,
>e.g. https://geonames.cidoc-crm.org/2654308/ for
>https://www.geonames.org/2654308/burgess-hill.html
>- CIDOC CRM users quote these variant URLs in their RDF data
>
> This approach makes no demands on the external system; it simply exploits
> the fact that it is providing machine-processible data.  Once installed, it
> will deliver whatever resources are in the external system, i.e. you don't
> need to keep updating your 'copy'.  In effect, it extends the scope of the
> CRM-compatible graph to include this external resource (and all the
> resources that *it *mentions).
>
> Where the external resource has a SPARQL end-point, it may be possible to
> implement the mapping (at least in simple cases) by a suitable CONSTRUCT
> statement.
>
> Thoughts?
>
> Richard
> --
> *Richard Light*
> _