[ol-tech] SPAM Re: a few notes on rdf views

Erik Hetzner Tue, 08 Jun 2010 19:23:54 -0700

At Tue, 08 Jun 2010 17:03:45 -0600,
Lee Passey wrote:
> I'm certain it's not ideologically pure, but I think it's very 
> practical. The W3C states that the motivation for creating RDF was "to 
> represent information in a minimally constraining, flexible way." In 
> information processing there is a natural and inevitable tension between 
> constraints and flexibility. Human beings (and presumably really good 
> AI) is very good at deriving meaning from ambiguity. Computer 
> algorithms, not so much. So if what I want is a way to represent 
> information about the relationships between web resources, and present 
> the relationship data to a human to sort out, flexibility is good. If 
> what I want to do is data mining, flexibility is bad.
> 
> I tend to be much more interested in data mining and automated data 
> processing than in just presenting another pretty web page to the world; 
> constraints work for me.


The problem is, RDF is a completely different world, not a way of
getting around schemas or DTDs. It happens to be (often) serialized in
XML, but the model is completely different.

In my opinion, RDF is more constraining than XML, because it forces
the designer to think clearly about the underlying model, rather than
presenting a lot of different metadata fields. For instance, modeling
an article as (from [1]):

<info:doi/10.1134/S0003683806040089> a bibo:Article ;
    dc:title "Effect of argillaceous minerals on the growth of 
phosphate-mobilizing bacteria Bacillus subtilis"@en ;
    […]
    dc:isPartOf <urn:issn:23346587> ;
    bibo:volume "42" ;
    bibo:issue "4" ;
    bibo:pageStart "388" ;
    bibo:pageEnd "391" ;
    dc:creator <http://examples.net/contributors/2> ;
    dc:creator <http://examples.net/contributors/1> ;
    bibo:authorList ( <http://examples.net/contributors/2> 
<http://examples.net/contributors/1>) .

<urn:issn:23346587> a bibo:Journal; 
    dc:title "Applied Biochemistry and Microbiology"@en ;
    bibo:shortTitle "App Biochem and Biol"@en .

is a lot more constraining than (from [2]):

<mods ...>
  <titleInfo>
    <nonSort>The</nonSort><title>Urban Question as a Scale Question</title>
    <subTitle>Reflections on Henri Lefebre, Urban Theory and the Politics of 
Scale</subTitle>
  </titleInfo>
  <name type="personal">
    <namePart type="given">Neil</namePart>
    <namePart type="family">Brenner</namePart>
    <role><roleTerm type="text">author</roleTerm></role>
  </name>
  <typeOfResource>text</typeOfResource>
  <genre>article</genre>
  <originInfo>
    <issuance>monographic</issuance>
  </originInfo>
  <relatedItem type="host">
    <titleInfo><title>International Journal of Urban and Regional 
Research</title></titleInfo>
    <originInfo>
      <issuance>continuing</issuance>
    </originInfo>
    <part>
      <detail type="volume"><number>24</number></detail>
      <detail type="issue"><number>2</number><caption>no.</caption></detail>
      <extent unit="pages"><start>361</start><end>378</end></extent>
      <date>2000</date>
   </part>
  </relatedItem>
  <identifier>BrennerN2000a</identifier>
</mods>

In the former each piece (article, journal, author) is identified
clearly with a URI, eliminating the need for matching on strings, the
model is well-known and not ad-hoc, it is clear how additional
metadata could be attached to the various parts of the model, etc. All
of this makes the RDF more constrained than the XML, not less.

Anyhow, your mileage may vary.
 
> Maybe not. If you look at the web documentation, OL claims that the JSON 
> API "is deprecated now. This is retained only for backward compatibility 
> and RESTful API should be used instead of this." Again according to the 
> web documentation, the RESTful API is equivalent to the RDF interface. 
> My understanding of the word "deprecated" is that it is a warning 
> against use in the future so that it may be phased out. If OL is going 
> to phase out the JSON API, then whatever replaces it should be a 
> complete representation of the underlying data object (which is, in 
> fact, just a stored record of the JSON text object), at least for data 
> mining purposes.
> 
> I always use the JSON API because I'm assured of getting all the data. 
> If OL said, "whoops, we really aren't deprecating the JSON API, and it 
> will always be available" then I would cease to care about the RDF 
> representation, as it would no longer be of any interest to me.

If I read [3] correctly, while the “JSON API” is deprecated, the JSON
format of the “RESTful API” is not. So perhaps this conversation will
go nowhere. :)

> And in my mind, this is the biggest problem with RDF. If I'm writing an 
> application to derive biographical data from an RDF feed, an infinite 
> number of alternatives makes it useless. As the Pointed Man in the 
> Pointless Forest said, "a point in every direction is as good as no 
> point at all." A controlled vocabulary (and by controlled I mean 
> limited, constricted and constrained) is critical to automated data 
> processing.
>
> In the end, I don't care if an author's name is represented by:
> 
> […]
> 
> But it should only be represented by one of these, not by all. If I need 
> it transformed into a different vocabulary, that's what XSLT is for. In 
> all probability FOAF is probably good enough for whatever consumer of OL 
> data emerges. But it shouldn't be selected simply because it's the 
> newest craze, and it certainly shouldn't be selected with the idea that 
> if it's not good enough OL will just add a new, parallel XML tree. At 
> some point, somebody needs to say, "This far shalt thou go, and no farther."

As I see it, you can just ignore what you don’t want. As long as the
graph has the schema that you understand, you can use that. The more
the merrier. If OL outputs FOAF & RDA, & it conforms to the semantics
of both, great. If I know what FOAF is, I can use that, but if I only
understand RDA, I can use that instead, and not worry about the
differences between the semantics of RDF & FOAF, because OL has done
that for me. I really don’t see the problem. A graph can be trimmed
wherever you like.

For the record, XSLT is not very useful for dealing with RDF+XML,
unless one constrains (!) the syntax of RDF+XML.

best, Erik

1. http://bibliontology.com/content/article
2. http://www.loc.gov/standards/mods/v3/modsjournal.xml
3. http://openlibrary.org/dev/docs/restful_api

pgpAhk2OiLRmw.pgp
Description: PGP signature

_______________________________________________
Ol-tech mailing list
Ol-tech@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
ol-tech-unsubscr...@archive.org

[ol-tech] ***SPAM*** Re: a few notes on rdf views

Reply via email to

[ol-tech] SPAM Re: a few notes on rdf views