Re: [RDF Patch] Looking at Talis Changesets and other proposals.

Andy Seaborne Wed, 31 Jul 2013 05:11:48 -0700

It'll be worth while expanding on the streaming and scalability points.


This metadata is a bit complicated: I've falled into some traps here.

The RDF patch format is not an RDF serialization. Blank node labelswork differently so embedding N-Triples or Turtle isn't so automatic.An RDF patch parsers needing a Turtle parser seems a bit heavy.

It could be consider as string (block escaping? c.f CDATA), to be sentoff to a real RDF language parser. While a combined RDF-patch+Turtle iseasy for RIOT (same tokenizer so no issues of read ahead grabbing tokensfrom the other language) but it's not normal for it to be easy to havemixed languages when using parser generators.

And <> isn't a sensible way to refer to "this change" because (common SWissue) it really means "where the copy of this document came from".

The change itself needs a unique name like a UUID so it's the samewherever the copy is obtained from.

We could have link headers, rather than inline metadata, except they canget broken and not accessinle at the tim eof access.

If this is an area where there is doubt ("thinking to do, choices to bemade"), then I think putting that more speculative stuff in a separatesection and keeping the core document simple and stable. But worthwhileputting in the doc as something is needed.


        Andy


On 30/07/13 16:56, Rob Vesse wrote:

Andy

I am familiar with Talis Changesets having used them heavily in my PhD
research.

My concerns are much the same as yours in that Changesets really don't
scale well.  The other big problem is that since they are RDF graphs they
are unordered since once cannot rely on a serializer/parser producing the
data in the same order as was originally intended especially if you start
crossing boundaries between different toolkits/APIs.  This makes them
effectively useless as a streaming patch format unless you send a stream
of small changests, this however adds copious overhead to a format
intended for speed and simplicity.

Perhaps more simply you can do the following

#METADATA
<> rp:create [ foaf:name "Andy" ; foaf:orgURL <http://jena.apache.org> ] ;
    rp:createdDate "2013-07-30"^^xsd:date
    rdfs:comment "A valid Turtle graph" .
#METADATA

The #METADATA is used to denote the start/end of a metadata block (which
ideally we permit only at the start of the patch).  This can then be
easily discarded by line oriented processors since if you see #METADATA
you just throw away all subsequent lines until you see #METADATA again.
Within the metadata block you could allow full blown Turtle or restrict to
a simpler tuple format if preferable?

Is it worth adding a comparison to alternative approaches as an Appendix
to the RDF patch proposal?

Rob


On 7/30/13 7:49 AM, "Andy Seaborne" <a...@apache.org> wrote:

Rob, all,

Leigh Dodds expressed a preference for Talis Changesets for patches.  I
have tries to analysis their pros and cons.

For me, the scale issue alone makes changesets the wrong starting point.
  They really solve a different problem of managing some remote data
with small, incremental changes.

It would be useful to add to RDF patch the ability to have metadata
about the change itself.

One way is to introduce a new marker M, which permits effectively,
N-Triples.  (Maybe required to be at the front.)

Not Turtle but I see RDF patch as machine oriented, not human readable.

M <> rp:create _:a .
M _:a foaf:name "Andy" .
M _:a foaf:orgURL <http://jena.apache.org/> .
M <> rp:createdDate "2013-07-30"^^xsd:date .
M <> rdfs:comment "Seems like a good idea" .

        Andy

Re: [RDF Patch] Looking at Talis Changesets and other proposals.

Reply via email to