I had a quick go, and the penalty from gzip with using expanded forms without "R" was negligible (~ 0.1%, a bit higher with no prefixes). It also means you can't process the RDF Patch in a parallel way without preprocessing. (Same for prefixes).
Using "R" could also restrict possible compression pattern, for instance in : A <http://example.com/thingie15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> . A <http://example.com/thingie15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> . a good compression algorithm might recognize patterns in here like: .\nA <http://example.com/thingie > <http://www.w3.org/ #type> <http://schema.org/ Using "R" would restrict possible patterns - betting on it recognizing "> .\nA R R" (which sometimes would work well). Can RDF Patch items within a transaction be considered in any order (first all the DELETEs, then all the ADDs), or do they have to be played back linearly? On 19 October 2016 at 10:57, Rob Vesse <[email protected]> wrote: > Yes but ANY is a form of lossy compression. You lost the actual details of > what was removed. Also it can only be used for removals and yields no benefit > for additions. > > On the other hand REPEAT is lossless compression. > > However if you apply a general-purpose compression like gzip on top of the > patch you probably get just as good compression without needing any special > tokens. In my experience repeat is more useful in compact binary formats > where you can use fewer bytes to encode it then either the term itself or a > reference to the term in some lookup table. > > On 14/10/2016 17:09, "Andy Seaborne" <[email protected]> wrote: > > These two together seem a bit contradictory. The advantage of ANY, with > versions, is that it is form of compression. > > > > -- Stian Soiland-Reyes http://orcid.org/0000-0001-9842-9718
