I had a quick go, and the penalty from gzip with using expanded forms
without "R" was negligible (~ 0.1%, a bit higher with no prefixes). It
also means you can't process the RDF Patch in a parallel way without
preprocessing.  (Same for prefixes).

Using "R" could also restrict possible compression pattern, for instance in :

A <http://example.com/thingie15>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://schema.org/Person> .
A <http://example.com/thingie15>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://schema.org/Person> .

a good compression algorithm might recognize patterns in here like:

 .\nA <http://example.com/thingie
> <http://www.w3.org/
#type> <http://schema.org/


Using "R" would restrict possible patterns - betting on it recognizing
"> .\nA R R" (which sometimes would work well).



Can RDF Patch items within a transaction be considered in any order
(first all the DELETEs, then all the ADDs), or do they have to be
played back linearly?


On 19 October 2016 at 10:57, Rob Vesse <[email protected]> wrote:
> Yes but ANY is a form of lossy compression. You lost the actual details of 
> what was removed. Also it can only be used for removals and yields no benefit 
> for additions.
>
>  On the other hand REPEAT is lossless compression.
>
>  However if you apply a general-purpose compression like gzip on top of the 
> patch you probably get just as good compression without needing any special 
> tokens. In my experience repeat is more useful in compact binary formats 
> where you can use fewer bytes to encode it then either the term itself or a 
> reference to the term in some lookup table.
>
> On 14/10/2016 17:09, "Andy Seaborne" <[email protected]> wrote:
>
>     These two together seem a bit contradictory.  The advantage of ANY, with
>     versions, is that it is form of compression.
>
>
>
>



-- 
Stian Soiland-Reyes
http://orcid.org/0000-0001-9842-9718

Reply via email to