Cool - my first attempt at write speed testing suggested it was about the same as N-triples.

Write performance testing is harder (!!!) because you need a big enough source of data to run without the source itself affecting the numbers.

N-Triples writing has always been faster than reading - it's much closer to "push strings straight into the output" with no single character mangling most of the time.

From looking at the thrift implementation, it has to do small char->byte conversions.

It maybe faster to not use Java's native converter (which involves a copy) but to do direct chars -> output stream using BlockUTF8.

When I last tested, BlockUTF8 was faster for strings <~100 characters but after that Java JDK was faster for larger.

        Andy

On 04/09/14 10:05, Rob Vesse wrote:
Thanks Andy,

I have started experimenting, more on that to follow

Rob

On 31/08/2014 15:36, "Andy Seaborne" <a...@apache.org> wrote:

On 26/08/14 21:20, Andy Seaborne wrote:
I've been working on a binary format for RDF and SPARQL result sets:

http://afs.github.io/rdf-thrift/

This is now ready to go if everyone is OK with that.

I'm flagging this up for passive consensus because it adds a new
dependency (for Apache Thrift).

And of course any questions or comments.

Summary, as an RDF syntax:

+ x3 faster to parse than N-triples
+ same size as N-triples, and same compression effects with gzip (8-10
compression).
+ Not much additional work to add because Thrift does most of the work.

      Andy

Migration done (JENA-774).  Some cleaning up to do (putting classes in
more logical places mostly) but tests in and passing.

        Andy






Reply via email to