Cool - my first attempt at write speed testing suggested it was about
the same as N-triples.
Write performance testing is harder (!!!) because you need a big enough
source of data to run without the source itself affecting the numbers.
N-Triples writing has always been faster than reading - it's much closer
to "push strings straight into the output" with no single character
mangling most of the time.
From looking at the thrift implementation, it has to do small
char->byte conversions.
It maybe faster to not use Java's native converter (which involves a
copy) but to do direct chars -> output stream using BlockUTF8.
When I last tested, BlockUTF8 was faster for strings <~100 characters
but after that Java JDK was faster for larger.
Andy
On 04/09/14 10:05, Rob Vesse wrote:
Thanks Andy,
I have started experimenting, more on that to follow
Rob
On 31/08/2014 15:36, "Andy Seaborne" <a...@apache.org> wrote:
On 26/08/14 21:20, Andy Seaborne wrote:
I've been working on a binary format for RDF and SPARQL result sets:
http://afs.github.io/rdf-thrift/
This is now ready to go if everyone is OK with that.
I'm flagging this up for passive consensus because it adds a new
dependency (for Apache Thrift).
And of course any questions or comments.
Summary, as an RDF syntax:
+ x3 faster to parse than N-triples
+ same size as N-triples, and same compression effects with gzip (8-10
compression).
+ Not much additional work to add because Thrift does most of the work.
Andy
Migration done (JENA-774). Some cleaning up to do (putting classes in
more logical places mostly) but tests in and passing.
Andy