[
https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418752#comment-17418752
]
Andy Seaborne commented on JENA-2167:
-------------------------------------
Some initial figures.
Parsing BSBM 25 million (which is large enough to get stable timing figures
after warm up):
Thrift: 1 million triples per second.
Protobuf: 918kTPS
N-Triples: 245kTPS
The thrift rate is faster than last time I ran it. Same hardware, same code,
newer Java (this is Java 17-ea)
Suspicion: The protobuf is slightly slower because protobuf does not provide
length delimited objects, where as Thrift encoding is self contained. The
encoding of a graph is writing triples streaming fashion, each triple a
Protobuf message. The protobuf way is to add a block length into the stream,
and the extra decoding of this is slightly inefficient (it create two java
objects per triple, rather than reuse existing objects).
> Provide an RDF Binary format using Protobuf
> -------------------------------------------
>
> Key: JENA-2167
> URL: https://issues.apache.org/jira/browse/JENA-2167
> Project: Apache Jena
> Issue Type: New Feature
> Affects Versions: Jena 4.2.0
> Reporter: Andy Seaborne
> Assignee: Andy Seaborne
> Priority: Major
>
> To go along side the RDF Thrift encoding.
> Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)