On 06/06/2019 02:12, Dan Davis wrote:
Jason,

I would argue that you should exchange a Set of triples, so you can take
advantage of Spark's distributed nature.  Your logic can materialize that
list into a Graph or Model when needed to operate on it.   Andy is right
about being careful about the size - you may want to build a specialized
set that throws if the set is too large, and you may want to experiment
with it.

Andy,

Does Jena Riot (or contrib) provide a binary syntax for RDF that is optimal
for fast parse?

https://jena.apache.org/documentation/io/rdf-binary.html

It's about x2 faster than N-triples to parse, and about the same time to write.

    Andy

Reply via email to