Re: [rdflib-dev] Graph.parse with RDFXML parser performance issues

2022-01-07 Thread Nicholas Car
True, but producing an HDT file will likely require a lot more effort than an n-triples file. breanda, what's the thing generating the RDF/XML file in the first place? On Sat, Jan 8, 2022 at 11:58 AM Wes Turner wrote: > RDFHDT is fast for *reads*; probably faaster than n-triples > https://github

Re: [rdflib-dev] Graph.parse with RDFXML parser performance issues

2022-01-07 Thread Wes Turner
RDFHDT is fast for *reads*; probably faaster than n-triples https://github.com/RDFLib/rdflib-hdt On Fri, Jan 7, 2022 at 8:55 PM Nicholas Car < nicholas@surroundaustralia.com> wrote: > I guess it depends on how you are producing the RDF/XML file in the first > place. If you do have control ove

Re: [rdflib-dev] Graph.parse with RDFXML parser performance issues

2022-01-07 Thread Nicholas Car
I guess it depends on how you are producing the RDF/XML file in the first place. If you do have control over that, and loading times are really an issue, produce an n-triples file as this will load the fastest! On Sat, Jan 8, 2022 at 5:06 AM Wes Turner wrote: > Out of curiosity, does performance

Re: [rdflib-dev] Graph.parse with RDFXML parser performance issues

2022-01-07 Thread Wes Turner
Out of curiosity, does performance differ with defusedxml in there? (RDF)XML parser complexity really is unnecessary compared to e.g. N3, JSONLD, or RDFHDT. Does performance differ after transforming to a non-XML format? defusedxml should probably be an install_requires dependency because of the

Re: [rdflib-dev] Graph.parse with RDFXML parser performance issues

2022-01-07 Thread Brendan McMahon
Hi Nick, thanks for the response! Yes, the idea you mention is what I was considering trying next, but I thought I'd ask in here to see if there were any other ideas about handling this with what the library has built in. I will report back here with what I do if it works out! Also, the file is

Re: [rdflib-dev] Graph.parse with RDFXML parser performance issues

2022-01-07 Thread Nicholas Car
Hi Brendan, This is an interesting issue! No I haven't encountered it, but then I never use large RDF/XMl graphs. How large is your graph by the way? If you really think the issue is the getting or testing of elements in the RDF DefinedNamespace, couldn't you just clone rdfxml.py and replace all