I think I may have tracked down what is causing my slow performance of GET with the new Fuseki 0.28 snapshot. Comparing the output of s-get for the same data from the latest Fuseki 0.28 snapshot, and from the 0.26 release, I discovered that the 0.28 snapshot is creating the XML in hierarchical form, with nesting of elements (RDF/XML-ABBREV). In Fuseki 0.26, it would output the RDF in the regular flattened RDF/XML format. Obviously, creating the flattened form is much more efficient.
While I understand that RDF/XML-ABBREV is more human readable, there's a big price to pay in efficiency, at least for my data. In my case, I'm accessing my Fuseki endpoint via datasetAccessor.getModel(), and as far as I know, there's no way for me to tell Fuseki through this API that I want the data to be serialized as N-TRIPLES (since it's just going to be loaded in a Jena model anyway and not read by a human). Is there a way I can control how Fuseki serializes by default? And why was the default serialization format changed to RDF/XML-ABBREV - is anyone really using RDF/XML anymore as a human-readable format anyway? ;-) I really appreciate any advice, workarounds, or fixes for this issue. I can't really switch back to the earlier Fuseki versions anymore, since the new jena-text makes my life so much easier since I no longer have to worry about manually reindexing after SPARQL Update, like I did with Fuseki and LARQ. Thanks for incorporating jena-text! Thank you, Elli >________________________________ > From: Elli Schwarz <[email protected]> >To: "[email protected]" <[email protected]> >Sent: Wednesday, June 26, 2013 9:48 AM >Subject: Problem with Fuseki generating RDF/XML > > >Rob, > >(This email previously had the subject JENA-378 Redux) > >I think I tracked down the problem with getModel() a bit more. Using s-get, I >can get data back as TTL immediately: >./s-get http://localhost:3131/ds/data http://192.168.6.37/graph/uri_data > > >If I modify the s-get script to get results as RDF/XML, then it takes several >minutes for Fuseki 0.28-SNAPSHOT to respond. > >I start Fuseki 0.28 with this command (Fuseki 0.26 is started similarly, but >with the config-tdb.ttl assembler): >/usr/bin/java -Dlog4j.configuration=log4j.properties -Xmx3200M -jar >/opt/jena-2.10/jena-fuseki-0.2.8-SNAPSHOT/fuseki-server.jar --update >--config=config-tdb-text.ttl --port=3131 > > >If I point the same modified s-get script to the Fuseki 0.26 release, the >RDF/XML comes back immediately. My guess is that the >DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/data").getModel(modelName) > command I use gets data back as RDF/XML, and for some reason Fuseki 0.28 >takes a long time to generate RDF/XML. Any ideas as to what changed in the >latest version of Fuseki that would cause this problem? Is there any way I can >set Fuseki (or the client DatasetAccessor) to use TTL serialization? > >(BTW, I created JENA-479 for the other bug I discovered with SPARQL Insert >scripts.) > >Thank you very much for your help, >Elli > > > >>________________________________ >> From: Rob Vesse <[email protected]> >>To: "[email protected]" <[email protected]>; Elli Schwarz >><[email protected]> >>Sent: Tuesday, June 25, 2013 4:40 PM >>Subject: Re: JENA-378 Redux >> >> >>> I use the older stable jena-core and jena-arq 2.10.0 and jena-fuseki >>>0.2.6 >> >>The current stable releases are jena-core and jena-arq 2.10.1 and >>jena-fuseki 0.2.7 >> >>Do you experience the problem with those versions? >> >>Fuseki config file or arguments used to start would be useful. >> >>Rob >> >> >>On 6/25/13 1:35 PM, "Elli Schwarz" <[email protected]> wrote: >> >>>This past January, I reported a bug to this list which was recorded as >>>JENA-378. I'm now experiencing what appears to be the same problem, where >>>[ ] syntax in an Insert script doesn't work when using >>>UpdateExecutionFactory: >>> >>> String updateString = "INSERT {} WHERE { ?x ?p [ ?a ?b ] }"; >>> UpdateRequest update = UpdateFactory.create(updateString); >>> >>> UpdateProcessor up = UpdateExecutionFactory.createRemote(update, >>> "http://localhost:3131/ds/update"); >>> up.execute(); >>> >>>The error is: 400 Encountered " "?" "? "" >>>caused by the client generating incorrect SPARQL with an extra ? (as >>>viewed from the Fuseki log): INSERT { } WHERE { ?x ?p ??0 . ??0 ?a ?b >>> } >>> >>>This is with jena-core & jena-arg 2.10.2-SNAPSHOT, and with jena-fuseki >>>0.2.8-SNAPSHOT (compiled today). >>>-- >>>Another problem I'm having which I can't track down is that the following >>>code takes a VERY long time to execute (10 minutes): >>>DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update").getMo >>>del(modelName); >>> >>>With earlier versions of Fuseki, it would take seconds, with the same >>>data. The problem seems to be related to my Fuseki server instance >>>itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my client code, >>>since even if I use the older stable jena-core and jena-arq 2.10.0 and >>>jena-fuseki 0.2.6, I also have the problem (but not if I connect it to an >>>earlier Fuseki release). Upon debugging, it appears that for some reason >>>the HTTP request itself is taking a long time to complete. In fact, I'm >>>not even getting anything in the Fuseki log for about a minute after the >>>request is made, but once the request is made I immediately see a spike >>>in CPU usage on the server. This doesn't appear to be a network latency >>>issue since other access to the server isn't affected, it appears to be >>>just this call. It would seem that Fuseki is spinning its wheels on >>>something. >>> >>>I realize this may not be enough info for you to determine what is >>>causing the problem, but I don't know how else to track down the issue. >>>Using s-get I can get back the data quickly, which is strange since I >>>though it would be doing the same thing as the getModel(). >>> >>>Thank you, >>>Elli >> >> >> > >
