Re: Problem with Fuseki generating RDF/XML

Elli Schwarz Thu, 27 Jun 2013 11:39:27 -0700

I think I may have tracked down what is causing my slow performance of GET with 
the new Fuseki 0.28 snapshot. Comparing the output of s-get for the same data 
from the latest Fuseki 0.28 snapshot, and from the 0.26 release, I discovered 
that the 0.28 snapshot is creating the XML in hierarchical form, with nesting 
of elements (RDF/XML-ABBREV). In Fuseki 0.26, it would output the RDF in the 
regular flattened RDF/XML format. Obviously, creating the flattened form is 
much more efficient.


While I understand that RDF/XML-ABBREV is more human readable, there's a big 
price to pay in efficiency, at least for my data. In my case, I'm accessing my 
Fuseki endpoint via datasetAccessor.getModel(), and as far as I know, there's 
no way for me to tell Fuseki through this API that I want the data to be 
serialized as N-TRIPLES (since it's just going to be loaded in a Jena model 
anyway and not read by a human). Is there a way I can control how Fuseki 
serializes by default? And why was the default serialization format changed to 
RDF/XML-ABBREV - is anyone really using RDF/XML anymore as a human-readable 
format anyway? ;-)

I really appreciate any advice, workarounds, or fixes for this issue. I can't 
really switch back to the earlier Fuseki versions anymore, since the new 
jena-text makes my life so much easier since I no longer have to worry about 
manually reindexing after SPARQL Update, like I did with Fuseki and LARQ. 
Thanks for incorporating jena-text!

Thank you,
Elli



>________________________________
> From: Elli Schwarz <[email protected]>
>To: "[email protected]" <[email protected]> 
>Sent: Wednesday, June 26, 2013 9:48 AM
>Subject: Problem with Fuseki generating RDF/XML
> 
>
>Rob,
>
>(This email previously had the subject JENA-378 Redux) 
>
>I think I tracked down the problem with getModel() a bit more. Using s-get, I 
>can get data back as TTL immediately:
>./s-get http://localhost:3131/ds/data http://192.168.6.37/graph/uri_data
>
>
>If I modify the s-get script to get results as RDF/XML, then it takes several 
>minutes for Fuseki 0.28-SNAPSHOT to respond.
>
>I start Fuseki 0.28 with this command (Fuseki 0.26 is started similarly, but 
>with the config-tdb.ttl assembler):
>/usr/bin/java -Dlog4j.configuration=log4j.properties -Xmx3200M -jar 
>/opt/jena-2.10/jena-fuseki-0.2.8-SNAPSHOT/fuseki-server.jar --update 
>--config=config-tdb-text.ttl --port=3131
>
>
>If I point the same modified s-get script to the Fuseki 0.26 release, the 
>RDF/XML comes back immediately. My guess is that the 
>DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/data";).getModel(modelName)
> command I use gets data back as RDF/XML, and for some reason Fuseki 0.28 
>takes a long time to generate RDF/XML. Any ideas as to what changed in the 
>latest version of Fuseki that would cause this problem? Is there any way I can 
>set Fuseki (or the client DatasetAccessor) to use TTL serialization?
>
>(BTW, I created JENA-479 for the other bug I discovered with SPARQL Insert 
>scripts.)
>
>Thank you very much for your help,
>Elli
>
>
>
>>________________________________
>> From: Rob Vesse <[email protected]>
>>To: "[email protected]" <[email protected]>; Elli Schwarz 
>><[email protected]> 
>>Sent: Tuesday, June 25, 2013 4:40 PM
>>Subject: Re: JENA-378 Redux
>> 
>>
>>> I use the older stable jena-core and jena-arq 2.10.0 and jena-fuseki
>>>0.2.6
>>
>>The current stable releases are jena-core and jena-arq 2.10.1 and
>>jena-fuseki 0.2.7
>>
>>Do you experience the problem with those versions?
>>
>>Fuseki config file or arguments used to start would be useful.
>>
>>Rob
>>
>>
>>On 6/25/13 1:35 PM, "Elli Schwarz" <[email protected]> wrote:
>>
>>>This past January, I reported a bug to this list which was recorded as
>>>JENA-378. I'm now experiencing what appears to be the same problem, where
>>>[ ] syntax in an Insert script doesn't work when using
>>>UpdateExecutionFactory:
>>>
>>>  String updateString = "INSERT {} WHERE { ?x ?p [ ?a  ?b ] }";
>>>  UpdateRequest update = UpdateFactory.create(updateString);
>>>
>>>  UpdateProcessor up = UpdateExecutionFactory.createRemote(update,
>>>      "http://localhost:3131/ds/update";);
>>>  up.execute();
>>>
>>>The error is: 400 Encountered " "?" "? ""
>>>caused by the client generating incorrect SPARQL with an extra ? (as
>>>viewed from the Fuseki log):  INSERT { } WHERE   { ?x ?p ??0 . ??0 ?a ?b
>>> } 
>>>
>>>This is with jena-core & jena-arg  2.10.2-SNAPSHOT, and with jena-fuseki
>>>0.2.8-SNAPSHOT (compiled today).
>>>--
>>>Another problem I'm having which I can't track down is that the following
>>>code takes a VERY long time to execute (10 minutes):
>>>DatasetAccessorFactory.createHTTP("http://localhost:3131/ds/update";).getMo
>>>del(modelName);
>>>
>>>With earlier versions of Fuseki, it would take seconds, with the same
>>>data. The problem seems to be related to my Fuseki server instance
>>>itself, which is 0.2.8-SNAPSHOT (r1496513), and not to my client code,
>>>since even if I use the older stable jena-core and jena-arq 2.10.0 and
>>>jena-fuseki 0.2.6, I also have the problem (but not if I connect it to an
>>>earlier Fuseki release). Upon debugging, it appears that for some reason
>>>the HTTP request itself is taking a long time to complete. In fact, I'm
>>>not even getting anything in the Fuseki log for about a minute after the
>>>request is made, but once the request is made I immediately see a spike
>>>in CPU usage on the server. This doesn't appear to be a network latency
>>>issue since other access to the server isn't affected, it appears to be
>>>just this call. It would seem that Fuseki is spinning its wheels on
>>>something. 
>>>
>>>I realize this may not be enough info for you to determine what is
>>>causing the problem, but I don't know how else to track down the issue.
>>>Using s-get I can get back the data quickly, which is strange since I
>>>though it would be doing the same thing as the getModel().
>>>
>>>Thank you,
>>>Elli
>>
>>
>>
>
>

Re: Problem with Fuseki generating RDF/XML

Reply via email to