RDF/XML pretty much has to be loaded all together to be processed (because any given assertion in the XML depends on a lot of context which could be from elsewhere in the document), so when Fuseki receives the stream of XML, it's got to assemble a complete graph before it can act on the data. You will likely find that using a streamable format (e.g. N-Triples) will get the data through before Fuseki hiccups. You can use Jena's command-line tools to convert formats client-side.
--- A. Soroka The University of Virginia Library > On Mar 29, 2016, at 10:19 AM, Mikael Pesonen <mikael.peso...@lingsoft.fi> > wrote: > > > Hi Osma! > > does thin mean that Fuseki loads the entire data into memory first? Dataset > is bigger than installed memory on the server. Tried 3Gb which got it further > but not to the finish line so trying bigger value tomorrow. > > Br, > Mikael > > > On 24.3.2016 17:19, Osma Suominen wrote: >> Hi Mikael! >> >> Please check the Fuseki log, you're likely to find the problem over there. >> s-put/SOH is just reporting that the connection to Fuseki was broken for >> some reason. >> >> One probable cause is that Fuseki has run out of memory while processing the >> upload. In that case you will need to increase the memory allocated to >> Fuseki by setting JAVA_OPTS to a value such as "-Xmx4G". The default setting >> of 1.2 GB doesn't allow for particularly large files to be uploaded. >> >> -Osma >> >> On 24/03/16 15:08, Mikael Pesonen wrote: >>> >>> Hi again! >>> >>> I got now further, but at some point when loading a fairly lage XML file >>> I get this: >>> >>> /usr/lib/ruby/1.9.1/net/protocol.rb:199:in `write': Broken pipe >>> (Errno::EPIPE) >>> from /usr/lib/ruby/1.9.1/net/protocol.rb:199:in `write0' >>> from /usr/lib/ruby/1.9.1/net/protocol.rb:173:in `block in write' >>> from /usr/lib/ruby/1.9.1/net/protocol.rb:190:in `writing' >>> from /usr/lib/ruby/1.9.1/net/protocol.rb:172:in `write' >>> from /usr/lib/ruby/1.9.1/net/http.rb:1956:in >>> `send_request_with_body_stream' >>> from /usr/lib/ruby/1.9.1/net/http.rb:1922:in `exec' >>> from /usr/lib/ruby/1.9.1/net/http.rb:1318:in `block in >>> transport_request' >>> from /usr/lib/ruby/1.9.1/net/http.rb:1317:in `catch' >>> from /usr/lib/ruby/1.9.1/net/http.rb:1317:in `transport_request' >>> from /usr/lib/ruby/1.9.1/net/http.rb:1294:in `request' >>> from bin/s-put:221:in `response_no_body' >>> from bin/s-put:208:in `send_body' >>> from bin/s-put:164:in `PUT' >>> from bin/s-put:424:in `cmd_soh' >>> from bin/s-put:703:in `<main>' >>> >>> Do you have any idea whether it is because of big file or malformed XML >>> or something else? >>> >>> Br, >>> Mikael >>> >>> >>> On 24.3.2016 13:25, Osma Suominen wrote: >>>> Hi Mikael! >>>> >>>> Try renaming the file to .rdf instead of .xml. It's likely that s-put >>>> doesn't recognize the file extension .xml - after all, it could be any >>>> kind of XML, not just RDF/XML. >>>> >>>> -Osma >>>> >>>> On 24/03/16 11:31, Mikael Pesonen wrote: >>>>> >>>>> Hi, >>>>> >>>>> sorry for missing info. So I'm trying to: >>>>> >>>>> /apache-jena-fuseki-2.3.1$ bin/s-put http://localhost:3030/ds/data >>>>> http://www.lingsoft.fi/geonames/ ./tmp.xml >>>>> >>>>> >>>>> tmp.xml is a geonames entry: >>>>> >>>>> <?xml version="1.0" encoding="UTF-8" standalone="no"?> >>>>> <rdf:RDF xmlns:cc="http://creativecommons.org/ns#" >>>>> xmlns:dcterms="http://purl.org/dc/terms/" >>>>> xmlns:foaf="http://xmlns.com/foaf/0.1/" >>>>> xmlns:gn="http://www.geonames.org/ontology#" >>>>> xmlns:owl="http://www.w3.org/2002/07/owl#" >>>>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >>>>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" >>>>> xmlns:wgs84_pos="http://www.w3.org/2003/01/geo/wgs84_pos#"> >>>>> <gn:Feature rdf:about="http://sws.geonames.org/3/"><rdfs:isDefinedBy >>>>> rdf:resource="http://sws.geonames.org/3/about.rdf"/> >>>>> <gn:name>Zamīn Sūkhteh</gn:name><gn:alternateName xml:lang="fa">زمين >>>>> سوخته</gn:alternateName> >>>>> <gn:alternateName xml:lang="fa">Zamīn >>>>> Sūkhteh</gn:alternateName><gn:featureClass >>>>> rdf:resource="http://www.geonames.org/ontology#S"/> >>>>> <gn:featureCode rdf:resource="http://www.geonames.org/ontology#S.CRRL"/> >>>>> <gn:countryCode>IR</gn:countryCode> >>>>> <wgs84_pos:lat>32.45831</wgs84_pos:lat> >>>>> <wgs84_pos:long>48.96335</wgs84_pos:long> >>>>> <gn:parentFeature rdf:resource="http://sws.geonames.org/127082/"/> >>>>> <gn:parentCountry rdf:resource="http://sws.geonames.org/130758/"/> >>>>> <gn:parentADM1 rdf:resource="http://sws.geonames.org/127082/"/> >>>>> <gn:nearbyFeatures rdf:resource="http://sws.geonames.org/3/nearby.rdf"/> >>>>> <gn:locationMap >>>>> rdf:resource="http://www.geonames.org/3/zamin-sukhteh.html"/> >>>>> </gn:Feature> >>>>> </rdf:RDF> >>>>> >>>>> >>>>> And error comes from Ruby: >>>>> >>>>> /usr/lib/ruby/1.9.1/net/http.rb:1436:in `block in >>>>> initialize_http_header': undefined method `strip' for nil:NilClass >>>>> (NoMethodError) >>>>> from /usr/lib/ruby/1.9.1/net/http.rb:1434:in `each' >>>>> from /usr/lib/ruby/1.9.1/net/http.rb:1434:in >>>>> `initialize_http_header' >>>>> from bin/s-put:205:in `send_body' >>>>> from bin/s-put:164:in `PUT' >>>>> from bin/s-put:424:in `cmd_soh' >>>>> from bin/s-put:703:in `<main>' >>>>> >>>>> >>>>> XML looks like valid, but is s-put missing some info on what that XML >>>>> is? >>>>> >>>>> Br, >>>>> Mikael >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 18.3.2016 13:32, Andy Seaborne wrote: >>>>>> > No idea? I need to update data to either running database or make >>>>>> > a new db. >>>>>> >>>>>> (to a message 3 days ago ...) >>>>>> >>>>>> "this does not work" is a bit minimal. >>>>>> >>>>>> What does work? Other s-* commands? Other files? >>>>>> >>>>>> I'd guess that ".xml" is not recognized as RDF. It's not the right >>>>>> file extension. The MIME type must be for the request. There's some >>>>>> kind of determination in the soh script. >>>>>> >>>>>>> When trying to start another server to port 3031 server complains >>>>>>> >>>>>>> org.apache.jena.tdb.TDBException: Can't open database at location >>>>>>> /home/text/tools/apache-jena-fuseki-2.3.1/run/system/ >>>>>> >>>>>> You can't have two servers running on the same TDB files at the same >>>>>> time. (For that matter, you can't do that with MySQL either - you need >>>>>> a server process to mediate requests). >>>>>> >>>>>> Andy >>>>>> >>>>>>> >>>>>>> Mikael >>>>>>> >>>>>>> On 15.3.2016 13:40, Mikael Pesonen wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> okay thats good to know. I tried with s-put >>>>>>>> >>>>>>>> apache-jena-fuseki-2.3.1$ bin/s-put http://localhost:3030/ds/data >>>>>>>> http://www.lingsoft.fi/geonames/ tmp.xml >>>>>>>> >>>>>>>> /usr/lib/ruby/1.9.1/net/http.rb:1436:in `block in >>>>>>>> initialize_http_header': undefined method `strip' for nil:NilClass >>>>>>>> (NoMethodError) >>>>>>>> from /usr/lib/ruby/1.9.1/net/http.rb:1434:in `each' >>>>>>>> from /usr/lib/ruby/1.9.1/net/http.rb:1434:in >>>>>>>> `initialize_http_header' >>>>>>>> from bin/s-put:205:in `send_body' >>>>>>>> from bin/s-put:164:in `PUT' >>>>>>>> from bin/s-put:424:in `cmd_soh' >>>>>>>> from bin/s-put:703:in `<main>' >>>>>>>> >>>>>>>> tmp.xml contains one entry from Geonames dump: >>>>>>>> >>>>>>>> <?xml version="1.0" encoding="UTF-8" standalone="no"?><rdf:RDF >>>>>>>> xmlns:cc="http://creativecommons.org/ns#" >>>>>>>> xmlns:dcterms="http://purl.org/dc/terms/" >>>>>>>> xmlns:foaf="http://xmlns.com/foaf/0.1/" >>>>>>>> xmlns:gn="http://www.geonames.org/ontology#" >>>>>>>> xmlns:owl="http://www.w3.org/2002/07/owl#" >>>>>>>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >>>>>>>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" >>>>>>>> xmlns:wgs84_pos="http://www.w3.org/2003/01/geo/wgs84_pos#"><gn:Feature >>>>>>>> >>>>>>>> rdf:about="http://sws.geonames.org/3/"><rdfs:isDefinedBy >>>>>>>> rdf:resource="http://sws.geonames.org/3/about.rdf"/><gn:name>Zamīn >>>>>>>> Sūkhteh</gn:name><gn:alternateName xml:lang="fa">زمين >>>>>>>> سوخته</gn:alternateName><gn:alternateName xml:lang="fa">Zamīn >>>>>>>> Sūkhteh</gn:alternateName><gn:featureClass >>>>>>>> rdf:resource="http://www.geonames.org/ontology#S"/><gn:featureCode >>>>>>>> rdf:resource="http://www.geonames.org/ontology#S.CRRL"/><gn:countryCode>IR</gn:countryCode><wgs84_pos:lat>32.45831</wgs84_pos:lat><wgs84_pos:long>48.96335</wgs84_pos:long><gn:parentFeature >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> rdf:resource="http://sws.geonames.org/127082/"/><gn:parentCountry >>>>>>>> rdf:resource="http://sws.geonames.org/130758/"/><gn:parentADM1 >>>>>>>> rdf:resource="http://sws.geonames.org/127082/"/><gn:nearbyFeatures >>>>>>>> rdf:resource="http://sws.geonames.org/3/nearby.rdf"/><gn:locationMap >>>>>>>> rdf:resource="http://www.geonames.org/3/zamin-sukhteh.html"/></gn:Feature></rdf:RDF> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Br, >>>>>>>> Mikael >>>>>>>> >>>>>>>> >>>>>>>> On 15.3.2016 13:21, Andy Seaborne wrote: >>>>>>>>> On 15/03/16 10:40, Mikael Pesonen wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> is it possible to add content to graph from RDF XML with command >>>>>>>>>> line >>>>>>>>>> tools? s-put requires SPARQL and tdbloader says >>>>>>>>>> >>>>>>>>>> org.apache.jena.tdb.TDBException: Can't open database at location >>>>>>>>>> /home/text/tools/apache-jena-3.0.1/DB/ as it is already locked >>>>>>>>>> by the >>>>>>>>>> process with PID 7672. TDB databases do not permit concurrent >>>>>>>>>> usage >>>>>>>>>> across JVMs so in order to prevent possible data corruption you >>>>>>>>>> cannot >>>>>>>>>> open this location from the JVM that does not own the lock for the >>>>>>>>>> dataset >>>>>>>>>> >>>>>>>>>> Br, >>>>>>>>>> Mikael >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Yes - use s-put or s-post. >>>>>>>>> >>>>>>>>> These are the SPARQL Graph Store Protocol - no query language, no >>>>>>>>> update language. >>>>>>>>> >>>>>>>>> All they do is HTTP PUT or POST to the right graph name and the >>>>>>>>> right >>>>>>>>> content type. >>>>>>>>> >>>>>>>>> You can PUT and POST to the dataset itself as well using curl or >>>>>>>>> wget >>>>>>>>> or any HTTP tool. You need to set the Content-type header. >>>>>>>>> >>>>>>>>> Andy >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>> >> >> > > -- > www.lingsoft.fi > > Speech Applications - Language Management - Translation - Reader's and > Writer's Tools - Text Tools - E-books and M-books > > Mikael Pesonen > System Engineer > > e-mail: mikael.peso...@lingsoft.fi > Tel. +358 2 279 3300 > > Time zone: GMT+2 > > Helsinki Office > Eteläranta 10 > FI-00130 Helsinki > FINLAND > > Turku Office > Linnankatu 10 A > FI-20100 Turku > FINLAND >