the problem/bug is not related to the BOM character but seemingly to many UTF-8.
i get (consistently) a return code of 204 when the fuseki server is running without -v and 500 when running with -v if any of the literatls contains a "strange" (nonASCII?) UTF-8. the current problem is the character ä (code point 228 - character a with diaresis, german umlaut). if i remove the character, the triples (all of the request) are stored, if it is in the literat, none is stored. i understand that a request encoded as application/sparql-update must be coded as UTF8 which my literal is - or is there some special encoding necessary for the german a umlaut? i do not think that the triples should be encoded as latin1 or similar?? i tried to POST with curl or wget, but did not succeed (i have not much experience with these outside of simplest case). in any case, it is likely a bug when the response with or without -v in the fuseki start makes a difference? thank you for the help! andrew -- em.o.Univ.Prof. Dr. sc.techn. Dr. h.c. Andrew U. Frank +43 1 58801 12710 direct Geoinformation, TU Wien +43 1 58801 12700 office Gusshausstr. 27-29 +43 1 55801 12799 fax 1040 Wien Austria +43 676 419 25 72 mobil On 03/28/2017 03:35 PM, Andy Seaborne wrote: > What storage is the Fuseki server using? I can't reproduce the > restart effect. > > The BOM is not 65257 (bytes xFE xFF) in a SPARQL Update request, it's > bytes xEF xBB xBF. > > We are talking about what is on-the-wire which means UTF-8 encoded > unicode and codepoint 65257, U+FEFF is 3 bytes in UTF-8 xEF xBB xBF > > http://unicode.org/faq/utf_bom.html#bom4 > > The bytes xFE xFF are illegal as UTF-8 hence the message you see. > > $ echo -n $'\uFEFF' | od -t x1 > ==> > 0000000 ef bb bf > 0000003 > > $ echo -n $'\xFE\xFF' | od -t x1 > ==> > 0000000 fe ff > 0000002 > > The fact that the 500 does not say where the error in the input stream > occurs is an unfortunate effect of efficient decoding by java and by > javacc. It processes large blocks of bytes and does not say where in > the block the error occurred. This is a nuisance. > > What is legal is to put the unicode encoding "\uFEFF" into the SPARQL > Update. > > Andy > > > > On 28/03/17 12:07, Andrew U Frank wrote: >> thank you for your information. starting fuseki with -v gives indeed >> more information. in this case i get >> >> [2017-03-28 12:45:07] Fuseki INFO [49] POST >> http://127.0.0.1:3030/memDB/update >> [2017-03-28 12:45:07] Fuseki INFO [49] => Connection: >> close >> [2017-03-28 12:45:07] Fuseki INFO [49] => User-Agent: >> haskell-HTTP/4000.3.5 >> [2017-03-28 12:45:07] Fuseki INFO [49] => Host: >> 127.0.0.1:3030 >> [2017-03-28 12:45:07] Fuseki INFO [49] => Accept: >> */* >> [2017-03-28 12:45:07] Fuseki INFO [49] => Content-Length: >> 1062 >> [2017-03-28 12:45:07] Fuseki INFO [49] => Content-Type: >> application/sparql-update >> [2017-03-28 12:45:07] Fuseki INFO [49] POST /memDB :: 'update' :: >> [application/sparql-update] ? >> [2017-03-28 12:45:07] Fuseki WARN [49] Runtime IO Exception (client >> left?) RC = 500 : java.nio.charset.MalformedInputException: Input >> length = 1 >> org.apache.jena.atlas.RuntimeIOException: >> java.nio.charset.MalformedInputException: Input length = 1 >> at org.apache.jena.atlas.io.IO.exception(IO.java:233) >> at >> org.apache.jena.fuseki.servlets.SPARQL_Update.executeBody(SPARQL_Update.java:183) >> >> at >> org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:108) >> >> at >> org.apache.jena.fuseki.servlets.ActionSPARQL.executeLifecycle(ActionSPARQL.java:134) >> >> at >> org.apache.jena.fuseki.servlets.SPARQL_UberServlet.executeRequest(SPARQL_UberServlet.java:356) >> >> at >> org.apache.jena.fuseki.servlets.SPARQL_UberServlet.serviceDispatch(SPARQL_UberServlet.java:317) >> >> at >> org.apache.jena.fuseki.servlets.SPARQL_UberServlet.executeAction(SPARQL_UberServlet.java:272) >> >> at >> org.apache.jena.fuseki.servlets.ActionSPARQL.execCommonWorker(ActionSPARQL.java:85) >> >> at >> org.apache.jena.fuseki.servlets.ActionBase.doCommon(ActionBase.java:81) >> at >> org.apache.jena.fuseki.servlets.FusekiFilter.doFilter(FusekiFilter.java:73) >> >> at >> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669) >> >> at >> org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61) >> >> at >> org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108) >> >> at >> org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137) >> >> at >> org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125) >> >> at >> org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66) >> >> at >> org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:449) >> >> at >> org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:365) >> >> at >> org.apache.shiro.subject.support.SubjectCallable.doCall(SubjectCallable.java:90) >> >> at >> org.apache.shiro.subject.support.SubjectCallable.call(SubjectCallable.java:83) >> >> at >> org.apache.shiro.subject.support.DelegatingSubject.execute(DelegatingSubject.java:383) >> >> at >> org.apache.shiro.web.servlet.AbstractShiroFilter.doFilterInternal(AbstractShiroFilter.java:362) >> >> at >> org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125) >> >> at >> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669) >> >> at >> org.apache.jena.fuseki.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:285) >> >> at >> org.apache.jena.fuseki.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:248) >> >> at >> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669) >> >> at >> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581) >> >> at >> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) >> >> at >> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) >> >> at >> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) >> >> at >> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1156) >> >> at >> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511) >> >> at >> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) >> >> at >> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1088) >> >> at >> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) >> >> at >> org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:374) >> >> at >> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119) >> >> at org.eclipse.jetty.server.Server.handle(Server.java:517) >> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:306) >> at >> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242) >> >> at >> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:245) >> >> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) >> at >> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75) >> >> at >> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:213) >> >> at >> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147) >> >> at >> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654) >> >> at >> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572) >> >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.nio.charset.MalformedInputException: Input length = 1 >> at java.nio.charset.CoderResult.throwException(CoderResult.java:281) >> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339) >> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) >> at java.io.InputStreamReader.read(InputStreamReader.java:184) >> at java.io.Reader.read(Reader.java:140) >> at org.apache.jena.atlas.io.IO.readWholeFileAsUTF8(IO.java:316) >> at org.apache.jena.atlas.io.IO.readWholeFileAsUTF8(IO.java:298) >> at >> org.apache.jena.fuseki.servlets.SPARQL_Update.executeBody(SPARQL_Update.java:182) >> >> ... 47 more >> [2017-03-28 12:45:07] Fuseki INFO [49] 500 >> java.nio.charset.MalformedInputException: Input length = 1 (4 ms) >> >> the version is recently downloaded (2.5.0 - is there a better one?). >> >> the transfer is using a http protocol (i think haskell ghc uses the >> libcurl) and the request is now: >> >> callHTTP5 : >> request POST http://127.0.0.1:3030/memDB/update HTTP/1.1 >> Accept: */* >> Content-Length: 1062 >> Content-Type: application/sparql-update >> >> requestbody INSERT DATA { GRAPH <http://gerastree.at/fn2b> >> {<http://gerastree.at/waterhouse-kw#> >> <http://gerastree.at/lit_2014#titel> "(Krieg und Welt)"@de . >> <http://gerastree.at/waterhouse-kw#P003> >> <http://gerastree.at/lit_2014#hl1> "with bomUnsere Namen werden >> lebendig"@de . >> <http://gerastree.at/waterhouse-kw#P003> >> <http://gerastree.at/lit_2014#inBuch> >> <http://gerastree.at/waterhouse-kw#> . >> <http://gerastree.at/waterhouse-kw#P003> >> <http://gerastree.at/lit_2014#inPart> >> <http://gerastree.at/lit_2014#P000> . >> <http://gerastree.at/waterhouse-kw#P003> >> <http://gerastree.at/lit_2014#aufSeite> "L009" . >> <http://gerastree.at/waterhouse-kw#P004> >> <http://gerastree.at/lit_2014#paragraph> "Was ist ihm fremd und was sein >> eigen?\n"@de . >> <http://gerastree.at/waterhouse-kw#P004> >> <http://gerastree.at/lit_2014#inBuch> >> <http://gerastree.at/waterhouse-kw#P004> . >> <http://gerastree.at/waterhouse-kw#P004> >> <http://gerastree.at/lit_2014#inPart> >> <http://gerastree.at/lit_2014#P003> . >> <http://gerastree.at/waterhouse-kw#P004> >> <http://gerastree.at/lit_2014#aufSeite> "L011" . >> } } >> callHTTP5 result is is Right HTTP/1.1 500 >> java.nio.charset.MalformedInputException: Input length = 1 >> Date: Tue, 28 Mar 2017 10:45:07 GMT >> Fuseki-Request-ID: 49 >> Content-Type: text/plain;charset=utf-8 >> Cache-Control: must-revalidate,no-cache,no-store >> Pragma: no-cache >> Content-Length: 134 >> Connection: close >> >> which is a "not ok repsonse" and coresponds to the fact that nothing is >> stored . >> >> i thought this could be closed and assumed i had some other problem. but >> then i restarted fuseki (exactly the same configuration as before >> (--mem) but without the -v >> and get a different response for the same request (the program producing >> was not changed) - this time with a 204 answer (and no triples stored, >> as for the 500 response), which is clearly not to be expected. >> >> callHTTP5 : >> request POST http://127.0.0.1:3030/memDB/update HTTP/1.1 >> Accept: */* >> Content-Length: 1062 >> Content-Type: application/sparql-update >> >> requestbody INSERT DATA { GRAPH <http://gerastree.at/fn2d> >> {<http://gerastree.at/waterhouse-kw#> >> <http://gerastree.at/lit_2014#titel> "with bom(Krieg und Welt)"@de . >> <http://gerastree.at/waterhouse-kw#P003> >> <http://gerastree.at/lit_2014#hl1> "Unsere Namen werden lebendig"@de . >> <http://gerastree.at/waterhouse-kw#P003> >> <http://gerastree.at/lit_2014#inBuch> >> <http://gerastree.at/waterhouse-kw#> . >> <http://gerastree.at/waterhouse-kw#P003> >> <http://gerastree.at/lit_2014#inPart> >> <http://gerastree.at/lit_2014#P000> . >> <http://gerastree.at/waterhouse-kw#P003> >> <http://gerastree.at/lit_2014#aufSeite> "L009" . >> <http://gerastree.at/waterhouse-kw#P004> >> <http://gerastree.at/lit_2014#paragraph> "Was ist ihm fremd und was sein >> eigen?\n"@de . >> <http://gerastree.at/waterhouse-kw#P004> >> <http://gerastree.at/lit_2014#inBuch> >> <http://gerastree.at/waterhouse-kw#P004> . >> <http://gerastree.at/waterhouse-kw#P004> >> <http://gerastree.at/lit_2014#inPart> >> <http://gerastree.at/lit_2014#P003> . >> <http://gerastree.at/waterhouse-kw#P004> >> <http://gerastree.at/lit_2014#aufSeite> "L011" . >> } } >> callHTTP5 result is is Right HTTP/1.1 204 No Content >> Date: Tue, 28 Mar 2017 10:58:33 GMT >> Fuseki-Request-ID: 28 >> Connection: close >> >> i hope this is enough information that you can identify a fix to allow >> the 500 response to pass through. >> >> to reproduce the problem it seems to be enough to have a BOM "\65279" >> character in a triple with a literal (perhaps at the front position, but >> seemingly any triple in the request triggers the error response). >> >> thank you for your effort - i like fuseki a lot! >> >> andrew >> >>