I've got a PR in now for 1499 (TIM remembers graph names even after they are empty):
https://github.com/apache/jena/pull/374 Whether or not that's the whole story here, that's a different question! ajs6f > On Mar 7, 2018, at 12:23 PM, Andy Seaborne <a...@apache.org> wrote: > > > > On 07/03/18 16:51, Dave Reynolds wrote: >> On 07/03/18 16:29, Andy Seaborne wrote: >>> JENA-1499 may have knock on effects. >>> >>> Adding a quad and deleting a quad and still listing the graph name would be >>> OK but "contains graph" returns false in TIM and true in general. That >>> might make a difference - I haven't traced that part of the code. >> Understood. >>> As for the :3030, ":" is one of the nasty characters in URIs but the Fuseki >>> log looks like it is %-ed correctly. >>> >>> https://gist.github.com/afs/c5623911293dd19da7f6cc1b2cf64c17 >>> >>> is an updated script to use curl and not Jena tools. >>> >>> There is variable to set the graph name. >>> >>> Does this script illustrate the effect? >> Yes with one correction. Line 18 should be: >> G5X="$(wwwenc $G5)" >> rather than: >> G5X="$(wwwenc $G)" > > My bad - gist corrected. > >> With that correction that this illustrates the effect for me. > > I always get: > > ------------------------------------------- > | s | p | o | G | > =========================================== > | :r4 | rdf:type | :Resource | :r4 | > | :r4 | rdfs:label | "r 4 modified" | :r4 | > ------------------------------------------- > > 3.4.0, 3.6.0 (I do delete run/ each start-up or use the 3.6.0 basic server to > ensure no funny stuff). > > This is bizarre. No blank nodes either. > > openjdk version "1.8.0_151" > > > Andy > >> Dave >>> Andy >>> >>> >>> On 07/03/18 14:14, Dave Reynolds wrote: >>>> On 07/03/18 12:59, Andy Seaborne wrote: >>>>> >>>>> >>>>> On 06/03/18 22:43, Dave Reynolds wrote: >>>>>> Hi Andy, >>>>>> >>>>>> Thanks for confirming you seeing something similar, glad I wasn't >>>>>> hallucinating! >>>>> >>>>> Not sure any more that I am :-| >>>>> >>>>> There are ghost graphs in a TIM dataset after deletion, JENA-1499, but >>>>> I'm not seeing an empty store and do see the added triples. >>>> >>>> OK so this just got weirder. I just tried the script from your gist and >>>> that works for me. Whereas my own was still failing. The only difference >>>> is that in your initial s-put you have a slightly different graph name >>>> (includes the port). >>>> >>>> So for me on 3.4.0 and 3.6.0 if I run a trimmed version of your test [1], >>>> with the same update.ttl [2] and U1.ru [3] scripts. I see correct results: >>>> >>>> == Step 1a >>>> ---- >>>> == Step 1b >>>> <http://localhost:3030/graph5> { >>>> <http://localhost/test/r4> >>>> a <http://localhost/test/Resource> ; >>>> <http://www.w3.org/2000/01/rdf-schema#label> >>>> "r 4" . >>>> } >>>> ---- >>>> == Step 2a >>>> ---- >>>> <http://localhost/test/r4> { >>>> <http://localhost/test/r4> >>>> a <http://localhost/test/Resource> ; >>>> <http://www.w3.org/2000/01/rdf-schema#label> >>>> "r 4 modified" . >>>> } >>>> >>>> <http://localhost:3030/graph5> { >>>> } >>>> ---- >>>> == Step 3a >>>> -------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> | s | p | o >>>> | G | >>>> ================================================================================================================================================== >>>> >>>> | <http://localhost/test/r4> | >>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | >>>> <http://localhost/test/Resource> | <http://localhost/test/r4> | >>>> | <http://localhost/test/r4> | >>>> <http://www.w3.org/2000/01/rdf-schema#label> | "r 4 modified" >>>> | <http://localhost/test/r4> | >>>> -------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> ---- >>>> >>>> If I run the same script but without the ":3030" in the graph name [4] I >>>> see incorrect results from the SPARQL query but correct TRIG: >>>> >>>> == Step 1a >>>> ---- >>>> == Step 1b >>>> <http://localhost/graph5> { >>>> <http://localhost/test/r4> >>>> a <http://localhost/test/Resource> ; >>>> <http://www.w3.org/2000/01/rdf-schema#label> >>>> "r 4" . >>>> } >>>> ---- >>>> == Step 2a >>>> ---- >>>> <http://localhost/graph5> { >>>> } >>>> >>>> <http://localhost/test/r4> { >>>> <http://localhost/test/r4> >>>> a <http://localhost/test/Resource> ; >>>> <http://www.w3.org/2000/01/rdf-schema#label> >>>> "r 4 modified" . >>>> } >>>> ---- >>>> == Step 3a >>>> ----------------- >>>> | s | p | o | G | >>>> ================= >>>> ----------------- >>>> ---- >>>> >>>> The only significant difference I can see is that the change in graph >>>> names means the graphs list in a different order in the TRIG, which >>>> suggests they hash in a different order in the store. Why that should >>>> matter though ... ? >>>> >>>> This at least explains why it was so hard to isolate the problem and that >>>> it seemed non-deterministic. Any attempt to simplify the names made the >>>> problem go away. >>>> >>>> Dave >>>> >>>> [1] https://gist.github.com/der/d47602ffbd294b2ed572cbccb02c7778 >>>> [2] https://gist.github.com/der/26a2147d19ad495a925374e56428f7e3 >>>> [3] https://gist.github.com/der/704363855a7cca35b3f123dff136c56d >>>> [4] https://gist.github.com/der/7e1e711c6341df417e9ff18416bd1a5c >>>> >>>> >>>>> >>>>> Andy >>>>> >>>>>> >>>>>> I've tried with '' instead of "" in the shell script version of the test >>>>>> with identical results. >>>>>> >>>>>> Confirmed that using --memTDB the test passes for me. >>>>>> >>>>>> Dave >>>>>> >>>>>> On 06/03/18 17:29, Andy Seaborne wrote: >>>>>>> Weird. >>>>>>> >>>>>>> I ran this script with Fuseki v3.6.0 "--mem" and also "--memTDB" for >>>>>>> steps up to and including 6. >>>>>>> >>>>>>> https://gist.github.com/afs/cd6953b06985dde37a9581134ec13165 >>>>>>> >>>>>>> There something going on with TIM because I'm seeing empty graph5 with >>>>>>> TIM but not with TDB. >>>>>>> >>>>>>> I may have seen no results once and then I changed: >>>>>>> >>>>>>> > 6. Check the contents of the store: >>>>>>> > >>>>>>> > rsparql --service http://localhost:3030/ds/query "SELECT * WHERE >>>>>>> > {Graph ?G {?s ?p ?o}} ORDER BY ?G" >>>>>>> >>>>>>> There are "" quotes around a "*" and it's a script. Could you try >>>>>>> ''-quotes please? >>>>>>> >>>>>>> I'll try step 7. >>>>>>> >>>>>>> Andy >>>>>>> >>>>>>> On 05/03/18 23:56, Dave Reynolds wrote: >>>>>>>> I've been trying to debug some weird behaviour in my test cases and >>>>>>>> think they are due to a bug in memory-backed fuseki stores. >>>>>>>> >>>>>>>> However, the behaviour is so odd and hard to reproduce I'd like some >>>>>>>> confirmation that someone else sees the same effect before opening a >>>>>>>> JIRA. >>>>>>>> >>>>>>>> # Steps to reproduce >>>>>>>> >>>>>>>> [Sorry this is convoluted but all my attempts to simplify fail to show >>>>>>>> the suspected bug.] >>>>>>>> >>>>>>>> 1. Fresh download of fuseki 3.6.0. >>>>>>>> >>>>>>>> 2. Start an in memory server in one shell: >>>>>>>> >>>>>>>> fuseki-server --mem --update /ds >>>>>>>> >>>>>>>> 3. Create test data with two statements about one resource: >>>>>>>> >>>>>>>> echo 'prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> >>>>>>>> prefix eg: <http://localhost/test/> >>>>>>>> eg:r4 a eg:Resource; rdfs:label "r 4" .' > update.ttl >>>>>>>> >>>>>>>> 4. Use the graph REST API to put the data into a named graph: >>>>>>>> >>>>>>>> s-put http://localhost:3030/ds/data http://localhost/graph5 >>>>>>>> update.ttl >>>>>>>> >>>>>>>> 5. Run a sparql update which will delete the original statements from >>>>>>>> the graph and add some replacement statements to a new graph: >>>>>>>> >>>>>>>> rupdate --service=http://localhost:3030/ds/update ' >>>>>>>> DELETE {GRAPH ?G {<http://localhost/test/r4> ?p ?o}} WHERE {GRAPH >>>>>>>> ?G {<http://localhost/test/r4> ?p ?o}}; >>>>>>>> INSERT DATA { GRAPH <http://localhost/test/r4> { >>>>>>>> <http://localhost/test/r4> >>>>>>>> <http://www.w3.org/2000/01/rdf-schema#label> "r 4 modified" . >>>>>>>> <http://localhost/test/r4> >>>>>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> >>>>>>>> <http://localhost/test/Resource> . >>>>>>>> } }' >>>>>>>> >>>>>>>> 6. Check the contents of the store: >>>>>>>> >>>>>>>> rsparql --service http://localhost:3030/ds/query "SELECT * WHERE >>>>>>>> {Graph ?G {?s ?p ?o}} ORDER BY ?G" >>>>>>>> >>>>>>>> At this point the store *should* contain two statements in graph >>>>>>>> http://localhost/test/r4. With a TDB-backed fuseki that's what I see. >>>>>>>> With the memory backed fuseki I see an apparently empty store. >>>>>>>> >>>>>>>> If other named graphs are populated with other unrelated data they >>>>>>>> will seem to have disappeared as well. >>>>>>>> >>>>>>>> 7. Now reinsert the original data, running step 4 again and check by >>>>>>>> running step 6 again. At this point both the "missing" statements from >>>>>>>> graph http://localhost/test/r4 reappear, as does the reinserted >>>>>>>> original statements in http://localhost/graph5. >>>>>>>> >>>>>>>> # Simplifying the test case >>>>>>>> >>>>>>>> So far ... >>>>>>>> >>>>>>>> - I've failed to reproduce this outside of fuseki. >>>>>>>> >>>>>>>> - I've failed to reproduce this without mixing graph operations and >>>>>>>> update operations. >>>>>>>> >>>>>>>> - If I reduce the inserted/updated data to a single statement >>>>>>>> instead of a pair of statements it passes. >>>>>>>> >>>>>>>> - If I try with at empty TDB store it passes. >>>>>>>> >>>>>>>> - I get the same behaviour from 3.4.0 as from 3.6.0. >>>>>>>> >>>>>>>> Any of these failures may be user error on my part, it has been so >>>>>>>> hard turning an apparently non-deterministic error into something >>>>>>>> reproducible that I'm no longer sure of anything :( >>>>>>>> >>>>>>>> Am I going mad or does anyone else see the same behaviour? >>>>>>>> >>>>>>>> Dave >>>>>>>> >>>>>>>> >>>>>>>>