Addshore closed this task as "Invalid". Addshore added a comment.
Run an empty blazegraph container. docker run -d -p 9999:9999 --env WIKIBASE_SCHEME=https --env WIKIBASE_HOST=intentionally-empty.wiki.opencura.com --env WDQS_HOST=localhost --env WDQS_PORT=9999 --name demo-wdqs wikibase/wdqs:0.3.40 /runBlazegraph.sh Wait for the service to come up, and make sure it is empty curl "localhost:9999/bigdata/sparql?query=SELECT%20%2A%20WHERE%20%7B%3Fa%20%3Fb%20%3Fc%7D" You should see something like this <?xml version='1.0' encoding='UTF-8'?> <sparql xmlns='http://www.w3.org/2005/sparql-results#'> <head> <variable name='a'/> <variable name='b'/> <variable name='c'/> </head> <results> </results> </sparql> Run the updater once pointing to some wikibase, and the query service we just made docker exec demo-wdqs /runUpdate.sh You should see something like this, and you can kill / stop it after a few loops (Ctrl+C) wait-for-it.sh: waiting 300 seconds for intentionally-empty.wiki.opencura.com:80 wait-for-it.sh: intentionally-empty.wiki.opencura.com:80 is available after 0 seconds wait-for-it.sh: waiting 300 seconds for localhost:9999 wait-for-it.sh: localhost:9999 is available after 0 seconds Updating via http://localhost:9999/bigdata/namespace/wdq/sparql #logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n 18:00:17.284 [main] INFO org.wikidata.query.rdf.tool.Update - Starting Updater 0.3.40 (a115a80eec974454d140389e1f52aad0e54913f9) 18:00:18.959 [main] INFO o.w.q.r.t.change.ChangeSourceContext - Checking where we left off 18:00:18.960 [main] INFO o.w.query.rdf.tool.rdf.RdfRepository - Checking for left off time from the updater 18:00:19.267 [main] INFO o.w.query.rdf.tool.rdf.RdfRepository - Checking for left off time from the dump 18:00:19.333 [main] INFO o.w.q.r.t.change.ChangeSourceContext - Defaulting start time to 30 days ago: 2021-02-15T18:00:19.333Z 18:00:20.452 [main] INFO o.w.q.r.t.change.RecentChangesPoller - Got no real changes 18:00:20.780 [main] INFO org.wikidata.query.rdf.tool.Updater - Polled up to 2021-02-15T18:00:19.333Z at (0.0, 0.0, 0.0) updates per second and (0.0, 0.0, 0.0) milliseconds per second 18:00:21.066 [main] INFO o.w.q.r.t.change.RecentChangesPoller - Got no real changes 18:00:21.067 [main] INFO org.wikidata.query.rdf.tool.Updater - Sleeping for 10 secs 18:00:31.661 [main] INFO o.w.q.r.t.change.RecentChangesPoller - Got no real changes 18:00:31.662 [main] INFO org.wikidata.query.rdf.tool.Updater - Sleeping for 10 secs Run the updater again. docker exec demo-wdqs /runUpdate.sh This time you should see the error wait-for-it.sh: waiting 300 seconds for intentionally-empty.wiki.opencura.com:80 wait-for-it.sh: intentionally-empty.wiki.opencura.com:80 is available after 0 seconds wait-for-it.sh: waiting 300 seconds for localhost:9999 wait-for-it.sh: localhost:9999 is available after 0 seconds Updating via http://localhost:9999/bigdata/namespace/wdq/sparql #logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n 18:00:55.545 [main] INFO org.wikidata.query.rdf.tool.Update - Starting Updater 0.3.40 (a115a80eec974454d140389e1f52aad0e54913f9) 18:00:57.495 [main] INFO o.w.q.r.t.change.ChangeSourceContext - Checking where we left off 18:00:57.496 [main] INFO o.w.query.rdf.tool.rdf.RdfRepository - Checking for left off time from the updater 18:00:57.996 [main] INFO o.w.query.rdf.tool.rdf.RdfRepository - Found left off time from the updater 18:00:58.000 [main] ERROR org.wikidata.query.rdf.tool.Update - Error during initialization. java.lang.IllegalStateException: RDF store reports the last update time is before the minimum safe poll time. You will have to reload from scratch or you might have missing data. at org.wikidata.query.rdf.tool.change.ChangeSourceContext.getStartTime(ChangeSourceContext.java:100) at org.wikidata.query.rdf.tool.Update.initialize(Update.java:145) at org.wikidata.query.rdf.tool.Update.main(Update.java:98) Exception in thread "main" java.lang.IllegalStateException: RDF store reports the last update time is before the minimum safe poll time. You will have to reload from scratch or you might have missing data. at org.wikidata.query.rdf.tool.change.ChangeSourceContext.getStartTime(ChangeSourceContext.java:100) at org.wikidata.query.rdf.tool.Update.initialize(Update.java:145) at org.wikidata.query.rdf.tool.Update.main(Update.java:98) This is because the timestamp recording where updates are has been set, and is no longer "safe". This can be seen as a triple, and is by default 30 days ago. curl "localhost:9999/bigdata/sparql?query=SELECT%20%2A%20WHERE%20%7B%3Fa%20%3Fb%20%3Fc%7D" <?xml version='1.0' encoding='UTF-8'?> <sparql xmlns='http://www.w3.org/2005/sparql-results#'> <head> <variable name='a'/> <variable name='b'/> <variable name='c'/> </head> <results> <result> <binding name='a'> <uri>https://intentionally-empty.wiki.opencura.com</uri> </binding> <binding name='b'> <uri>http://schema.org/dateModified</uri> </binding> <binding name='c'> <literal datatype='http://www.w3.org/2001/XMLSchema#dateTime'>2021-02-15T18:00:18Z</literal> </binding> </result> </results> </sparql> If everything is safe to update, and you're not going to end up missing data, you can reset this time, to a date in the last 30 days. (Overriding what is normally done https://github.com/wmde/wikibase-docker/blob/0c561dd6c17a918323b44c7282b5e5acccfd4e45/wdqs/0.3.40/runUpdate.sh#L9) docker exec demo-wdqs bash -c '/wdqs/runUpdate.sh -h http://${WDQS_HOST}:${WDQS_PORT} -- --wikibaseUrl ${WIKIBASE_SCHEME}://${WIKIBASE_HOST} --conceptUri ${WIKIBASE_SCHEME}://${WIKIBASE_HOST} --entityNamespaces ${WDQS_ENTITY_NAMESPACES} --init --start 20210301010101' The date is now updated curl "localhost:9999/bigdata/sparql?query=SELECT%20%2A%20WHERE%20%7B%3Fa%20%3Fb%20%3Fc%7D" Should show something like <?xml version='1.0' encoding='UTF-8'?> <sparql xmlns='http://www.w3.org/2005/sparql-results#'> <head> <variable name='a'/> <variable name='b'/> <variable name='c'/> </head> <results> <result> <binding name='a'> <uri>https://intentionally-empty.wiki.opencura.com</uri> </binding> <binding name='b'> <uri>http://schema.org/dateModified</uri> </binding> <binding name='c'> <literal datatype='http://www.w3.org/2001/XMLSchema#dateTime'>2021-03-01T01:01:00Z</literal> </binding> </result> </results> </sparql> I'm going to close this ticket now as the scope of it is rather unclear. The case mentioned above should not really be happening during regular operation of a wikibase, but perhaps we need to make the last step here (resetting the timestamp) more resilient, and perhaps the default behaviour when using an empty wikibase a bit better. This would need some collaboration between wmde and the wikidata query service team. If people have individual bugs or feature requests then new tickets are welcome! TASK DETAIL https://phabricator.wikimedia.org/T186161 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Addshore Cc: RShigapov, danshick-wmde, Samantha_Alipio_WMDE, darthmon_wmde, WMDE-leszek, Superraptor123, Tinyttt, Louperivois, Jsamwrites, Considering.Different.Routes, DarTar, Addshore, Andrawaag, Aklapper, maantietaja, Akuckartz, Jelabra, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Asahiko, abian, despens, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs