[Wikidata-bugs] [Maniphest] T290438: wikidata query results are not stable / reliable

2021-09-14 Thread Herzi.Pinki
Herzi.Pinki added a comment.


  T291006 : please reload 
https://www.wikidata.org/wiki/Q37986974

TASK DETAIL
  https://phabricator.wikimedia.org/T290438

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Herzi.Pinki
Cc: Gehel, Krd, Aklapper, Herzi.Pinki, Invadibot, MPhamWMF, maantietaja, 
CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T290438: wikidata query results are not stable / reliable

2021-09-14 Thread Gehel
Gehel closed this task as "Declined".
Gehel added a comment.


  Data drift is a known issue. The update process is imperfect and some level 
of data drift is expected. We're planning a full data reload as part of the 
move to our new Updater process later this month (T244590 
), which should fix the current 
issues. We expect the new updater to be more resilient to those kind of issues, 
but it is still going to be imperfect.
  
  I've added some minimal documentation: 
https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service#Known_limitations

TASK DETAIL
  https://phabricator.wikimedia.org/T290438

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Gehel, Krd, Aklapper, Herzi.Pinki, Invadibot, MPhamWMF, maantietaja, 
CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T290438: wikidata query results are not stable / reliable

2021-09-07 Thread Bugreporter
Bugreporter added a project: Wikidata-Query-Service.

TASK DETAIL
  https://phabricator.wikimedia.org/T290438

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bugreporter
Cc: Krd, Aklapper, Herzi.Pinki, Invadibot, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T290438: wikidata query results are not stable / reliable

2021-09-06 Thread Herzi.Pinki
Herzi.Pinki created this task.
Herzi.Pinki added a project: Wikidata.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  **List of steps to reproduce** (step by step, including full links if 
applicable):
  
  - run
  
  
https://query.wikidata.org/#%23%20Objekte%20mit%20ObjektID%2C%20HERIS-ID%2C%20WienerWohnenIds%20oder%20TirolerKunstkatasterIds%2C%20Wiener%20Kulturgut%0ASELECT%20DISTINCT%20%3Fitem%20%3Fqid5%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%28GROUP_CONCAT%28DISTINCT%20%3FObjektID%3B%20SEPARATOR%3D%27%2C%20%27%29%20AS%20%3FObjektIDs%29%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%28GROUP_CONCAT%28DISTINCT%20%3FHERISId%3B%20SEPARATOR%3D%27%2C%20%27%29%20AS%20%3FHERISIds%29%0A%20%20%20WHERE%20%7B%0A%20%20%20%3Fitem%20wdt%3AP17%20wd%3AQ40%20.%20%23%20%C3%B6sterreich%0A%20%20%20%7B%20%3Fitem%20wdt%3AP2951%20%3FObjektID.%0A%20%20%20%7DUNION%7B%20%3Fitem%20wdt%3AP9154%20%3FHERISId.%0A%20%20%20%7D%0A%20%20%20%23%3Fitem%20wdt%3AP9154%20%2239086%22%0A%20%20%20bind%20%28replace%28xsd%3Astring%28%3Fitem%29%2C%22http%3A%2F%2Fwww.wikidata.org%2Fentity%2F%22%2C%22%22%29%20as%20%3Fqid5%29%0A%7D%20GROUP%20BY%20%3Fitem%20%3Fqid5%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%3FObjektIDs%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%3FHERISIds%0A
  (sorry, no simpler statement found)
  
  - run multiple times
  - this will yield 39043 or 39044 matches (by chance)
  - if only 39043 matches are returned, it is always ?item wdt:P9154 
 "39086" 
(https://www.wikidata.org/wiki/Q37986974, column HERISIds in the result) that 
is missing. The missing object was changed in May 2021 the last time.
  - sorry, but as it happens by chance, I cannot give clearer step by step 
proceeding.
  - I didn't get the same unstable behaviour for checking only item Q37986974 
by property value.
  
  
https://query.wikidata.org/#%23%20Objekte%20mit%20ObjektID%2C%20HERIS-ID%2C%20WienerWohnenIds%20oder%20TirolerKunstkatasterIds%2C%20Wiener%20Kulturgut%0ASELECT%20DISTINCT%20%3Fitem%20%3Fqid5%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%28GROUP_CONCAT%28DISTINCT%20%3FObjektID%3B%20SEPARATOR%3D%27%2C%20%27%29%20AS%20%3FObjektIDs%29%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%28GROUP_CONCAT%28DISTINCT%20%3FHERISId%3B%20SEPARATOR%3D%27%2C%20%27%29%20AS%20%3FHERISIds%29%0A%20%20%20WHERE%20%7B%0A%20%20%20%3Fitem%20wdt%3AP17%20wd%3AQ40%20.%20%23%20%C3%B6sterreich%0A%20%20%20%7B%20%3Fitem%20wdt%3AP2951%20%3FObjektID.%0A%20%20%20%7DUNION%7B%20%3Fitem%20wdt%3AP9154%20%3FHERISId.%0A%20%20%20%7D%0A%20%20%20%3Fitem%20wdt%3AP9154%20%2239086%22%0A%20%20%20bind%20%28replace%28xsd%3Astring%28%3Fitem%29%2C%22http%3A%2F%2Fwww.wikidata.org%2Fentity%2F%22%2C%22%22%29%20as%20%3Fqid5%29%0A%7D%20GROUP%20BY%20%3Fitem%20%3Fqid5%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%3FObjektIDs%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%3FHERISIds%0A
  
  **What happens?**:
  It seems to be a caching problem / cache invalidation problem / cache refresh 
problem. Are there more than one front-end servers doing the caching? May it 
depend on round robin load balancing etc? We have a bot process based on the 
query above, and it is essential that the bot gets all the results any time.
  I got the same weird behaviour for another item, which I purged (don't know 
whether purging helps on wikidata?) and then added / removed a property to be 
sure. But purging is not the solution, as in most cases you will not get an 
idea that one item is missing in the results. I did not observe more than one 
missing item at the same time.
  
  **What should have happened instead?**:
  Query results should be guaranteed to be complete and correct / or to fail 
with an error message / error status. (except for replication delays etc.)
  
  **Software version (if not a Wikimedia wiki), browser information, 
screenshots, other information, etc**:

TASK DETAIL
  https://phabricator.wikimedia.org/T290438

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Herzi.Pinki
Cc: Krd, Aklapper, Herzi.Pinki, Invadibot, maantietaja, Akuckartz, Nandana, 
Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org