[Wikidata-bugs] [Maniphest] [Commented On] T177275: Add ordinal variable to MWAPI service calls
gerritbot added a comment. Change 388287 merged by jenkins-bot: [wikidata/query/rdf@master] Add option to fetch ordinal of the result in MWAPI query: https://gerrit.wikimedia.org/r/388287TASK DETAILhttps://phabricator.wikimedia.org/T177275EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, Eloquence, Aklapper, Smalyshev, Lahi, Lordiis, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, merbst, Avner, Lewizho99, Maathavan, debt, Gehel, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T114904: Migrate wb_items_per_site to using prefixed entity IDs instead of numeric IDs
hoo added a comment. Note: I just also found T179793: Consider dropping the "wb_items_per_site.wb_ips_site_page" index while looking at this… maybe this can be done at once?!TASK DETAILhttps://phabricator.wikimedia.org/T114904EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: WMDE-leszek, Ladsgroup, Multichill, Sjoerddebruin, Lydia_Pintscher, Pasleim, Ricordisamoa, hoo, daniel, Aklapper, Lahi, GoranSMilovanovic, QZanden, Marostegui, Minhnv-2809, Luke081515, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Created] T179793: Consider dropping the "wb_items_per_site.wb_ips_site_page" index
hoo created this task.hoo added projects: Wikidata, MediaWiki-extensions-WikibaseRepository, DBA.Herald added a subscriber: Aklapper. TASK DESCRIPTIONFrom db1070: KEY `wb_ips_site_page` (`ips_site_page`), This is useful for queries where we want to find a given linked page by title (like "Berlin"), but don't know the site id (like "enwiki"). We don't do these kinds of queries within the software and I can barely imagine a purpose for this. The only way to use this via Wikibase is SiteLinkLookup::getLinks which allows for such queries to be crafted (but no one does this currently). The method states: Note: if the conditions are not very selective the result set can be very big. * Thus the caller is responsible for not executing too expensive queries in its context. If we want, we could make the implementations of getLinks throw if $siteIds is not set, but I'm not sure that's even needed here (as per the above comment).TASK DETAILhttps://phabricator.wikimedia.org/T179793EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: Aklapper, Ladsgroup, Multichill, daniel, hoo, Lahi, GoranSMilovanovic, QZanden, Marostegui, Minhnv-2809, Luke081515, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114904: Migrate wb_items_per_site to using prefixed entity IDs instead of numeric IDs
hoo added a comment. Giving the size of the table, changing this shouldn't be overly horrible. It's a fair bit of migration work… but I assume doing this for maintenance queries and consistency is worth it.TASK DETAILhttps://phabricator.wikimedia.org/T114904EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: WMDE-leszek, Ladsgroup, Multichill, Sjoerddebruin, Lydia_Pintscher, Pasleim, Ricordisamoa, hoo, daniel, Aklapper, Lahi, GoranSMilovanovic, QZanden, Marostegui, Minhnv-2809, Luke081515, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Created] T179792: Watching more pages in Wikidata
ChristianKl created this task.ChristianKl added a project: Wikidata.Herald added a subscriber: Aklapper. TASK DESCRIPTIONOne of Wikidata core problems at the moment is that it doesn't have enough review of edits and thus vandalism doesn't often takes a while to get reverted. This is partly because information is spread over more different pages than on Wikipedia. If you imagine that you have 10 Wikidata items that got each edited by 3 people for a total of 10 edits each on the one hand and one Wikipedia article with 100 edits by 30 people, the amount of edits that both pages have is the same. The Wikipedia article is on the watchlist of many people and on the other hand the Wikidata items are only watched by three people. What could be done to change this? Currently, pages are put on a watchlist if new statements get created however only the page on which the statement is made is added to the watchlist. We need more event where pages get added to watchlists: Adding properties to the watchlist when they are used Add the linked property to the watchlist That means when I add the statement "hand" "anatomical location" "free upper limb" I add all the pages to my watchlist plus the relevant talk pages. At the time of this writing the talk page for "anatomical location" has a "Number of page watchers who visited recent edits" of 4. That means that it's not possible to have a discussion about how the property should be used with other people who use the property. When we automatically add the property when it gets used to the watchlist this suddenly means it's possible to have a discussion about how "anatomical location" should be used that gets seen by the people who use the property. That's valuable for standardizing it's usage. It will get new users sooner into contact with discussions with other users. The second issue that gets solved is the ability to clean up vandalism to the labels of highly used properties and items fast. As Wikidata gets used more it becomes also more unacceptable that it takes hours to revert vandalism on the item of a country. This change would increase the watchers of such highly used pages enough that the vandalism reverting will get fast. Does this create a watchlist overload? There should be the possibility on "Preferences/Watchlist" to turn off this feature. Having the option is in line with the current ability to manage which pages get added to the watchlist. This possibility will allow high activity users to solve the problem for them. On the other hand, this feature might increase the need for the ability to filter the watchlist by language. If you have 5,000 people following the item Germany and you have 20 seldomly used languages who add labels to it, it's likely not efficient that 5,000 people who can't read the language see the item. The work required for filtering seems doable and not directly urgent.TASK DETAILhttps://phabricator.wikimedia.org/T179792EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ChristianKlCc: Aklapper, ChristianKl, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, aude, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T170779: Wikidata search suggestions do not display on screen if character whose decomposition contains nukta is present in search query
hoo added a comment. In T170779#3734809, @Smalyshev wrote: @Snaterlicious, @hoo, @thiemowmde Do you know why the check is there and what it meant to be doing? @tstarling raised the following concern: The search term is normalized by the server using $wgContLang->normalize(), which potentially includes transformations beyond NFC, especially if the content language is Arabic or Malayalam. So even if you do client-side NFC using the same version of Unicode as the server, there is at least a hypothetical possibility of a hang. Replied on gerrit.TASK DETAILhttps://phabricator.wikimedia.org/T170779EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: tstarling, Snaterlicious, gerritbot, hoo, Smalyshev, debt, Liuxinyu970226, TJones, daniel, thiemowmde, Aftabuzzaman, Mahir256, Aklapper, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Lewizho99, Maathavan, Jdrewniak, Wikidata-bugs, aude, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T175230: Wikidata identifier links don't respect nofollow configuration
ChristianKl added a comment. Quora links to us with do-follow and we link to them with do-follow. In this case, I don't see a problem.TASK DETAILhttps://phabricator.wikimedia.org/T175230EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ChristianKlCc: ChristianKl, Nemo_bis, Aklapper, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, aude, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T179681: Add HDT dump of Wikidata
Addshore added a comment. @Smalyshev we discussed dumping the JNL files used by blaze graph directly at points during wikidata con. I'm aware that isnt a HDT dump, but im wondering if this would help in any way.TASK DETAILhttps://phabricator.wikimedia.org/T179681EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: AddshoreCc: Addshore, Smalyshev, Ladsgroup, Arkanosis, Tarrow, Lucas_Werkmeister_WMDE, Aklapper, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, aude, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unassigned] T143424: [Task] Explore the Entity Relevancy Scoring for Wikidata
thalhamm removed thalhamm as the assignee of this task. TASK DETAILhttps://phabricator.wikimedia.org/T143424EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: thalhammCc: Lydia_Pintscher, Smalyshev, thalhamm, thiemowmde, Sjoerddebruin, Glorian_Yapinus, Aklapper, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, aude, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T175230: Wikidata identifier links don't respect nofollow configuration
Nemo_bis added a comment. This is still happening.TASK DETAILhttps://phabricator.wikimedia.org/T175230EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Nemo_bisCc: Nemo_bis, Aklapper, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, aude, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T179681: Add HDT dump of Wikidata
Arkanosis added a comment. FWIW, I've just tried to convert the ttl dump of the 1st of November 2017 on a machine with 378 GiB of RAM and 0 GiB of swap and… well… it failed with std::bad_alloc after more than 21 hours of runtime. Granted, there was another process eating ~100 GiB of memory, but I thought it would be okay — so I'm proved wrong. As I was optimistic, I ran the conversion directly from the ttl.gz file, maybe preventing some memory mapping optimization, and also added the -i flag to generate the index at the same time. I'll re-run the conversion without these in the hope of finally getting the hdt file. So, here are the statistics I got: $ /usr/bin/time -v rdf2hdt -f ttl -i -p wikidata-20171101-all.ttl.gz wikidata-20171101-all.hdt Catch exception load: std::bad_alloc ERROR: std::bad_alloc Command exited with non-zero status 1 Command being timed: "rdf2hdt -f ttl -i -p wikidata-20171101-all.ttl.gz wikidata-20171101-all.hdt" User time (seconds): 64999.77 System time (seconds): 10906.79 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 21:13:25 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 200475524 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 703 Minor (reclaiming a frame) page faults: 8821385485 Voluntary context switches: 36774 Involuntary context switches: 4514261 Swaps: 0 File system inputs: 81915000 File system outputs: 2767696 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 1 /usr/bin/time -v rdf2hdt -f ttl -i -p wikidata-20171101-all.ttl.gz 64999,77s user 10906,80s system 99% cpu 21:13:25,50 total NB: the exceptionally long runtime is the result of the conversion being single-threaded while the machine has a lot of threads but a relatively low per-thread performance (2.3 Ghz). The process wasn't under memory pressure until it crashed (no swap anyway) and wasn't waiting much for I/O — so it was all CPU-bound.TASK DETAILhttps://phabricator.wikimedia.org/T179681EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ArkanosisCc: Smalyshev, Ladsgroup, Arkanosis, Tarrow, Lucas_Werkmeister_WMDE, Aklapper, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, aude, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs