tfmorris added a comment.
I'm surprised that this hasn't received any attention in 15 months. As an
update to @Nikki 's numbers <https://phabricator.wikimedia.org/T303677#7789434>
there are now on the order of 2.5 **BILLION** of these bot generated
descriptions. The top 5 alone rep
tfmorris added a comment.
@Manuel when you write:
> A new feature that would solve this problem is already planned, but it does
not exist yet (see T303677 <https://phabricator.wikimedia.org/T303677>).
Thanks for the pointer! What does "planned" mean in this cont
tfmorris added a comment.
I have a theory as to where a big chunk of the machine generated descriptions
are from. They are the phrase "Wikimedia category" in hundreds of languages as
a textual transcription of the triple `instanceOf Q4167836`. For example,
Catégorie:Naissance à Se
tfmorris added a comment.
Is triple count the only important parameter? It seems likely that the
descriptions could be larger, on average, than labels.
It seems odd that there are more descriptions (19% of total) than labels
(5%), although that agrees with what the previous study found
tfmorris added a comment.
How does one discover what the resolution was? (Apologies if this should be
obvious, but I'm used to bug trackers which link the commits back to the issue.)
TASK DETAIL
https://phabricator.wikimedia.org/T329093
EMAIL PREFERENCES
https
tfmorris added a comment.
I vote for full URLs. Also, HTTPS URLs should probably be used throughout in
preference to HTTP URLs to save naive clients from the extra latency of a
redirect.
TASK DETAIL
https://phabricator.wikimedia.org/T329093
EMAIL PREFERENCES
https
tfmorris added a comment.
https://isa.toolforge.org/ and https://wikishootme.toolforge.org/ were also
down about the same time (11:03 Eastern US).
TASK DETAIL
https://phabricator.wikimedia.org/T262550
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences
tfmorris added a comment.
In T240442#5851541 <https://phabricator.wikimedia.org/T240442#5851541>,
@Addshore wrote:
> In T240442#5834866 <https://phabricator.wikimedia.org/T240442#5834866>,
@Ladsgroup wrote:
>
>> Very broad idea, feel free to discard, I t
tfmorris added a comment.
If the manifest has to be constructed by hand, it seems like YAML would be a
better format than JSON. They are equivalent from a structural and
informational point of view, but YAML is **much** easier to edit without
creating invalid documents.
TASK DETAIL
https
tfmorris added a comment.
The so-called "Freebase" dataset is actually a mix of data from Freebase and
a bunch of URLs that were pulled from Google web crawls by an intern as
potential "evidence." They don't have anything to do with the provenance of the
data that wa
tfmorris added a comment.
It would seem like the 2018-03-13 spreadsheet should be adequate to call this
task complete. I would recommend including some qualitative understanding of
the source of the Freebase data in addition to just pure curation ratio when
making judgements about how
tfmorris updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T237925
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: tfmorris
Cc: tfmorris, Matlin, Lucas_Werkmeister_WMDE, Michael, So9q, Hjfocs,
ChristianKl, Tpt, Pintoch
tfmorris added a comment.
It seems bizarre that the utility of this is debated. The solution suggested
by Bene sounds simple, straightforward, and useful.
TASK DETAIL
https://phabricator.wikimedia.org/T126510
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel
tfmorris added a comment.
I think these are likely two different bugs. Has anyone looked at either of
them in the last 4 months?
Here's are some other examples of aude's bug:
https://www.wikidata.org/wiki/Q5260247?debug=1
https://www.wikidata.org/wiki/Q4636?debug=1
It's
14 matches
Mail list logo