Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Yes. This should be a client feature, not a Wikidata feature (so something that is on Wikipedia and Commons) On Fri, Aug 21, 2015 at 10:54 PM, Jan Ainali jan.ain...@wikimedia.se wrote: I am with Ryan here, and I believe that is Magnus idea too, the autodescription should not be a field in the database, it should be queried on the fly from the statements. *Med vänliga hälsningar,Jan Ainali* Verksamhetschef, Wikimedia Sverige http://wikimedia.se 0729 - 67 29 48 *Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.* Bli medlem. http://blimedlem.wikimedia.se 2015-08-21 21:26 GMT+02:00 Ryan Kaldari rkald...@wikimedia.org: If the way to 'edit' the autodescription is by changing the claims for the item, I support the idea. I would oppose, however, the autodescription being another text field you can edit directly as I think this would be very confusing for Wikidata editors, as each item would effectively just have 2 interchangable description fields. On Aug 21, 2015, at 11:21 AM, Jon Katz jk...@wikimedia.org wrote: This is a really interesting discussion and it seems that there is near-consensus that an automated description for entities without a manual description is not a bad idea, particularly if they are kept in a separate field. Speak now if you feel that is not correct. To S's suggestion: what steps do we need to take to put autodesc into wiki's? - establish consensus with stakeholders outside this thread? - create new field? - rule out/protect against edge cases (are their length limits, for instance) - ways to edit (explaining to a user how they can edit or override is going to be important) Who should own it and create an epic to track? Wikidata, Search, Reading? On Fri, Aug 21, 2015 at 10:27 AM, Monte Hurd mh...@wikimedia.org wrote: This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. +1000 Exactly! I was operating under the assumption we were talking about the existing description field. Separate auto and manual description fields completely avoids *all* of the issues/concerns I raised :) On Thu, Aug 20, 2015 at 2:48 AM, Magnus Manske magnusman...@googlemail.com wrote: So it turns out that ValterVBot alone has created over 1.8 MILLION manual descriptions. And there are other bots that do this. We already HAVE automatic descriptions, we just store them in the manual field. The worst of both worlds. On Thu, Aug 20, 2015 at 9:24 AM Magnus Manske magnusman...@googlemail.com wrote: On Thu, Aug 20, 2015 at 1:43 AM Monte Hurd mh...@wikimedia.org wrote: True about algorithms never being finished, but aren't we essentially stuck with the first run output, unless I misunderstand how you envision this working? (assuming you don't want to over-write non-blank descriptions the next time you improve and re-run the process) Of course we're not stuck with the initial automatic descriptions! Whatever gave you that idea? Ideally, each description would be computed on-the-fly, but that won't scale; output needs to be cached, and invalidated when necessary. Possible reasons for cache invalidation: * The item statements have changed * Items referenced in the description (e.g. country for nationality) have changed * The algorithm has been improved * After cache reached a certain age, just to make sure This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
I am with Ryan here, and I believe that is Magnus idea too, the autodescription should not be a field in the database, it should be queried on the fly from the statements. *Med vänliga hälsningar,Jan Ainali* Verksamhetschef, Wikimedia Sverige http://wikimedia.se 0729 - 67 29 48 *Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.* Bli medlem. http://blimedlem.wikimedia.se 2015-08-21 21:26 GMT+02:00 Ryan Kaldari rkald...@wikimedia.org: If the way to 'edit' the autodescription is by changing the claims for the item, I support the idea. I would oppose, however, the autodescription being another text field you can edit directly as I think this would be very confusing for Wikidata editors, as each item would effectively just have 2 interchangable description fields. On Aug 21, 2015, at 11:21 AM, Jon Katz jk...@wikimedia.org wrote: This is a really interesting discussion and it seems that there is near-consensus that an automated description for entities without a manual description is not a bad idea, particularly if they are kept in a separate field. Speak now if you feel that is not correct. To S's suggestion: what steps do we need to take to put autodesc into wiki's? - establish consensus with stakeholders outside this thread? - create new field? - rule out/protect against edge cases (are their length limits, for instance) - ways to edit (explaining to a user how they can edit or override is going to be important) Who should own it and create an epic to track? Wikidata, Search, Reading? On Fri, Aug 21, 2015 at 10:27 AM, Monte Hurd mh...@wikimedia.org wrote: This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. +1000 Exactly! I was operating under the assumption we were talking about the existing description field. Separate auto and manual description fields completely avoids *all* of the issues/concerns I raised :) On Thu, Aug 20, 2015 at 2:48 AM, Magnus Manske magnusman...@googlemail.com wrote: So it turns out that ValterVBot alone has created over 1.8 MILLION manual descriptions. And there are other bots that do this. We already HAVE automatic descriptions, we just store them in the manual field. The worst of both worlds. On Thu, Aug 20, 2015 at 9:24 AM Magnus Manske magnusman...@googlemail.com wrote: On Thu, Aug 20, 2015 at 1:43 AM Monte Hurd mh...@wikimedia.org wrote: True about algorithms never being finished, but aren't we essentially stuck with the first run output, unless I misunderstand how you envision this working? (assuming you don't want to over-write non-blank descriptions the next time you improve and re-run the process) Of course we're not stuck with the initial automatic descriptions! Whatever gave you that idea? Ideally, each description would be computed on-the-fly, but that won't scale; output needs to be cached, and invalidated when necessary. Possible reasons for cache invalidation: * The item statements have changed * Items referenced in the description (e.g. country for nationality) have changed * The algorithm has been improved * After cache reached a certain age, just to make sure This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
If the way to 'edit' the autodescription is by changing the claims for the item, I support the idea. I would oppose, however, the autodescription being another text field you can edit directly as I think this would be very confusing for Wikidata editors, as each item would effectively just have 2 interchangable description fields. On Aug 21, 2015, at 11:21 AM, Jon Katz jk...@wikimedia.org wrote: This is a really interesting discussion and it seems that there is near-consensus that an automated description for entities without a manual description is not a bad idea, particularly if they are kept in a separate field. Speak now if you feel that is not correct. To S's suggestion: what steps do we need to take to put autodesc into wiki's? establish consensus with stakeholders outside this thread? create new field? rule out/protect against edge cases (are their length limits, for instance) ways to edit (explaining to a user how they can edit or override is going to be important) Who should own it and create an epic to track? Wikidata, Search, Reading? On Fri, Aug 21, 2015 at 10:27 AM, Monte Hurd mh...@wikimedia.org wrote: This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. +1000 Exactly! I was operating under the assumption we were talking about the existing description field. Separate auto and manual description fields completely avoids *all* of the issues/concerns I raised :) On Thu, Aug 20, 2015 at 2:48 AM, Magnus Manske magnusman...@googlemail.com wrote: So it turns out that ValterVBot alone has created over 1.8 MILLION manual descriptions. And there are other bots that do this. We already HAVE automatic descriptions, we just store them in the manual field. The worst of both worlds. On Thu, Aug 20, 2015 at 9:24 AM Magnus Manske magnusman...@googlemail.com wrote: On Thu, Aug 20, 2015 at 1:43 AM Monte Hurd mh...@wikimedia.org wrote: True about algorithms never being finished, but aren't we essentially stuck with the first run output, unless I misunderstand how you envision this working? (assuming you don't want to over-write non-blank descriptions the next time you improve and re-run the process) Of course we're not stuck with the initial automatic descriptions! Whatever gave you that idea? Ideally, each description would be computed on-the-fly, but that won't scale; output needs to be cached, and invalidated when necessary. Possible reasons for cache invalidation: * The item statements have changed * Items referenced in the description (e.g. country for nationality) have changed * The algorithm has been improved * After cache reached a certain age, just to make sure This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. +1000 Exactly! I was operating under the assumption we were talking about the existing description field. Separate auto and manual description fields completely avoids *all* of the issues/concerns I raised :) On Thu, Aug 20, 2015 at 2:48 AM, Magnus Manske magnusman...@googlemail.com wrote: So it turns out that ValterVBot alone has created over 1.8 MILLION manual descriptions. And there are other bots that do this. We already HAVE automatic descriptions, we just store them in the manual field. The worst of both worlds. On Thu, Aug 20, 2015 at 9:24 AM Magnus Manske magnusman...@googlemail.com wrote: On Thu, Aug 20, 2015 at 1:43 AM Monte Hurd mh...@wikimedia.org wrote: True about algorithms never being finished, but aren't we essentially stuck with the first run output, unless I misunderstand how you envision this working? (assuming you don't want to over-write non-blank descriptions the next time you improve and re-run the process) Of course we're not stuck with the initial automatic descriptions! Whatever gave you that idea? Ideally, each description would be computed on-the-fly, but that won't scale; output needs to be cached, and invalidated when necessary. Possible reasons for cache invalidation: * The item statements have changed * Items referenced in the description (e.g. country for nationality) have changed * The algorithm has been improved * After cache reached a certain age, just to make sure This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
This is a really interesting discussion and it seems that there is near-consensus that an automated description for entities without a manual description is not a bad idea, particularly if they are kept in a separate field. Speak now if you feel that is not correct. To S's suggestion: what steps do we need to take to put autodesc into wiki's? - establish consensus with stakeholders outside this thread? - create new field? - rule out/protect against edge cases (are their length limits, for instance) - ways to edit (explaining to a user how they can edit or override is going to be important) Who should own it and create an epic to track? Wikidata, Search, Reading? On Fri, Aug 21, 2015 at 10:27 AM, Monte Hurd mh...@wikimedia.org wrote: This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. +1000 Exactly! I was operating under the assumption we were talking about the existing description field. Separate auto and manual description fields completely avoids *all* of the issues/concerns I raised :) On Thu, Aug 20, 2015 at 2:48 AM, Magnus Manske magnusman...@googlemail.com wrote: So it turns out that ValterVBot alone has created over 1.8 MILLION manual descriptions. And there are other bots that do this. We already HAVE automatic descriptions, we just store them in the manual field. The worst of both worlds. On Thu, Aug 20, 2015 at 9:24 AM Magnus Manske magnusman...@googlemail.com wrote: On Thu, Aug 20, 2015 at 1:43 AM Monte Hurd mh...@wikimedia.org wrote: True about algorithms never being finished, but aren't we essentially stuck with the first run output, unless I misunderstand how you envision this working? (assuming you don't want to over-write non-blank descriptions the next time you improve and re-run the process) Of course we're not stuck with the initial automatic descriptions! Whatever gave you that idea? Ideally, each description would be computed on-the-fly, but that won't scale; output needs to be cached, and invalidated when necessary. Possible reasons for cache invalidation: * The item statements have changed * Items referenced in the description (e.g. country for nationality) have changed * The algorithm has been improved * After cache reached a certain age, just to make sure This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
On Fri, Aug 21, 2015 at 11:21 AM, Jon Katz jk...@wikimedia.org wrote: To S's suggestion: what steps do we need to take to put autodesc into wiki's? N! Saying put or store produces resistance. This is about when and where to _display_ an AutoDesc that's generated on-the-fly from Wikidata. Caching it is an optimization detail. The second message in this thread said Rather, cache [auto] descriptions separately, and update them as required yet we keep reviving a dead horse. - establish consensus with stakeholders outside this thread? I think the Reading team can decide to show the AutoDesc on lead images and in mobile search results when there's no Wikidata description. - create new field? Never. Cache it in RESTBase. - rule out/protect against edge cases (are their length limits, for instance) - ways to edit (explaining to a user how they can edit or override is going to be important) I think Monte's excellent prototype of editing descriptions on Mobile (T90765) should show the AutoDesc, as in Try to write something better than this. However, Lydia Pintscher declined my T109772 present the short AutoDesc of an item when editing its description, giving some cogent blockers. If the AutoDesc is inaccurate solely because a fact in Wikidata is wrong, then the user should update the item in Wikidata rather than add a manual description. As Dimitry wrote IMO, allowing the user to edit the description is a missed opportunity to make the user edit the actual *data*, such that the description is generated correctly. I don't know if AutoDesc could link every piece of the description to the fact generating it. Who should own it and create an epic to track? Wikidata, Search, Reading? The CTO, i.e. bring it up at some Engineering management meeting. Magnus Manske wrote: So it turns out that ValterVBot alone has created over 1.8 MILLION manual descriptions. And there are other bots that do this. We already HAVE automatic descriptions, we just store them in the manual field. The worst of both worlds. The longer we go without a productized AutoDesc that's shown whenever there isn't a manual description, the more people will do this. Regards, -- =S Page WMF Tech writer ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
On Thu, Aug 20, 2015 at 1:43 AM Monte Hurd mh...@wikimedia.org wrote: True about algorithms never being finished, but aren't we essentially stuck with the first run output, unless I misunderstand how you envision this working? (assuming you don't want to over-write non-blank descriptions the next time you improve and re-run the process) Of course we're not stuck with the initial automatic descriptions! Whatever gave you that idea? Ideally, each description would be computed on-the-fly, but that won't scale; output needs to be cached, and invalidated when necessary. Possible reasons for cache invalidation: * The item statements have changed * Items referenced in the description (e.g. country for nationality) have changed * The algorithm has been improved * After cache reached a certain age, just to make sure This is why the automatic description cache and the manual description need to be kept separate; just pasting the autodesc into the manual description field would mean it could never be updated automatically. That would be very bad indeed. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
My hero Magnus Manske noted The situation, for most languages, is this: No manual descriptions, on basically any item. And that will remain so for the (near) future. Automatic descriptions can change that, literally over night, with a little programming and linguistic effort. ... This is a force multiplier of volunteer effort with a factor of 250. And we ignore that ... why, exactly? The potential of AutoDesc is so enormous to attain a world in which every single person on the planet is given free access to the sum of all human knowledge that it should be the entire movement's top project. I nearly wrote a career-limiting e-mail rant to WMF-all on that subject last night. In this e-mail thread we're talking about it in the limited scope of Wikidata descriptions in search on mobile web beta, where the mobile client presents a useful signpost for *existing* articles, in an emblem on lead images and in search results. That's important but we're missing the forest for a single tree when discussing such a transformative technology. If only WMF had a CTO for such things [1]. Anyway, returning to this specific use case: * Nobody is saying store the AutoDesc in the Wikidata per-language description field. * Nobody is saying show the AutoDesc if there is an existing Wikidata description. * Is anybody against showing AutoDesc, after some refinement and productization [2], in these mobile use cases when there is no Wikidata description? * I propose the AutoDesc as a quality bar that any edit to a Wikidata description needs to improve on (but again that's a topic beyond this mail thread). Yours, excitedly, =S Page [1] http://grnh.se/30f54b , apply today! [2] https://bitbucket.org/magnusmanske/autodesc/src/HEAD/www/js/?at=master and https://github.com/dbrant/wikidata-autodesc . It's already a nodejs service, can we append oid and declare victory ? :-) On Wed, Aug 19, 2015 at 2:57 AM, Magnus Manske magnusman...@googlemail.com wrote: Oh, and as for examples, random-paging just got me this: https://en.wikipedia.org/wiki/Jules_Malou Manual description: Belgian politician Automatic description: Belgian politician and lawyer, Prime Minister of Belgium, and member of the Chamber of Representatives of Belgium (1810–1886) ♂ I know which one I'd prefer... On Wed, Aug 19, 2015 at 10:50 AM Magnus Manske magnusman...@googlemail.com wrote: Thank you Dmitry! Well phrased and to the point! As for templating, that might be the worst of both worlds; without the flexibility and over-time improvement of automatic descriptions, but making it harder for people to enter (compared to free-style text). We have a Visual Editor on Wikipedia for a reason :-) On Wed, Aug 19, 2015 at 4:07 AM Dmitry Brant dbr...@wikimedia.org wrote: My thoughts, as ever(!), are as follows: - The tool that generates the descriptions deserves a lot more development. Magnus' tool is very much a prototype, and represents a tiny glimpse of what's possible. Looking at its current output is a straw man. - Auto-generated descriptions work for current articles, and *all future articles*. They automatically adapt to updated data. They automatically become more accurate as new data is added. - When you edit the descriptions yourself, you're not really making a meaningful contribution to the *data* that underpins the given Wikidata entry; i.e. you're not contributing any new information. You're simply paraphrasing the first sentence or two of the Wikipedia article. That can't possibly be a productive use of contributors' time. As for Brian's suggestion: It would be a step forward; we can even invent a whole template-type syntax for transcluding bits of actual data into the description. But IMO, that kind of effort would still be better spent on fully-automatic descriptions, because that's the ideal that semi-automatic descriptions can only approach. On Tue, Aug 18, 2015 at 10:36 PM, Brian Gerstle bgers...@wikimedia.org wrote: Could there be a way to have our nicely curated description cake and eat it too? For example, interpolating data into the description and/or marking data points which are referenced in the description (so as to mark it as outdated when they change)? I appreciate the potential benefits of generated descriptions (and other things), but Monte's examples might have swayed me towards human curated—when available. On Tuesday, August 18, 2015, Monte Hurd mh...@wikimedia.org wrote: Ok, so I just did what I proposed. I went to random enwiki articles and described the first ten I found which didn't already have descriptions: - Courage Under Fire, *1996 film about a Gulf War friendly-fire incident* - Pebasiconcha immanis, *largest known species of land snail, extinct* - List of Kenyan writers, *notable Kenyan authors* - Solar eclipse of December 14, 1917, *annular eclipse which lasted 77 seconds* - Natchaug Forest Lumber Shed, *historic Civilian Conservation Corps
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
On Wed, Aug 19, 2015 at 11:19 PM Monte Hurd mh...@wikimedia.org wrote: No manual descriptions, on basically any item. And that will remain so for the (near) future. Automatic descriptions can change that, literally over night, with a little programming and linguistic effort. ... This is a force multiplier of volunteer effort with a factor of 250. And we ignore that ... why, exactly? Not ignoring. In fact, if the auto-generated descriptions near the quality of human curated descriptions, I'm totally and wholeheartedly onboard that their use should be strongly considered. I just disagree that closing the quality gap will involve little programming and linguistic effort. I lean more toward massive programming and linguistic effort end of the spectrum. Specifically, I think it will take massive effort to make the auto-generated descriptions so good that an average person would say, hey these auto generated descriptions are better than the human curated descriptions in the examples I posted. You are confusing (in the literal meaning of the word, fusing together) several issues into one here, which you then call better. I see at least five distinct types of better: 1. A description exists, vs. it does not. In that aspect, automatic descriptions will always be better than manual ones. 2. One description is more complete than the other. From what I see in random examples, this is already the case for many biographical items that have a lot of statements. I have actually considered cutting them back a little, because even these short descriptions can get quite extensive. 3. Context-aware, specifically, the context where the description is shown. This one goes to the automatic descriptions. AutoDesc already can generate plain text, links to Wikidata, links to a specific Wikipedia where there are articles, and use plain text/redlinks/Wikidata links otherwise. It can generate Wikitext, with some infoboxes. It could easily generate HTML blurbs with a thumbnail if there is an image, and so on. This if contrasted with plain text for manual descriptions. 4. Linguistic/style. Manual descriptions CAN be better phrased than automatic ones, but can also be worse. Automatic descriptions are unimaginative, but consistent. Here is where I probably beg to differ from most other people on this thread: I firmly believe that a description, even if it is slightly wrong grammatically, is preferable to no description, as long as humans still can understand what is meant. If the German description gets the gender of moon wrong, so what? (I don't think it does, but just for the sake of argument) Eventually, someone will implement a fix for that. Maybe we'll have gender for things per language as statements at some point, which would be useful beyond autodesc. 5. To the point. That is where manual descriptions have their only advantage in the long run. Even from a lot of statements, it is hard for an algorithm to figure out why exactly that person, that thing, that event are important. Sometime it is something obscure, something that does not fit well into statements, or is hidden among them. And there, and only there, do manual descriptions make sense, as I have always maintained. I am well aware of the limitations of automatic descriptions. I can also see that perfection will never be reached, that the algorithms will never be finished. Like Wikipedia. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Those were literally the first 10 random articles I encountered which didn't have descriptions. The tool that generates the descriptions deserves a lot more development. Magnus' tool is very much a prototype, and represents a tiny glimpse of what's possible. Looking at its current output is a straw man. It's not a straw man at all - it's a baseline to move the discussion away from the abstract. We need to start looking at real examples. One of my main concerns is a lot more development is actually an understatement as many of the optimizations will be language dependent. On Wed, Aug 19, 2015 at 2:57 AM, Magnus Manske magnusman...@googlemail.com wrote: Oh, and as for examples, random-paging just got me this: https://en.wikipedia.org/wiki/Jules_Malou Manual description: Belgian politician Automatic description: Belgian politician and lawyer, Prime Minister of Belgium, and member of the Chamber of Representatives of Belgium (1810–1886) ♂ I know which one I'd prefer... On Wed, Aug 19, 2015 at 10:50 AM Magnus Manske magnusman...@googlemail.com wrote: Thank you Dmitry! Well phrased and to the point! As for templating, that might be the worst of both worlds; without the flexibility and over-time improvement of automatic descriptions, but making it harder for people to enter (compared to free-style text). We have a Visual Editor on Wikipedia for a reason :-) On Wed, Aug 19, 2015 at 4:07 AM Dmitry Brant dbr...@wikimedia.org wrote: My thoughts, as ever(!), are as follows: - The tool that generates the descriptions deserves a lot more development. Magnus' tool is very much a prototype, and represents a tiny glimpse of what's possible. Looking at its current output is a straw man. - Auto-generated descriptions work for current articles, and *all future articles*. They automatically adapt to updated data. They automatically become more accurate as new data is added. - When you edit the descriptions yourself, you're not really making a meaningful contribution to the *data* that underpins the given Wikidata entry; i.e. you're not contributing any new information. You're simply paraphrasing the first sentence or two of the Wikipedia article. That can't possibly be a productive use of contributors' time. As for Brian's suggestion: It would be a step forward; we can even invent a whole template-type syntax for transcluding bits of actual data into the description. But IMO, that kind of effort would still be better spent on fully-automatic descriptions, because that's the ideal that semi-automatic descriptions can only approach. On Tue, Aug 18, 2015 at 10:36 PM, Brian Gerstle bgers...@wikimedia.org wrote: Could there be a way to have our nicely curated description cake and eat it too? For example, interpolating data into the description and/or marking data points which are referenced in the description (so as to mark it as outdated when they change)? I appreciate the potential benefits of generated descriptions (and other things), but Monte's examples might have swayed me towards human curated—when available. On Tuesday, August 18, 2015, Monte Hurd mh...@wikimedia.org wrote: Ok, so I just did what I proposed. I went to random enwiki articles and described the first ten I found which didn't already have descriptions: - Courage Under Fire, *1996 film about a Gulf War friendly-fire incident* - Pebasiconcha immanis, *largest known species of land snail, extinct* - List of Kenyan writers, *notable Kenyan authors* - Solar eclipse of December 14, 1917, *annular eclipse which lasted 77 seconds* - Natchaug Forest Lumber Shed, *historic Civilian Conservation Corps post-and-beam building* - Sun of Jamaica (album), *debut 1980 studio album by Goombay Dance Band* - E-1027, *modernist villa in France by architect Eileen Gray* - Daingerfield State Park, *park in Morris County, Texas, USA, bordering Lake Daingerfield* - Todo Lo Que Soy-En Vivo, *2014 Live album by Mexican pop singer Fey* - 2009 UEFA Regions' Cup, *6th UEFA Regions' Cup, won by Castile and Leon* And here are the respective descriptions from Magnus' (quite excellent) autodesc.js: - Courage Under Fire, *1996 film by Edward Zwick, produced by John Davis and David T. Friendly from United States of America* - Pebasiconcha immanis, *species of Mollusca* - List of Kenyan writers, *Wikimedia list article* - Solar eclipse of December 14, 1917, *solar eclipse* - Natchaug Forest Lumber Shed, *Construction in Connecticut, United States of America* - Sun of Jamaica (album), *album* - E-1027, *villa in Roquebrune-Cap-Martin, France* - Daingerfield State Park, *state park and state park of a state of the United States in Texas, United States of America* - Todo Lo Que Soy-En Vivo, *live album by Fey* - 2009 UEFA Regions' Cup, *none* Thoughts? Just trying to make my own bold assertions falsifiable :) On Tue, Aug 18, 2015 at 6:32 PM, Monte Hurd
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
No manual descriptions, on basically any item. And that will remain so for the (near) future. Automatic descriptions can change that, literally over night, with a little programming and linguistic effort. ... This is a force multiplier of volunteer effort with a factor of 250. And we ignore that ... why, exactly? Not ignoring. In fact, if the auto-generated descriptions near the quality of human curated descriptions, I'm totally and wholeheartedly onboard that their use should be strongly considered. I just disagree that closing the quality gap will involve little programming and linguistic effort. I lean more toward massive programming and linguistic effort end of the spectrum. Specifically, I think it will take massive effort to make the auto-generated descriptions so good that an average person would say, hey these auto generated descriptions are better than the human curated descriptions in the examples I posted. But I may, of course, be wrong! On Wed, Aug 19, 2015 at 1:27 PM, S Page sp...@wikimedia.org wrote: My hero Magnus Manske noted The situation, for most languages, is this: No manual descriptions, on basically any item. And that will remain so for the (near) future. Automatic descriptions can change that, literally over night, with a little programming and linguistic effort. ... This is a force multiplier of volunteer effort with a factor of 250. And we ignore that ... why, exactly? The potential of AutoDesc is so enormous to attain a world in which every single person on the planet is given free access to the sum of all human knowledge that it should be the entire movement's top project. I nearly wrote a career-limiting e-mail rant to WMF-all on that subject last night. In this e-mail thread we're talking about it in the limited scope of Wikidata descriptions in search on mobile web beta, where the mobile client presents a useful signpost for *existing* articles, in an emblem on lead images and in search results. That's important but we're missing the forest for a single tree when discussing such a transformative technology. If only WMF had a CTO for such things [1]. Anyway, returning to this specific use case: * Nobody is saying store the AutoDesc in the Wikidata per-language description field. * Nobody is saying show the AutoDesc if there is an existing Wikidata description. * Is anybody against showing AutoDesc, after some refinement and productization [2], in these mobile use cases when there is no Wikidata description? * I propose the AutoDesc as a quality bar that any edit to a Wikidata description needs to improve on (but again that's a topic beyond this mail thread). Yours, excitedly, =S Page [1] http://grnh.se/30f54b , apply today! [2] https://bitbucket.org/magnusmanske/autodesc/src/HEAD/www/js/?at=master and https://github.com/dbrant/wikidata-autodesc . It's already a nodejs service, can we append oid and declare victory ? :-) On Wed, Aug 19, 2015 at 2:57 AM, Magnus Manske magnusman...@googlemail.com wrote: Oh, and as for examples, random-paging just got me this: https://en.wikipedia.org/wiki/Jules_Malou Manual description: Belgian politician Automatic description: Belgian politician and lawyer, Prime Minister of Belgium, and member of the Chamber of Representatives of Belgium (1810–1886) ♂ I know which one I'd prefer... On Wed, Aug 19, 2015 at 10:50 AM Magnus Manske magnusman...@googlemail.com wrote: Thank you Dmitry! Well phrased and to the point! As for templating, that might be the worst of both worlds; without the flexibility and over-time improvement of automatic descriptions, but making it harder for people to enter (compared to free-style text). We have a Visual Editor on Wikipedia for a reason :-) On Wed, Aug 19, 2015 at 4:07 AM Dmitry Brant dbr...@wikimedia.org wrote: My thoughts, as ever(!), are as follows: - The tool that generates the descriptions deserves a lot more development. Magnus' tool is very much a prototype, and represents a tiny glimpse of what's possible. Looking at its current output is a straw man. - Auto-generated descriptions work for current articles, and *all future articles*. They automatically adapt to updated data. They automatically become more accurate as new data is added. - When you edit the descriptions yourself, you're not really making a meaningful contribution to the *data* that underpins the given Wikidata entry; i.e. you're not contributing any new information. You're simply paraphrasing the first sentence or two of the Wikipedia article. That can't possibly be a productive use of contributors' time. As for Brian's suggestion: It would be a step forward; we can even invent a whole template-type syntax for transcluding bits of actual data into the description. But IMO, that kind of effort would still be better spent on fully-automatic descriptions, because that's the ideal that semi-automatic descriptions can only approach. On Tue,
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
True about algorithms never being finished, but aren't we essentially stuck with the first run output, unless I misunderstand how you envision this working? (assuming you don't want to over-write non-blank descriptions the next time you improve and re-run the process) ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Oh, and as for examples, random-paging just got me this: https://en.wikipedia.org/wiki/Jules_Malou Manual description: Belgian politician Automatic description: Belgian politician and lawyer, Prime Minister of Belgium, and member of the Chamber of Representatives of Belgium (1810–1886) ♂ I know which one I'd prefer... On Wed, Aug 19, 2015 at 10:50 AM Magnus Manske magnusman...@googlemail.com wrote: Thank you Dmitry! Well phrased and to the point! As for templating, that might be the worst of both worlds; without the flexibility and over-time improvement of automatic descriptions, but making it harder for people to enter (compared to free-style text). We have a Visual Editor on Wikipedia for a reason :-) On Wed, Aug 19, 2015 at 4:07 AM Dmitry Brant dbr...@wikimedia.org wrote: My thoughts, as ever(!), are as follows: - The tool that generates the descriptions deserves a lot more development. Magnus' tool is very much a prototype, and represents a tiny glimpse of what's possible. Looking at its current output is a straw man. - Auto-generated descriptions work for current articles, and *all future articles*. They automatically adapt to updated data. They automatically become more accurate as new data is added. - When you edit the descriptions yourself, you're not really making a meaningful contribution to the *data* that underpins the given Wikidata entry; i.e. you're not contributing any new information. You're simply paraphrasing the first sentence or two of the Wikipedia article. That can't possibly be a productive use of contributors' time. As for Brian's suggestion: It would be a step forward; we can even invent a whole template-type syntax for transcluding bits of actual data into the description. But IMO, that kind of effort would still be better spent on fully-automatic descriptions, because that's the ideal that semi-automatic descriptions can only approach. On Tue, Aug 18, 2015 at 10:36 PM, Brian Gerstle bgers...@wikimedia.org wrote: Could there be a way to have our nicely curated description cake and eat it too? For example, interpolating data into the description and/or marking data points which are referenced in the description (so as to mark it as outdated when they change)? I appreciate the potential benefits of generated descriptions (and other things), but Monte's examples might have swayed me towards human curated—when available. On Tuesday, August 18, 2015, Monte Hurd mh...@wikimedia.org wrote: Ok, so I just did what I proposed. I went to random enwiki articles and described the first ten I found which didn't already have descriptions: - Courage Under Fire, *1996 film about a Gulf War friendly-fire incident* - Pebasiconcha immanis, *largest known species of land snail, extinct* - List of Kenyan writers, *notable Kenyan authors* - Solar eclipse of December 14, 1917, *annular eclipse which lasted 77 seconds* - Natchaug Forest Lumber Shed, *historic Civilian Conservation Corps post-and-beam building* - Sun of Jamaica (album), *debut 1980 studio album by Goombay Dance Band* - E-1027, *modernist villa in France by architect Eileen Gray* - Daingerfield State Park, *park in Morris County, Texas, USA, bordering Lake Daingerfield* - Todo Lo Que Soy-En Vivo, *2014 Live album by Mexican pop singer Fey* - 2009 UEFA Regions' Cup, *6th UEFA Regions' Cup, won by Castile and Leon* And here are the respective descriptions from Magnus' (quite excellent) autodesc.js: - Courage Under Fire, *1996 film by Edward Zwick, produced by John Davis and David T. Friendly from United States of America* - Pebasiconcha immanis, *species of Mollusca* - List of Kenyan writers, *Wikimedia list article* - Solar eclipse of December 14, 1917, *solar eclipse* - Natchaug Forest Lumber Shed, *Construction in Connecticut, United States of America* - Sun of Jamaica (album), *album* - E-1027, *villa in Roquebrune-Cap-Martin, France* - Daingerfield State Park, *state park and state park of a state of the United States in Texas, United States of America* - Todo Lo Que Soy-En Vivo, *live album by Fey* - 2009 UEFA Regions' Cup, *none* Thoughts? Just trying to make my own bold assertions falsifiable :) On Tue, Aug 18, 2015 at 6:32 PM, Monte Hurd mh...@wikimedia.org wrote: The whole human-vs-extracted descriptions quality question could be fairly easy to test I think: - Pick, some number of articles at random. - Run them through a description extraction script. - Have a human describe the same articles with, say, the app interface I demo'ed. If nothing else this exercise could perhaps make what's thus far been a wildly abstract discussion more concrete. On Tue, Aug 18, 2015 at 6:17 PM, Monte Hurd mh...@wikimedia.org wrote: If having the most elegant description extraction mechanism was the goal I would totally agree ;) On Tue, Aug 18, 2015 at 5:19 PM, Dmitry Brant dbr...@wikimedia.org wrote:
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. Anecdotal. I know plenty of readers that find them useful for knowing if they should click something, and it is also an anecdotal useless opinion. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo Luckily there's no mobile teams any more after the reorg. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Thanks - is that the one I have (can't tell)? Here's mine: https://en.wikipedia.org/wiki/User:Jane023/common.js On Tue, Aug 18, 2015 at 5:23 PM, Magnus Manske magnusman...@googlemail.com wrote: Show automatic description underneath From Wikipedia...: https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js To use, add: importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to your common.js On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane...@gmail.com wrote: It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
That's the one, just add importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to that page. On Tue, Aug 18, 2015 at 6:39 PM Jane Darnell jane...@gmail.com wrote: Thanks - is that the one I have (can't tell)? Here's mine: https://en.wikipedia.org/wiki/User:Jane023/common.js On Tue, Aug 18, 2015 at 5:23 PM, Magnus Manske magnusman...@googlemail.com wrote: Show automatic description underneath From Wikipedia...: https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js To use, add: importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to your common.js On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane...@gmail.com wrote: It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Ok, so I just did what I proposed. I went to random enwiki articles and described the first ten I found which didn't already have descriptions: - Courage Under Fire, *1996 film about a Gulf War friendly-fire incident* - Pebasiconcha immanis, *largest known species of land snail, extinct* - List of Kenyan writers, *notable Kenyan authors* - Solar eclipse of December 14, 1917, *annular eclipse which lasted 77 seconds* - Natchaug Forest Lumber Shed, *historic Civilian Conservation Corps post-and-beam building* - Sun of Jamaica (album), *debut 1980 studio album by Goombay Dance Band* - E-1027, *modernist villa in France by architect Eileen Gray* - Daingerfield State Park, *park in Morris County, Texas, USA, bordering Lake Daingerfield* - Todo Lo Que Soy-En Vivo, *2014 Live album by Mexican pop singer Fey* - 2009 UEFA Regions' Cup, *6th UEFA Regions' Cup, won by Castile and Leon* And here are the respective descriptions from Magnus' (quite excellent) autodesc.js: - Courage Under Fire, *1996 film by Edward Zwick, produced by John Davis and David T. Friendly from United States of America* - Pebasiconcha immanis, *species of Mollusca* - List of Kenyan writers, *Wikimedia list article* - Solar eclipse of December 14, 1917, *solar eclipse* - Natchaug Forest Lumber Shed, *Construction in Connecticut, United States of America* - Sun of Jamaica (album), *album* - E-1027, *villa in Roquebrune-Cap-Martin, France* - Daingerfield State Park, *state park and state park of a state of the United States in Texas, United States of America* - Todo Lo Que Soy-En Vivo, *live album by Fey* - 2009 UEFA Regions' Cup, *none* Thoughts? Just trying to make my own bold assertions falsifiable :) On Tue, Aug 18, 2015 at 6:32 PM, Monte Hurd mh...@wikimedia.org wrote: The whole human-vs-extracted descriptions quality question could be fairly easy to test I think: - Pick, some number of articles at random. - Run them through a description extraction script. - Have a human describe the same articles with, say, the app interface I demo'ed. If nothing else this exercise could perhaps make what's thus far been a wildly abstract discussion more concrete. On Tue, Aug 18, 2015 at 6:17 PM, Monte Hurd mh...@wikimedia.org wrote: If having the most elegant description extraction mechanism was the goal I would totally agree ;) On Tue, Aug 18, 2015 at 5:19 PM, Dmitry Brant dbr...@wikimedia.org wrote: IMO, allowing the user to edit the description is a missed opportunity to make the user edit the actual *data*, such that the description is generated correctly. On Tue, Aug 18, 2015 at 8:02 PM, Monte Hurd mh...@wikimedia.org wrote: IMO, if the goal is quality, then human curated descriptions are superior until such time as the auto-generation script passes the Turing test ;) I see these empty descriptions as an amazing opportunity to give *everyone* an easy new way to edit. I whipped an app editing interface up at the Lyon hackathon: https://www.youtube.com/watch?v=6VblyGhf_c8 I used it to add a couple hundred descriptions in a single day just by hitting random then adding descriptions for articles which didn't have them. I'd love to try a limited test of this in production to get a sense for how effective human curation can be if the interface is easy to use... On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali jan.ain...@wikimedia.se wrote: Nice one! Does not appear to work on svwiki though. Does it have something to do with that the wiki in question does not display that tagline? *Med vänliga hälsningar,Jan Ainali* Verksamhetschef, Wikimedia Sverige http://wikimedia.se 0729 - 67 29 48 *Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.* Bli medlem. http://blimedlem.wikimedia.se 2015-08-18 17:23 GMT+02:00 Magnus Manske magnusman...@googlemail.com : Show automatic description underneath From Wikipedia...: https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js To use, add: importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to your common.js On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane...@gmail.com wrote: It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
IMO, allowing the user to edit the description is a missed opportunity to make the user edit the actual *data*, such that the description is generated correctly. On Tue, Aug 18, 2015 at 8:02 PM, Monte Hurd mh...@wikimedia.org wrote: IMO, if the goal is quality, then human curated descriptions are superior until such time as the auto-generation script passes the Turing test ;) I see these empty descriptions as an amazing opportunity to give *everyone* an easy new way to edit. I whipped an app editing interface up at the Lyon hackathon: https://www.youtube.com/watch?v=6VblyGhf_c8 I used it to add a couple hundred descriptions in a single day just by hitting random then adding descriptions for articles which didn't have them. I'd love to try a limited test of this in production to get a sense for how effective human curation can be if the interface is easy to use... On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali jan.ain...@wikimedia.se wrote: Nice one! Does not appear to work on svwiki though. Does it have something to do with that the wiki in question does not display that tagline? *Med vänliga hälsningar,Jan Ainali* Verksamhetschef, Wikimedia Sverige http://wikimedia.se 0729 - 67 29 48 *Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.* Bli medlem. http://blimedlem.wikimedia.se 2015-08-18 17:23 GMT+02:00 Magnus Manske magnusman...@googlemail.com: Show automatic description underneath From Wikipedia...: https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js To use, add: importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to your common.js On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane...@gmail.com wrote: It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l -- Dmitry Brant Mobile Apps Team (Android) Wikimedia Foundation https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
The whole human-vs-extracted descriptions quality question could be fairly easy to test I think: - Pick, some number of articles at random. - Run them through a description extraction script. - Have a human describe the same articles with, say, the app interface I demo'ed. If nothing else this exercise could perhaps make what's thus far been a wildly abstract discussion more concrete. On Tue, Aug 18, 2015 at 6:17 PM, Monte Hurd mh...@wikimedia.org wrote: If having the most elegant description extraction mechanism was the goal I would totally agree ;) On Tue, Aug 18, 2015 at 5:19 PM, Dmitry Brant dbr...@wikimedia.org wrote: IMO, allowing the user to edit the description is a missed opportunity to make the user edit the actual *data*, such that the description is generated correctly. On Tue, Aug 18, 2015 at 8:02 PM, Monte Hurd mh...@wikimedia.org wrote: IMO, if the goal is quality, then human curated descriptions are superior until such time as the auto-generation script passes the Turing test ;) I see these empty descriptions as an amazing opportunity to give *everyone* an easy new way to edit. I whipped an app editing interface up at the Lyon hackathon: https://www.youtube.com/watch?v=6VblyGhf_c8 I used it to add a couple hundred descriptions in a single day just by hitting random then adding descriptions for articles which didn't have them. I'd love to try a limited test of this in production to get a sense for how effective human curation can be if the interface is easy to use... On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali jan.ain...@wikimedia.se wrote: Nice one! Does not appear to work on svwiki though. Does it have something to do with that the wiki in question does not display that tagline? *Med vänliga hälsningar,Jan Ainali* Verksamhetschef, Wikimedia Sverige http://wikimedia.se 0729 - 67 29 48 *Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.* Bli medlem. http://blimedlem.wikimedia.se 2015-08-18 17:23 GMT+02:00 Magnus Manske magnusman...@googlemail.com: Show automatic description underneath From Wikipedia...: https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js To use, add: importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to your common.js On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane...@gmail.com wrote: It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l -- Dmitry Brant Mobile Apps Team (Android) Wikimedia Foundation https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
If having the most elegant description extraction mechanism was the goal I would totally agree ;) On Tue, Aug 18, 2015 at 5:19 PM, Dmitry Brant dbr...@wikimedia.org wrote: IMO, allowing the user to edit the description is a missed opportunity to make the user edit the actual *data*, such that the description is generated correctly. On Tue, Aug 18, 2015 at 8:02 PM, Monte Hurd mh...@wikimedia.org wrote: IMO, if the goal is quality, then human curated descriptions are superior until such time as the auto-generation script passes the Turing test ;) I see these empty descriptions as an amazing opportunity to give *everyone* an easy new way to edit. I whipped an app editing interface up at the Lyon hackathon: https://www.youtube.com/watch?v=6VblyGhf_c8 I used it to add a couple hundred descriptions in a single day just by hitting random then adding descriptions for articles which didn't have them. I'd love to try a limited test of this in production to get a sense for how effective human curation can be if the interface is easy to use... On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali jan.ain...@wikimedia.se wrote: Nice one! Does not appear to work on svwiki though. Does it have something to do with that the wiki in question does not display that tagline? *Med vänliga hälsningar,Jan Ainali* Verksamhetschef, Wikimedia Sverige http://wikimedia.se 0729 - 67 29 48 *Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.* Bli medlem. http://blimedlem.wikimedia.se 2015-08-18 17:23 GMT+02:00 Magnus Manske magnusman...@googlemail.com: Show automatic description underneath From Wikipedia...: https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js To use, add: importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to your common.js On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane...@gmail.com wrote: It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l -- Dmitry Brant Mobile Apps Team (Android) Wikimedia Foundation https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
IMO, if the goal is quality, then human curated descriptions are superior until such time as the auto-generation script passes the Turing test ;) I see these empty descriptions as an amazing opportunity to give *everyone* an easy new way to edit. I whipped an app editing interface up at the Lyon hackathon: https://www.youtube.com/watch?v=6VblyGhf_c8 I used it to add a couple hundred descriptions in a single day just by hitting random then adding descriptions for articles which didn't have them. I'd love to try a limited test of this in production to get a sense for how effective human curation can be if the interface is easy to use... On Tue, Aug 18, 2015 at 1:25 PM, Jan Ainali jan.ain...@wikimedia.se wrote: Nice one! Does not appear to work on svwiki though. Does it have something to do with that the wiki in question does not display that tagline? *Med vänliga hälsningar,Jan Ainali* Verksamhetschef, Wikimedia Sverige http://wikimedia.se 0729 - 67 29 48 *Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.* Bli medlem. http://blimedlem.wikimedia.se 2015-08-18 17:23 GMT+02:00 Magnus Manske magnusman...@googlemail.com: Show automatic description underneath From Wikipedia...: https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js To use, add: importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to your common.js On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane...@gmail.com wrote: It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Kunal, I believe what Joaquin was referring to is the notion that the web team in Reading will be entering into work on the desktop-oriented experience. As you rightly note, the Android and iOS teams are focused squarely on experiences for mobile devices. For the edification of the list, the web engineers in Reading were in the mobile web team, with a focus on experiences for users on mobile form factor devices like phones and tablets. It's going to take some time to ramp up practices for tackling code and architecture historically oriented to the desktop form factor. We'll need to ensure continued stability in the platform and work with the community as we propose and introduce changes to the desktop form factor user experience. -Adam On Tue, Aug 18, 2015 at 11:22 AM, Legoktm legoktm.wikipe...@gmail.com wrote: On 08/18/2015 10:29 AM, Joaquin Oltra Hernandez wrote: Luckily there's no mobile teams any more after the reorg. Are you sure? [1] In any case, it would be interesting to look at how many commits and contributions the Desktop Mobile Web team has made to the desktop interface. [1] https://wikimediafoundation.org/wiki/Staff_and_contractors#Mobile_Apps -- Legoktm ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Nice one! Does not appear to work on svwiki though. Does it have something to do with that the wiki in question does not display that tagline? *Med vänliga hälsningar,Jan Ainali* Verksamhetschef, Wikimedia Sverige http://wikimedia.se 0729 - 67 29 48 *Tänk dig en värld där varje människa har fri tillgång till mänsklighetens samlade kunskap. Det är det vi gör.* Bli medlem. http://blimedlem.wikimedia.se 2015-08-18 17:23 GMT+02:00 Magnus Manske magnusman...@googlemail.com: Show automatic description underneath From Wikipedia...: https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js To use, add: importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to your common.js On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane...@gmail.com wrote: It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Show automatic description underneath From Wikipedia...: https://en.wikipedia.org/wiki/User:Magnus_Manske/autodesc.js To use, add: importScript ( 'User:Magnus_Manske/autodesc.js' ) ; to your common.js On Tue, Aug 18, 2015 at 9:47 AM Jane Darnell jane...@gmail.com wrote: It would be even better if this (short: 3 field max) pipe-separated list was available as a gadget to wikidatans on Wikipedia (like me). I can't see if a page I am on has an instance of (though it should) and I can see the description thanks to another gadget (sorry no idea which one that is). Often I will update empty descriptions, but if I was served basic fields (so for a painting, the creator field), I would click through to update that too. On Tue, Aug 18, 2015 at 9:58 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Jane Darnell, 15/08/2015 08:53: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. +1, item descriptions are mostly useless in my experience. As for get into production on Wikipedia I don't know what it means, I certainly don't like 1) mobile-specific features, 2) overriding existing manually curated content; but it's good to 3) fill gaps. Mobile folks often do (1) and (2), if they *instead* did (3) I'd be very happy. :) Nemo ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
S, No, the RESTBase mobileapps service[1] doesn't do this currently. That should be possible, though. The service currently uses action=mobileview under the hood. This means it gets it first from the WP instances, and it that one doesn't have it it would go to Wikidata. In the future we'll likely switch to Parsoid for the backend requests but I don't know when that will happen. We then might have to request the description using something like action=queryprop=pagetermswbptterms=description[1] if that's not included in Parsoid. [1] https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_apps [2] https://en.wikipedia.org/wiki/Special:ApiSandbox#action=queryprop=pagetermsformat=jsonwbptterms=descriptiontitles=Cat Bernd On Mon, Aug 17, 2015 at 4:56 PM, S Page sp...@wikimedia.org wrote: On Sat, Aug 15, 2015 at 5:08 AM, Magnus Manske magnusman...@googlemail.com wrote: Ah, but when auto-descriptions get better, how do we know which should be updated, and which have been improved bu humans? Because people will screem bloody murder if we replace their descriptions with automatic ones, even if those are better. Would it be acceptable to *generate* a description on the fly if there isn't a description in the user's language, but never *replace* an existing description in Wikidata? AIUI this is what RESTBase is good at: in response to API requests for information about a page, some backend generates information, RESTBase caches it for future requests but RESTBase doesn't update the content databases. If I'm right (unlikely :-) ), then the upcoming MobileApps service could do this without anyone screaming. Maybe the MobileApps service already does this, I'm not sure what https://restbase.wikimedia.org/en.wikipedia.org/v1/page/mobile-text/Cat puts in the description field if Wikidata's description is empty. figuring out which descriptions we can overwrite is next-to-impossible. So don't try. The game becomes: present the generated description next to the manual Wikidata description, and if enough users prefer the former, blank out the Wikidata description. Cheers, -- =S Page WMF Tech writer ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
On Sat, Aug 15, 2015 at 5:08 AM, Magnus Manske magnusman...@googlemail.com wrote: Ah, but when auto-descriptions get better, how do we know which should be updated, and which have been improved bu humans? Because people will screem bloody murder if we replace their descriptions with automatic ones, even if those are better. Would it be acceptable to *generate* a description on the fly if there isn't a description in the user's language, but never *replace* an existing description in Wikidata? AIUI this is what RESTBase is good at: in response to API requests for information about a page, some backend generates information, RESTBase caches it for future requests but RESTBase doesn't update the content databases. If I'm right (unlikely :-) ), then the upcoming MobileApps service could do this without anyone screaming. Maybe the MobileApps service already does this, I'm not sure what https://restbase.wikimedia.org/en.wikipedia.org/v1/page/mobile-text/Cat puts in the description field if Wikidata's description is empty. figuring out which descriptions we can overwrite is next-to-impossible. So don't try. The game becomes: present the generated description next to the manual Wikidata description, and if enough users prefer the former, blank out the Wikidata description. Cheers, -- =S Page WMF Tech writer ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
On Sat, Aug 15, 2015 at 3:43 AM, Dan Garry dga...@wikimedia.org wrote: I've seen arguments on both sides here. Some say automatically generated descriptions are not good enough. Some say they are. Why don't we gather some data on this and use that to decide what's right? :-) Please do. Especially pay attention to languages other than English though. Because even if we get algorithms to write good descriptions for English are we going to do the same for all the other languages? Especially those where grammar is tricky and Wikidata doesn't even have the necessary information to make the grammar right? The other tricky side is determining why something is actually notable. That's not a trivial thing to determine based on the data we have. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Ah, but when auto-descriptions get better, how do we know which should be updated, and which have been improved bu humans? Because people will screem bloody murder if we replace their descriptions with automatic ones, even if those are better. On Sat, Aug 15, 2015 at 7:53 AM Jane Darnell jane...@gmail.com wrote: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. This could be a prompt to make a game that offers to update the description with an auto-generated text. So for a Monet painting, the description could be creator Monet|instance painting. We have over 100,000 paintings on Wikidata thanks to the Sum of all Paintings project (yay!) and most museums only have titles in the language it was created in and the language of the museum, so we are a long way from creating meaningful titles for all of these and meaningful short descriptions would be a real benefit to the project. On Sat, Aug 15, 2015 at 8:38 AM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Sat, Aug 15, 2015 at 3:43 AM, Dan Garry dga...@wikimedia.org wrote: I've seen arguments on both sides here. Some say automatically generated descriptions are not good enough. Some say they are. Why don't we gather some data on this and use that to decide what's right? :-) Please do. Especially pay attention to languages other than English though. Because even if we get algorithms to write good descriptions for English are we going to do the same for all the other languages? Especially those where grammar is tricky and Wikidata doesn't even have the necessary information to make the grammar right? The other tricky side is determining why something is actually notable. That's not a trivial thing to determine based on the data we have. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Well we should start by filling blank descriptions of course. I should have gone on to explain that in my experience of using listeria lists in my userspace on the Dutch wikipedia, I have noticed lots of Wikidata infrastructure that hasn't been translated yet. So in my example of the Monet painting, in some languages it would not look like creator Monet|instance painting but like Qxyz Monet|Qefg Qklm (worst case scenario where only the item for Monet has been propagated to all 200+ languages). Having a game where such auto descriptions can be served to people who are able to fill in labels and descriptions could be useful for more than just the one item. On Sat, Aug 15, 2015 at 2:08 PM, Magnus Manske magnusman...@googlemail.com wrote: Ah, but when auto-descriptions get better, how do we know which should be updated, and which have been improved bu humans? Because people will screem bloody murder if we replace their descriptions with automatic ones, even if those are better. On Sat, Aug 15, 2015 at 7:53 AM Jane Darnell jane...@gmail.com wrote: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. This could be a prompt to make a game that offers to update the description with an auto-generated text. So for a Monet painting, the description could be creator Monet|instance painting. We have over 100,000 paintings on Wikidata thanks to the Sum of all Paintings project (yay!) and most museums only have titles in the language it was created in and the language of the museum, so we are a long way from creating meaningful titles for all of these and meaningful short descriptions would be a real benefit to the project. On Sat, Aug 15, 2015 at 8:38 AM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Sat, Aug 15, 2015 at 3:43 AM, Dan Garry dga...@wikimedia.org wrote: I've seen arguments on both sides here. Some say automatically generated descriptions are not good enough. Some say they are. Why don't we gather some data on this and use that to decide what's right? :-) Please do. Especially pay attention to languages other than English though. Because even if we get algorithms to write good descriptions for English are we going to do the same for all the other languages? Especially those where grammar is tricky and Wikidata doesn't even have the necessary information to make the grammar right? The other tricky side is determining why something is actually notable. That's not a trivial thing to determine based on the data we have. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
On Aug 15, 2015 14:06, Magnus Manske magnusman...@googlemail.com wrote: On Sat, Aug 15, 2015 at 7:38 AM Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Sat, Aug 15, 2015 at 3:43 AM, Dan Garry dga...@wikimedia.org wrote: I've seen arguments on both sides here. Some say automatically generated descriptions are not good enough. Some say they are. Why don't we gather some data on this and use that to decide what's right? :-) Please do. Especially pay attention to languages other than English though. Because even if we get algorithms to write good descriptions for English are we going to do the same for all the other languages? Especially those where grammar is tricky and Wikidata doesn't even have the necessary information to make the grammar right? The other tricky side is determining why something is actually notable. That's not a trivial thing to determine based on the data we have. And you know very well that (AFAIK) I am the only one who actually worked on this, in a tiny fraction of my spare time, and I only speak German and English. The /real/ questions here are: 1. The language that are actually implemented, are they returning descriptions that are good/OK/bad/plain wrong 2. What could be achieved, on the existing or similar infrastructure, in a short period of time, if we drive to get code snippets (or equivalent) for other languages from volunteers? 3. What could be achieved, medium/long term, if we had a proper linguist to work on the problem? Or someone who has worked with multi-language text generation before? I've just been winging it so far. Current auto-descriptions are not the best we can do. They are, frankly, the WORST we can do. This is a starting point, not the end product. Yeah I understand. And this is not a criticism of your work. I think it is actually rather cool. It is questioning if it is a good idea to continue to push it to get into production on Wikipedia on a large scale. Cheers Lydia ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
That sounds like a very good idea for labels. Not quite a game, but I have something along those lines running for a while. Example: Monet. http://tools.wmflabs.org/wikidata-todo/cloudy_concept.php?q=Q296lang=en Every time a label is set in a language, it would potentially improve dozens or hundreds of auto-descriptions. Which is one reason why we should NOT flood the manual description field with one-off text generation; they need to be updated, and figuring out which descriptions we can overwrite is next-to-impossible. On Sat, Aug 15, 2015 at 1:17 PM Jane Darnell jane...@gmail.com wrote: Well we should start by filling blank descriptions of course. I should have gone on to explain that in my experience of using listeria lists in my userspace on the Dutch wikipedia, I have noticed lots of Wikidata infrastructure that hasn't been translated yet. So in my example of the Monet painting, in some languages it would not look like creator Monet|instance painting but like Qxyz Monet|Qefg Qklm (worst case scenario where only the item for Monet has been propagated to all 200+ languages). Having a game where such auto descriptions can be served to people who are able to fill in labels and descriptions could be useful for more than just the one item. On Sat, Aug 15, 2015 at 2:08 PM, Magnus Manske magnusman...@googlemail.com wrote: Ah, but when auto-descriptions get better, how do we know which should be updated, and which have been improved bu humans? Because people will screem bloody murder if we replace their descriptions with automatic ones, even if those are better. On Sat, Aug 15, 2015 at 7:53 AM Jane Darnell jane...@gmail.com wrote: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. This could be a prompt to make a game that offers to update the description with an auto-generated text. So for a Monet painting, the description could be creator Monet|instance painting. We have over 100,000 paintings on Wikidata thanks to the Sum of all Paintings project (yay!) and most museums only have titles in the language it was created in and the language of the museum, so we are a long way from creating meaningful titles for all of these and meaningful short descriptions would be a real benefit to the project. On Sat, Aug 15, 2015 at 8:38 AM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Sat, Aug 15, 2015 at 3:43 AM, Dan Garry dga...@wikimedia.org wrote: I've seen arguments on both sides here. Some say automatically generated descriptions are not good enough. Some say they are. Why don't we gather some data on this and use that to decide what's right? :-) Please do. Especially pay attention to languages other than English though. Because even if we get algorithms to write good descriptions for English are we going to do the same for all the other languages? Especially those where grammar is tricky and Wikidata doesn't even have the necessary information to make the grammar right? The other tricky side is determining why something is actually notable. That's not a trivial thing to determine based on the data we have. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
On Sat, Aug 15, 2015 at 1:17 PM Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Aug 15, 2015 14:06, Magnus Manske magnusman...@googlemail.com wrote: On Sat, Aug 15, 2015 at 7:38 AM Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Sat, Aug 15, 2015 at 3:43 AM, Dan Garry dga...@wikimedia.org wrote: I've seen arguments on both sides here. Some say automatically generated descriptions are not good enough. Some say they are. Why don't we gather some data on this and use that to decide what's right? :-) Please do. Especially pay attention to languages other than English though. Because even if we get algorithms to write good descriptions for English are we going to do the same for all the other languages? Especially those where grammar is tricky and Wikidata doesn't even have the necessary information to make the grammar right? The other tricky side is determining why something is actually notable. That's not a trivial thing to determine based on the data we have. And you know very well that (AFAIK) I am the only one who actually worked on this, in a tiny fraction of my spare time, and I only speak German and English. The /real/ questions here are: 1. The language that are actually implemented, are they returning descriptions that are good/OK/bad/plain wrong 2. What could be achieved, on the existing or similar infrastructure, in a short period of time, if we drive to get code snippets (or equivalent) for other languages from volunteers? 3. What could be achieved, medium/long term, if we had a proper linguist to work on the problem? Or someone who has worked with multi-language text generation before? I've just been winging it so far. Current auto-descriptions are not the best we can do. They are, frankly, the WORST we can do. This is a starting point, not the end product. Yeah I understand. And this is not a criticism of your work. I think it is actually rather cool. It is questioning if it is a good idea to continue to push it to get into production on Wikipedia on a large scale. With that, I agree wholeheartedly. There might be a point of doing an extended prototype though, before going to production (as much as I'd like that). What languages would be easy, hard, impossible? Would this work as a stand-alone project (e.g. dedicated VM), or as an extension of wikibase (flexibility vs. convenient integration)? What open source code is already out there we could use? Anyone in WMF/chapters who has experience in text generation? Anyone in WMF/chapters who speaks a small language who could help set up an example generator for that? What are the major item classes on Wikidata to be covered with special code, beyond the obvious human bio? And we'd need someone to run this. As much as I'd like to, I'm stretched too thin as it is... ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Yes something like that (although if you look at a list that long in a small language like Macedonian it could be pretty overwhelming). I think we need something that shows people why it is worth their time to translate property or item labels for things that don't have Wikipedia articles attached to them (yet). On Sat, Aug 15, 2015 at 2:33 PM, Magnus Manske magnusman...@googlemail.com wrote: That sounds like a very good idea for labels. Not quite a game, but I have something along those lines running for a while. Example: Monet. http://tools.wmflabs.org/wikidata-todo/cloudy_concept.php?q=Q296lang=en Every time a label is set in a language, it would potentially improve dozens or hundreds of auto-descriptions. Which is one reason why we should NOT flood the manual description field with one-off text generation; they need to be updated, and figuring out which descriptions we can overwrite is next-to-impossible. On Sat, Aug 15, 2015 at 1:17 PM Jane Darnell jane...@gmail.com wrote: Well we should start by filling blank descriptions of course. I should have gone on to explain that in my experience of using listeria lists in my userspace on the Dutch wikipedia, I have noticed lots of Wikidata infrastructure that hasn't been translated yet. So in my example of the Monet painting, in some languages it would not look like creator Monet|instance painting but like Qxyz Monet|Qefg Qklm (worst case scenario where only the item for Monet has been propagated to all 200+ languages). Having a game where such auto descriptions can be served to people who are able to fill in labels and descriptions could be useful for more than just the one item. On Sat, Aug 15, 2015 at 2:08 PM, Magnus Manske magnusman...@googlemail.com wrote: Ah, but when auto-descriptions get better, how do we know which should be updated, and which have been improved bu humans? Because people will screem bloody murder if we replace their descriptions with automatic ones, even if those are better. On Sat, Aug 15, 2015 at 7:53 AM Jane Darnell jane...@gmail.com wrote: Yes but even if the descriptions were just the contents of fields separated by a pipe it would be better than nothing. This could be a prompt to make a game that offers to update the description with an auto-generated text. So for a Monet painting, the description could be creator Monet|instance painting. We have over 100,000 paintings on Wikidata thanks to the Sum of all Paintings project (yay!) and most museums only have titles in the language it was created in and the language of the museum, so we are a long way from creating meaningful titles for all of these and meaningful short descriptions would be a real benefit to the project. On Sat, Aug 15, 2015 at 8:38 AM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Sat, Aug 15, 2015 at 3:43 AM, Dan Garry dga...@wikimedia.org wrote: I've seen arguments on both sides here. Some say automatically generated descriptions are not good enough. Some say they are. Why don't we gather some data on this and use that to decide what's right? :-) Please do. Especially pay attention to languages other than English though. Because even if we get algorithms to write good descriptions for English are we going to do the same for all the other languages? Especially those where grammar is tricky and Wikidata doesn't even have the necessary information to make the grammar right? The other tricky side is determining why something is actually notable. That's not a trivial thing to determine based on the data we have. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Magnus, were you thinking that if there *is* a description field for the knowledge item then that should override the computed description? On Fri, Aug 14, 2015 at 10:31 AM, Magnus Manske magnusman...@googlemail.com wrote: As to Wikidata descriptions, I think it's a good first step. As someone mentioned, it's pretty useless for most languages, as there are no descriptions on Wikidata. IMHO the next step is auto-generating short descriptions from the item statements, which will be perfectly fine for the vast majority of cases. If this is done, I suggest to NOT put the auto-generated text in the manual description field, as descriptions will improve over time, through both new statements and better algorithms. Rather, cache descriptions separately, and update them as required. On Fri, Aug 14, 2015 at 4:52 PM Adam Baso ab...@wikimedia.org wrote: We recently resumed tweeting with the @WikimediaMobile handle, and I wanted to share one tweet with you: https://twitter.com/WikimediaMobile/status/631178379501285376 It looks like people are pretty keen on it. There was one person who said outside of top Wikipedias it doesn't seem quite as useful. I was wondering, what role might https://www.wikidata.org/wiki/Wikidata:Arbitrary_access play in helping to enrich results? https://phabricator.wikimedia.org/T100786 and https://phabricator.wikimedia.org/T100787 are recent examples of implementation of this sort of thing, as mentioned on https://wikitech.wikimedia.org/wiki/Deployments. -Adam ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
Absolutely. However, IMO there should only be a manual description if the automatic one is not sufficient. Austrian biologist (1900-1990), winner of the 1980 Whatnot award is not something humans need to write in 250 languages. In fact, I'd be in favor of removing trivial manual descriptions, as automatic ones would likely be better (as in, more up-to-date with the statements). But yes, the manual description, if present, should take precedence. On Fri, Aug 14, 2015 at 6:36 PM Adam Baso ab...@wikimedia.org wrote: Magnus, were you thinking that if there *is* a description field for the knowledge item then that should override the computed description? On Fri, Aug 14, 2015 at 10:31 AM, Magnus Manske magnusman...@googlemail.com wrote: As to Wikidata descriptions, I think it's a good first step. As someone mentioned, it's pretty useless for most languages, as there are no descriptions on Wikidata. IMHO the next step is auto-generating short descriptions from the item statements, which will be perfectly fine for the vast majority of cases. If this is done, I suggest to NOT put the auto-generated text in the manual description field, as descriptions will improve over time, through both new statements and better algorithms. Rather, cache descriptions separately, and update them as required. On Fri, Aug 14, 2015 at 4:52 PM Adam Baso ab...@wikimedia.org wrote: We recently resumed tweeting with the @WikimediaMobile handle, and I wanted to share one tweet with you: https://twitter.com/WikimediaMobile/status/631178379501285376 It looks like people are pretty keen on it. There was one person who said outside of top Wikipedias it doesn't seem quite as useful. I was wondering, what role might https://www.wikidata.org/wiki/Wikidata:Arbitrary_access play in helping to enrich results? https://phabricator.wikimedia.org/T100786 and https://phabricator.wikimedia.org/T100787 are recent examples of implementation of this sort of thing, as mentioned on https://wikitech.wikimedia.org/wiki/Deployments. -Adam ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
First example that loaded on random item: https://www.wikidata.org/wiki/Q6256189 English: Manual description: American politician. Automatic description: US-American politician (*1968) ♂ German: Manual description: None. Automatic description: Vereinigte Staaten Politiker (*1968) ♂ (yes, would need some work on the algorithm, but understandable) https://tools.wmflabs.org/autodesc/?q=Q6256189lang=demode=shortlinks=textredlinks=format=jsonfm On Fri, Aug 14, 2015 at 11:22 PM Magnus Manske magnusman...@googlemail.com wrote: On Fri, Aug 14, 2015 at 9:54 PM Gergo Tisza gti...@wikimedia.org wrote: On Fri, Aug 14, 2015 at 10:31 AM, Magnus Manske magnusman...@googlemail.com wrote: IMHO the next step is auto-generating short descriptions from the item statements, which will be perfectly fine for the vast majority of cases. The Wikidata team is not a fan of that idea: T91981 https://phabricator.wikimedia.org/T91981 Yes, sadly. The argument not good enough is a fail IMHO, though. If it's bad, improve the algorithm and/or add statements. If it's still bad, THEN add a manual description. I think the worst possible description is the one that's missing. Back-of-the-envelope calculation: * We have ~45 million manual descriptions at the moment on Wikidata * We have ~18 million items * We have ~250 languages That means that, as of this moment, less than 1% of all possible descriptions are filled in. And the quality of these manual descriptions is everyone's best guess; I've seen plenty disambiguation page and category page, EVEN IS THAT IS NOT TRUE. Some crappy bot filled those in. No chance of quickly fixing this. So, 99% descriptions missing, with little chance of them getting filled in at all (think: small languages), and a rather dubious track record for the ones that are. It's like letting people drown in the Mediterranean because the tents to house them temporarily are not good enough. Frustrating, seriously. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
On Fri, Aug 14, 2015 at 9:54 PM Gergo Tisza gti...@wikimedia.org wrote: On Fri, Aug 14, 2015 at 10:31 AM, Magnus Manske magnusman...@googlemail.com wrote: IMHO the next step is auto-generating short descriptions from the item statements, which will be perfectly fine for the vast majority of cases. The Wikidata team is not a fan of that idea: T91981 https://phabricator.wikimedia.org/T91981 Yes, sadly. The argument not good enough is a fail IMHO, though. If it's bad, improve the algorithm and/or add statements. If it's still bad, THEN add a manual description. I think the worst possible description is the one that's missing. Back-of-the-envelope calculation: * We have ~45 million manual descriptions at the moment on Wikidata * We have ~18 million items * We have ~250 languages That means that, as of this moment, less than 1% of all possible descriptions are filled in. And the quality of these manual descriptions is everyone's best guess; I've seen plenty disambiguation page and category page, EVEN IS THAT IS NOT TRUE. Some crappy bot filled those in. No chance of quickly fixing this. So, 99% descriptions missing, with little chance of them getting filled in at all (think: small languages), and a rather dubious track record for the ones that are. It's like letting people drown in the Mediterranean because the tents to house them temporarily are not good enough. Frustrating, seriously. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
I added some thoughts on the task. I do think it's something we explore, even if on a small group of articles to measure the impact. On Fri, Aug 14, 2015 at 3:27 PM, Magnus Manske magnusman...@googlemail.com wrote: First example that loaded on random item: https://www.wikidata.org/wiki/Q6256189 English: Manual description: American politician. Automatic description: US-American politician (*1968) ♂ German: Manual description: None. Automatic description: Vereinigte Staaten Politiker (*1968) ♂ (yes, would need some work on the algorithm, but understandable) https://tools.wmflabs.org/autodesc/?q=Q6256189lang=demode=shortlinks=textredlinks=format=jsonfm On Fri, Aug 14, 2015 at 11:22 PM Magnus Manske magnusman...@googlemail.com wrote: On Fri, Aug 14, 2015 at 9:54 PM Gergo Tisza gti...@wikimedia.org wrote: On Fri, Aug 14, 2015 at 10:31 AM, Magnus Manske magnusman...@googlemail.com wrote: IMHO the next step is auto-generating short descriptions from the item statements, which will be perfectly fine for the vast majority of cases. The Wikidata team is not a fan of that idea: T91981 Yes, sadly. The argument not good enough is a fail IMHO, though. If it's bad, improve the algorithm and/or add statements. If it's still bad, THEN add a manual description. I think the worst possible description is the one that's missing. Back-of-the-envelope calculation: * We have ~45 million manual descriptions at the moment on Wikidata * We have ~18 million items * We have ~250 languages That means that, as of this moment, less than 1% of all possible descriptions are filled in. And the quality of these manual descriptions is everyone's best guess; I've seen plenty disambiguation page and category page, EVEN IS THAT IS NOT TRUE. Some crappy bot filled those in. No chance of quickly fixing this. So, 99% descriptions missing, with little chance of them getting filled in at all (think: small languages), and a rather dubious track record for the ones that are. It's like letting people drown in the Mediterranean because the tents to house them temporarily are not good enough. Frustrating, seriously. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l -- Jon Robson * http://jonrobson.me.uk * https://www.facebook.com/jonrobson * @rakugojon ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
The argument not good enough is a fail IMHO, though. If it's bad, improve the algorithm and/or add statements. If it's still bad, THEN add a manual description. +10^100 On Fri, Aug 14, 2015 at 6:22 PM, Magnus Manske magnusman...@googlemail.com wrote: On Fri, Aug 14, 2015 at 9:54 PM Gergo Tisza gti...@wikimedia.org wrote: On Fri, Aug 14, 2015 at 10:31 AM, Magnus Manske magnusman...@googlemail.com wrote: IMHO the next step is auto-generating short descriptions from the item statements, which will be perfectly fine for the vast majority of cases. The Wikidata team is not a fan of that idea: T91981 https://phabricator.wikimedia.org/T91981 Yes, sadly. The argument not good enough is a fail IMHO, though. If it's bad, improve the algorithm and/or add statements. If it's still bad, THEN add a manual description. I think the worst possible description is the one that's missing. Back-of-the-envelope calculation: * We have ~45 million manual descriptions at the moment on Wikidata * We have ~18 million items * We have ~250 languages That means that, as of this moment, less than 1% of all possible descriptions are filled in. And the quality of these manual descriptions is everyone's best guess; I've seen plenty disambiguation page and category page, EVEN IS THAT IS NOT TRUE. Some crappy bot filled those in. No chance of quickly fixing this. So, 99% descriptions missing, with little chance of them getting filled in at all (think: small languages), and a rather dubious track record for the ones that are. It's like letting people drown in the Mediterranean because the tents to house them temporarily are not good enough. Frustrating, seriously. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l -- Dmitry Brant Mobile Apps Team (Android) Wikimedia Foundation https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Re: [WikimediaMobile] What people think about Wikidata descriptions in search on mobile web beta, and a question about arbitrary access of Wikidata data
I've seen arguments on both sides here. Some say automatically generated descriptions are not good enough. Some say they are. Why don't we gather some data on this and use that to decide what's right? :-) Dan On 14 Aug 2015 6:29 pm, Dmitry Brant dbr...@wikimedia.org wrote: The argument not good enough is a fail IMHO, though. If it's bad, improve the algorithm and/or add statements. If it's still bad, THEN add a manual description. +10^100 On Fri, Aug 14, 2015 at 6:22 PM, Magnus Manske magnusman...@googlemail.com wrote: On Fri, Aug 14, 2015 at 9:54 PM Gergo Tisza gti...@wikimedia.org wrote: On Fri, Aug 14, 2015 at 10:31 AM, Magnus Manske magnusman...@googlemail.com wrote: IMHO the next step is auto-generating short descriptions from the item statements, which will be perfectly fine for the vast majority of cases. The Wikidata team is not a fan of that idea: T91981 https://phabricator.wikimedia.org/T91981 Yes, sadly. The argument not good enough is a fail IMHO, though. If it's bad, improve the algorithm and/or add statements. If it's still bad, THEN add a manual description. I think the worst possible description is the one that's missing. Back-of-the-envelope calculation: * We have ~45 million manual descriptions at the moment on Wikidata * We have ~18 million items * We have ~250 languages That means that, as of this moment, less than 1% of all possible descriptions are filled in. And the quality of these manual descriptions is everyone's best guess; I've seen plenty disambiguation page and category page, EVEN IS THAT IS NOT TRUE. Some crappy bot filled those in. No chance of quickly fixing this. So, 99% descriptions missing, with little chance of them getting filled in at all (think: small languages), and a rather dubious track record for the ones that are. It's like letting people drown in the Mediterranean because the tents to house them temporarily are not good enough. Frustrating, seriously. ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l -- Dmitry Brant Mobile Apps Team (Android) Wikimedia Foundation https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l ___ Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l