[Wikidata] Re: State of the (Wiki)data

2022-11-02 Thread Markus Krötzsch

Dear all,

Thanks, Romaine, for this detailed and careful analysis of the 
situation. I think much of this is spot-on. I think one of the main 
insights here is that we need more uniformity. Wikidata in many places 
is still used like some exotic "structured" format for entering plain 
texts, which make sense to human readers but prevent or confuse 
automated usage. The key is to "see" collections of items rather than 
single pages.


It seems Wikidata would need more stakeholder communities for specific 
areas (say sports events) to oversee and guide the modeling of the items 
in this kind. We need more WikiProjects.


Regarding the question whether solutions need to be technical or social, 
I'd say both must go together. I also have often been disheartened by 
the sheer effort that it would require to add even the most obvious 
statements to a larger set of items. Geography is a good example: there 
are so many nearby places that share the same geo-administrative history 
(take a look at the country, P17, of Dresden, Q1731), yet it is 
practically impossible to add this to any significant amount of the 
thousands of Germany cities ... Here, like in many of the cases Romaine 
has described, the technical limitations may smother necessary community 
activity. (The specific case might also be an example of something where 
an approach of "data sharing" is needed, i.e. a modeling paradigm that 
simply allows us to say "this place has the same history of P17 
statements as this other place"; but that's not the main topic of this 
post).


New tools may also enable and encourage communities to grow that have 
not formed in the past decade. One aspect here might be that it is 
difficult for communities to appreciate the result of their efforts. For 
example, it is very difficult to create a uniform appearance for a group 
of pages, already since the order of statements (in a group of the same 
property) is so hard to change, and also since the pages are already 
very long. Even if one can achieve complete semantic uniformity, one 
will not currently have much opportunity to "see" this success. There 
are unsolved challenges here that cannot be compared with the relatively 
simple and small data that one can find in a typical Wikipedia Infobox. 
External developers and maybe even researchers could contribute here, 
but they would also benefit form the input and concrete ideas from 
WikiProjects (Romain's email already had quite a number of directly 
implementable ideas in it ... this kind of constructive input is already 
half of the solution).


Cheers,

Markus


On 31/10/2022 23:40, Romaine Wiki wrote:
Yesterday it was 10 years ago when Wikidata was founded and two weeks 
ago Wikidata reached the amount of 100 million items. This is a good 
moment to see what we have (and don't have), to look a bit back, and 
also some hope for the future.


The idea to describe this already started in September and since then I 
have done various analysis to get a picture. This, however, will not be 
a complete overview as there are too many factors involved, just a 
general picture of what I came across.


(Spoiler: This e-mail gets more structure further below. :-p)

== Structured? ==

Wikidata, it is said it contains structured data. I think we need to be 
more precise with it: it is how the data is stored that is structured. 
And this structured data is _only_ present on an individual item. If we 
zoom out a little bit, and view multiple items of a serie, among items 
the data is often missing, fragmented, differently organised, and 
sometimes even problematic. On a multi-item-level (serie-level) it 
highly depends if a user has done all the work to synchronise the 
various items all together or not.


*Example:* I came across a serie of items about a certain sports 
tournament with an edition organised each year for 50 years on a row. 
For P31 (instance of), on 5 items it was called an event, on 25 items it 
was called a sporting event, on on 13 items a tournament, on some others 
a competition, and a few without P31. To be clear, each edition had the 
same setup, was for the same sport, everything the same. The articles on 
Wikipedia are better structured!


This is just a simple serie of items. Zooming out another level, the 
differences between series are huge, which makes the quality low.


How is a new item added? In the past ten years many items have been 
added with bots/tools based on the articles on Wikipedia. (Yes, for I 
ignore here other additions.) In future still many items will be created 
when an article on Wikipedia has been created. In the worst case, the 
user adds the sitelink and the items stays empty (practically useless!). 
A little bit better, the user adds P31/P279 (instance of/subclass of) 
(not useful, but it helps). A bit more better, also other statements are 
added (an item becomes useful). Better when a user checks one/two other 
items in a series. Much better when a user checks all items 

[Wikidata] Re: [Small wiki toolkits] Final feedback session on Friday, October 28th, at 16:00 UTC

2022-11-02 Thread Srishti Sethi
Hello, If you missed attending the last feedback session on the small wiki
toolkits workshops, here is a short survey that you could fill out to let
us know your technical learning needs and ideas and suggestions for
improving the workshop format for the next year <
https://docs.google.com/forms/d/e/1FAIpQLSeVhqgZYXQG8Fgw25xRS0n7eKMleOKy5i1tM5Ty3mGACtcTag/viewform>
[1] If you attended the feedback session, you can still fill out the survey
to share ideas that you couldn't do so more anonymously during our meeting
last week.


We look forward to learning from your suggestions and including them in the
planning for next year.


Cheers,

Srishti


[1]
https://docs.google.com/forms/d/e/1FAIpQLSeVhqgZYXQG8Fgw25xRS0n7eKMleOKy5i1tM5Ty3mGACtcTag/viewform


*Srishti Sethi*
Senior Developer Advocate
Wikimedia Foundation 



On Fri, Oct 28, 2022 at 8:01 AM Seyram Komla Sapaty 
wrote:

> Hello!
>
> Reminder that this workshop starts in an hour (16:00 UTC).
>
> See you there!
>
>
>
> On Thu, Oct 20, 2022 at 8:18 PM Srishti Sethi 
> wrote:
>
>> Hello everyone,
>>
>>
>> The last & final feedback session on the "Small wiki toolkits" (SWT)
>> workshop series is coming up - it will take place on Friday, October 28th,
>> at 16:00 UTC. You can find more details on the workshop and a link to join
>> here: <
>> https://meta.wikimedia.org/wiki/Small_wiki_toolkits/Workshops#Upcoming:_Final_feedback_session_on_the_workshop_series>
>> [1].
>>
>>
>> This workshop will gather feedback on the SWT workshop series around bots
>> and scripts development, ongoing since January 2022. There will be a
>> discussion around the following:
>>
>>- Overall feedback on the workshop series
>>- Technical topics you would like to see the SWT team focus on by
>>running workshops or developing resources in 2023
>>- Your preferred learning formats
>>
>>
>> This session does not require attendance in previous workshops to
>> participate. We look forward to your participation!
>>
>>
>> Best,
>>
>> Srishti
>>
>>
>> On behalf of the SWT Workshops Organization team
>>
>>
>> [1]
>> https://meta.wikimedia.org/wiki/Small_wiki_toolkits/Workshops#Upcoming:_Final_feedback_session_on_the_workshop_series
>>
>>
>>
>> *Srishti Sethi*
>> Senior Developer Advocate
>> Wikimedia Foundation 
>>
>>
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/5YOFVJH3SWE7W7L6UMPONET6HHLTQ4YM/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org


[Wikidata] Re: Talk to the Search Platform / Query Service Team—November 2nd, 2022

2022-11-02 Thread Guillaume Lederrey
The Search Platform Office Hours are starting in about 1h. Feel free to
join if you want to talk to us!

On Tue, 1 Nov 2022 at 11:37, Guillaume Lederrey 
wrote:

> Hello all!
>
> The Search Platform Team usually holds an open meeting on the first
> Wednesday of each month. Come talk to us about anything related to
> Wikimedia search, Wikidata Query Service (WDQS), Wikimedia Commons Query
> Service (WCQS), etc.!
>
> Feel free to add your items to the Etherpad Agenda for the next meeting.
>
> Details for our next meeting:
> Date: Wednesday, November 2nd, 2022
> Time: 15:00-16:00 UTC / 08:00 PDT / 11:00 EDT / 16:00 CEST / 19:00 GST
> Etherpad:
> https://etherpad.wikimedia.org/p/Search_Platform_Office_Hours
> Google Meet link: https://meet.google.com/vgj-bbeb-uyi
> Join by phone: https://tel.meet/vgj-bbeb-uyi?pin=8118110806927
>
> Have fun and see you soon!
>
>Guillaume
>
> --
> *Guillaume Lederrey* (he/him)
> Engineering Manager
> Wikimedia Foundation 
>


-- 
*Guillaume Lederrey* (he/him)
Engineering Manager
Wikimedia Foundation 
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/HMEC432IXJCOEJ7HUT7JX3AZRNFQOPRP/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org