Thanks for the message Houcemeddine. I admire you optimism! I'm happy that you're investigating biomedical taxonomic statements in Wikidata, but I'm not interested in writing a paper right now. You can find examples of working with the JSON dumps in the qwikidata python package I wrote here https://qwikidata.readthedocs.io/en/stable/readme.html#json-dump . You can contact me off the mailing list if you would like help using the package. best, -Gabriel
On Sat, Jun 15, 2019 at 8:30 PM Houcemeddine A. Turki < turkiabdelwa...@hotmail.fr> wrote: > Dear Sir, > I thank you for your efforts. When dealing with biomedical taxonomic > statements jn Wikidata, we found similar deficiencies. I have already > decided to write a paper about the biomedical taxonomy of Wikidata and how > to adjust it. I will be honoured if you can be the first author of the > work. You have already extracted the taxonomic statements. So, you can > easily filter the biomedical ones. This work has already been done for > other taxonomies such as SNOMED-CT ( > https://scholar.google.ca/citations?user=UsG8QFwAAAAJ&hl=fr&oi=sra, > https://scholar.google.ca/citations?user=c4LlYxsAAAAJ&hl=fr&oi=sra, > https://scholar.google.ca/citations?user=jVLGHGQAAAAJ&hl=fr&oi=sra, > https://scholar.google.ca/citations?user=fBAvwi4AAAAJ&hl=fr&oi=sra). I > will be available online for further discussion if you agree to work with > our team. This will be simple. > Yours Sincerely, > Houcemeddine Turki (he/him) > Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia > Undergraduate Researcher, UR12SP36 > GLAM and Education Coordinator, Wikimedia TN User Group > Member, WikiResearch Tunisia > Member, Wiki Project Med > Member, WikiIndaba Steering Committee > Member, Wikimedia and Library User Group Steering Committee > Co-Founder, WikiLingua Maghreb > Founder, TunSci > ____________________ > +21629499418 > > > > > > > > > > > > > > > -------- Message d'origine -------- > De : Gabriel Altay <gabriel.al...@gmail.com> > Date : 2019/06/15 23:05 (GMT+01:00) > À : Discussion list for the Wikidata project <wikidata@lists.wikimedia.org> > > Objet : Re: [Wikidata] instance of, subclass of, oh my > > Thanks Jan, I will pursue the badminton discussion on the talk page. > > On Sat, Jun 15, 2019 at 5:49 PM Jan Ainali <j...@aina.li> wrote: > >> Hello Gabriel, >> >> I agree with you about the badminton tournaments, that seems odd. It >> appears to already be a discussion about that on the talk page of the only >> participant in the badminton project: >> https://www.wikidata.org/wiki/User_talk:Florentyna#subclass_of:_badminton_tournament >> >> Perhaps it is best to continue the discussion there? >> >> /Jan Ainali >> http://ainali.com >> >> >> Den lör 15 juni 2019 kl 23:11 skrev Gabriel Altay < >> gabriel.al...@gmail.com>: >> >>> Hello everyone, >>> >>> I was playing around with a recent wikidata dump and extracted the items >>> that "looked" like classes based on the definition here, >>> >>> https://www.wikidata.org/wiki/Wikidata:WikiProject_Ontology/Classes >>> >>> Specifically, an item is a class-item if any of the following are true, >>> * the item is the value of a P31 ("instance of") statement >>> >>> * the item has a P279 ("subclass of") statement (subclass) >>> >>> * the item is the value of a P279 ("subclass of") statement >>> (superclass) >>> >>> Once I extracted all items that met these criteria (2,399,621 items >>> from wikidata-20190603-all.json.bz2) I started examining the results. One >>> of the things I found slightly surprising is that there are about 23k >>> badminton events that are classes b/c they have "subclass of >>> https://www.wikidata.org/wiki/Q13357858" statements. SPARQL query >>> below. >>> >>> >>> https://query.wikidata.org/#SELECT%20%3Fitem%20%3FitemLabel%20%0AWHERE%20%0A%7B%0A%20%20%3Fitem%20wdt%3AP31%20wd%3AQ57733494.%0A%20%20%3Fitem%20wdt%3AP279%20wd%3AQ13357858.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D >>> >>> It also looks like there is a badminton project page, >>> https://www.wikidata.org/wiki/Category:WikiProject_Badminton >>> https://www.wikidata.org/wiki/Wikidata:WikiProject_Badminton/Subclass >>> >>> >>> I'd like to remove these statements as it seems that a particular >>> instance of a badminton tournament >>> https://www.wikidata.org/wiki/Q121940 >>> is not a class. >>> >>> It seems that this pattern is also in place for about 1,000,000 items >>> which are instance of gene (e.g. https://www.wikidata.org/wiki/Q40108). >>> >>> I had a couple questions for the mailing list, >>> >>> 1) do folks know if there is an active group working on wikidata >>> ontology >>> 2) i've read a few messages about shape expressions. would it be >>> worthwhile to setup a shape expression that prevents most items from having >>> both "instance of" and "subclass of" statements? >>> 3) if these entries are generated by bots, what is the best way to get >>> in touch with the owner, their user talk page? >>> >>> I am probably missing a lot of information about what has been done so >>> far in the community, but I'm happy to read anything someone points me >>> towards. >>> >>> best, >>> -Gabriel >>> _______________________________________________ >>> Wikidata mailing list >>> Wikidata@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikidata >>> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata