> We may also want to consider if Wikidata is actually the best store for > all kinds of data. Let's consider example: > > https://www.wikidata.org/w/index.php?title=Q57009452 > > This is an entity that is almost 2M in size, almost 3000 statements ...
A paper with 2884 authors! arxiv.org deals with it by calling them the "Atlas Collaboration": https://arxiv.org/abs/1403.0489 The actual paper does the same (with the full list of names and affiliations in the Appendix). The nice thing about graph databases is we should be able to set author to point to an "Atlas Collaboration" node, and then have that node point to the 2884 individual author nodes (and each of those nodes point to their affiliation). What are the reasons to not re-organize it that way? My first thought was that who is in the collaboration changes over time? But does it change day to day, or only change each academic year? Either way, maybe we need to point the author field to something like "Atlas Collaboration 2014a", and clone-and-modify that node each time we come to a paper that describes a different membership? Or is it better to do each persons membership of such a group with a start and end date? (BTW, arxiv.org tells me there are 1059 results for ATLAS Collaboration; don't know if one "result" corresponds to one "paper", though.) > While I am not against storing this as such, I do wonder if it's > sustainable to keep such kind of data together with other Wikidata data > in a single database. It feels like it belongs in "core" Wikidata. Being able to ask "which papers has this researcher written?" seems like a good example of a Wikidata query. Similarly, "which papers have The ATLAS Collaboration" worked on?" But, also, are queries like "Which authors of Physics papers went to a high school that had more than 1000 students?" part of the goal of Wikidata? If so, Wikidata needs optimizing in such a way that makes such queries both possible and tractable. Darren _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata