Hi everybody,
Sorry to open up an old thread again after ten days, but there were some
things in Lydia's reply below that I wanted to come back to.
So, first, a couple of examples of the kind of Commons Categories I had
in mind:
https://commons.wikimedia.org/wiki/Category:Images_released_by_British_Library_Images_Online
https://commons.wikimedia.org/wiki/Category:Metropolitan_Improvements_%281828%29_Thomas_Hosmer_Shepherd
Despite their names, both these cats effectively identify images from
particular photosets on Flickr. The first category relates to a
particular set of images released by a particular institution on a
particular date. The second relates to a particular set of scans from a
particular edition of a particular book. Both (IMO) would (and,
moreover *should*) currently fail Wikidata:Notability.
The book, and even the edition, might be notable. But a particular set
of scans surely would not. Similarly, the first category is really just
a photoset from Flickr, again something that wouldn't currently get a
Wikidata Q-number.
Now in the email below, Lydia effectively said: no problem, just give
each Commons Category a Wikidata Q-number anyway. (Imho they should be
on Wikidata. I fear if we introduce another layer it'll be considerably
harder to use and maintain.)
GerardM, in sessions at Wikimania, also argued strongly simply for
putting everything in Wikidata.
But I think this would be a mistake, because IMO Wikidata:Notability is
a positive virtue, which should be defended. It is *useful* to people
that they can download a dump of Wikidata for their own purposes, and
get real-world relevant items, rather than the dump being bloated with
wiki junk.
So in my opinion, Commons categories should generally *not* get
Q-numbers on Wikidata (unless they pass WD:N), but should instead get
items on the Commons Wikibase which is being created expressly for the
purpose of holding structured data on things which really only have a
commonswiki significance, and are not real-world notable.
A second point relates to Magnus's issue about how much of this could be
replaced by queries.
Yes, if one were progressively building up a topic search on images from
books in the 1-million image BL Mechanical Curator release, one might
ask for books about London, then books published in a particular date
range. But within that, the natural query to specify scans from this
particular copy of 'Metropolitan Improvements' is the image's membership
of this particular set -- membership of the set in itself is something
that should be queryable, and such a query is the kind of query that, at
the right stage, should be offerable to the user trying to refine their
search.
In fact, most current Commons categories will not be WD-notable. But
even for the most egregious of Commons intersection categories, IMO it
will still be worth the Commons Wikibase tracking category membership
for an image, not least for the ability that will give to easily present
the category's files in different ways -- eg perhaps sorted by filename;
or by original creation date; or by upload date; or by uploader; or by
geographical proximity... etc. Holding the category membership in the
wikibase then allows people to write gadgets to sort or filter or
re-present the category in multiple ways. So it's useful to have the
category as an entity that can be a target for a property.
But there are also reasons for a category to have an item in its own
right -- because there is structured data that one may wish to associate
with the category: one example would be access stats to members of the
category (eg which categories in the Mechanical Curator collection have
had the most file views?) -- the kind of thing of great interest to GLAMs.
Many categories also contain information defining them -- for example,
for the book scans category, one would want a property that this
category contained scans of the particular book (pointed to by its
Q-number), probably a particular edition (probably a qualifier). One
might also want to associate linked data -- pointers to entries for the
book in (possibly multiple) catalogues of its original host institution.
So for all these reasons it may well be useful, as a matter of course,
to have a container for structured information associated with each
commonscat.
This is why I think each and every category on Commons should have its
own Commons Wikibase item, with an associated C-number.
Queries are important, but I'd suggest they are best seen as an
*addition* to the present category system, rather than a *replacement*
for it.
A particular way forward, it seems to me, might be to allow categories
to be *augmented* with specific queries -- i.e. to allow rules to be
specified for particular categories, so that files whose structured-data
topic information matched the rules would automatically be added to the
categories, alongside