On Wed, May 8, 2013 at 10:47 PM, James Forrester <jforres...@wikimedia.org> wrote: > * Pages are implicitly in the parent categories of their explicit categories > * -> Pages in <Politicians from the Netherlands> are in <People from the > Netherlands by profession> (its first parent) and <People from the > Netherlands> (its first parent's parent) and <Politicians> (its second > parent) and <People> (its second parent's parent) and … > * -> Yes, this poses issues given the sometimes cyclic nature of > categories' hierarchies, but this is relatively trivial to code around
Category cycles are the least of it. The fact that the existing category hierarchy isn't based on any sensible-for-inference ontology is a bigger problem. Let's consider what would happen to one of my favorite examples on enwiki: * The article for Romania is in <Black Sea countries>. Ok. * And that category is in <Black Sea>, so Romania is in that too. Which is a little strange, but not too bad. * And <Black Sea> is in <Seas of Russia> and <Landforms of Ukraine>. Huh? Romania doesn't belong in either of those, despite that being equivalent to your example where pages in <Politicians from the Netherlands> also end up in <People> via <Politicians>. And it gets worse the further up you go. You would have Romania in <Liquids> a few more levels up. For this to work, each wiki would have to redo its category hierarchy as a real ontology based on is-a relationships, rather than the current is-somehow-related-to. Or we would have to introduce some magic word or something to tell MediaWiki that <Politicians> is-a <People> is a valid inference while <Black Sea countries> is-a <Black Sea> isn't. In other words, code-wise adding "tags" to an article is the same as categories with inference and querying. But trying to use the existing category setup as it exists on something like enwiki as "tags" for inference (or querying, to a lesser extent) seems like GIGO. > * Readers can search, querying across categories regardless of whether > they're implicit or explicit > * -> A search for the intersection of <People from the Netherlands> with > <Politicians> will effectively return results for <Politicians from the > Netherlands> (and the user doesn't need to know or care that this is an > extant or non-extant category) A person who is originally from the Netherlands but moved to Germany and became a politician there would be in <People from the Netherlands> and <Politicians>, but maybe should not be in <Politicians from the Netherlands> depending on how exactly you define that category. -- Brad Jorsch Software Engineer Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l