https://bugzilla.wikimedia.org/show_bug.cgi?id=1211
--- Comment #51 from Tim Starling <tstarl...@wikimedia.org> 2010-06-02 01:35:04 UTC --- (In reply to comment #41) > First of all, this implementation of a fix would depend on bug 16012. All > schema changes to large tables are suspended until there's a reasonable way of > executing them, apparently, and categorylinks is certainly a large table. Schema changes are in no way blocked by bug 16012, that was a misconception. I did a couple of schema changes during the deployment of 1.16wmf4, and I'm happy to do more in the next deployment. There are some conditions which need to be met for easy migration, but those won't change when bug 16012 is implemented. The table can be any size, as long as it fits into the free disk space, since MySQL needs to make a copy when it does an ALTER TABLE. [...] > Thirdly, this would probably *require* bug 450 to be fixed to be reasonably > efficient. In the current setup, we'd need something like 'SELECT ... WHERE > cl_from=xxx AND cl_namespace NOT IN (' . NS_FILE . ',' . NS_CATEGORY . ') > ORDER > BY cl_sortkey', which MySQL (at least older versions like 4.0) is too stupid > to > execute efficiently: it can't use the index for ordering if there's a range > condition like IN in the WHERE clause. (I think it would be smart enough in > 5.0, or at least 5.1.) Do you mean the category page query? Presumably that would be cl_to=xxx not cl_from=xxx. You don't need bug 450 to be fixed. You can keep the current (cl_to, cl_sortkey, cl_from) index and add an additional (cl_to, cl_namespace, cl_sortkey, cl_from) index. Then we can support both UIs. When namespace breakdowns are enabled, there would be some complexity involved in making the main category page query skip subcategories and files, but I think it would be possible to do it efficiently, even in MySQL 4.0, by using ranges and unions. For instance, we could have a union with the first subquery having a condition like (cl_sortkey > $offset AND cl_namespace < NS_CATEGORY), and the second subquery having a condition like (cl_namespace > NS_CATEGORY). > And as for the proposal in bug 450, I see two issues. First of all, you'd > surely need one query per namespace on each category view, which could be a > couple dozen on a lot of wikis. That seems kind of excessive, if it's > avoidable. The category page view with the namespace breakdown would have to page through the results one namespace at a time, it wouldn't display, say, pages in all namespaces that start with the letter "L". As in the current scheme, the UI has to reflect the indexes. The query would have ORDER BY cl_namespace, cl_sortkey. > Second of all, the current structure of the category table reflects > the pages/subcats/files breakdown. If counts are given separately for all > namespaces, you'd need to store all the counts in the category table. This > would either require an ALTER TABLE for every namespace added or removed (not > happening), or else breaking off a new categorycount table like (cc_cat, > cc_namespace, cc_count) to store the counts and deleting the cat_pages, > cat_subcats, cat_files from the category table. Breaking namespaces out into a new table would be the only decent solution, assuming we need that feature. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. You are the assignee for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l