https://bugzilla.wikimedia.org/show_bug.cgi?id=164
--- Comment #174 from Philippe Verdy <verd...@wanadoo.fr> 2010-07-22 09:13:30 UTC --- > If we're gonna discuss anything, let's discuss the current implementation plan: it is the only relevant plan to discuss at this point. This is EXACTLY what I was discussing: proposing an implementation design, which also considers the fact that collations will also need to evolve over time (for example the UCS repertoire is evolving (so the DUCET table is modified), and collation rules are being corrected for some languages, in the CLDR project) : each change will generate a new internal localeid to support it, and there will be possibly different keys during the transition, even if (finally) an old collation rule will be deleted (with its old sortkeys) after the new sortkeys will have been fully recomputed. So this is clearly not "blah blah". And this is certainly relevant for the fact that you're considering implementing some or all of the suggestions (and you'll have to test your solutions, including on their performance impact. I propose a simple model that can be very space-efficient, and that also avoids reparsing all pages if ever a collation rule is changed, or if additional collation rules are added in a category to support multiple sort orders (notably within Chinese categories that could support different orders). My proposal does not even depend on the backend SQL server capabilities (all it has to support is at least a binary order on ASCII-only VARCHAR(n) to store the computed and truncated sortkeys, that will be generated by the PHP front-end (using ICU) and followed by an ASCII-only serialization. This means that the simplest "ORDER BY" queries to retrieve correctly ordered lists of pages will work independantly of the backend. The function used in PHP to generate the binary-ordered sortkey (that will finally be effectively stored) should also be accessible in MediaWiki as a builtin parser function, that will take two parameters: the locale code, and the text. For example, as {{SORTKEY:text|locale}}, where the ''locale'' specified can be optional and will take the default value of the {{CONTENTLANGUAGE}} of the project). This builtin parser function could also be used to correctly sort the sortable Mediawiki tables inserted in articles, by using correct sortkeys generated by this template, if the generated sortkey is readable and effectively serialized as ASCII-only, but it does not necessarily have to be truncated by this function, even if it will be truncated when the same function result will be used to store sortkeys in the schema). This parser function should even be the first development to realize, before even changing the category-page indexes, because it can be applied and tested immediately in existing categories (for example by using categorizing templates in Wiktionary), without even upgrading the SQL schema. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l