On Mon, Nov 30, 2009 at 08:56:48PM +0000, Simon Slavin wrote: > On 30 Nov 2009, at 6:49pm, Nicolas Williams wrote: > > See my previous message: it would make no sense to have a column with > > data-dependent collations. But perhaps I'm missing something. Can you > > describe the semantics of data-dependent collations? > > Suppose I am preparing a sales system. It can be used by users in > many countries, but a user in each country sees only products sold in > that country, and they see those products sorted into alphabetic order > -- however their own country sorts its alphabet. So in one table I > have > > Product ID Market Product name > ---------- ------ ------------ > 1 US Box > 2 US Bible > 3 US Barrel > 4 France BoƮte > 5 France Bible > [...] > > So in the same column I have words in three different languages. If > German had a different alphabet to English, depending on which > customer I am SELECTing for I will need to use a different collation > to sort the results. Of course, I expect this is relatively rare, and > can be perhaps be handled using VIEWs or some other trick, but I > wanted to demonstrate that it is sometimes needed.
This falls into case (b): localization. Here it'd be nice if SQLite3 allowed collations to be named via parameters, so you could compile a query once. But it does not, so you need to compile a query for each language. But that's as far as it goes: you're still specifying the collation rather than letting the data specify a collation. I.e., it'd be nice if you could compile SELECT product_name FROM products WHERE market = :country ORDER BY product_name COLLATE :lang ASC; bind in the country and language, and then execute, but as it is SQLite3 doesn't let you parametrize collations. So you have to mprintf and compile a separate query for each collation. Oh, I suppose one might want to allow a collation to be the result of another query, so that you could lookup collations by country code, or what have you SELECT product_name FROM products WHERE market = :country ORDER BY (SELECT lang FROM country_to_lang WHERE country = :country) But that's still not a collation that is derived from the data being sorted, but from other data. This: SELECT product_name FROM products WHERE market = :country ORDER BY lang_of(product_name); makes no sense! That's because nothing guarantees that lang_of(product_name) will be singular here. (Also, not only does it make no sense, it doesn't help you since SQLite3 will not compile such a query.) > >>> Note too that Unicode has codepoints for specifying the language that > >>> the subsequent text is written in. > > > > http://unicode.org/unicode/faq/languagetagging.html#2 > > http://www.unicode.org/versions/Unicode5.0.0/ch16.pdf#G17521 > > (section 16.9) > > Many thanks. I did not know this and it saves me from having to > invent my own system in something else I am doing. I do understand > why it should not normally be used and I will not over-use it. I don't think you should use Unicode language tags though. See above and previous messages. > On 30 Nov 2009, at 7:16pm, Nuno Lucas wrote: > > Note that some countries have different collations depending on the > > objective of the output. For example, dictionary order may be different > > from phone-book order and different from other general listing order. > > Arghhhhhhhh ! Yes, it all sucks. At least some language academies are now working to reduce such problems. For example, it used to be that case folding is Spanish did preserve accents in the to-upper-case direction, and Spanish used to have special sort orders for 'ch' and 'll' (which were as though a single character). Nico -- _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users