Hi, I think I have comprehended roughly around the constructs and the concept underlying.
At Thu, 22 Sep 2011 12:35:56 -0400, Tom Lane <t...@sss.pgh.pa.us> wrote in <23159.1316709...@sss.pgh.pa.us> tgl> Sure, if the "increment the top byte" strategy proves to not accomplish tgl> that effectively. But I'd prefer not to design a complex strategy until tgl> it's been proven that a simpler one doesn't work. Ok, I understand indistinctly that thought. But I have not grasp your measure for the complexity. The current make_greater_string tips up the tail of bare byte sequence and cheks the whole byte sequence to be valid against the database encoding and try the next if not. On the other hand, the patch (although the style is corrupted..) searches for the last CHARACTER and try to tipping the last CARACTER up and decline if failed. Looking within the scope of the function make_greater_string, feel more complexity on the former because of the check and loop construct. Yes, altough the `tipping the character up' has complexity within, but the complexity is capsulated within single function. >From the performance perspective, the current implement always slipps 64 times (0xff - 0xbf, for UTF8) and checks the WHOLE pattern string on every slippage, and eventually declines for the only but not negligible 100 (within Japanese chars only) code points. The check-and-retry loop can't be a help for these cases. And checks the whole pattern string at least once nevertheless successfully landed. While the algorithm of the patch seeks the whole pattern string to find the last character but makes no slippage for whole the code space and declines only on the point of chainging the character length. (Of cource it is possible to save thses points but it is `too expensive' for the gain to me:). Not only checking the whole string, but also checking the character after increment operation is essentially needless for this method. To summarise from my view, these two methods seems not so different on performance for the `incrementable's by current method and the latter seems rather efficient and applicable for most of the `unincrementable's. The patch now does cheking the validity of the result as last-resort because of the possible inconsistency caused by careless chainging of the validity check function (changing the character set, in other word, very unlikely.). But It is unnessessary itself if the consistency between the incrementer and the validator has been checked after the last modification. The four-bytes memcpy would be get out by changing the rewinding method. These modifications make the operation gets low-cost (I think..) for UTF8 and it for others left untouched with regard to the behavior. As I've written in previous message, the modification on pg_wchar_table can be rewinded before until another incrementer comes. Can I have another chance to show the another version of the patch according to the above? Regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers