Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

Kyotaro HORIGUCHI Fri, 23 Sep 2011 02:17:50 -0700

Hi, I think I have comprehended roughly around the constructs and
the concept underlying.


At Thu, 22 Sep 2011 12:35:56 -0400, Tom Lane <[email protected]> wrote in 
<[email protected]>
tgl> Sure, if the "increment the top byte" strategy proves to not accomplish
tgl> that effectively.  But I'd prefer not to design a complex strategy until
tgl> it's been proven that a simpler one doesn't work.

Ok, I understand indistinctly that thought. But I have not grasp
your measure for the complexity.

 The current make_greater_string tips up the tail of bare byte
sequence and cheks the whole byte sequence to be valid against
the database encoding and try the next if not. On the other hand,
the patch (although the style is corrupted..) searches for the
last CHARACTER and try to tipping the last CARACTER up and
decline if failed.

Looking within the scope of the function make_greater_string,
feel more complexity on the former because of the check and loop
construct.

Yes, altough the `tipping the character up' has complexity
within, but the complexity is capsulated within single function.


>From the performance perspective, the current implement always
slipps 64 times (0xff - 0xbf, for UTF8) and checks the WHOLE
pattern string on every slippage, and eventually declines for the
only but not negligible 100 (within Japanese chars only) code
points. The check-and-retry loop can't be a help for these
cases. And checks the whole pattern string at least once
nevertheless successfully landed.

While the algorithm of the patch seeks the whole pattern string
to find the last character but makes no slippage for whole the
code space and declines only on the point of chainging the
character length. (Of cource it is possible to save thses points
but it is `too expensive' for the gain to me:).  Not only
checking the whole string, but also checking the character after
increment operation is essentially needless for this method.

To summarise from my view, these two methods seems not so
different on performance for the `incrementable's by current
method and the latter seems rather efficient and applicable for
most of the `unincrementable's.


 The patch now does cheking the validity of the result as
last-resort because of the possible inconsistency caused by
careless chainging of the validity check function (changing the
character set, in other word, very unlikely.). But It is
unnessessary itself if the consistency between the incrementer
and the validator has been checked after the last modification.
The four-bytes memcpy would be get out by changing the rewinding
method. These modifications make the operation gets low-cost (I
think..) for UTF8 and it for others left untouched with regard to
the behavior.

As I've written in previous message, the modification on
pg_wchar_table can be rewinded before until another incrementer
comes.

Can I have another chance to show the another version of the
patch according to the above?


Regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

Reply via email to