On Fri, Jan 25, 2019 at 04:33:54PM +0200, Heikki Linnakangas wrote: > On 15/01/2019 02:52, John Naylor wrote: > >The majority of cases are measurably faster, and the best case is at > >least 20x faster. On the whole I'd say this patch is a performance win > >even without further optimization. I'm marking it ready for committer. > > I read through the patch one more time, tweaked the comments a little bit, > and committed. Thanks for the review! > > I did a little profiling of the worst case, where this is slower than the > old approach. There's a lot of function call overhead coming from walking > the string with pg_mblen(). That could be improved. If we inlined pg_mblen() > into loop, it becomes much faster, and I think this code would be faster > even in the worst case. (Except for the very worst cases, where hash table > with the new code happens to have a collision at a different point than > before, but that doesn't seem like a fair comparison.) > > I think this is good enough as it is, but if I have the time, I'm going to > try optimizing the pg_mblen() loop, as well as similar loops e.g. in > pg_mbstrlen(). Or if someone else wants to give that a go, feel free.
It might be valuable to just inline the UTF8 case. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +