Re: Rework of collation code, extensibility

2023-02-23 Thread Jeff Davis
On Wed, 2023-02-22 at 20:49 +0100, Peter Eisentraut wrote: > > On 14.02.23 00:45, Jeff Davis wrote: > > > > Now the patches are: > > > > > > > > 0001: pg_strcoll/pg_strxfrm > > > > 0002: pg_locale_deterministic() > > > > 0003: cleanup a USE_ICU special case > > > > 0004: GUCs

Re: Rework of collation code, extensibility

2023-02-22 Thread Peter Eisentraut
On 14.02.23 00:45, Jeff Davis wrote: Now the patches are: 0001: pg_strcoll/pg_strxfrm 0002: pg_locale_deterministic() 0003: cleanup a USE_ICU special case 0004: GUCs (only for testing, not for commit) I haven't read the whole thing again, but this arrangement looks good to

Re: Rework of collation code, extensibility

2023-02-20 Thread Jeff Davis
On Mon, 2023-02-13 at 15:45 -0800, Jeff Davis wrote: > New version attached. Changes: These patches, especially 0001, have been around for a while, and they've received some review attention with no outstanding TODOs that I'm aware of. I plan to commit v10 (or something close to it) soon unless

Re: Rework of collation code, extensibility

2023-02-13 Thread Jeff Davis
New version attached. Changes: * I moved the GUC patch to the end (so you can ignore it if it's not useful for review) * I cut out the pg_locale_internal.h rearrangement (at least for now, it might seem useful after the dust settles on the other changes). * I added a separate patch for

Re: Rework of collation code, extensibility

2023-02-13 Thread Peter Eisentraut
On 01.02.23 00:33, Jeff Davis wrote: On Tue, 2023-01-31 at 11:40 +0100, Peter Eisentraut wrote: I don't know to what extent this depends on the abbreviated key GUC discussion.  Does the rest of this patch set depend on this? The overall refactoring is not dependent logically on the GUC patch.

Re: Rework of collation code, extensibility

2023-01-31 Thread Jeff Davis
On Tue, 2023-01-31 at 11:40 +0100, Peter Eisentraut wrote: > I don't know to what extent this depends on the abbreviated key GUC > discussion.  Does the rest of this patch set depend on this? The overall refactoring is not dependent logically on the GUC patch. It may require some trivial fixup

Re: Rework of collation code, extensibility

2023-01-31 Thread Peter Eisentraut
On 27.01.23 00:47, Jeff Davis wrote: I'm hoping to commit 0002 and 0003 soon-ish, maybe a week or two, please let me know if you want me to hold off. (I won't commit the GUCs unless others find them generally useful; they are included here to more easily reproduce my performance tests.) I have

Re: Rework of collation code, extensibility

2023-01-26 Thread Jeff Davis
Attached v9 and added perf numbers below. I'm hoping to commit 0002 and 0003 soon-ish, maybe a week or two, please let me know if you want me to hold off. (I won't commit the GUCs unless others find them generally useful; they are included here to more easily reproduce my performance tests.) My

Re: Rework of collation code, extensibility

2023-01-20 Thread Jeff Davis
On Fri, 2023-01-20 at 12:54 -0800, Jeff Davis wrote: > Both of these are surprising, and I haven't investigated deeply yet. It's just because autoconf defaults to -O2 and meson to -O3, at least on my machine. It turns out that, at -O2, master and the refactoring branch are even; but at -O3, both

Re: Rework of collation code, extensibility

2023-01-20 Thread Jeff Davis
On Tue, 2023-01-17 at 14:18 -0800, Peter Geoghegan wrote: > The second goal is a perfectly good enough goal on its own, and one > that I am totally supportive of. Making the code clearer is icing on > the cake. Attached v8, which is just a rebase. To reiterate: commitfest entry

Re: Rework of collation code, extensibility

2023-01-17 Thread Peter Geoghegan
On Sat, Jan 14, 2023 at 12:03 PM Jeff Davis wrote: > The first goal I had was simply that the code was really hard to > understand and work on, and refactoring was justified to improve the > situation. > > The second goal, which is somewhat dependent on the first goal, is that > we really need an

Re: Rework of collation code, extensibility

2023-01-14 Thread Jeff Davis
On Fri, 2023-01-13 at 11:57 -0800, Peter Geoghegan wrote: > You're adding a layer of indirection that's going to set things up > for > later patches that add a layer of indirection for version ICU > libraries (and even libc itself), and some of the details only make > sense in that context. This

Re: Rework of collation code, extensibility

2023-01-13 Thread Peter Geoghegan
On Wed, Jan 11, 2023 at 3:44 PM Jeff Davis wrote: > Attached trivial rebase as v6. Some review comments for this v6. Comments on 0001-*: * I think that 0002-* can be squashed with 0001-*, since there isn't any functional reason why you'd want to commit the strcoll() and strxfrm() changes

Re: Rework of collation code, extensibility

2023-01-11 Thread Jeff Davis
On Wed, 2023-01-11 at 15:08 +0100, Peter Eisentraut wrote: > I think the refactoring that you proposed in the thread "Refactor to > introduce pg_strcoll()." was on a sensible track.  Maybe we should > try > to get that done. Those should be patches 0001-0003 in this thread (now at v6), which

Re: Rework of collation code, extensibility

2023-01-11 Thread vignesh C
On Thu, 22 Dec 2022 at 11:11, Jeff Davis wrote: > > On Sat, 2022-12-17 at 19:14 -0800, Jeff Davis wrote: > > Attached is a new patch series. I think there are enough changes that > > this has become more of a "rework" of the collation code rather than > > just a refactoring. This is a

Re: Rework of collation code, extensibility

2023-01-11 Thread Peter Eisentraut
On 06.01.23 08:04, Jeff Davis wrote: The existing code is not great, in my opinion: it doesn't have clear API boundaries, the comments are insufficient, and lots of special cases need to be handled awkwardly by callers. That style is hard to beat when it comes to the raw line count; but it's

Re: Rework of collation code, extensibility

2023-01-05 Thread Jeff Davis
On Wed, 2023-01-04 at 22:46 +0100, Peter Eisentraut wrote: > It combines some refactoring that was previously posted with partial > support for multiple ICU libraries with partial support for some new > hooks.  Shouldn't those be three separate threads? Originally they felt more separate to me,

Re: Rework of collation code, extensibility

2023-01-04 Thread Peter Eisentraut
On 22.12.22 06:40, Jeff Davis wrote: On Sat, 2022-12-17 at 19:14 -0800, Jeff Davis wrote: Attached is a new patch series. I think there are enough changes that this has become more of a "rework" of the collation code rather than just a refactoring. This is a continuation of some prior

Re: Rework of collation code, extensibility

2022-12-17 Thread Ted Yu
On Sat, Dec 17, 2022 at 8:54 PM John Naylor wrote: > > > nul-terminate -> null-terminate > > NUL is a common abbreviation for the zero byte (but not for zero > pointers). See the ascii manpage. > > -- > John Naylor > EDB: http://www.enterprisedb.com > > Ah. `nul-terminated` does appear in the

Re: Rework of collation code, extensibility

2022-12-17 Thread John Naylor
On Sun, Dec 18, 2022 at 10:28 AM Ted Yu wrote: > It seems the `else` is not needed (since when the if branch is taken, we return from the func). By that same logic, this review comment is not needed, since compiler vendors don't charge license fees by the number of keywords. ;-) Joking aside,

Re: Rework of collation code, extensibility

2022-12-17 Thread Ted Yu
On Sat, Dec 17, 2022 at 7:14 PM Jeff Davis wrote: > Attached is a new patch series. I think there are enough changes that > this has become more of a "rework" of the collation code rather than > just a refactoring. This is a continuation of some prior work[1][2] in > a new thread given its new