On Wed, 2025-07-30 at 12:21 -0500, Nathan Bossart wrote:
> Here is what I have staged for commit.
That's more clear to me. I also like that it shows that the options
work well together, because that was not obvious before.
Regards,
Jeff Davis
uot;? Because you currently can't do "--data-
only --schema-only". So that would make it not quite an alias.
If we go in this direction, it might be easier to just say that --
include conflicts with --schema-only and --data-only.
Regards,
Jeff Davis
On Tue, 2025-07-29 at 20:22 +0200, Álvaro Herrera wrote:
> Please move the switches themselves out of the translatable message,
> otherwise there are too many of them. For instance,
Thank you for looking, v2 attached.
Regards,
Jeff Davis
From 61b0239f17a1c7220de32699e95c6b365a
be builtin in that case, I suppose.
Another annoyance is that, if INITDB_LOCALE_PROVIDER=builtin, and
LC_CTYPE is not UTF-8-compatible, then we need to force LC_CTYPE=C.
That affects fewer things than it would with the libc provider, but it
still affects some things.
Regards,
Jeff Davis
On Wed, 2025-06-18 at 10:21 -0700, Jeff Davis wrote:
> On Wed, 2025-06-18 at 10:43 -0500, Nathan Bossart wrote:
> > IIUC the current proposal is to:
> >
> > * Dump/restore stats by default.
We don't have a consensus for that, so unless a few people make an
abrupt turnar
On Thu, 2025-07-10 at 10:42 -0700, Jeff Davis wrote:
> On Wed, 2025-06-18 at 10:21 -0700, Jeff Davis wrote:
> > * reject the combination of an "only" option and a "with" option
>
> There seems to be a rough consensus on this point.
Patch attached.
lt
> behavior about statistics in pg_dump, though.
I don't see a consensus to make stats the default.
Regards,
Jeff Davis
On Wed, 2025-07-23 at 19:11 -0700, Jeff Davis wrote:
> The patch feels a bit over-engineered, but I'd like to know what you
> think. It would be great if you could test/debug the windows NLS-
> enabled paths.
Let me explain how it ended up looking over-engineered, and perhaps
On Fri, 2025-07-11 at 11:48 +1200, Thomas Munro wrote:
> On Fri, Jul 11, 2025 at 6:22 AM Jeff Davis wrote:
> > I don't have a great windows development environment, and it
> > appears CI
> > and the buildfarm don't offer great coverage either. Can I ask for
> &
ocale. The current proposal doesn't attempt that kind of
cleverness.
Comments?
Regards,
Jeff Davis
From 8ba8f74d28a64bfb006a76fbec64638f55f3660c Mon Sep 17 00:00:00 2001
From: Jeff Davis
Date: Thu, 17 Jul 2025 13:07:50 -0700
Subject: [PATCH] initdb: default to builtin C.UTF-8
Disc
much milk if we only convert ASCII correctly.
>
> But perhaps I am just being paranoid.
That's a reasonable concern, and I don't mean to dismiss it. But I
believe that problem is two orders of magnitude smaller than the
problems we have with the status quo.
Regards,
Jeff Davis
hem when either --statistics-only or --no-
> > schema is used.
Thank you.
>
> +1, pending resolution of the defaults issue.
I went ahead and committed this as it clearly needs to be fixed. We can
continue the options discussion.
Regards,
Jeff Davis
SQL standard seems to require Unicode Full Case Mapping.
Regards,
Jeff Davis
[1] https://www.postgresql.org/docs/devel/locale.html#LOCALE-PROVIDERS
elease of the provider, it seems less likely to cause a problem
for equality searches, and therefore carries a lower risk for PKs. The
downside is that the keys will be larger and there are still some
risks, including bugs in the implementation (which is not just a
theoretical concern).
Othe
7b25c86f).
The revert seems to be related to pgport_shlib. At least for my current
work, I'm focused on removing setlocale() dependencies in the backend,
and a PG_C_LOCALE should work fine there.
Regards,
Jeff Davis
On Thu, 2025-07-10 at 11:53 +1200, Thomas Munro wrote:
> On Thu, Jul 10, 2025 at 10:52 AM Jeff Davis
> wrote:
> > The first problem -- how to affect the encoding of strings returned
> > by
> > strerror() on windows -- may be solvable as well. It looks like
> > LC_ME
o-statistics and reject --statistics.
Other options are mostly the same between them, so I'm not sure it's a
good idea for them to diverge.
Regards,
Jeff Davis
On Wed, 2025-06-18 at 10:21 -0700, Jeff Davis wrote:
> * reject the combination of an "only" option and a "with" option
There seems to be a rough consensus on this point. Should we move ahead
with this small change and see if we can get consensus to go further?
Regards,
Jeff Davis
On Mon, 2025-07-07 at 17:56 -0700, Jeff Davis wrote:
> I looked into this a bit, and if I understand correctly, the only
> problem is with strerror() and strerror_r(), which depend on
> LC_MESSAGES for the language but LC_CTYPE to find the right encoding.
...
> Windows would be a dif
I was trying to exercise the function IsoLocaleName(), which is
surrounded by:
#if defined(WIN32) && defined(LC_MESSAGES)
but, at least in CI, that combination never seems to be true, which
surprised me. What platforms exercise this code path?
Regards,
Jeff Davis
On Tue, 2025-07-01 at 08:06 -0700, Jeff Davis wrote:
> Attached rebased v3.
And here's v4.
I changed the global variable to only hold the LC_CTYPE (not
LC_COLLATE), because windows doesn't support a _locale_t that
represents multiple categories with different locales.
This pa
On Wed, 2025-06-11 at 12:15 -0700, Jeff Davis wrote:
> > v1-0008-Set-process-LC_COLLATE-C-and-LC_CTYPE-C.patch
> >
> > As I mentioned earlier in the thread, I don't think we can do this
> > for
> > LC_CTYPE, because otherwise system error messages would not
On Wed, 2025-06-11 at 12:15 -0700, Jeff Davis wrote:
> I changed this to a global_libc_locale that includes both LC_COLLATE
> and LC_CTYPE (from datcollate and datctype), in case an extension is
> relying on strcoll for some reason.
..
> This patch series, at least so far, is desi
d be confusing, but maybe it's fine.
Regards,
Jeff Davis
one.
Regards,
Jeff Davis
s from pg_locale.h but instead put
> them in the .c files as needed, and explain why this is possible or
> suitable now.
It goes with v16-0003, so I will hold this back for now as well.
Regards,
Jeff Davis
pen-source Unicode normalization? If so, that would be very
cool.
The reason I'm asking is because, if there are multiple open source
implementations, we should either have the best one, or just borrow
another one as long as it has a suitable license (perhaps translating
to C as necessary).
Regards,
Jeff Davis
ize _or_ use form-
> insensitive string comparison, but nothing did that 20 years ago.
> Thus
> doing the form-insensitivity in the filesystem seemed best, and if
> you
> do that you can be form-preserving to enable the optimization
> described
> above.
Databases have similar concerns as a filesystem in this respect.
Regards,
Jeff Davis
ities
for optimization as well, such as:
* reducing the need for palloc and extra buffers, perhaps by using
buffers on the stack for small strings
* operate more directly on UTF-8 data rather than decoding and re-
encoding the entire string
Regards,
Jeff Davis
>
> Works for me.
Sounds good. We can document compatibility notes around this point.
If normalization becomes important, we can take the time to work out
the performance implications more carefully, and potentially introduce
an NCASEFOLD() if needed.
Regards,
Jeff Davis
type.
> I guess I don't feel strongly about it either
> way.
Are you a user of citext? I'm genuinely interested in the use cases,
and whether the separate-data-type approach has merits that are missing
in the other approaches.
Regards,
Jeff Davis
he entry for EXCLUDE? I also merged your wording with
some similar wording from the entry about UNIQUE. Attached.
Regards,
Jeff Davis
From 0988ec1bac79055899fb555ac0c0441333888c83 Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth"
Date: Tue, 17 Jun 2025 20:48:56 -0700
Subject:
R(), so that
sounds like a good idea. I'd be interested to hear from users of
citext.
Regards,
Jeff Davis
ot sure whether we'd want to standardize one or both of
those functions.
And if you think there's likely to be a collision with the standard
that's hard to anticipate and fix now, then we should consider
reverting CASEFOLD() for 18 and wait for more progress on the
standardization. W
tisfy Robert's concern about
the --help output. But Robert also wants stats off by default for
pg_dump and on by default for pg_restore, which I think means we need
both --with-statistics and --no-statistics anyway. Robert, comments?
Regards,
Jeff Davis
override that and I'm not sure we have one right now.
Regards,
Jeff Davis
On Thu, 2025-06-12 at 08:58 -0700, Jeff Davis wrote:
> On Thu, 2025-06-12 at 09:52 -0500, Nathan Bossart wrote:
> > If the idea is to remove all options for default behavior, we'd be
> > removing
> > --no-statistics, --with-data, and --with-schema at this point.
>
&
folding would also
want normalization, but it's hard to weigh that against the performance
cost. It might not matter outside of a few edge cases, though I'm not
sure exactly how many.
Regards,
Jeff Davis
but the "--x-only" options
also put us in a tough spot.
If --data-only had always been spelled "--no-schema" (or "--without-
data" or whatever), and --schema-only had always been spelled "--no-
data", then I think it would be a lot easier to add statistics into the
mix.
Regards,
Jeff Davis
On Mon, 2025-06-16 at 16:09 -0500, Nathan Bossart wrote:
> So perhaps there's not as strong of a
> consensus as we thought. Maybe we should ask for any new/updated
> votes.
Does it make any sense to be off by default in 18 and on in some later
release?
Regards
Jeff Davis
Fixed.
Regards,
Jeff Davis
isible changes in the past, and
> regenerating tsvectors because of that were merely a suggestion.
Interesting, thank you for looking into the history here. It would
certainly be simpler to just make FTS fully collation-aware.
Regards,
Jeff Davis
ther options,
we don't need to worry about consistency with them, and I think we
should just use "--statistics".
Regards,
Jeff Davis
y.
To me, "last option wins" means that you don't raise an error; the
latter option simply overrides the earlier one.
Given that the pg_dump options are not order-sensitive now (unless I'm
missing something), I'm worried about the consequences of trying to
make them so now.
Regards,
Jeff Davis
simple to start using "last option wins" behavior
now. There are probably some combinations of options where it's not
clear whether a later option is an extra constraint or will override a
previous option.
Regards,
Jeff Davis
On Thu, 2025-06-12 at 15:57 -0500, Nathan Bossart wrote:
> FWIW I don't have a tremendously strong opinion about --statistics-
> only.
Same here. I won't cast a vote on this particular issue, as long as the
functionality is available.
Regards,
Jeff Davis
rip out --statistics-only (in favor
> of
> --no-schema --no-data --with-statistics).
I'd probably keep --statistics-only.
Regards,
Jeff Davis
On Thu, 2025-06-12 at 10:18 -0400, Robert Haas wrote:
> Am I too late to propose ripping this out?
As long as we keep the functionality, I'm fine changing the
options/names around at this point.
Regards,
Jeff Davis
ndexes,
which are in SECTION_POST_DATA).
Regards,
Jeff Davis
On Fri, 2025-02-07 at 11:19 -0800, Jeff Davis wrote:
>
> Attached v15. Just a rebase.
Attached v16.
> * commit this on the grounds that it's a desirable code improvement
> and
> the worst-case regression isn't a major concern; or
I plan to commit this soon after bra
NCTION
statements that come from other places (e.g. direct from applications,
or migration scripts, or extension scripts).
>
Regards,
Jeff Davis
ger of accidentally depending on that setting. Can the encoding be
controlled with LC_MESSAGES instead of LC_CTYPE?
Do you have an example of how things can go wrong?
> For the LC_COLLATE settings, I think we could just
> do the setting in main(), where the other non-database-speci
We could try to create a GUC to control this behavior, but behavior-
changing GUCs don't have a great history, and it would probably last
quite some time before we could really turn off libc for good.
There would be similar challenges for downcase_identifier() and maybe
pg_strcasecmp().
Regards,
Jeff Davis
o.
I guess "CTYPE" works, but it's too technical and feels libc-specific.
Regards,
Jeff Davis
we need is the right encoding, do
we need a proper locale?
Regards,
Jeff Davis
On Fri, 2025-06-06 at 15:47 -0700, Jeff Davis wrote:
> > > * Force the environment variables LC_COLLATE=C and LC_CTYPE=C
> > > unconditionally, and pg_perm_setlocale() them
> >
> > Currently that would be a regression for some people, because
> > when
On Thu, 2025-06-05 at 22:15 -0700, Jeff Davis wrote:
> To continue this thread, I did a symbol search in the meson build
> directory like (patterns.txt attached):
Attached a rough patch series which does what everyone seemed to agree
on:
* Change some trivial ASCII cases to use pg_
on datctype, and I could have offered a more clear reply to
the user.
Regards,
Jeff Davis
/ comments. Another caller is
get_iso_localename().
There are also a couple false positives where mbstowcs_l/wcstombs_l are
emulated with uselocale() and mbstowcs/wcstombs. In that case, it's not
actually sensitive to the global setting.
---
copyfromparse.c - the input is
, then ignore LC_COLLATE/LC_CTYPE and emit a
WARNING, rather than trying to set it based on LOCALE and getting an
error.
Regards,
Jeff Davis
[1]
https://www.postgresql.org/message-id/cd3517c7-ddb8-454e-9dd5-70e3d84ff6a2%40eisentraut.org
From fea7ab4f0495330fae56f069520de374d75ae0b8 Mon Sep 17
On Tue, 2025-06-03 at 20:22 -0700, Jeff Davis wrote:
> EQUALITY marker: indicates that the function or index AM depends on
> CollOid for the equality semantics of the input expression. Examples:
> texteq(), btree AM, hash AM. (Note: EQUALITY is only important for
> non-
> determini
a strong opinion on which route to
take, but I chose the above names from existing keywords so we wouldn't
have to add any.
Regards,
Jeff Davis
ted behavior.
If we make the opposite assumption, that none are ordering-sensitive
unless we mark them so, that would allow properly-marked functions to
fail at parse time, and the rest to fail at runtime. But this
assumption doesn't work as well for recording dependencies, because
we'd miss the dependencies for UDFs that aren't properly marked.
Thoughts?
Regards,
Jeff Davis
that a UDF with collatable inputs depends on
all of the behaviors.
Regards,
Jeff Davis
ct users to create their own functions which depend on our
normalization tables, we can add a fourth marker UNICODE. Otherwise, we
can just special case the few builtin functions we have to create those
dependency entries.
Regards,
Jeff Davis
t execute any non-superuser-owned code"
would be very useful at a practical level, e.g. for pg_dump.
Regards,
Jeff Davis
e database, and we've had plenty of fixes involving
> the startup process and a different process, mostly the checkpointer.
> That's an annoying limitation.
If you have in mind some other ways to use it than I like it a lot
more. And I don't have a better idea.
Regards,
Jeff Davis
ficant performance overhead
to wrapping the function as is done for SECURITY DEFINER, so if the
function is obviously safe, it would be nice to avoid that. And it
would be another tool to help us mitigate the various related problems
we have with selecting from views, etc.
Regards,
Jeff Davis
as SECURITY DEFINER and then someone changes it
later?
Regards,
Jeff Davis
to an
"upgrade_warnings" directory sounds like a reasonable way to go.
Regards,
Jeff Davis
g infrastructure is a lot less of a
problem than other kinds of complexity, so it might be OK. But it would
be nice if there were a couple cases that would benefit rather than
one.
Regards,
Jeff Davis
ile.
Should we automatically retain files associated with warnings, or copy
them to a different location?
Regards,
Jeff Davis
itly specifies --with-statistics.
Regards,
Jeff Davis
From 5b73253f8848638f1754f4b9da82e90e8814b4b1 Mon Sep 17 00:00:00 2001
From: Jeff Davis
Date: Thu, 22 May 2025 11:03:03 -0700
Subject: [PATCH v2] Change defaults for statistics export.
Set the default behavior of pg_dump, pg_dumpall, and
low for most call sites? Which
call sites are the most interesting ones that need special attention?
Regards,
Jeff Davis
dn't want to change the test results as a part of this
commit.
Regards,
Jeff Davis
From b76cb91441e2eefe278249e23fcd703d27a85a06 Mon Sep 17 00:00:00 2001
From: Jeff Davis
Date: Thu, 22 May 2025 11:03:03 -0700
Subject: [PATCH v1] Change defaults for statistics export.
Set the default be
e
> import side here.
That's fine with me. Perhaps we should just say that pre-18 behavior
differences can be fixed up during export, and post-18 behavior
differences are fixed up during import?
Regards,
Jeff Davis
That might be fine, but it would be good
to understand where the line is between things we should reinterpret
during export vs things we should reinterpret during import.
Regards,
Jeff Davis
gt; passed, so I think this is a reasonable alternative to that design.
I'd have to see the patch to see whether I liked the end result. But
I'm guessing that involves a lot of non-mechanical changes in the call
sites, and also relies on test coverage for all of them.
Regards,
Jeff Davis
e recordDependencyOn() take a LOCKMODE
parameter, which would both inform the caller that a lock will be
taken, and allow the caller to do it their own way and specify NoLock
if necessary. That still results in a huge diff, but the end result
would not be any more complex than the current code.
Regards,
Jeff Davis
surprising to
me. Assuming that heavyweight locks are the right approach, the locks
need to be taken somewhere. And expecting all the callers to get it
right seems error-prone.
This is a long thread so I must be missing some problem or complication
here.
Regards,
Jeff Davis
t that
still doesn't quite capture ICU's more complex definition of word
boundaries.
Or, we could remove those unused functions for now, and figure out if
there's a reason to add them back later. They are probably adding more
confusion than anything.
Regards,
Jeff Davis
From ff
ge.
I tried that in v2-0003, but I think it ended up worse. Most
pg_wc_xyz() functions don't care if it's the default collation or not,
so there are a lot of duplicate cases.
The previous approach is still there as v2-0002.
Regrads,
Jeff Davis
From 9724181f715ce3468e9342763fad
) = 'I';
?column? | ?column? | ?column? | ?column?
--+--+------+--
t| t| f| f
That behavior goes back a long way, so I'm not suggesting that we
change it.
Regards,
Jeff Davis
From e8a68f42f5802d138ba04043b25b7d
On Wed, 2025-04-02 at 17:58 +0530, Shlok Kyal wrote:
> I reviewed the patch and I have a comment:
Thank you and vignesh for the feedback. This patch didn't quite make it
for v18, but I will address it for the next CF.
Regards,
Jeff Davis
On Wed, 2025-03-19 at 15:17 -0700, Jeff Davis wrote:
> On Sat, 2025-03-15 at 21:37 -0400, Corey Huinker wrote:
> > > 0001 - no changes, but the longer I go the more I'm certain this
> > > is
> > > something we want to do.
>
> This replaces regclassin
to
fetch the next batch), and have a single static variable that points to
that.
Also in 0003, the "next_te" variable is a bit confusing, because it's
actually the last TocEntry, until it's advanced to point to the current
one.
Other than that, looks good to me.
Regards,
Jeff Davis
der parallelism, which might
defeat the batching work that we're trying to do.
Regards,
Jeff Davis
uld
use the same $src_dump for both restoration and comparison, but it
looks like you wanted coverage of the --create option. (Aside: why
parallel restore there? Is that just for test coverage or was there a
performance reason?)
Regards,
Jeff Davis
by
> a
> previous call). Does that sound like a strong enough check?
Again, I'd just be practical here and do the check if it feels natural,
and if not, improve the comments so that someone modifying the code
would know where to look.
Regards,
Jeff Davis
ot;? Isn't
that already implied by "JOIN unnest($1, $2) ... s.tablename =
u.tablename"?
Regards,
Jeff Davis
ke it in, or
waiting for beta reports, may yield some new information that could
change minds.
Mid-beta might be too long, but let's wait for the final CF to settle
and give people the chance to respond to a top-level thread?
Regards,
Jeff Davis
make the decision now for some reason?
Regards,
Jeff Davis
ore and after dumps, and if the
"before" version is 17, then it will not have the relallfrozen argument
to pg_restore_relation_stats. We might need a filtering step in
adjust_new_dumpfile?
Attached new v11j-0001
Regards,
Jeff Davis
From 154b8b5c10ec330c26ccd9006c434a7db1feef04
to unblock
your work.
Regards,
Jeff Davis
suite for me.
Are you saying that the tests don't work for you even when v2j-0003 is
applied? Or are you saying that your tests are failing on master, and
that v2j-0002 should be committed?
Regards,
Jeff Davis
From 6fc3b98dc9a2589b9943e075b492b4c31044c14e Mon Sep 17 00:00:00 2001
Fro
e can wait until beta to see what kinds of
problems people encounter.
Regards,
Jeff Davis
On Sat, 2025-03-22 at 09:39 -0700, Jeff Davis wrote:
> For some reason I'm getting a decline of about 3% in the c.sql test
> that seems to be associated with the accessor functions, even when
> inlined. I'm also not seeing as much benefit from the inlining of the
> MemoryCont
le, you get what you asked for.
> >
>
>
> They *asked for* that because they didn't have the mechanism to say
> "hold the mayo" or "everything except pickles". That's reducing their
> choice, and then blaming them for their choice.
Can we reach a decision here and move forward?
Regards,
Jeff Davis
On Tue, 2025-03-04 at 17:28 -0800, Jeff Davis wrote:
> My results (with wide tables):
>
> GROUP BY EXCEPT
> master: 2151 1732
> entire v8 series: 2054 1740
I'm not sure what I did with the EXCEPT test,
less risky than not updating: if you don't update Unicode,
then the code points could end up in the database treated as
unassigned, and then cause a problem for future updates.
Regards,
Jeff Davis
1 - 100 of 1395 matches
Mail list logo