Re: Initcap works differently with different locale providers
On Wed, Aug 6, 2025 at 2:44 PM Peter Eisentraut wrote: > > On 04.08.25 22:59, Jeff Davis wrote: > > On Mon, 2025-08-04 at 12:30 +0700, Oleg Tselebrovskiy wrote: > >> First patch just adds this warning about not relying on initcap() > >> exact > >> result. The second one is the same, but removes the part "what is a > >> word" > >> since it's could be moot because we recommend writing custom > >> functions, > >> so understanding what is a word is not exactly needed. Still on the > >> fence > >> about which patch is better, though > > > > One more thing: we should also change it to "... to upper case (or > > title case) and the rest to lower case...". Title case is for scripts > > that have characters like 'Dž' (U+01C5). > > > > Other than that I like the second version, which un-documents the > > specific word boundary rules. I'll admit I'm not quite sure how people > > use this function in practice, but I expect that it's mostly convenient > > (or lazy) display. > > It's meant to be an Oracle-compatible function, so maybe someone can > check there for some details. > > https://docs.oracle.com/en/database/oracle/oracle-database/18/sqlrf/INITCAP.html > > I think we should try to document the behavior more precisely. But we > probably first have to agree what it should be. > > > Alexander, is there a reason you backported this change? I don't > > normally backport doc improvements like this, but I'm not sure what > > standard others use. The fact that it's on 7 branches makes me more > > reluctant to commit these extra improvements on top. Can you take care > > of these follow-up patches? Or, just revert the change and I can make > > the improvements in master. > > Yes, I was not in favor of backpatching this, since it was not a bug > fix. And it turns out it was incomplete. I think we should revert all > the backpatches and iterate on getting the documentation the way we want > in master. Got it. Sorry for the confusion. I'll revert patches from back branches and then continue to work on the subject for master. -- Regards, Alexander Korotkov Supabase
Re: Initcap works differently with different locale providers
On 04.08.25 22:59, Jeff Davis wrote: On Mon, 2025-08-04 at 12:30 +0700, Oleg Tselebrovskiy wrote: First patch just adds this warning about not relying on initcap() exact result. The second one is the same, but removes the part "what is a word" since it's could be moot because we recommend writing custom functions, so understanding what is a word is not exactly needed. Still on the fence about which patch is better, though One more thing: we should also change it to "... to upper case (or title case) and the rest to lower case...". Title case is for scripts that have characters like 'Dž' (U+01C5). Other than that I like the second version, which un-documents the specific word boundary rules. I'll admit I'm not quite sure how people use this function in practice, but I expect that it's mostly convenient (or lazy) display. It's meant to be an Oracle-compatible function, so maybe someone can check there for some details. https://docs.oracle.com/en/database/oracle/oracle-database/18/sqlrf/INITCAP.html I think we should try to document the behavior more precisely. But we probably first have to agree what it should be. Alexander, is there a reason you backported this change? I don't normally backport doc improvements like this, but I'm not sure what standard others use. The fact that it's on 7 branches makes me more reluctant to commit these extra improvements on top. Can you take care of these follow-up patches? Or, just revert the change and I can make the improvements in master. Yes, I was not in favor of backpatching this, since it was not a bug fix. And it turns out it was incomplete. I think we should revert all the backpatches and iterate on getting the documentation the way we want in master.
Re: Initcap works differently with different locale providers
On Wed, 2025-08-06 at 13:44 +0200, Peter Eisentraut wrote: > It's meant to be an Oracle-compatible function, so maybe someone can > check there for some details. If it's purely a compatibility function, then using ICU's sophisticated word break iterator doesn't make sense. > https://docs.oracle.com/en/database/oracle/oracle-database/18/sqlrf/INITCAP.html > > I think we should try to document the behavior more precisely. I don't think ICU purely follows Unicode on this point (does it?), so we'd have to point to the ICU documentation. > But we > probably first have to agree what it should be. I still don't fully understand the use case here. I've used the function a few times to assemble a few strings into a page heading, but that was some time ago so I don't even clearly remember my use case. It seems plausible there are quite a few people doing something similar, and they'd benefit from ICU's more sophisticated approach. But if the primary use case is for compatibility, then we might be trying to hard to make this a provider-specific feature. > > Yes, I was not in favor of backpatching this, since it was not a bug > fix. And it turns out it was incomplete. I think we should revert > all > the backpatches and iterate on getting the documentation the way we > want > in master. +1. Regards, Jeff Davis
Re: Make pgoutput documentation easier to find
On Wed, Aug 6, 2025 at 8:36 PM Peter Eisentraut wrote: > > On 03.08.25 03:32, Fujii Masao wrote: > > The current documentation for pgoutput is buried in the logical streaming > > replication protocol section (in protocol.sgml), and there's no index entry > > for it. This makes it hard to discover and access, for example, when trying > > to look up the options it supports. > > > > I've often struggled to locate this information myself, so I'd like to > > propose moving the pgoutput documentation to the logical decoding section > > and adding an index entry. The attached patch does that. I think this change > > will make it much easier for users to find the relevant details. > > This would move the documentation of pgoutput from "Internals" to > "Server Programming". So it's a question of whether this is something > we want to advertise that people can use directly. In the past, > pgoutput was an implementation detail of logical replication. But I > gather people are using it for other things now? I've heard that Debezium users, a tool for change data capture, can use pgoutput as the logical decoding plugin. I also know users, including some of my colleagues, who use pgoutput with pg_recvlogical to capture messages inserted via pg_logical_emit_message(). So I think making the pgoutput documentation easier to find would be helpful for those users. Regards, -- Fujii Masao
Re: Make pgoutput documentation easier to find
On 03.08.25 03:32, Fujii Masao wrote: The current documentation for pgoutput is buried in the logical streaming replication protocol section (in protocol.sgml), and there's no index entry for it. This makes it hard to discover and access, for example, when trying to look up the options it supports. I've often struggled to locate this information myself, so I'd like to propose moving the pgoutput documentation to the logical decoding section and adding an index entry. The attached patch does that. I think this change will make it much easier for users to find the relevant details. This would move the documentation of pgoutput from "Internals" to "Server Programming". So it's a question of whether this is something we want to advertise that people can use directly. In the past, pgoutput was an implementation detail of logical replication. But I gather people are using it for other things now? Still not clear what kind of guarantees we want to give about its interfaces, for example.
