Re: Initcap works differently with different locale providers

2025-08-06 Thread Alexander Korotkov
On Wed, Aug 6, 2025 at 2:44 PM Peter Eisentraut  wrote:
>
> On 04.08.25 22:59, Jeff Davis wrote:
> > On Mon, 2025-08-04 at 12:30 +0700, Oleg Tselebrovskiy wrote:
> >> First patch just adds this warning about not relying on initcap()
> >> exact
> >> result. The second one is the same, but removes the part "what is a
> >> word"
> >> since it's could be moot because we recommend writing custom
> >> functions,
> >> so understanding what is a word is not exactly needed. Still on the
> >> fence
> >> about which patch is better, though
> >
> > One more thing: we should also change it to "... to  upper case (or
> > title case) and the rest to lower case...". Title case is for scripts
> > that have characters like 'Dž' (U+01C5).
> >
> > Other than that I like the second version, which un-documents the
> > specific word boundary rules. I'll admit I'm not quite sure how people
> > use this function in practice, but I expect that it's mostly convenient
> > (or lazy) display.
>
> It's meant to be an Oracle-compatible function, so maybe someone can
> check there for some details.
>
> https://docs.oracle.com/en/database/oracle/oracle-database/18/sqlrf/INITCAP.html
>
> I think we should try to document the behavior more precisely.  But we
> probably first have to agree what it should be.
>
> > Alexander, is there a reason you backported this change? I don't
> > normally backport doc improvements like this, but I'm not sure what
> > standard others use. The fact that it's on 7 branches makes me more
> > reluctant to commit these extra improvements on top. Can you take care
> > of these follow-up patches? Or, just revert the change and I can make
> > the improvements in master.
>
> Yes, I was not in favor of backpatching this, since it was not a bug
> fix.  And it turns out it was incomplete.  I think we should revert all
> the backpatches and iterate on getting the documentation the way we want
> in master.

Got it.  Sorry for the confusion.  I'll revert patches from back
branches and then continue to work on the subject for master.

--
Regards,
Alexander Korotkov
Supabase




Re: Initcap works differently with different locale providers

2025-08-06 Thread Peter Eisentraut

On 04.08.25 22:59, Jeff Davis wrote:

On Mon, 2025-08-04 at 12:30 +0700, Oleg Tselebrovskiy wrote:

First patch just adds this warning about not relying on initcap()
exact
result. The second one is the same, but removes the part "what is a
word"
since it's could be moot because we recommend writing custom
functions,
so understanding what is a word is not exactly needed. Still on the
fence
about which patch is better, though


One more thing: we should also change it to "... to  upper case (or
title case) and the rest to lower case...". Title case is for scripts
that have characters like 'Dž' (U+01C5).

Other than that I like the second version, which un-documents the
specific word boundary rules. I'll admit I'm not quite sure how people
use this function in practice, but I expect that it's mostly convenient
(or lazy) display.


It's meant to be an Oracle-compatible function, so maybe someone can 
check there for some details.


https://docs.oracle.com/en/database/oracle/oracle-database/18/sqlrf/INITCAP.html

I think we should try to document the behavior more precisely.  But we 
probably first have to agree what it should be.



Alexander, is there a reason you backported this change? I don't
normally backport doc improvements like this, but I'm not sure what
standard others use. The fact that it's on 7 branches makes me more
reluctant to commit these extra improvements on top. Can you take care
of these follow-up patches? Or, just revert the change and I can make
the improvements in master.


Yes, I was not in favor of backpatching this, since it was not a bug 
fix.  And it turns out it was incomplete.  I think we should revert all 
the backpatches and iterate on getting the documentation the way we want 
in master.






Re: Initcap works differently with different locale providers

2025-08-06 Thread Jeff Davis
On Wed, 2025-08-06 at 13:44 +0200, Peter Eisentraut wrote:
> It's meant to be an Oracle-compatible function, so maybe someone can 
> check there for some details.

If it's purely a compatibility function, then using ICU's sophisticated
word break iterator doesn't make sense.

> https://docs.oracle.com/en/database/oracle/oracle-database/18/sqlrf/INITCAP.html
> 
> I think we should try to document the behavior more precisely.

I don't think ICU purely follows Unicode on this point (does it?), so
we'd have to point to the ICU documentation.

>   But we 
> probably first have to agree what it should be.

I still don't fully understand the use case here. I've used the
function a few times to assemble a few strings into a page heading, but
that was some time ago so I don't even clearly remember my use case. It
seems plausible there are quite a few people doing something similar,
and they'd benefit from ICU's more sophisticated approach.

But if the primary use case is for compatibility, then we might be
trying to hard to make this a provider-specific feature.

> 
> Yes, I was not in favor of backpatching this, since it was not a bug 
> fix.  And it turns out it was incomplete.  I think we should revert
> all 
> the backpatches and iterate on getting the documentation the way we
> want 
> in master.

+1.

Regards,
Jeff Davis





Re: Make pgoutput documentation easier to find

2025-08-06 Thread Fujii Masao
On Wed, Aug 6, 2025 at 8:36 PM Peter Eisentraut  wrote:
>
> On 03.08.25 03:32, Fujii Masao wrote:
> > The current documentation for pgoutput is buried in the logical streaming
> > replication protocol section (in protocol.sgml), and there's no index entry
> > for it. This makes it hard to discover and access, for example, when trying
> > to look up the options it supports.
> >
> > I've often struggled to locate this information myself, so I'd like to
> > propose moving the pgoutput documentation to the logical decoding section
> > and adding an index entry. The attached patch does that. I think this change
> > will make it much easier for users to find the relevant details.
>
> This would move the documentation of pgoutput from "Internals" to
> "Server Programming".  So it's a question of whether this is something
> we want to advertise that people can use directly.  In the past,
> pgoutput was an implementation detail of logical replication.  But I
> gather people are using it for other things now?

I've heard that Debezium users, a tool for change data capture, can use
pgoutput as the logical decoding plugin. I also know users, including
some of my colleagues, who use pgoutput with pg_recvlogical to capture
messages inserted via pg_logical_emit_message().

So I think making the pgoutput documentation easier to find would be
helpful for those users.

Regards,

-- 
Fujii Masao




Re: Make pgoutput documentation easier to find

2025-08-06 Thread Peter Eisentraut

On 03.08.25 03:32, Fujii Masao wrote:

The current documentation for pgoutput is buried in the logical streaming
replication protocol section (in protocol.sgml), and there's no index entry
for it. This makes it hard to discover and access, for example, when trying
to look up the options it supports.

I've often struggled to locate this information myself, so I'd like to
propose moving the pgoutput documentation to the logical decoding section
and adding an index entry. The attached patch does that. I think this change
will make it much easier for users to find the relevant details.


This would move the documentation of pgoutput from "Internals" to 
"Server Programming".  So it's a question of whether this is something 
we want to advertise that people can use directly.  In the past, 
pgoutput was an implementation detail of logical replication.  But I 
gather people are using it for other things now?  Still not clear what 
kind of guarantees we want to give about its interfaces, for example.