Re: Signed vs. Unsigned (some)

2022-02-13 Thread Ranier Vilela
Em sex., 11 de jun. de 2021 às 23:05, Ranier Vilela 
escreveu:

> Hi,
>
> Removing legitimate warnings can it be worth it?
>
> -1 CAST can be wrong, when there is an invalid value defined
> (InvalidBucket, InvalidBlockNumber).
> I think depending on the compiler -1 CAST may be different from
> InvalidBucket or InvalidBlockNumber.
>
> pg_rewind is one special case.
> All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind
> was used -1?
> I did not find it InvalidXLogSegNo!
> Not tested.
>
> Trivial patch attached.
>
After a long time, finally a small part is accepted and fixed.
https://github.com/postgres/postgres/commit/302612a6c74fb16f26d094ff47e9c59cf412740c

regards,
Ranier Vilela


Re: Signed vs. Unsigned (some)

2021-07-02 Thread Ranier Vilela
Em sex., 2 de jul. de 2021 às 13:29, Alvaro Herrera 
escreveu:

> On 2021-Jul-02, Justin Pryzby wrote:
>
> > On Fri, Jul 02, 2021 at 12:09:23PM +0200, Peter Eisentraut wrote:
>
> > > The use of InvalidBlockNumber with vac_update_relstats() looks a bit
> fishy
> > > to me.  We are using in the same call 0 as the default for
> > > num_all_visible_pages, and we generally elsewhere also use 0 as the
> starting
> > > value for relpages, so it's not clear to me why it should be -1 or
> > > InvalidBlockNumber here.  I'd rather leave it "slightly wrong" for now
> so it
> > > can be checked again.
>
> > |commit 0e69f705cc1a3df273b38c9883fb5765991e04fe
> > |Author: Alvaro Herrera 
> > |Date:   Fri Apr 9 11:29:08 2021 -0400
> > |
> > |Set pg_class.reltuples for partitioned tables
> >
> > 3d35 also affects partitioned tables, and 0e69 appears to do the right
> thing by
> > preserving relpages=-1 during auto-analyze.
>
> I suppose the question is what is the value used for.  BlockNumber is
> typedef'd uint32, an unsigned variable, so using -1 for it is quite
> fishy.  The weird thing is that in vac_update_relstats we cast it to
> (int32) when storing it in the pg_class tuple, so that's quite fishy
> too.
>

> What we really want is for table_block_relation_estimate_size to work
> properly.  What that does is get the signed-int32 value from pg_class
> and cast it back to BlockNumber.  If that assignment gets -1 again, then
> it's all fine.  I didn't test it.
>
It seems to me that it is happening, but it is risky to make comparisons
between different types.

1)
#define InvalidBlockNumber 0x

int main()
{
unsigned int num_pages;
int rel_pages;

num_pages = -1;
rel_pages = (int) num_pages;
printf("num_pages = %u\n", num_pages);
printf("rel_pages = %d\n", rel_pages);
printf("(num_pages == InvalidBlockNumber) => %u\n", (num_pages ==
InvalidBlockNumber));
printf("(rel_pages == InvalidBlockNumber) => %u\n", (rel_pages ==
InvalidBlockNumber));
}

num_pages = 4294967295
rel_pages = -1
(num_pages == InvalidBlockNumber) => 1
(rel_pages == InvalidBlockNumber) => 1 /* 17:68: warning: comparison
between signed and unsigned integer expressions [-Wsign-compare] */

If num_pages is promoted to uint64 and rel_pages to int64:
2)
#define InvalidBlockNumber 0x

int main()
{
unsigned long int num_pages;
long int rel_pages;

num_pages = -1;
rel_pages = (int) num_pages;
printf("num_pages = %lu\n", num_pages);
printf("rel_pages = %ld\n", rel_pages);
printf("(num_pages == InvalidBlockNumber) => %u\n", (num_pages ==
InvalidBlockNumber));
printf("(rel_pages == InvalidBlockNumber) => %u\n", (rel_pages ==
InvalidBlockNumber));
}

num_pages = 18446744073709551615
rel_pages = -1
(num_pages == InvalidBlockNumber) => 0
(rel_pages == InvalidBlockNumber) => 0 /* 17:68: warning: comparison
between signed and unsigned integer expressions [-Wsign-compare] */

As Kyotaro said:
"they might be different if we forget to widen the constant
when widening the types"

regards,
Ranier Vilela


Re: Signed vs. Unsigned (some)

2021-07-02 Thread Alvaro Herrera
On 2021-Jul-02, Justin Pryzby wrote:

> On Fri, Jul 02, 2021 at 12:09:23PM +0200, Peter Eisentraut wrote:

> > The use of InvalidBlockNumber with vac_update_relstats() looks a bit fishy
> > to me.  We are using in the same call 0 as the default for
> > num_all_visible_pages, and we generally elsewhere also use 0 as the starting
> > value for relpages, so it's not clear to me why it should be -1 or
> > InvalidBlockNumber here.  I'd rather leave it "slightly wrong" for now so it
> > can be checked again.

> |commit 0e69f705cc1a3df273b38c9883fb5765991e04fe
> |Author: Alvaro Herrera 
> |Date:   Fri Apr 9 11:29:08 2021 -0400
> |
> |Set pg_class.reltuples for partitioned tables
> 
> 3d35 also affects partitioned tables, and 0e69 appears to do the right thing 
> by
> preserving relpages=-1 during auto-analyze.

I suppose the question is what is the value used for.  BlockNumber is
typedef'd uint32, an unsigned variable, so using -1 for it is quite
fishy.  The weird thing is that in vac_update_relstats we cast it to
(int32) when storing it in the pg_class tuple, so that's quite fishy
too.

What we really want is for table_block_relation_estimate_size to work
properly.  What that does is get the signed-int32 value from pg_class
and cast it back to BlockNumber.  If that assignment gets -1 again, then
it's all fine.  I didn't test it.

I think changing the vac_update_relstats call I added in 0e69f705cc1a to
InvalidBlockNumber is fine.  I didn't verify any other places.

I think storing BlockNumber values >= 2^31 in an int32 catalog column is
asking for trouble.  We'll have to fix that at some point.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: Signed vs. Unsigned (some)

2021-07-02 Thread Justin Pryzby
On Fri, Jul 02, 2021 at 12:09:23PM +0200, Peter Eisentraut wrote:
> On 16.06.21 10:48, Peter Eisentraut wrote:
> > On 15.06.21 10:17, Kyotaro Horiguchi wrote:
> > > The definitions are not ((type) -1) but ((type) 0x) so
> > > actually they might be different if we forget to widen the constant
> > > when widening the types.  Regarding to the compiler behavior, I think
> > > we are assuming C99[1] and C99 defines that -1 is converted to
> > > Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)
> > > 
> > > I'm +0.2 on it.  It might be worthwhile as a matter of style.
> > 
> > I think since we have the constants we should use them.
> 
> I have pushed the InvalidBucket changes.
> 
> The use of InvalidBlockNumber with vac_update_relstats() looks a bit fishy
> to me.  We are using in the same call 0 as the default for
> num_all_visible_pages, and we generally elsewhere also use 0 as the starting
> value for relpages, so it's not clear to me why it should be -1 or
> InvalidBlockNumber here.  I'd rather leave it "slightly wrong" for now so it
> can be checked again.

There's two relevant changes:

|commit 3d351d916b20534f973eda760cde17d96545d4c4
|Author: Tom Lane 
|Date:   Sun Aug 30 12:21:51 2020 -0400
|
|Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.

|commit 0e69f705cc1a3df273b38c9883fb5765991e04fe
|Author: Alvaro Herrera 
|Date:   Fri Apr 9 11:29:08 2021 -0400
|
|Set pg_class.reltuples for partitioned tables

3d35 also affects partitioned tables, and 0e69 appears to do the right thing by
preserving relpages=-1 during auto-analyze.

Note that Alvaro's commit message and comment refer to relpages, but should
have said reltuples - comment fixed at 7ef8b52cf.

-- 
Justin




Re: Signed vs. Unsigned (some)

2021-07-02 Thread Ranier Vilela
Em sex., 2 de jul. de 2021 às 07:09, Peter Eisentraut <
peter.eisentr...@enterprisedb.com> escreveu:

> On 16.06.21 10:48, Peter Eisentraut wrote:
> > On 15.06.21 10:17, Kyotaro Horiguchi wrote:
> >> The definitions are not ((type) -1) but ((type) 0x) so
> >> actually they might be different if we forget to widen the constant
> >> when widening the types.  Regarding to the compiler behavior, I think
> >> we are assuming C99[1] and C99 defines that -1 is converted to
> >> Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)
> >>
> >> I'm +0.2 on it.  It might be worthwhile as a matter of style.
> >
> > I think since we have the constants we should use them.
>
> I have pushed the InvalidBucket changes.
>
Nice. Thanks.


> The use of InvalidBlockNumber with vac_update_relstats() looks a bit
> fishy to me.  We are using in the same call 0 as the default for
> num_all_visible_pages, and we generally elsewhere also use 0 as the
> starting value for relpages, so it's not clear to me why it should be -1
> or InvalidBlockNumber here.

It seems to me that the only use in vac_update_relstats is to mark relpages
as invalid (dirty = true).


>   I'd rather leave it "slightly wrong" for
> now so it can be checked again.
>
Ideally InvalidBlockNumber should be 0.
Maybe in the long run this will be fixed.

regards,
Ranier Vilela


Re: Signed vs. Unsigned (some)

2021-07-02 Thread Peter Eisentraut

On 16.06.21 10:48, Peter Eisentraut wrote:

On 15.06.21 10:17, Kyotaro Horiguchi wrote:

The definitions are not ((type) -1) but ((type) 0x) so
actually they might be different if we forget to widen the constant
when widening the types.  Regarding to the compiler behavior, I think
we are assuming C99[1] and C99 defines that -1 is converted to
Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)

I'm +0.2 on it.  It might be worthwhile as a matter of style.


I think since we have the constants we should use them.


I have pushed the InvalidBucket changes.

The use of InvalidBlockNumber with vac_update_relstats() looks a bit 
fishy to me.  We are using in the same call 0 as the default for 
num_all_visible_pages, and we generally elsewhere also use 0 as the 
starting value for relpages, so it's not clear to me why it should be -1 
or InvalidBlockNumber here.  I'd rather leave it "slightly wrong" for 
now so it can be checked again.





Re: Signed vs. Unsigned (some)

2021-06-16 Thread Ranier Vilela
Em qua., 16 de jun. de 2021 às 05:48, Peter Eisentraut <
peter.eisentr...@enterprisedb.com> escreveu:

> On 15.06.21 10:17, Kyotaro Horiguchi wrote:
> > The definitions are not ((type) -1) but ((type) 0x) so
> > actually they might be different if we forget to widen the constant
> > when widening the types.  Regarding to the compiler behavior, I think
> > we are assuming C99[1] and C99 defines that -1 is converted to
> > Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)
> >
> > I'm +0.2 on it.  It might be worthwhile as a matter of style.
>
> I think since we have the constants we should use them.
>
> >> pg_rewind is one special case.
> >> All cases of XLogSegNo (uint64) initialization are zero, but in
> pg_rewind
> >> was used -1?
> >> I did not find it InvalidXLogSegNo!
> >
> > I'm not sure whether that is a thinko that the variable is signed or
> > that it is intentional to assign the maximum value.  Anyway, actually
> > there's no need for initializing the variable at all. So I don't think
> > it's worth changing the initial value. If any compiler actually
> > complains about the assignment changing it to zero seems reasonable.
> >
> >> Not tested.
>
> I think this case needs some analysis and explanation what is going on.
> I agree that the existing code looks a bit fishy, but we shouldn't just
> change it to something else without understanding what is going on.
>
Yes, sure.
I think everyone agrees that they have to understand to change something.

I am acting as a firefighter for small fires.
I believe the real contribution to Postgres would be to convince them to
change the default build flags.
Last night I tested a full build on Ubuntu, with clang 10.
Surprise, no warning, all clear, with -Wall enabled (by default).
No wonder these problems end up inside the code, no one sees them.
Everyone is happy to compile Postgres and not see any warnings.
But add -Wpedantinc and -Wextra and you'll get more trouble than rabbits in
a magician's hat.
Of course most are bogus, but they are there, and the new ones, the result
of the new code that has just been modified, will not enter.
Tom once complained that small scissors don't cut the grass.
But small defects piling up lead to big problems.

I believe Postgres will benefit enormously from enabling all the warnings
at compile time, at least the new little bugs will have some chance of not
getting into the codebase.

regards,
Ranier Vilela


Re: Signed vs. Unsigned (some)

2021-06-16 Thread Peter Eisentraut

On 15.06.21 10:17, Kyotaro Horiguchi wrote:

The definitions are not ((type) -1) but ((type) 0x) so
actually they might be different if we forget to widen the constant
when widening the types.  Regarding to the compiler behavior, I think
we are assuming C99[1] and C99 defines that -1 is converted to
Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)

I'm +0.2 on it.  It might be worthwhile as a matter of style.


I think since we have the constants we should use them.


pg_rewind is one special case.
All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind
was used -1?
I did not find it InvalidXLogSegNo!


I'm not sure whether that is a thinko that the variable is signed or
that it is intentional to assign the maximum value.  Anyway, actually
there's no need for initializing the variable at all. So I don't think
it's worth changing the initial value. If any compiler actually
complains about the assignment changing it to zero seems reasonable.


Not tested.


I think this case needs some analysis and explanation what is going on. 
I agree that the existing code looks a bit fishy, but we shouldn't just 
change it to something else without understanding what is going on.





Re: Signed vs. Unsigned (some)

2021-06-15 Thread Ranier Vilela
Hi Kyotaro,

Thanks for taking a look.

Em ter., 15 de jun. de 2021 às 05:17, Kyotaro Horiguchi <
horikyota@gmail.com> escreveu:

> At Fri, 11 Jun 2021 23:05:29 -0300, Ranier Vilela 
> wrote in
> > Hi,
> >
> > Removing legitimate warnings can it be worth it?
>
> From what the warning comes from? And what is the exact message?
>
msvc 64 bits compiler (Level4)
warning C4245: '=': conversion from 'int' to 'Bucket', signed/unsigned
mismatch


> > -1 CAST can be wrong, when there is an invalid value defined
> > (InvalidBucket, InvalidBlockNumber).
> > I think depending on the compiler -1 CAST may be different from
> > InvalidBucket or InvalidBlockNumber.
>
> The definitions are not ((type) -1) but ((type) 0x) so
> actually they might be different if we forget to widen the constant
> when widening the types.  Regarding to the compiler behavior, I think
> we are assuming C99[1] and C99 defines that -1 is converted to
> Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)
>
> I'm +0.2 on it.  It might be worthwhile as a matter of style.
>
I think about more than style.
This is one of the tricks that should not be used.


> > pg_rewind is one special case.
> > All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind
> > was used -1?
> > I did not find it InvalidXLogSegNo!
>
> I'm not sure whether that is a thinko that the variable is signed or
> that it is intentional to assign the maximum value.

It is a thinko.

  Anyway, actually
> there's no need for initializing the variable at all. So I don't think
> it's worth changing the initial value.

It is the case of removing the initialization then?


> If any compiler actually
> complains about the assignment changing it to zero seems reasonable.
>
Same case.
msvc 64 bits compiler (Level4)
warning C4245: '=': initialization from 'int' to 'XLogSegNo',
signed/unsigned mismatch


> > Not tested.
> >
> > Trivial patch attached.
>
> Please don't quickly update the patch responding to my comments alone.
> I might be a minority.
>
Ok.

best regards,
Ranier Vilela


Re: Signed vs. Unsigned (some)

2021-06-15 Thread Kyotaro Horiguchi
At Fri, 11 Jun 2021 23:05:29 -0300, Ranier Vilela  wrote 
in 
> Hi,
> 
> Removing legitimate warnings can it be worth it?

>From what the warning comes from? And what is the exact message?

> -1 CAST can be wrong, when there is an invalid value defined
> (InvalidBucket, InvalidBlockNumber).
> I think depending on the compiler -1 CAST may be different from
> InvalidBucket or InvalidBlockNumber.

The definitions are not ((type) -1) but ((type) 0x) so
actually they might be different if we forget to widen the constant
when widening the types.  Regarding to the compiler behavior, I think
we are assuming C99[1] and C99 defines that -1 is converted to
Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)

I'm +0.2 on it.  It might be worthwhile as a matter of style.

> pg_rewind is one special case.
> All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind
> was used -1?
> I did not find it InvalidXLogSegNo!

I'm not sure whether that is a thinko that the variable is signed or
that it is intentional to assign the maximum value.  Anyway, actually
there's no need for initializing the variable at all. So I don't think
it's worth changing the initial value. If any compiler actually
complains about the assignment changing it to zero seems reasonable.

> Not tested.
> 
> Trivial patch attached.

Please don't quickly update the patch responding to my comments alone.
I might be a minority.

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center