Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-05 Thread Ashutosh Sharma
 I think UInt32GetDatum(metad->hashm_procid) looks fine, the reason
 being 'hashm_procid' is defined as 32-bit unsigned integer but then we
 may have to define procid as int8 in SQL function.
>>>
>>> No, you're wrong.  The GetDatum you choose macro has to match the SQL
>>> type, not the type of the variable that you're passing to it.  For
>>> example, if you've got an "int" in the code and the SQL type is
>>> "int8", you have to use Int64GetDatum, not Int32GetDatum.  Otherwise,
>>> stuff breaks, because on some systems 64-bit integers are passed by
>>> reference, not by value, so the representation that Int32GetDatum
>>> produces isn't valid when interpreted by DatumGetInt64 later on.  The
>>> latter is expecting a pointer, but the former didn't produce one.
>>>
>>
>> Thank you very much for detailed information and explanation. It is
>> really very helpful and understandable. But, As per your explanation,
>> GetDatum we choose need to match the SQL type, not the type of the
>> variable used in code and I do not see any unsigned integer SQL type
>> in PostgreSQL then I am just wondering on why do we have
>> UInt32GetDatum or UInt64GetDatum macros.
>
> That's a pretty good question.  UInt64GetDatum is used in exactly one
> place (exec_stmt_getdiag) and there's really no reason why
> Int64GetDatum wouldn't be more appropriate.  UInt32GetDatum is used in
> a few more places, and some of those are used for hash values which
> are not exposed at the SQL level so they might be legitimate, but
> others like the ones in pageinspect look like fuzzy thinking that has
> only survived because it happens not to break anything.  I suppose if
> we wanted to be really rigorous about this sort of thing, we should
> change UInt32GetDatum to do something tangibly different from
> Int32GetDatum, like negate all the bits.  Then if somebody picked the
> wrong macro it would actually fail.  I'm not sure that's really the
> best place to spend our effort, though.  The moral of this episode is
> that it's important to at least get the right width.  Currently,
> getting the wrong signedness doesn't actually break anything.

Okay, understood. Thank you very much !

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-05 Thread Robert Haas
On Fri, Feb 3, 2017 at 9:14 PM, Ashutosh Sharma  wrote:
> On Fri, Feb 3, 2017 at 8:29 PM, Robert Haas  wrote:
>> On Fri, Feb 3, 2017 at 7:41 AM, Ashutosh Sharma  
>> wrote:
>>> I think UInt32GetDatum(metad->hashm_procid) looks fine, the reason
>>> being 'hashm_procid' is defined as 32-bit unsigned integer but then we
>>> may have to define procid as int8 in SQL function.
>>
>> No, you're wrong.  The GetDatum you choose macro has to match the SQL
>> type, not the type of the variable that you're passing to it.  For
>> example, if you've got an "int" in the code and the SQL type is
>> "int8", you have to use Int64GetDatum, not Int32GetDatum.  Otherwise,
>> stuff breaks, because on some systems 64-bit integers are passed by
>> reference, not by value, so the representation that Int32GetDatum
>> produces isn't valid when interpreted by DatumGetInt64 later on.  The
>> latter is expecting a pointer, but the former didn't produce one.
>>
>
> Thank you very much for detailed information and explanation. It is
> really very helpful and understandable. But, As per your explanation,
> GetDatum we choose need to match the SQL type, not the type of the
> variable used in code and I do not see any unsigned integer SQL type
> in PostgreSQL then I am just wondering on why do we have
> UInt32GetDatum or UInt64GetDatum macros.

That's a pretty good question.  UInt64GetDatum is used in exactly one
place (exec_stmt_getdiag) and there's really no reason why
Int64GetDatum wouldn't be more appropriate.  UInt32GetDatum is used in
a few more places, and some of those are used for hash values which
are not exposed at the SQL level so they might be legitimate, but
others like the ones in pageinspect look like fuzzy thinking that has
only survived because it happens not to break anything.  I suppose if
we wanted to be really rigorous about this sort of thing, we should
change UInt32GetDatum to do something tangibly different from
Int32GetDatum, like negate all the bits.  Then if somebody picked the
wrong macro it would actually fail.  I'm not sure that's really the
best place to spend our effort, though.  The moral of this episode is
that it's important to at least get the right width.  Currently,
getting the wrong signedness doesn't actually break anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-03 Thread Ashutosh Sharma
On Fri, Feb 3, 2017 at 8:29 PM, Robert Haas  wrote:
> On Fri, Feb 3, 2017 at 7:41 AM, Ashutosh Sharma  wrote:
>> I think UInt32GetDatum(metad->hashm_procid) looks fine, the reason
>> being 'hashm_procid' is defined as 32-bit unsigned integer but then we
>> may have to define procid as int8 in SQL function.
>
> No, you're wrong.  The GetDatum you choose macro has to match the SQL
> type, not the type of the variable that you're passing to it.  For
> example, if you've got an "int" in the code and the SQL type is
> "int8", you have to use Int64GetDatum, not Int32GetDatum.  Otherwise,
> stuff breaks, because on some systems 64-bit integers are passed by
> reference, not by value, so the representation that Int32GetDatum
> produces isn't valid when interpreted by DatumGetInt64 later on.  The
> latter is expecting a pointer, but the former didn't produce one.
>

Thank you very much for detailed information and explanation. It is
really very helpful and understandable. But, As per your explanation,
GetDatum we choose need to match the SQL type, not the type of the
variable used in code and I do not see any unsigned integer SQL type
in PostgreSQL then I am just wondering on why do we have
UInt32GetDatum or UInt64GetDatum macros.

>> Note: I am extremely sorry for wrongly choosing some of the SQL types
>> in the patch for pageinspect. I think there were few platform specific
>> things that too should have been addressed by me. Moreover, I feel
>> being the owner of this project I should have participated in this
>> discussion a bit earlier but as I was not subscribed to
>> pgsql-committers list I could not be on time.
>
> It might be a good idea to subscribe to pgsql-committers; that way you
> can follow what's getting committed, whether it is your patch or
> otherwise.  But we also should perhaps have migrated this discussion
> to pgsql-hackers.  Adjusting recipient list.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-03 Thread Robert Haas
On Fri, Feb 3, 2017 at 12:04 PM, Tom Lane  wrote:
> One thing to think about is what will happen if someday we want to use
> 64-bit hash codes (a day I think is not that far away).  It sounds like
> you've already chosen bigint for any output field that represents a
> hash code or a related value such as a mask ... but it wouldn't hurt
> to look through the fields with that in mind.

Yeah, I think we're fine on that score.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-03 Thread Tom Lane
Robert Haas  writes:
> So based on that theory, here's a patch.
> ...
> In short, this patch makes hashfuncs.c consistent about (1) using the
> next wider signed type to report unsigned values and (2) using the
> GetDatum macro that matches the SQL return type in width and
> signedness.  Objections?

I haven't actually reviewed the patch, but your description of it sounds
sane.

One thing to think about is what will happen if someday we want to use
64-bit hash codes (a day I think is not that far away).  It sounds like
you've already chosen bigint for any output field that represents a
hash code or a related value such as a mask ... but it wouldn't hurt
to look through the fields with that in mind.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-03 Thread Tom Lane
Robert Haas  writes:
> Hopefully that will turn the buildfarm green again, but we'll see.

It won't make the unhappy 64-bit machines happy, but I just pushed
a change that should deal with those problems.  With a little luck
we're over the hump now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-03 Thread Robert Haas
On Fri, Feb 3, 2017 at 10:54 AM, Robert Haas  wrote:
>> What needs to be resolved to decide if any of this is actually sane is to
>> figure out which of these values need to be int64 on the SQL side because
>> (a) they could practically exceed the range of signed int32 and (b) it
>> would bother us to show such values as negative rather than large
>> positive.  I suspect that not all the things currently declared as int64
>> really need to be.  I also remain unhappy that we can't manage to be
>> consistent about what a BlockNumber parameter is represented as.
>
> The existing usage is mixed.  For example, gin_metapage_info() returns
> pending_head and pending_tail as bigint, and those are BlockNumber at
> the C level. But bt_metap returns fastroot as int4, and that's also a
> BlockNumber at the C level.  For my money, int8 is a better choice
> because I don't like the idea of the block number rolling over to a
> negative value for a large relation, but I probably wouldn't bother
> breaking compatibility for the existing cases where it's been done
> otherwise, because relations of at least 16TB in size are not yet very
> common.  At some point we may have to bite that bullet but maybe not
> today.

So based on that theory, here's a patch.

- In hash_page_items, the returned columns are declared as itemoffset
int2, ctid tid, and data int8 at the SQL level.  There's existing
precedent for returning itemoffset as either int2 or int4, but since
at the C level it is uint16 it seems best to stick with int4, so the
patch changes that.  The other types seem OK: data is an int8 because
it's reporting the hash code, and that's uint32 at the C level.

- In hash_bitmap_info, the returned columns are declared as
bitmapblkno int8, bitmapbit int4, and bitstatus bool.  Since a block
number is a uint32 at the C level, it seems right to use int8 at the
SQL level per the above, because of signedness.  The other two return
columns are fine also.  The BlahGetDatum macros mostly match the
types, although in the case of bitmapblkno it differs in signedness
(UInt64GetDatum rather than Int64GetDatum).

- hash_metapage_info() returns a whole bunch of columns, all of which
are unsigned quantities, except for hashm_ntuples, which is a double.
With the attached patch, all of those unsigned quantities get promoted
to the next larger signed type at the SQL level (uint32 -> int8,
uint16 -> int4); exceptionally, hashm_procid is reported as an OID,
since it is.

In short, this patch makes hashfuncs.c consistent about (1) using the
next wider signed type to report unsigned values and (2) using the
GetDatum macro that matches the SQL return type in width and
signedness.  Objections?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pageinspect-more-hash-fixes.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-03 Thread Robert Haas
On Fri, Feb 3, 2017 at 7:41 AM, Ashutosh Sharma  wrote:
>> I think for now selecting named fields is sufficient.
>
> +1. Attached is the patch that has this changes.

Thanks for the patch, but you only handled one of the two cases Tom
reported upthread.  Added a fix for the other one and committed the
result as 29e312bc1301061ae9f897ff39f3b230c421a5fb.

Hopefully that will turn the buildfarm green again, but we'll see.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-03 Thread Robert Haas
On Fri, Feb 3, 2017 at 10:28 AM, Tom Lane  wrote:
> Robert Haas  writes:
>>  On Thu, Feb 2, 2017 at 11:16 PM, Tom Lane  wrote:
>>> I just made the C code agree with what the SQL declarations for the
>>> functions say.
>
>> Doesn't look like it to me.  You changed a bunch of places that say
>> UInt32GetDatum to UInt64GetDatum, but the SQL type certainly isn't
>> unsigned.
>
> The machines don't care about that.  They do care about the width of
> the datum.  Particularly on 32-bit hardware, where one width is
> pass-by-val and the other isn't.  (Also, if your point is that you
> wish we had a uint64 SQL type, I doubt we're going there.)

I know the machines don't care about that, but I still think it'd be a
better idea to be consistent with the SQL types throughout rather than
only in places where failing to do so actually breaks something.  It's
not much of an abstraction layer if we're only kinda-sorta rigorous
about using it correctly.  If nothing else, it obscures best practice
for new patch authors.

> What needs to be resolved to decide if any of this is actually sane is to
> figure out which of these values need to be int64 on the SQL side because
> (a) they could practically exceed the range of signed int32 and (b) it
> would bother us to show such values as negative rather than large
> positive.  I suspect that not all the things currently declared as int64
> really need to be.  I also remain unhappy that we can't manage to be
> consistent about what a BlockNumber parameter is represented as.

The existing usage is mixed.  For example, gin_metapage_info() returns
pending_head and pending_tail as bigint, and those are BlockNumber at
the C level. But bt_metap returns fastroot as int4, and that's also a
BlockNumber at the C level.  For my money, int8 is a better choice
because I don't like the idea of the block number rolling over to a
negative value for a large relation, but I probably wouldn't bother
breaking compatibility for the existing cases where it's been done
otherwise, because relations of at least 16TB in size are not yet very
common.  At some point we may have to bite that bullet but maybe not
today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.

2017-02-03 Thread Robert Haas
On Fri, Feb 3, 2017 at 7:41 AM, Ashutosh Sharma  wrote:
> I think UInt32GetDatum(metad->hashm_procid) looks fine, the reason
> being 'hashm_procid' is defined as 32-bit unsigned integer but then we
> may have to define procid as int8 in SQL function.

No, you're wrong.  The GetDatum you choose macro has to match the SQL
type, not the type of the variable that you're passing to it.  For
example, if you've got an "int" in the code and the SQL type is
"int8", you have to use Int64GetDatum, not Int32GetDatum.  Otherwise,
stuff breaks, because on some systems 64-bit integers are passed by
reference, not by value, so the representation that Int32GetDatum
produces isn't valid when interpreted by DatumGetInt64 later on.  The
latter is expecting a pointer, but the former didn't produce one.

> Note: I am extremely sorry for wrongly choosing some of the SQL types
> in the patch for pageinspect. I think there were few platform specific
> things that too should have been addressed by me. Moreover, I feel
> being the owner of this project I should have participated in this
> discussion a bit earlier but as I was not subscribed to
> pgsql-committers list I could not be on time.

It might be a good idea to subscribe to pgsql-committers; that way you
can follow what's getting committed, whether it is your patch or
otherwise.  But we also should perhaps have migrated this discussion
to pgsql-hackers.  Adjusting recipient list.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers