Re: [HACKERS] Hash id in pg_stat_statements

2012-11-15 Thread Dimitri Fontaine
Peter Geoghegan writes: > should expose the hash. The need to aggregate historical statistics > just doesn't appreciably alter things here, I feel. The number of > discrete queries that an application will execute in a week just isn't > that different from the number that it will ever execute, I s

Re: [HACKERS] Hash id in pg_stat_statements

2012-11-15 Thread Peter Geoghegan
On 15 November 2012 13:10, Magnus Hagander wrote: >> Well, forgive me for pointing this out, but I did propose that the >> hash be a 64-bit value (which would have necessitated adopting >> hash_any() to produce 64-bit values), but you rejected the proposal. I >> arrived at the same probability for

Re: [HACKERS] Hash id in pg_stat_statements

2012-11-15 Thread Magnus Hagander
On Tue, Oct 2, 2012 at 8:22 PM, Peter Geoghegan wrote: > On 2 October 2012 18:16, Tom Lane wrote >> 1. Why isn't something like md5() on the reported query text an equally >> good solution for users who want a query hash? > > Because that does not uniquely identify the entry. The very first > thi

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-15 Thread Daniel Farina
On Mon, Oct 15, 2012 at 7:35 AM, Peter Geoghegan wrote: > On 3 October 2012 19:04, Tom Lane wrote: >> Daniel Farina writes: >>> Instead, I think it makes sense to assign a number -- arbitrarily, but >>> uniquely -- to the generation of a new row in pg_stat_statements, and, >>> on the flip side,

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-15 Thread Peter Geoghegan
On 3 October 2012 19:04, Tom Lane wrote: > Daniel Farina writes: >> Instead, I think it makes sense to assign a number -- arbitrarily, but >> uniquely -- to the generation of a new row in pg_stat_statements, and, >> on the flip side, whenever a row is retired its number should be >> eliminated, p

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-10 Thread Peter Geoghegan
On 3 October 2012 19:54, Peter Geoghegan wrote: > On 3 October 2012 19:04, Tom Lane wrote: >> This argument seems sensible to me. Is there any use-case where the >> proposed counter wouldn't do what people wished to do with an exposed >> hash value? > > Yes. The hash could be used to aggregate q

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-10 Thread Peter Geoghegan
On 3 October 2012 19:54, Peter Geoghegan wrote: > On 3 October 2012 19:04, Tom Lane wrote: >> This argument seems sensible to me. Is there any use-case where the >> proposed counter wouldn't do what people wished to do with an exposed >> hash value? > > Yes. The hash could be used to aggregate q

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-03 Thread Daniel Farina
On Wed, Oct 3, 2012 at 11:04 AM, Tom Lane wrote: > Daniel Farina writes: >> Instead, I think it makes sense to assign a number -- arbitrarily, but >> uniquely -- to the generation of a new row in pg_stat_statements, and, >> on the flip side, whenever a row is retired its number should be >> elimi

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-03 Thread Peter Geoghegan
On 3 October 2012 19:04, Tom Lane wrote: > Daniel Farina writes: >> Instead, I think it makes sense to assign a number -- arbitrarily, but >> uniquely -- to the generation of a new row in pg_stat_statements, and, >> on the flip side, whenever a row is retired its number should be >> eliminated, p

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-03 Thread Tom Lane
Daniel Farina writes: > Instead, I think it makes sense to assign a number -- arbitrarily, but > uniquely -- to the generation of a new row in pg_stat_statements, and, > on the flip side, whenever a row is retired its number should be > eliminated, practically, for-ever. This way re-introductions

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-02 Thread Daniel Farina
On Mon, Oct 1, 2012 at 12:57 AM, Magnus Hagander wrote: > Can we please expose the internal hash id of the statements in > pg_stat_statements? > > I know there was discussions about it earlier, and it wasn't done with > an argument of it not being stable between releases (IIRC). I think we > can l

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-02 Thread Martijn van Oosterhout
On Tue, Oct 02, 2012 at 12:58:15PM -0400, Stephen Frost wrote: > > I simply do not understand objections to the proposal. Have I missed > > something? > > It was my impression that the concern is the stability of the hash value > and ensuring that tools which operate on it don't mistakenly lump t

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-02 Thread Peter Geoghegan
On 2 October 2012 17:58, Stephen Frost wrote: > Right, and that's all I'm trying to address here- how do we provide a > value for a given query which can be relied upon by outside sources, > even in the face of a point release which changes what our internal hash > value for a given query is. I d

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-02 Thread Peter Geoghegan
On 2 October 2012 18:16, Tom Lane wrote > 1. Why isn't something like md5() on the reported query text an equally > good solution for users who want a query hash? Because that does not uniquely identify the entry. The very first thing that the docs say on search_path is "Qualified names are tedio

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-02 Thread Tom Lane
Stephen Frost writes: > * Peter Geoghegan (pe...@2ndquadrant.com) wrote: >> I simply do not understand objections to the proposal. Have I missed >> something? > It was my impression that the concern is the stability of the hash value > and ensuring that tools which operate on it don't mistakenly

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-02 Thread Stephen Frost
* Peter Geoghegan (pe...@2ndquadrant.com) wrote: > On 1 October 2012 18:05, Stephen Frost wrote: > > You're going to have to help me here, 'cause I don't see how there can > > be duplicates if we include the PGSS_FILE_HEADER as part of the hash, > > unless we're planning to keep PGSS_FILE_HEADER c

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-02 Thread Magnus Hagander
P On Oct 2, 2012 5:04 PM, "Euler Taveira" wrote: > > On 02-10-2012 10:15, Peter Geoghegan wrote: > > There are other, similar tools that exist in proprietary databases. > > They expose a hash value, which is subject to exactly the same caveats > > as our own. They explicitly encourage the type of

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-02 Thread Euler Taveira
On 02-10-2012 10:15, Peter Geoghegan wrote: > There are other, similar tools that exist in proprietary databases. > They expose a hash value, which is subject to exactly the same caveats > as our own. They explicitly encourage the type of aggregation by > third-party tools that I anticipate will ha

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-02 Thread Peter Geoghegan
On 1 October 2012 18:05, Stephen Frost wrote: > * Peter Geoghegan (pe...@2ndquadrant.com) wrote: >> That won't really help matters. There'd still be duplicate entries, >> from before and after the change, even if we make it immediately >> obvious which is which. The only reasonable solution in tha

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-01 Thread Stephen Frost
Peter, * Peter Geoghegan (pe...@2ndquadrant.com) wrote: > That won't really help matters. There'd still be duplicate entries, > from before and after the change, even if we make it immediately > obvious which is which. The only reasonable solution in that scenario > is to bump PGSS_FILE_HEADER, wh

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-01 Thread Peter Geoghegan
On 1 October 2012 17:12, Stephen Frost wrote: > Peter, all, > > * Peter Geoghegan (pe...@2ndquadrant.com) wrote: >> Well, I'll point out once again that the argument about its stability >> is invalid, because we serialise the entries to disk. If a point >> release changes the representation of the

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-01 Thread Stephen Frost
Peter, all, * Peter Geoghegan (pe...@2ndquadrant.com) wrote: > Well, I'll point out once again that the argument about its stability > is invalid, because we serialise the entries to disk. If a point > release changes the representation of the query tree such that the > hash values won't match, th

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-01 Thread Peter Geoghegan
On 1 October 2012 15:22, Magnus Hagander wrote: > On Mon, Oct 1, 2012 at 4:10 PM, Tom Lane wrote: >> Worse than that: it could change across a minor version update. These >> are internal data structures we're hashing, and we've been known to >> have to change them for bug-fix purposes. > > As Pe

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-01 Thread Euler Taveira
On 01-10-2012 11:22, Magnus Hagander wrote: > As long as we *tell* them under what conditions it might change, I > think it's perfectly fine. Particularly those who are likely to use > this functionality should certainly be capable of understanding that. > Even if we do that it is too much work fo

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-01 Thread Magnus Hagander
On Mon, Oct 1, 2012 at 4:10 PM, Tom Lane wrote: > Magnus Hagander writes: >> Can we please expose the internal hash id of the statements in >> pg_stat_statements? > >> I know there was discussions about it earlier, and it wasn't done with >> an argument of it not being stable between releases (II

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-01 Thread Tom Lane
Magnus Hagander writes: > Can we please expose the internal hash id of the statements in > pg_stat_statements? > I know there was discussions about it earlier, and it wasn't done with > an argument of it not being stable between releases (IIRC). Worse than that: it could change across a minor ve

Re: [HACKERS] Hash id in pg_stat_statements

2012-10-01 Thread Peter Geoghegan
On 1 October 2012 08:57, Magnus Hagander wrote: > I know there was discussions about it earlier, and it wasn't done with > an argument of it not being stable between releases (IIRC). I think we > can live with that drawback, assuming of course that we document this > properly. Well, I'll point ou

[HACKERS] Hash id in pg_stat_statements

2012-10-01 Thread Magnus Hagander
Can we please expose the internal hash id of the statements in pg_stat_statements? I know there was discussions about it earlier, and it wasn't done with an argument of it not being stable between releases (IIRC). I think we can live with that drawback, assuming of course that we document this pro