Re: [HACKERS] MD5 aggregate

2013-06-27 Thread Dean Rasheed
On 27 June 2013 17:47, Peter Eisentraut wrote: > On 6/27/13 4:19 AM, Dean Rasheed wrote: >> I'd say there are clearly people who want it, and the nature of some >> of those answers suggests to me that we ought to have a better answer >> in core. > > It's not clear what these people wanted this fun

Re: [HACKERS] MD5 aggregate

2013-06-27 Thread Peter Eisentraut
On 6/27/13 4:19 AM, Dean Rasheed wrote: > I'd say there are clearly people who want it, and the nature of some > of those answers suggests to me that we ought to have a better answer > in core. It's not clear what these people wanted this functionality for. They all wanted to analyze a table to c

Re: [HACKERS] MD5 aggregate

2013-06-27 Thread Robert Haas
On Thu, Jun 27, 2013 at 7:29 AM, Marko Kreen wrote: > On Thu, Jun 27, 2013 at 11:28 AM, Dean Rasheed > wrote: >> On 26 June 2013 21:46, Peter Eisentraut wrote: >>> On 6/26/13 4:04 PM, Dean Rasheed wrote: A quick google search reveals several people asking for something like this, and

Re: [HACKERS] MD5 aggregate

2013-06-27 Thread Marko Kreen
On Thu, Jun 27, 2013 at 11:28 AM, Dean Rasheed wrote: > On 26 June 2013 21:46, Peter Eisentraut wrote: >> On 6/26/13 4:04 PM, Dean Rasheed wrote: >>> A quick google search reveals several people asking for something like >>> this, and people recommending md5(string_agg(...)) or >>> md5(string_agg

Re: [HACKERS] MD5 aggregate

2013-06-27 Thread Dean Rasheed
On 26 June 2013 21:46, Peter Eisentraut wrote: > On 6/26/13 4:04 PM, Dean Rasheed wrote: >> A quick google search reveals several people asking for something like >> this, and people recommending md5(string_agg(...)) or >> md5(string_agg(md5(...))) based solutions, which are doomed to failure >> o

Re: [HACKERS] MD5 aggregate

2013-06-27 Thread Dean Rasheed
On 26 June 2013 22:48, Noah Misch wrote: > On Wed, Jun 26, 2013 at 09:04:34PM +0100, Dean Rasheed wrote: >> On 26 June 2013 19:32, Noah Misch wrote: >> > On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: > >> > md5_agg() is well-defined and not cryptographically novel, and your use >

Re: [HACKERS] MD5 aggregate

2013-06-26 Thread Noah Misch
On Wed, Jun 26, 2013 at 09:04:34PM +0100, Dean Rasheed wrote: > On 26 June 2013 19:32, Noah Misch wrote: > > On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: > > md5_agg() is well-defined and not cryptographically novel, and your use case > > is credible. However, not every useful-s

Re: [HACKERS] MD5 aggregate

2013-06-26 Thread Peter Eisentraut
On 6/26/13 4:04 PM, Dean Rasheed wrote: > A quick google search reveals several people asking for something like > this, and people recommending md5(string_agg(...)) or > md5(string_agg(md5(...))) based solutions, which are doomed to failure > on larger tables. The thread discussed several other o

Re: [HACKERS] MD5 aggregate

2013-06-26 Thread Dean Rasheed
On 26 June 2013 19:32, Noah Misch wrote: > On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: >> I've been playing around with the idea of an aggregate that computes >> the sum of the md5 hashes of each of its inputs, which I've called >> md5_total() for now, although I'm not particular

Re: [HACKERS] MD5 aggregate

2013-06-26 Thread Noah Misch
On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: > I've been playing around with the idea of an aggregate that computes > the sum of the md5 hashes of each of its inputs, which I've called > md5_total() for now, although I'm not particularly wedded to that > name. Comparing it with md5

Re: Review [was Re: [HACKERS] MD5 aggregate]

2013-06-23 Thread Dean Rasheed
On 21 June 2013 21:04, David Fetter wrote: > On Fri, Jun 21, 2013 at 10:48:35AM -0700, David Fetter wrote: >> On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: >> > On 15 June 2013 10:22, Dean Rasheed wrote: >> > > There seem to be 2 separate directions that this could go, which >> >

Re: Review [was Re: [HACKERS] MD5 aggregate]

2013-06-21 Thread David Fetter
On Fri, Jun 21, 2013 at 10:48:35AM -0700, David Fetter wrote: > On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: > > On 15 June 2013 10:22, Dean Rasheed wrote: > > > There seem to be 2 separate directions that this could go, which > > > really meet different requirements: > > > > > >

Review [was Re: [HACKERS] MD5 aggregate]

2013-06-21 Thread David Fetter
On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: > On 15 June 2013 10:22, Dean Rasheed wrote: > > There seem to be 2 separate directions that this could go, which > > really meet different requirements: > > > > 1). Produce an unordered sum for SQL to compare 2 tables regardless of > >

Re: [HACKERS] MD5 aggregate

2013-06-17 Thread Marko Kreen
On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote: > On 15 June 2013 10:22, Dean Rasheed wrote: > > There seem to be 2 separate directions that this could go, which > > really meet different requirements: > > > > 1). Produce an unordered sum for SQL to compare 2 tables regardless of > >

Re: [HACKERS] MD5 aggregate

2013-06-17 Thread Dean Rasheed
On 15 June 2013 10:22, Dean Rasheed wrote: > There seem to be 2 separate directions that this could go, which > really meet different requirements: > > 1). Produce an unordered sum for SQL to compare 2 tables regardless of > the order in which they are scanned. A possible approach to this might >

Re: [HACKERS] MD5 aggregate

2013-06-15 Thread Dean Rasheed
On 13 June 2013 10:35, Dean Rasheed wrote: > Hi, > > Attached is a patch implementing a new aggregate function md5_agg() to > compute the aggregate MD5 sum across a number of rows. This is > something I've wished for a number of times. I think the primary use > case is to do a quick check that 2 t

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Craig Ringer
On 06/13/2013 05:35 PM, Dean Rasheed wrote: > Hi, > > Attached is a patch implementing a new aggregate function md5_agg() to > compute the aggregate MD5 sum across a number of rows. This is > something I've wished for a number of times. I think the primary use > case is to do a quick check that 2 t

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Craig Ringer
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/14/2013 09:40 PM, Stephen Frost wrote: > Where I'd take this is actually in a completely different direction.. > I'd like the aggregate to be able to match the results of running the > 'md5sum' unix utility on a file that's been COPY'd out. Unti

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Dean Rasheed
On 14 June 2013 16:09, Hannu Krosing wrote: > What skytools/pgq/londiste uses for comparing tables on master > and slave is query like this > > select sum(hashtext(t.*::text)) from t; > > This is non-modulo sum and does not use md5 but relies on > whatever the hashtext() du jour is :) > > So it i

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Hannu Krosing
On 06/14/2013 04:47 PM, Tom Lane wrote: > Dean Rasheed writes: >> On 14 June 2013 14:14, Tom Lane wrote: >>> Personally I'd be a bit inclined to xor the per-row md5's rather than >>> sum them, but that's a small matter. >> But this would be a much riskier thing to do with a single column, >> beca

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Andres Freund
On 2013-06-14 15:49:31 +0100, Dean Rasheed wrote: > On 14 June 2013 15:19, Stephen Frost wrote: > > * Andrew Dunstan (and...@dunslane.net) wrote: > >> I'd rather go the other way, processing the records without having > >> to process them otherwise at all. Turning things into text must slow > >> t

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Dean Rasheed
On 14 June 2013 15:19, Stephen Frost wrote: > * Andrew Dunstan (and...@dunslane.net) wrote: >> I'd rather go the other way, processing the records without having >> to process them otherwise at all. Turning things into text must slow >> things down, surely. > > That's certainly an interesting idea

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Tom Lane
Dean Rasheed writes: > On 14 June 2013 14:14, Tom Lane wrote: >> Personally I'd be a bit inclined to xor the per-row md5's rather than >> sum them, but that's a small matter. > But this would be a much riskier thing to do with a single column, > because if you updated multiple rows in the same w

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Dean Rasheed
On 14 June 2013 14:14, Tom Lane wrote: > Marko Kreen writes: >> On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed >> wrote: >>> Attached is a patch implementing a new aggregate function md5_agg() to >>> compute the aggregate MD5 sum across a number of rows. > >> It's more efficient to calculate pe

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Stephen Frost
* Andrew Dunstan (and...@dunslane.net) wrote: > I'd rather go the other way, processing the records without having > to process them otherwise at all. Turning things into text must slow > things down, surely. That's certainly an interesting idea also.. Thanks, Stephen s

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Andrew Dunstan
On 06/14/2013 09:40 AM, Stephen Frost wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: Marko Kreen writes: On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed wrote: Attached is a patch implementing a new aggregate function md5_agg() to compute the aggregate MD5 sum across a number of rows. It's m

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Marko Kreen writes: > > On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed > > wrote: > >> Attached is a patch implementing a new aggregate function md5_agg() to > >> compute the aggregate MD5 sum across a number of rows. > > > It's more efficient to calcula

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Benedikt Grundmann
On Fri, Jun 14, 2013 at 2:14 PM, Tom Lane wrote: > Marko Kreen writes: > > On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed > wrote: > >> Attached is a patch implementing a new aggregate function md5_agg() to > >> compute the aggregate MD5 sum across a number of rows. > > > It's more efficient to

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Tom Lane
Marko Kreen writes: > On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed > wrote: >> Attached is a patch implementing a new aggregate function md5_agg() to >> compute the aggregate MD5 sum across a number of rows. > It's more efficient to calculate per-row md5, and then sum() them. > This avoids th

Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Marko Kreen
On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed wrote: > Attached is a patch implementing a new aggregate function md5_agg() to > compute the aggregate MD5 sum across a number of rows. This is > something I've wished for a number of times. I think the primary use > case is to do a quick check that

Re: [HACKERS] MD5 aggregate

2013-06-13 Thread Peter Eisentraut
On 6/13/13 5:35 AM, Dean Rasheed wrote: > Attached is a patch implementing a new aggregate function md5_agg() to > compute the aggregate MD5 sum across a number of rows. That seems somewhat useful. > In passing, I've tidied up and optimised the code in md5.c a bit --- > specifically I've removed

[HACKERS] MD5 aggregate

2013-06-13 Thread Dean Rasheed
Hi, Attached is a patch implementing a new aggregate function md5_agg() to compute the aggregate MD5 sum across a number of rows. This is something I've wished for a number of times. I think the primary use case is to do a quick check that 2 tables, possibly on different servers, contain the same