Re: [SQL] Finding sequential records

2008-09-29 Thread Steve Midgley

At 05:38 PM 9/26/2008, Oliveiros Cristina wrote:

In-Reply-To: <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]>
<[EMAIL PROTECTED]>
Howdy, Steve.

SELECT id
FROM dummy a
NATURAL JOIN (
SELECT fkey_id,name
FROM dummy
GROUP BY fkey_id,name
HAVING COUNT(*) > 1 AND SUM(id) = (MAX(id) + MIN(id)) * (MAX(id) - 
MIN(id) + 1) / 2

) b
ORDER BY id;

In your table you just have duplicates? Or you may have triplicates? 
And quadruplicates? And in general n-uplicates? At the time, I thought 
you might have n-uplicates, so I designed the query to be as general 
as possible to handle all that cases, from which duplicates are a 
particular case, but now i am wondering if you don't have more than 
duplicates.


In my specific case it turns out I only had duplicates, but there could 
have been n-plicates, so your code is still correct for my use-case 
(though I didn't say that in my OP).



Well, anyway the idea is as follows
The sum of a sequence is given by first + last / 2 * n, with n = last 
- first + 1, OK ?


I *love* your application of that formula. It's rare for me to be able 
to use "real" math in SQL, so this was a pleasure to read (and 
understand!)


Thanks again to Richard and Oliveiros for a truly educating experience! 
I hope some others were similarly enlightened.


With gratitude,

Steve


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Finding sequential records

2008-09-29 Thread Richard Broersma
On Mon, Sep 29, 2008 at 7:48 PM, Steve Midgley <[EMAIL PROTECTED]> wrote:

> In my specific case it turns out I only had duplicates, but there could have
> been n-plicates, so your code is still correct for my use-case (though I
> didn't say that in my OP).

Ya there are a lot of neat queries that you can construct.  If you
have a good background in math and set theory (which I don't have) you
can develop all sorts of powerful analysis queries.

On a side note, I thought that I should mention that unwanted
duplicates are an example where some ~have gotten bitten~ with a
purely surrogate key approach.  To make matter worse, is when some
users update part of one duplicate and another updates a different
duplicated on a another field(s).  Then once the designer discovers
the duplicate problem, she/he has to figure out some way of merging
these non-exact duplicates.  So even if the designer has no intention
of implementing natural primary/foreign keys, he/she will still
benefit from a natural key consideration in that a strategy can be
designed to prevent getting bitten by duplicated data.

I only mention this because db designers get bitten by this all the
time.  Well at least the ones that subscribe to www.utteraccess.com
get bitten.  From what I've seen not one day has gone by without
someone posting a question to this site about how to both find and
remove all but one of the duplicates.

-- 
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Finding sequential records

2008-09-29 Thread Steve Midgley

At 09:50 PM 9/29/2008, Richard Broersma wrote:
On Mon, Sep 29, 2008 at 7:48 PM, Steve Midgley <[EMAIL PROTECTED]> 
wrote:


> In my specific case it turns out I only had duplicates, but there 
could have
> been n-plicates, so your code is still correct for my use-case 
(though I

> didn't say that in my OP).

Ya there are a lot of neat queries that you can construct.  If you
have a good background in math and set theory (which I don't have) you
can develop all sorts of powerful analysis queries.

On a side note, I thought that I should mention that unwanted
duplicates are an example where some ~have gotten bitten~ with a
purely surrogate key approach.  To make matter worse, is when some
users update part of one duplicate and another updates a different
duplicated on a another field(s).  Then once the designer discovers
the duplicate problem, she/he has to figure out some way of merging
these non-exact duplicates.  So even if the designer has no intention
of implementing natural primary/foreign keys, he/she will still
benefit from a natural key consideration in that a strategy can be
designed to prevent getting bitten by duplicated data.

I only mention this because db designers get bitten by this all the
time.  Well at least the ones that subscribe to www.utteraccess.com
get bitten.  From what I've seen not one day has gone by without
someone posting a question to this site about how to both find and
remove all but one of the duplicates.


Truly. I have worked with some school districts around the US and this 
duplicate record problem is more than theoretical. Some of the 
gnarliest, dirtiest, n-plicate data I've ever seen comes out of the US 
public education system.


More generally where I have seen a need for natural keys, I've always 
taken the "best of both worlds" approach. So I always stick an 
integer/serial PK into any table - why not - they're cheap and 
sometimes are handy. And then for tables along the lines of your 
description, I add a compound unique index which serves the business 
rule of "no dupes along these lines."


Am I following your point? Any reason why using serial PK's with 
"compound natural unique indices" is better/worse than just using 
natural PK's?


Steve


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql