[SQL] pg_dump not correctly saving schema with partitioned tables?

2012-01-31 Thread chester c young
have database with many partitions.  each partition table has its own primary 
key sequence.

   Column  ||  Modifiers   
---++--
 uno_id|| not null default nextval('cmp0004.cmt_uno_id_seq'::regclass)

when dumped and then resorted, the pk sequence is changed to an inherited- from 
table:


Column||  Modifiers   
--++--
 uno_id   || not null default nextval('uno_uno_id_seq'::regclass)

(another error is that the pk sequence does not spec the schema!)

is there any way I can reliably backup and restore the db?



-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] must appear in the GROUP BY clause or be used in an aggregate function problem

2012-01-31 Thread Edward W. Rouse
Attempting to get a list of items that we want to be grouped by id and date,
with a detail column.

Table a is a typical user table; id, first and last name are all that I am
using

Table b is a tracking table, combining data from 8+ tables into one for ease
of use for reporting purposes; id, userid, company and invoice_number are
all that I am using

Table c is an exception table; it has relations with the other 2 by both the
tracking and user ids but all I need from it is the exception date

So the data required is the users first and last name, the company, the
invoice number and the exception date. And the expected structure for the
report is:

User Name: Company: Date: Invoice Number

For those invoices that are in the exception table. The sql I can get to
work is:

SELECT MAX(a.name_first) AS name_first, MAX(a.name_last) AS name_last,
b.company, MAX(c.report_date) AS rDate, b.invoicenum
FROM resources a JOIN tracking b ON (a.id=b.resource_id)
JOIN except_detail c ON (b. id = b.tracking_id)
WHERE b.region = NE' AND b.state = 1
GROUP BY a.id, b. company, b.invoicenum
ORDER BY name_last, name_first, b.role_name, rDate 

And in most cases this works fine. The problem arises when invoices get
added to the exception table due to their not being an invoice number. Even
though we join on the tracking id, the group by on invoicenum lumps the
different blank invoices into a single line, if the same user has more than
1. What we want is for each of the individual blank invoicenum entries to
have a separate line in the result. If I remove b.invoicenum from the group
by then I get the error in the subject line. If I try to use an aggregate
function (like I used MAX on the names) it's even worse. MAX works on the
names because they are all the same. MAX on the date doesn't seem to effect
the results that I can see other than if an invoice went into exception more
than once, and in that case we only want the latest one anyway.

Any hints as to how to get this to not lump all of the empty invoicenums for
a user into a single line? And I have mentioned putting in a dummy value
like the date for an invoicenum, but that works as well as I expected it
would (not at all).

Edward W. Rouse



-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] must appear in the GROUP BY clause or be used in an aggregate function problem

2012-01-31 Thread David Johnston
-Original Message-
From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org]
On Behalf Of Edward W. Rouse
Sent: Tuesday, January 31, 2012 3:27 PM
To: pgsql-sql@postgresql.org
Subject: [SQL] must appear in the GROUP BY clause or be used in an aggregate
function problem

Attempting to get a list of items that we want to be grouped by id and date,
with a detail column.

Table a is a typical user table; id, first and last name are all that I am
using

Table b is a tracking table, combining data from 8+ tables into one for ease
of use for reporting purposes; id, userid, company and invoice_number are
all that I am using

Table c is an exception table; it has relations with the other 2 by both the
tracking and user ids but all I need from it is the exception date

So the data required is the users first and last name, the company, the
invoice number and the exception date. And the expected structure for the
report is:

User Name: Company: Date: Invoice Number

For those invoices that are in the exception table. The sql I can get to
work is:

SELECT MAX(a.name_first) AS name_first, MAX(a.name_last) AS name_last,
b.company, MAX(c.report_date) AS rDate, b.invoicenum FROM resources a JOIN
tracking b ON (a.id=b.resource_id)
JOIN except_detail c ON (b. id = b.tracking_id) WHERE b.region = NE'
AND b.state = 1 GROUP BY a.id, b. company, b.invoicenum ORDER BY name_last,
name_first, b.role_name, rDate 

And in most cases this works fine. The problem arises when invoices get
added to the exception table due to their not being an invoice number. Even
though we join on the tracking id, the group by on invoicenum lumps the
different blank invoices into a single line, if the same user has more than
1. What we want is for each of the individual blank invoicenum entries to
have a separate line in the result. If I remove b.invoicenum from the group
by then I get the error in the subject line. If I try to use an aggregate
function (like I used MAX on the names) it's even worse. MAX works on the
names because they are all the same. MAX on the date doesn't seem to effect
the results that I can see other than if an invoice went into exception more
than once, and in that case we only want the latest one anyway.

Any hints as to how to get this to not lump all of the empty invoicenums for
a user into a single line? And I have mentioned putting in a dummy value
like the date for an invoicenum, but that works as well as I expected it
would (not at all).

Edward W. Rouse


---

Edward,

I would suggest first processing all exceptions EXCEPT "missing invoice
number" in one query, then constructing a second query that deals only with
missing invoice numbers, and finally UNION-ing the two results.

Also, I dislike the "MAX(varchar)" semantics.  You should be able to write
the query without it but without more details I cannot readily specify how.
Generally you'd simply operate off of "id" and then join in the
corresponding first/last names at the outer-most layer of the query.  Since
you are grouping on "a.id" anyway I would suggest you try it.

David J.


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] must appear in the GROUP BY clause or be used in an aggregate function problem

2012-01-31 Thread Samuel Gendler
On Tue, Jan 31, 2012 at 12:27 PM, Edward W. Rouse wrote:

> And in most cases this works fine. The problem arises when invoices get
> added to the exception table due to their not being an invoice number. Even
> though we join on the tracking id, the group by on invoicenum lumps the
> different blank invoices into a single line, if the same user has more than
> 1. What we want is for each of the individual blank invoicenum entries to
> have a separate line in the result. If I remove b.invoicenum from the group
> by then I get the error in the subject line. If I try to use an aggregate
> function (like I used MAX on the names) it's even worse. MAX works on the
> names because they are all the same. MAX on the date doesn't seem to effect
> the results that I can see other than if an invoice went into exception
> more
> than once, and in that case we only want the latest one anyway.
>
> Any hints as to how to get this to not lump all of the empty invoicenums
> for
> a user into a single line? And I have mentioned putting in a dummy value
> like the date for an invoicenum, but that works as well as I expected it
> would (not at all).
>

It seems like the use of the aggregation functions in your example aren't
absolutely necessary - though perhaps the date one is.  Best solution would
be to get rid of the aggregate columns so that you don't have this issue.

If that isn't possible, there are definitely some kludges you can use to
get it to work if you really don't want to union 2 queries together - one
with and one without valid invoice ids. Assuming invoice ids are generated
via a sequence, you can do the following to guarantee a unique 'id' for
each empty invoice - at the expense of incrementing your sequence
unnecessarily:

COALESCE(invoiceid, nextval(invoiceid_sequence))

A better version of that kludge would be to create a sequence just for this
purpose and set it to a very negative number.  All of your generated fake
ids will then be negative numbers (so easily identified by whatever is
consuming the query results) and you can reset the sequence back to the
most negative possible value whenever you get concerned about running out
of ids, since you won't be using the same sequence as the invoice table
itself.

There are probably lots of variations on that general concept.  You can
write a window function which will remember the ids already seen/generated
for each row and just generate an arbitrary id to be used in place of null
for grouping, though you'd have to worry about using an id that has not yet
appeared but which later does appear.  Assuming you can create a bucket of
space large enough that is guaranteed to not conflict with valid invoice
ids, you could make it work.  I don't know if you can have select max(x),
invoiceid group by func(invoiceid), so you may have to structure it as
"select max(x), func(invoiceid) group by 2" which would require that your
query results can deal with faked invoiceids - negative numbers again
providing a possible way to identify them.

Doesn't postgres allow operator overloading?  Perhaps you could override
the equality operator for that type so that just a single value (the max
value, for example) always compares as unequal to itself and then use
COALESCE(invoiceid, 9223372036854775807).  All rows without a valid value
will have that value, but the overloaded equality operator will cause them
to not group together (I'm guessing.  I haven't tried this). That one makes
me nervous because of the potential for nasty side effects should something
legitimately have that value, but one could probably make the case for
having a bigger problem if a column has a value equal to max bigint.

Restructuring the query to separate valid invoice ids from invalid and/or
getting rid of the aggregation does seem like the best solution, though.


Re: [SQL] pg_dump not correctly saving schema with partitioned tables?

2012-01-31 Thread Tom Lane
chester c young  writes:
> have database with many partitions.  each partition table has its own primary 
> key sequence.
>Column  ||  Modifiers   
> ---++--
>  uno_id|| not null default nextval('cmp0004.cmt_uno_id_seq'::regclass)

> when dumped and then resorted, the pk sequence is changed to an inherited- 
> from table:


> Column||  Modifiers   
> --++--
>  uno_id   || not null default nextval('uno_uno_id_seq'::regclass)

> (another error is that the pk sequence does not spec the schema!)

I see no reason to think there is a bug here; it's more likely that you
do not understand the display behavior of regclass constants.  They only
print a schema name if your current search_path is such that the
relation wouldn't be found by writing just the unqualified name.
So the most likely explanation for the discrepancy above is that you
executed the two \d commands under different search_path settings.

It's possible that you have actually found a pg_dump bug, but if so
you'll need to submit a complete test-case exhibiting the bug.

regards, tom lane

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] must appear in the GROUP BY clause or be used in an aggregate function problem

2012-01-31 Thread Edward W. Rouse
I would love to remove most of the aggregate functions. I am trying to
update an existing query to provide better data and started with the
existing one (which is where the MAX for the names came from). I originally
tried to remove those, but I get the group by/aggregate function error if I
do. I guess I don't understand enough to know why these are being forced
into that category. I have written queries in the past that had several
columns that were not considered requiring aggregation. Can someone
enlighten me on why these are being marked as requiring group/aggregation. 

 

Something with the way the table are joined perhaps? Can I remove the joins
and put the a.id = c.id in the where clause and get rid of this?

 

From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org]
On Behalf Of Samuel Gendler
Sent: Tuesday, January 31, 2012 3:57 PM
To: Edward W. Rouse
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] must appear in the GROUP BY clause or be used in an
aggregate function problem

 

On Tue, Jan 31, 2012 at 12:27 PM, Edward W. Rouse 
wrote:

And in most cases this works fine. The problem arises when invoices get
added to the exception table due to their not being an invoice number. Even
though we join on the tracking id, the group by on invoicenum lumps the
different blank invoices into a single line, if the same user has more than
1. What we want is for each of the individual blank invoicenum entries to
have a separate line in the result. If I remove b.invoicenum from the group
by then I get the error in the subject line. If I try to use an aggregate
function (like I used MAX on the names) it's even worse. MAX works on the
names because they are all the same. MAX on the date doesn't seem to effect
the results that I can see other than if an invoice went into exception more
than once, and in that case we only want the latest one anyway.

Any hints as to how to get this to not lump all of the empty invoicenums for
a user into a single line? And I have mentioned putting in a dummy value
like the date for an invoicenum, but that works as well as I expected it
would (not at all).

 

It seems like the use of the aggregation functions in your example aren't
absolutely necessary - though perhaps the date one is.  Best solution would
be to get rid of the aggregate columns so that you don't have this issue.

 

If that isn't possible, there are definitely some kludges you can use to get
it to work if you really don't want to union 2 queries together - one with
and one without valid invoice ids. Assuming invoice ids are generated via a
sequence, you can do the following to guarantee a unique 'id' for each empty
invoice - at the expense of incrementing your sequence unnecessarily:

 

COALESCE(invoiceid, nextval(invoiceid_sequence))

 

A better version of that kludge would be to create a sequence just for this
purpose and set it to a very negative number.  All of your generated fake
ids will then be negative numbers (so easily identified by whatever is
consuming the query results) and you can reset the sequence back to the most
negative possible value whenever you get concerned about running out of ids,
since you won't be using the same sequence as the invoice table itself.

There are probably lots of variations on that general concept.  You can
write a window function which will remember the ids already seen/generated
for each row and just generate an arbitrary id to be used in place of null
for grouping, though you'd have to worry about using an id that has not yet
appeared but which later does appear.  Assuming you can create a bucket of
space large enough that is guaranteed to not conflict with valid invoice
ids, you could make it work.  I don't know if you can have select max(x),
invoiceid group by func(invoiceid), so you may have to structure it as
"select max(x), func(invoiceid) group by 2" which would require that your
query results can deal with faked invoiceids - negative numbers again
providing a possible way to identify them.  

 

Doesn't postgres allow operator overloading?  Perhaps you could override the
equality operator for that type so that just a single value (the max value,
for example) always compares as unequal to itself and then use
COALESCE(invoiceid, 9223372036854775807).  All rows without a valid value
will have that value, but the overloaded equality operator will cause them
to not group together (I'm guessing.  I haven't tried this). That one makes
me nervous because of the potential for nasty side effects should something
legitimately have that value, but one could probably make the case for
having a bigger problem if a column has a value equal to max bigint.

 

Restructuring the query to separate valid invoice ids from invalid and/or
getting rid of the aggregation does seem like the best solution, though.



Re: [SQL] must appear in the GROUP BY clause or be used in an aggregate function problem

2012-01-31 Thread Edward W. Rouse
NM, I figured it out. The mere presence of an aggregate function and/or the
group by clause is what's causing all the hate for me. I will take a whack
at getting this to work without them. Thanks all.

 

From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org]
On Behalf Of Edward W. Rouse
Sent: Tuesday, January 31, 2012 4:27 PM
To: 'Samuel Gendler'
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] must appear in the GROUP BY clause or be used in an
aggregate function problem

 

I would love to remove most of the aggregate functions. I am trying to
update an existing query to provide better data and started with the
existing one (which is where the MAX for the names came from). I originally
tried to remove those, but I get the group by/aggregate function error if I
do. I guess I don't understand enough to know why these are being forced
into that category. I have written queries in the past that had several
columns that were not considered requiring aggregation. Can someone
enlighten me on why these are being marked as requiring group/aggregation. 

 

Something with the way the table are joined perhaps? Can I remove the joins
and put the a.id = c.id in the where clause and get rid of this?

 

From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org]
On Behalf Of Samuel Gendler
Sent: Tuesday, January 31, 2012 3:57 PM
To: Edward W. Rouse
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] must appear in the GROUP BY clause or be used in an
aggregate function problem

 

On Tue, Jan 31, 2012 at 12:27 PM, Edward W. Rouse 
wrote:

And in most cases this works fine. The problem arises when invoices get
added to the exception table due to their not being an invoice number. Even
though we join on the tracking id, the group by on invoicenum lumps the
different blank invoices into a single line, if the same user has more than
1. What we want is for each of the individual blank invoicenum entries to
have a separate line in the result. If I remove b.invoicenum from the group
by then I get the error in the subject line. If I try to use an aggregate
function (like I used MAX on the names) it's even worse. MAX works on the
names because they are all the same. MAX on the date doesn't seem to effect
the results that I can see other than if an invoice went into exception more
than once, and in that case we only want the latest one anyway.

Any hints as to how to get this to not lump all of the empty invoicenums for
a user into a single line? And I have mentioned putting in a dummy value
like the date for an invoicenum, but that works as well as I expected it
would (not at all).

 

It seems like the use of the aggregation functions in your example aren't
absolutely necessary - though perhaps the date one is.  Best solution would
be to get rid of the aggregate columns so that you don't have this issue.

 

If that isn't possible, there are definitely some kludges you can use to get
it to work if you really don't want to union 2 queries together - one with
and one without valid invoice ids. Assuming invoice ids are generated via a
sequence, you can do the following to guarantee a unique 'id' for each empty
invoice - at the expense of incrementing your sequence unnecessarily:

 

COALESCE(invoiceid, nextval(invoiceid_sequence))

 

A better version of that kludge would be to create a sequence just for this
purpose and set it to a very negative number.  All of your generated fake
ids will then be negative numbers (so easily identified by whatever is
consuming the query results) and you can reset the sequence back to the most
negative possible value whenever you get concerned about running out of ids,
since you won't be using the same sequence as the invoice table itself.

There are probably lots of variations on that general concept.  You can
write a window function which will remember the ids already seen/generated
for each row and just generate an arbitrary id to be used in place of null
for grouping, though you'd have to worry about using an id that has not yet
appeared but which later does appear.  Assuming you can create a bucket of
space large enough that is guaranteed to not conflict with valid invoice
ids, you could make it work.  I don't know if you can have select max(x),
invoiceid group by func(invoiceid), so you may have to structure it as
"select max(x), func(invoiceid) group by 2" which would require that your
query results can deal with faked invoiceids - negative numbers again
providing a possible way to identify them.  

 

Doesn't postgres allow operator overloading?  Perhaps you could override the
equality operator for that type so that just a single value (the max value,
for example) always compares as unequal to itself and then use
COALESCE(invoiceid, 9223372036854775807).  All rows without a valid value
will have that value, but the overloaded equality operator will cause them
to not group together (I'm guessing.  I haven't tried this). That one makes
me nervous because 

Re: [SQL] must appear in the GROUP BY clause or be used in an aggregate function problem

2012-01-31 Thread Samuel Gendler
On Tue, Jan 31, 2012 at 1:43 PM, Edward W. Rouse wrote:

> NM, I figured it out. The mere presence of an aggregate function and/or
> the group by clause is what’s causing all the hate for me. I will take a
> whack at getting this to work without them. Thanks all.
>

All columns that are not in an aggregate function MUST be in the group by
clause if there is any column in an aggregate function.  If you can get rid
of all aggregation, then you won't have to have the group by, either.  You
could use a correlated subquery to get the most recent report date, rather
than using max.  That would allow you to ditch all of the other
aggregation, I suspect.  But unless there is any chance of different
versions of the name fields for a given id, then it is harmless to drop the
MAX() function call and add them to the group by clause.  You get the same
effect.




> 
>
> ** **
>
> *From:* pgsql-sql-ow...@postgresql.org [mailto:
> pgsql-sql-ow...@postgresql.org] *On Behalf Of *Edward W. Rouse
> *Sent:* Tuesday, January 31, 2012 4:27 PM
> *To:* 'Samuel Gendler'
>
> *Cc:* pgsql-sql@postgresql.org
> *Subject:* Re: [SQL] must appear in the GROUP BY clause or be used in an
> aggregate function problem
>
> ** **
>
> I would love to remove most of the aggregate functions. I am trying to
> update an existing query to provide better data and started with the
> existing one (which is where the MAX for the names came from). I originally
> tried to remove those, but I get the group by/aggregate function error if I
> do. I guess I don’t understand enough to know why these are being forced
> into that category. I have written queries in the past that had several
> columns that were not considered requiring aggregation. Can someone
> enlighten me on why these are being marked as requiring group/aggregation.
> 
>
> ** **
>
> Something with the way the table are joined perhaps? Can I remove the
> joins and put the a.id = c.id in the where clause and get rid of this?
>
> ** **
>
> *From:* pgsql-sql-ow...@postgresql.org [mailto:
> pgsql-sql-ow...@postgresql.org] *On Behalf Of *Samuel Gendler
> *Sent:* Tuesday, January 31, 2012 3:57 PM
> *To:* Edward W. Rouse
> *Cc:* pgsql-sql@postgresql.org
> *Subject:* Re: [SQL] must appear in the GROUP BY clause or be used in an
> aggregate function problem
>
> ** **
>
> On Tue, Jan 31, 2012 at 12:27 PM, Edward W. Rouse 
> wrote:
>
> And in most cases this works fine. The problem arises when invoices get
> added to the exception table due to their not being an invoice number. Even
> though we join on the tracking id, the group by on invoicenum lumps the
> different blank invoices into a single line, if the same user has more than
> 1. What we want is for each of the individual blank invoicenum entries to
> have a separate line in the result. If I remove b.invoicenum from the group
> by then I get the error in the subject line. If I try to use an aggregate
> function (like I used MAX on the names) it's even worse. MAX works on the
> names because they are all the same. MAX on the date doesn't seem to effect
> the results that I can see other than if an invoice went into exception
> more
> than once, and in that case we only want the latest one anyway.
>
> Any hints as to how to get this to not lump all of the empty invoicenums
> for
> a user into a single line? And I have mentioned putting in a dummy value
> like the date for an invoicenum, but that works as well as I expected it
> would (not at all).
>
> ** **
>
> It seems like the use of the aggregation functions in your example aren't
> absolutely necessary - though perhaps the date one is.  Best solution would
> be to get rid of the aggregate columns so that you don't have this issue.*
> ***
>
> ** **
>
> If that isn't possible, there are definitely some kludges you can use to
> get it to work if you really don't want to union 2 queries together - one
> with and one without valid invoice ids. Assuming invoice ids are generated
> via a sequence, you can do the following to guarantee a unique 'id' for
> each empty invoice - at the expense of incrementing your sequence
> unnecessarily:
>
> ** **
>
> COALESCE(invoiceid, nextval(invoiceid_sequence))
>
> ** **
>
> A better version of that kludge would be to create a sequence just for
> this purpose and set it to a very negative number.  All of your generated
> fake ids will then be negative numbers (so easily identified by whatever is
> consuming the query results) and you can reset the sequence back to the
> most negative possible value whenever you get concerned about running out
> of ids, since you won't be using the same sequence as the invoice table
> itself.
>
> There are probably lots of variations on that general concept.  You can
> write a window function which will remember the ids already seen/generated
> for each row and just generate an arbitrary id to be used in place of null
> for grouping, though you'd have to worry about using an id that has not yet
>

Re: [SQL] must appear in the GROUP BY clause or be used in an aggregate function problem

2012-01-31 Thread Edward W. Rouse
Yeah, that's what I did. No more group by, the MAX is in a subquery. One of
the reasons I hate coming behind someone else and updating their stuff is
that too many people try to get cute with the code. I try to make everything
as plain and simple as possible unless performance issues require otherwise.
My code is boring, but easy to understand and maintain ;)

 

From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org]
On Behalf Of Samuel Gendler
Sent: Tuesday, January 31, 2012 4:52 PM
To: Edward W. Rouse
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] must appear in the GROUP BY clause or be used in an
aggregate function problem

 

 

On Tue, Jan 31, 2012 at 1:43 PM, Edward W. Rouse 
wrote:

NM, I figured it out. The mere presence of an aggregate function and/or the
group by clause is what's causing all the hate for me. I will take a whack
at getting this to work without them. Thanks all.

 

All columns that are not in an aggregate function MUST be in the group by
clause if there is any column in an aggregate function.  If you can get rid
of all aggregation, then you won't have to have the group by, either.  You
could use a correlated subquery to get the most recent report date, rather
than using max.  That would allow you to ditch all of the other aggregation,
I suspect.  But unless there is any chance of different versions of the name
fields for a given id, then it is harmless to drop the MAX() function call
and add them to the group by clause.  You get the same effect.

 

 

 

 

From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org]
On Behalf Of Edward W. Rouse
Sent: Tuesday, January 31, 2012 4:27 PM
To: 'Samuel Gendler'


Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] must appear in the GROUP BY clause or be used in an
aggregate function problem

 

I would love to remove most of the aggregate functions. I am trying to
update an existing query to provide better data and started with the
existing one (which is where the MAX for the names came from). I originally
tried to remove those, but I get the group by/aggregate function error if I
do. I guess I don't understand enough to know why these are being forced
into that category. I have written queries in the past that had several
columns that were not considered requiring aggregation. Can someone
enlighten me on why these are being marked as requiring group/aggregation. 

 

Something with the way the table are joined perhaps? Can I remove the joins
and put the a.id = c.id in the where clause and get rid of this?

 

From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org]
On Behalf Of Samuel Gendler
Sent: Tuesday, January 31, 2012 3:57 PM
To: Edward W. Rouse
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] must appear in the GROUP BY clause or be used in an
aggregate function problem

 

On Tue, Jan 31, 2012 at 12:27 PM, Edward W. Rouse 
wrote:

And in most cases this works fine. The problem arises when invoices get
added to the exception table due to their not being an invoice number. Even
though we join on the tracking id, the group by on invoicenum lumps the
different blank invoices into a single line, if the same user has more than
1. What we want is for each of the individual blank invoicenum entries to
have a separate line in the result. If I remove b.invoicenum from the group
by then I get the error in the subject line. If I try to use an aggregate
function (like I used MAX on the names) it's even worse. MAX works on the
names because they are all the same. MAX on the date doesn't seem to effect
the results that I can see other than if an invoice went into exception more
than once, and in that case we only want the latest one anyway.

Any hints as to how to get this to not lump all of the empty invoicenums for
a user into a single line? And I have mentioned putting in a dummy value
like the date for an invoicenum, but that works as well as I expected it
would (not at all).

 

It seems like the use of the aggregation functions in your example aren't
absolutely necessary - though perhaps the date one is.  Best solution would
be to get rid of the aggregate columns so that you don't have this issue.

 

If that isn't possible, there are definitely some kludges you can use to get
it to work if you really don't want to union 2 queries together - one with
and one without valid invoice ids. Assuming invoice ids are generated via a
sequence, you can do the following to guarantee a unique 'id' for each empty
invoice - at the expense of incrementing your sequence unnecessarily:

 

COALESCE(invoiceid, nextval(invoiceid_sequence))

 

A better version of that kludge would be to create a sequence just for this
purpose and set it to a very negative number.  All of your generated fake
ids will then be negative numbers (so easily identified by whatever is
consuming the query results) and you can reset the sequence back to the most
negative possible value whenever you get concerned about running

Fw: Re: [SQL] pg_dump not correctly saving schema with partitioned tables?

2012-01-31 Thread chester c young
> From: chester c young 
> Subject: Re: [SQL] pg_dump not correctly saving schema with partitioned 
> tables?
> To: "Tom Lane" 
> Date: Tuesday, January 31, 2012, 5:40 PM
> --- On Tue, 1/31/12, Tom Lane 
> wrote:
> 
> > From: Tom Lane 
> > Subject: Re: [SQL] pg_dump not correctly saving schema
> with partitioned tables?
> > To: "chester c young" 
> > Cc: pgsql-sql@postgresql.org
> > Date: Tuesday, January 31, 2012, 2:04 PM
> > chester c young 
> > writes:
> > > have database with many partitions.  each
> > partition table has its own primary key sequence.
> > >    Column  ||       
> >                
> >   Modifiers           
> >            
> >    
> > >
> >
> ---++--
> > >  uno_id    || not null default
> > nextval('cmp0004.cmt_uno_id_seq'::regclass)
> > 
> > > when dumped and then resorted, the pk sequence is
> > changed to an inherited- from table:
> > 
> > 
> > >     Column    || 
> >                
> >     Modifiers         
> >              
> > >
> >
> --++--
> > >  uno_id       || not null
> > default nextval('uno_uno_id_seq'::regclass)
> > 
> > > (another error is that the pk sequence does not
> spec
> > the schema!)
> > 
> > I see no reason to think there is a bug here; it's
> more
> > likely that you
> > do not understand the display behavior of regclass
> > constants.  They only
> > print a schema name if your current search_path is such
> that
> > the
> > relation wouldn't be found by writing just the
> unqualified
> > name.
> > So the most likely explanation for the discrepancy
> above is
> > that you
> > executed the two \d commands under different
> search_path
> > settings.
> > 
> > It's possible that you have actually found a pg_dump
> bug,
> > but if so
> > you'll need to submit a complete test-case exhibiting
> the
> > bug.
> > 
> here is a test case:
> 
> 
> create database test2;
> \c test2;
> 
> create schema s1;
> create schema s2;
> 
> create table s1.t1(
>     c1  bigserial   primary
> key,
>     c2 text );
> 
> create table s2.t1(
>     c1  bigserial   primary
> key
> ) inherits( s1.t1 );
> 
> test2=# \d s2.t2
>                
>             Table "s2.t2"
>  Column |  Type  |       
>          
>    Modifiers       
>               
> ++
>  c1     | bigint | not null default
> nextval('s2.t2_c1_seq'::regclass)
>  c2     | text   | 
> 
> -- pg_dump test2 > test2.dmp
> 
> create database test3
> \c test3
> \i test2.dmp
> 
> test3=# \d s2.t1
>                
>             Table "s2.t1"
>  Column |  Type  |       
>          
>    Modifiers       
>               
> ++
>  c1     | bigint | not null default
> nextval('s1.t1_c1_seq'::regclass)
>  c2     | text   | 
> 
> ## in the original db table s2.t1 pk is using s2.t1_c1_seq
> ## in the dumped db table s2.t1 pk is using s1.t1_c1_seq
> 
>

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] Display Length Between Var & Varchar

2012-01-31 Thread Carlos Mennens
I've noticed when I set a field to char, it takes up lots of space over varchar:

iamunix=# SELECT * FROM music;
 id |   band|  album   |date|
  asin|label
+---+--+++--
  1 | Dance Gavin Dance | Downtown Battle Mountain | 2007-05-15 |
B000OQF4PQ | Rise Records
(1 row)

iamunix=# SELECT * FROM music;
 id |   band|  album   |date|
asin|label
+---+--+++--
  1 | Dance Gavin Dance | Downtown Battle Mountain | 2007-05-15 |
B000OQF4PQ | Rise Records
(1 row)

I don't know how well it will show in plain text via email but does
anyone know why the field display width is wasted with so much white
space when set to char?

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Display Length Between Var & Varchar

2012-01-31 Thread Adrian Klaver
On Tuesday, January 31, 2012 8:08:06 pm Carlos Mennens wrote:
> I've noticed when I set a field to char, it takes up lots of space over
> varchar:
> 
> iamunix=# SELECT * FROM music;
>  id |   band|  album   |date|
>   asin|label
> +---+--++--
> --+-- 1 | Dance Gavin
> Dance | Downtown Battle Mountain | 2007-05-15 |
> B000OQF4PQ | Rise Records
> (1 row)
> 
> iamunix=# SELECT * FROM music;
>  id |   band|  album   |date|
> asin|label
> +---+--++--
> --+-- 1 | Dance Gavin Dance | Downtown Battle Mountain |
> 2007-05-15 |
> B000OQF4PQ | Rise Records
> (1 row)
> 
> I don't know how well it will show in plain text via email but does
> anyone know why the field display width is wasted with so much white
> space when set to char?

You will find that the Manual is very helpful in this regard:)

http://www.postgresql.org/docs/9.0/interactive/datatype-character.html
character varying(n), varchar(n)variable-length with limit
character(n), char(n)   fixed-length, blank padded

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: Fw: Re: [SQL] pg_dump not correctly saving schema with partitioned tables?

2012-01-31 Thread Tom Lane
chester c young  writes:
>> here is a test case:

[ slightly less messed-up test case for anyone who wants to duplicate this: ]

create schema s1;
create schema s2;
create table s1.t1(
c1  bigserial   primary key,
c2 text );
create table s2.t1(
c1  bigserial   primary key
) inherits( s1.t1 );

In the original database, s1.t1.c1 has a default referring to sequence
s1.t1_c1_seq, while s2.t1.c1 has a different default referring to
sequence s2.t1_c1_seq.  However, pg_dump examines the database roughly
like this:

1. Set search_path to s1.
2. Examine s1.t1's default expression; it looks like
nextval('t1_c1_seq'::regclass)
3. Set search_path to s2.
4. Examine s2.t1's default expression; it looks like
nextval('t1_c1_seq'::regclass)
5. Textually compare these defaults, find that they appear textually
   identical, conclude that s2.t1's default is inherited from s1.t1.
   Which it is not --- but after a dump and restore, it will be,
   because pg_dump set it up that way.

The only near-term fix I can see for that is for pg_dump to stop
trafficking in inherited defaults at all.  That is, always install
defaults with ALTER TABLE ONLY tab ALTER COLUMN col SET DEFAULT, and
do that over again explicitly for each child table.  Since (AFAICT)
the system catalogs don't explicitly record inherited-ness of defaults,
this should produce the correct state even when the default really was
inherited, and it dodges this problem of search-path-sensitive
printouts, or indeed the whole idea of trying to compare text
representations of default expressions at all (which is surely a kluge
from the get-go).  The code in pg_dump is jumping through hoops to
use inherited creation of defaults whenever possible, but I'm not sure
I see much point in that.

In the longer term it might be nicer if the system catalogs did record
inherited-ness of defaults (and then pg_dump could rely on that info
instead of guessing); but that would be a far more invasive change.

regards, tom lane

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql