Re: [GENERAL] [NOVICE] Question About Aggregate Functions

2006-09-13 Thread Brandon Aiken








Ah, I did not know what was in your
fields, so I did not assume they were Boolean values. It looked to me
like you were trying to use IS TRUE to substitute for the lack of a GROUP BY,
so I didnt know what to do.



Yes, count() will include all non-NULL
values. Sorry if I sounded unclear there. 



If you do typecasting the value zero is false
(and non-zero is true). NULL in an _expression_ always returns NULL, and
many programs will interpret that result as false. So Im not sure
of what results you might get with a Boolean test against a non-Boolean field,
especially if its an integer field.



postgres=# select 0::boolean = FALSE;

?column?

--

t

(1 row)



You should just be able to take the
previous query and add in your WHERE clauses:



SELECT count(t1.fielda), count(t2.fielda),
count(t2.fieldb), AVG(t2.fieldc)

FROM t1 JOIN t2 ON t1.item_id = t2.item_id

WHERE t1.fielda = TRUE AND t2.fielda =
TRUE AND t2.fieldb = TRUE

GROUP BY NULL;



Now, the INNER JOIN youre using is
only selecting fields where both t1.item_id and t2.item_id exist and the respective
fields are TRUE. That is, its only going to run the count and
average functions against the results of this query:

SELECT *

FROM t1 JOIN t2 ON ON t1.item_id =
t2.item_id

WHERE t1.fielda = TRUE AND t2.fielda =
TRUE AND t2.fieldb = TRUE;



If thats what you want, thats
great. 



However, you might want a count of each
field where that field is TRUE. In that case, I would use either
temporary tables, compound queries and derived tables, or multiple simple
queries.



Its also possible that you might
want a count of fields where t1.item_id and t2.item_id exist, but where only
each respective field is TRUE. That is, you want a count of t1.fielda
where it is TRUE no matter what t2.fielda and t2.fieldb are as long as
t1.item_id matches t2.item_id. In that case you have to do even more
joins, and that could take a fair bit of time especially if you havent
indexed your item_id fields.



You really have to look at your result
sets. Sometimes it is better to run multiple simple queries instead of
one big complex query to be sure youre getting the data you want and the
query executes in a reasonable amount of time.



Also, consider that NULL values are
generally considered bad to purposefully enter. Logically, It would be
better to create one table for each field and then create a record for each
item_id as you need it so you never have NULLs. The problem with that is
one of performance if you end up doing large number of JOINs. In that
case, it might be better to use integers instead of Boolean fields, since you
have three explicit states of TRUE, FALSE, and NOT YET DETERMINED.







--





Brandon
 Aiken





CS/IT Systems Engineer













From:
[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Don Parris
Sent: Tuesday, September 12, 2006
9:16 PM
To: [EMAIL PROTECTED]
Subject: Re: [NOVICE] Question
About Aggregate Functions





On 9/12/06, Brandon
 Aiken [EMAIL PROTECTED]
wrote:











First, aggregate functions always have to have a GROUP BY
clause. If you want everything in a table or join, you use GROUP BY
NULL. 












Thanks. I did not realize that. 













Next, IS TRUE statements will select anything that is not
NULL, 0, or FALSE, so I'm not sure what you're trying to get because you're
getting nearly everything, and count() already ignores NULL values.












I didn't see that in the manual's coverage, but could have overlooked it.
But count() will include the FALSE values along with the TRUE values - ignoring
only those that are NULL. At least, I think that's the case. So,
for each column I select, I need to be sure I am counting only the TRUE
values. I do have NULL, FALSE and TRUE values in each column, since I do
not always know for sure whether an attribute is TRUE or FALSE when I record
the item. That may be determined later, but not in all cases. 













Next, count(x, y, z) isn't a valid function. Count()
only has one parameter, so you'll have to call it several times.












I knew my syntax was wrong - but wasn't sure about calling multiple functions
since I hadn't seen any examples of that in my hunting for info. I was
trying to make a little clearer what I wanted to do. 














Depending on what you were hoping count(x, y, z) was
returning, you do this: 



SELECT count(t1.fielda), count(t2.fielda), count(t2.fieldb),
AVG(t2.fieldc)

FROM t1 JOIN t2 ON ON t1.item_id = t2.item_id

GROUP BY NULL;












This one looks more like what I am attempting to do. However, I do need
to be sure my count() functions are counting the values that are
TRUE. Is this a case where I should run a query to select the records
where the values for the desired columns are true, insert that result into a
temp table, and then perform the count() function as above on just those
records? Sure seems like that would be the simple route, now that I think
about it. 












SNIP








Re: [GENERAL] [NOVICE] Question About Aggregate Functions

2006-09-13 Thread Don Parris
On 9/13/06, Brandon Aiken [EMAIL PROTECTED] wrote:















Ah, I did not know what was in your
fields, so I did not assume they were Boolean values. It looked to me
like you were trying to use IS TRUE to substitute for the lack of a GROUP BY,
so I didn't know what to do.That was in the first paragraph of my OP.  How do I create a query that (1) evaluates each boolean field for
TRUE/FALSE and (2) counts the number of rows where each field is TRUE? Maybe you just hadn't had your first cup of coffee? ;-) Seriously, though, I really do appreciate your help.
Yes, count() will include all non-NULL
values. Sorry if I sounded unclear there. 



If you do typecasting the value zero is false
(and non-zero is true). NULL in an _expression_ always returns NULL, and
many programs will interpret that result as false. So I'm not sure
of what results you might get with a Boolean test against a non-Boolean field,
especially if it's an integer field.



postgres=# select 0::boolean = FALSE;

?column?

--

t

(1 row)



You should just be able to take the
previous query and add in your WHERE clauses:



SELECT count(t1.fielda), count(t2.fielda),
count(t2.fieldb), AVG(t2.fieldc)

FROM t1 JOIN t2 ON t1.item_id = t2.item_id

WHERE t1.fielda = TRUE AND t2.fielda =
TRUE AND t2.fieldb = TRUE

GROUP BY NULL;



Now, the INNER JOIN you're using is
only selecting fields where both t1.item_id and t2.item_id exist and the respective
fields are TRUE. That is, it's only going to run the count and
average functions against the results of this query:

SELECT *

FROM t1 JOIN t2 ON ON t1.item_id =
t2.item_id

WHERE t1.fielda = TRUE AND t2.fielda =
TRUE AND t2.fieldb = TRUE;



If that's what you want, that's
great. Can I use OR instead of AND here?
However, you might want a count of each
field where that field is TRUE. In that case, I would use either
temporary tables, compound queries and derived tables, or multiple simple
queries.
It's also possible that you might
want a count of fields where t1.item_id and t2.item_id exist, but where only
each respective field is TRUE. That is, you want a count of t1.fielda
where it is TRUE no matter what t2.fielda and t2.fieldb are as long as
t1.item_id matches t2.item_id. In that case you have to do even more
joins, and that could take a fair bit of time especially if you haven't
indexed your item_id fields. Well this sounds more like what I want. Given t2.fielda, t2.fieldb, t2.fieldc, any one (or all three) could be true, but frequently at least one of the fields is false. Initially, all of the fields might be unknown (thus NULL) for a given item until I am able to investigate the items to determine TRUE/FALSE. I frequently have items that are inactive, and thus unable to determine any of attributes in t2.
My end result needs to be a count of all the values in each field where the value is TRUE, as opposed to FALSE or NULL.


You really have to look at your result
sets. Sometimes it is better to run multiple simple queries instead of
one big complex query to be sure you're getting the data you want and the
query executes in a reasonable amount of time.



Also, consider that NULL values are
generally considered bad to purposefully enter. Logically, It would be
better to create one table for each field and then create a record for each
item_id as you need it so you never have NULLs. The problem with that is
one of performance if you end up doing large number of JOINs. In that
case, it might be better to use integers instead of Boolean fields, since you
have three explicit states of TRUE, FALSE, and NOT YET DETERMINED.Regarding the NULL Values:I have been thinking that I might want to leave my NULLs as they are. However, I will be concentrating mostly on the items that are active. Inactive items are only counted as part of the total number of items. Their attributes are a moot point, but technically FALSE. (If they are inactive, the attributes are no longer TRUE in any case.)
I am counting only those fields that I know (through verification) to be TRUE. I can use FALSE where the actual value is not known, and just change the attribute to TRUE when I discover that to be the case. I just need to be sure in my counts that I note the fact that FALSE values include the unverified values (meaning some of those might actually be true). Does that sound fairly logical to you?
Regarding the Table Layout:The columns in t2 are ordered, essentially according to the category of attributes involved. I had thought about using multiple tables, one for each category of attributes. However, I chose a monolithic table for the attributes to represent a single survey of each item. Each item might be surveyed again in the future to determine any changes, which would introduce a new version of the current table.
I'll tinker around with the queries a bit, and see what I come up with.Thanks for the input.


Re: [GENERAL] [NOVICE] Question About Aggregate Functions

2006-09-13 Thread Brandon Aiken








I think I mistakenly sent this to General
instead of Novice. Oops.



Yeah, I either skipped over or forgot the
bit in the OP about bools. Mea culpa.



You should be able to use OR instead of
AND in any logical _expression_.



 Well this sounds more like what I want.
Given t2.fielda, t2.fieldb, t2.fieldc, any one (or all three) could be true,
but frequently at least one of the fields is false. Initially, all of the
fields might be unknown (thus NULL) for a given item until I am able to
investigate the items to determine TRUE/FALSE. I frequently have items
that are inactive, and thus unable to determine any of attributes in t2. 

My end result needs to be a count of all the values in each field where the
value is TRUE, as opposed to FALSE or NULL.



Yeah, I would probably run 4 separate,
simple queries. That will get you the best performance since youre
doing no JOINs and no composite queries.



If you need to enter the results into
another table, try INSERT  to insert the defaults and any primary key
you have (like timestamp), then four UPDATE  SELECT statements.





The real problem with NULLs is some of the
(in my mind) nonsensical results you get, especially with logical operators:

NULL AND TRUE = NULL

NULL OR TRUE = TRUE

NULL AND FALSE = FALSE

NULL OR FALSE = NULL



Plus you have to use IS instead of = since
any NULL in an = _expression_ makes the result NULL (yes, this is an error in my previous
queries). NULL just has all these special cases. I find it much
nicer to avoid it wherever possible since it has somewhat unpredictable
results.













From:
[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Don Parris
Sent: Wednesday, September 13,
2006 12:50 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] [NOVICE]
Question About Aggregate Functions





On 9/13/06, Brandon
 Aiken [EMAIL PROTECTED]
wrote:











Ah, I did not know what was in your fields, so I did not
assume they were Boolean values. It looked to me like you were trying to
use IS TRUE to substitute for the lack of a GROUP BY, so I didn't know what to
do.












That was in the first paragraph of my OP.  How do I create a query
that (1) evaluates each boolean field for TRUE/FALSE and (2) counts the number of
rows where each field is TRUE? Maybe you just hadn't had your first
cup of coffee? ;-) Seriously, though, I really do appreciate your help.













Yes, count() will include all non-NULL values. Sorry if
I sounded unclear there. 



If you do typecasting the value zero is false (and non-zero
is true). NULL in an _expression_ always returns NULL, and many programs
will interpret that result as false. So I'm not sure of what results you
might get with a Boolean test against a non-Boolean field, especially if it's
an integer field.



postgres=# select 0::boolean = FALSE;

?column?

--

t

(1 row)



You should just be able to take the previous query and add in
your WHERE clauses:







SELECT count(t1.fielda), count(t2.fielda), count(t2.fieldb),
AVG(t2.fieldc)





FROM t1 JOIN t2 ON t1.item_id = t2.item_id

WHERE t1.fielda = TRUE AND t2.fielda = TRUE AND t2.fieldb =
TRUE

GROUP BY NULL;



Now, the INNER JOIN you're using is only selecting fields
where both t1.item_id and t2.item_id exist and the respective fields are TRUE.
That is, it's only going to run the count and average functions against
the results of this query:

SELECT *





FROM t1 JOIN t2 ON ON t1.item_id = t2.item_id





WHERE t1.fielda = TRUE AND t2.fielda = TRUE AND t2.fieldb =
TRUE;



If that's what you want, that's great. 












Can I use OR instead of AND here?













However, you might want a count of each field where that
field is TRUE. In that case, I would use either temporary tables,
compound queries and derived tables, or multiple simple queries.



















It's also possible that you might want a count of fields
where t1.item_id and t2.item_id exist, but where only each respective field is
TRUE. That is, you want a count of t1.fielda where it is TRUE no matter
what t2.fielda and t2.fieldb are as long as t1.item_id matches t2.item_id.
In that case you have to do even more joins, and that could take a fair
bit of time especially if you haven't indexed your item_id fields.











 Well this sounds more like what I want. Given t2.fielda,
t2.fieldb, t2.fieldc, any one (or all three) could be true, but frequently at
least one of the fields is false. Initially, all of the fields might be
unknown (thus NULL) for a given item until I am able to investigate the items
to determine TRUE/FALSE. I frequently have items that are inactive, and
thus unable to determine any of attributes in t2. 

My end result needs to be a count of all the values in each field where the
value is TRUE, as opposed to FALSE or NULL.











You really have to look at your result sets. Sometimes
it is better to run multiple simple queries instead of one big complex query to
be sure you're getting the data you

[GENERAL] Novice Question

2005-03-01 Thread Michael Romagnoli
I am new to postgresql, having previously worked with mysql mostly.
What kind of command would I run if I wanted to copy an entire table 
(along with renaming it, and, of course, all data from the first table - 
some of which is binary)?

Thanks,
-Mike
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [GENERAL] Novice Question

2005-03-01 Thread Michael Romagnoli
Sorry, I meant to ask about copying databases, not tables (including all
data in  the database as per below).
Thanks,
-Mike
Michael Romagnoli wrote:
I am new to postgresql, having previously worked with mysql mostly.
What kind of command would I run if I wanted to copy an entire table 
(along with renaming it, and, of course, all data from the first table 
- some of which is binary)?

Thanks,
-Mike
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [GENERAL] Novice Question

2005-03-01 Thread Sean Davis
On Mar 1, 2005, at 4:23 PM, Michael Romagnoli wrote:
I am new to postgresql, having previously worked with mysql mostly.
What kind of command would I run if I wanted to copy an entire table 
(along with renaming it, and, of course, all data from the first table 
- some of which is binary)?

Thanks,
-Mike
---(end of 
broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly
select * into table new_table from old_table;
That's it.
Sean
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Novice Question

2005-03-01 Thread Bricklen Anderson
Sean Davis wrote:
On Mar 1, 2005, at 4:23 PM, Michael Romagnoli wrote:
I am new to postgresql, having previously worked with mysql mostly.
What kind of command would I run if I wanted to copy an entire table 
(along with renaming it, and, of course, all data from the first table 
- some of which is binary)?

Thanks,
-Mike
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

select * into table new_table from old_table;
That's it.
Sean
you sure about that syntax?
How about:
create table new_table as select * from old_table;
--
___
This e-mail may be privileged and/or confidential, and the sender does
not waive any related rights and obligations. Any distribution, use or
copying of this e-mail or the information it contains by other than an
intended recipient is unauthorized. If you received this e-mail in
error, please advise me (by return e-mail or otherwise) immediately.
___
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [GENERAL] Novice Question

2005-03-01 Thread javier wilson
On Tue, 01 Mar 2005 16:30:19 -0500, Michael Romagnoli
[EMAIL PROTECTED] wrote:
 
 Sorry, I meant to ask about copying databases, not tables (including all
 data in  the database as per below).

you can do a pg_dump your_databaseyour_database.dump.sql
and then createdb to create your new database, and finally
pgsql -f your_database.dump.sql new_database

and that's it. you should probably use -Upostgres 
depending on what kind of security you use.

javier

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Novice Question

2005-03-01 Thread Chris Kratz
create database newdb template olddb; 

works as well.

-Chris

On Tuesday 01 March 2005 05:08 pm, javier wilson wrote:
 On Tue, 01 Mar 2005 16:30:19 -0500, Michael Romagnoli

 [EMAIL PROTECTED] wrote:
  Sorry, I meant to ask about copying databases, not tables (including all
  data in  the database as per below).

 you can do a pg_dump your_databaseyour_database.dump.sql
 and then createdb to create your new database, and finally
 pgsql -f your_database.dump.sql new_database

 and that's it. you should probably use -Upostgres
 depending on what kind of security you use.

 javier

 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Novice Question

2005-03-01 Thread Edmund Bacon
[EMAIL PROTECTED] (Michael Romagnoli) writes:

 What kind of command would I run if I wanted to copy an entire table
 (along with renaming it, and, of course, all data from the first table
 -
 some of which is binary)?

SELECT * INTO newtable FROM oldtable;

Note that this doesn't construct indexes, Foreign keys, constraints,
etc.

If by 'binary data' you mean BLOBs, I'd expect the above to work.
Other than that, AFAIUI you have no reasonable expectation that your data is
stored in any meaningful binary format by the database.  All data
could be internally stored as strings (though that might be very
slow).



-- 
Remove -42 for email

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org