[GENERAL] Re: [pgsql-general] Need Help With a A Simple Query That's Not So Simple

2011-11-01 Thread James B. Byrne

 Date: Mon, 31 Oct 2011 16:51:02 -0600
 From: Bill Thoen bth...@gisnet.com
 To: Postgrresql pgsql-general@postgresql.org
 Subject: Need Help With a A Simple Query That's Not So
 Simple
 Message-ID: 4eaf2656.6020...@gisnet.com

 I think this should be easy, but I can't seem to put the
 SQL together correctly and would appreciate any help.
 (I'm using Pg 8.4 in CentOS 5.5, if that matters.)

 I have a table of Farms and a table of crops in a 1:M
 relationship of Farms : Crops. There are lots of
 different crops to choose form but for now I'm only
 interested in two crops; corn and soybeans.

 Some farms grow only corn and some grow only soybeans,
 and some grow both.  What I'd like to know is, which
 Farms and how many are growing only corn, which and
 how many are growing soybeans and which and how many are
 growing both? I can easily get all the corn growers with:

 SELECT a.*
FROM farms a
JOIN crops b
  ON a.farm_id=b.farm_id
   WHERE crop_cd='0041'

 I can do the same with soybeans (crop_cd= '0081') and
 then I could subtract the sum of these from the total
 of all farms that grow either corn or soybeans to get
 the number of farms growing both, but having to
 do all those queries sounds very time consuming and
 inefficient. Is there a better way to get the farm
 counts or data by categories like farms growing only
 corn, farms growing only soybeans, farms growing
 both? I'm also interested in possibly expanding to a
 general case where I could select more than two crops.
 and get counts of the permutations.

 Here's a sketch of the relevant pieces of the data base.

 *Tables:*
 farms crops
 === ===
 farm_id  bigint (pkey) crop_id   (pkey)
 type farm_idforeign key to farms
 size crop_cd0041 = corn 0081=soybeans
 ...year
 ...

 Any help would be much appreciated.

 TIA,

 - Bill Thoen

I believe that what you are trying to do is called
relational algebra division. Take a look at these
references and see if either fits your needs:

http://en.wikipedia.org/wiki/Relational_algebra#Division_.28.C3.B7.29

http://www.cs.arizona.edu/~mccann/research/divpresentation.pdf





-- 
***  E-Mail is NOT a SECURE channel  ***
James B. Byrnemailto:byrn...@harte-lyne.ca
Harte  Lyne Limited  http://www.harte-lyne.ca
9 Brockley Drive  vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada  L8E 3C3


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Re: [pgsql-general] Need Help With a A Simple Query That's Not So Simple

2011-11-01 Thread Bill Thoen
Thanks! Half the problem searching the 'Net for answers is knowing what it's 
called.

Regards,

Bill Thoen
GISnet
http://gisnet.com
303-786-9961

On Nov 1, 2011, at 10:01 AM, James B. Byrne byrn...@harte-lyne.ca wrote:

 
 Date: Mon, 31 Oct 2011 16:51:02 -0600
 From: Bill Thoen bth...@gisnet.com
 To: Postgrresql pgsql-general@postgresql.org
 Subject: Need Help With a A Simple Query That's Not So
 Simple
 Message-ID: 4eaf2656.6020...@gisnet.com
 
 I think this should be easy, but I can't seem to put the
 SQL together correctly and would appreciate any help.
 (I'm using Pg 8.4 in CentOS 5.5, if that matters.)
 
 I have a table of Farms and a table of crops in a 1:M
 relationship of Farms : Crops. There are lots of
 different crops to choose form but for now I'm only
 interested in two crops; corn and soybeans.
 
 Some farms grow only corn and some grow only soybeans,
 and some grow both.  What I'd like to know is, which
 Farms and how many are growing only corn, which and
 how many are growing soybeans and which and how many are
 growing both? I can easily get all the corn growers with:
 
 SELECT a.*
   FROM farms a
   JOIN crops b
 ON a.farm_id=b.farm_id
  WHERE crop_cd='0041'
 
 I can do the same with soybeans (crop_cd= '0081') and
 then I could subtract the sum of these from the total
 of all farms that grow either corn or soybeans to get
 the number of farms growing both, but having to
 do all those queries sounds very time consuming and
 inefficient. Is there a better way to get the farm
 counts or data by categories like farms growing only
 corn, farms growing only soybeans, farms growing
 both? I'm also interested in possibly expanding to a
 general case where I could select more than two crops.
 and get counts of the permutations.
 
 Here's a sketch of the relevant pieces of the data base.
 
 *Tables:*
 farms crops
 === ===
 farm_id  bigint (pkey) crop_id   (pkey)
 type farm_idforeign key to farms
 size crop_cd0041 = corn 0081=soybeans
 ...year
 ...
 
 Any help would be much appreciated.
 
 TIA,
 
 - Bill Thoen
 
 I believe that what you are trying to do is called
 relational algebra division. Take a look at these
 references and see if either fits your needs:
 
 http://en.wikipedia.org/wiki/Relational_algebra#Division_.28.C3.B7.29
 
 http://www.cs.arizona.edu/~mccann/research/divpresentation.pdf
 
 
 
 
 
 -- 
 ***  E-Mail is NOT a SECURE channel  ***
 James B. Byrnemailto:byrn...@harte-lyne.ca
 Harte  Lyne Limited  http://www.harte-lyne.ca
 9 Brockley Drive  vox: +1 905 561 1241
 Hamilton, Ontario fax: +1 905 561 0757
 Canada  L8E 3C3
 
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general
 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Need Help With a A Simple Query That's Not So Simple

2011-10-31 Thread Bill Thoen
I think this should be easy, but I can't seem to put the SQL together 
correctly and would appreciate any help. (I'm using Pg 8.4 in CentOS 
5.5, if that matters.)


I have a table of Farms and a table of crops in a 1:M relationship of 
Farms : Crops. There are lots of different crops to choose form but for 
now I'm only interested in two crops; corn and soybeans. Some farms grow 
only corn and some grow only soybeans, and some grow both. What I'd like 
to know is, which Farms and how many are growing only corn, which and 
how many are growing soybeans and which and how many are growing both? I 
can easily get all the corn growers with:


SELECT a.*
  FROM farms a
  JOIN crops b
ON a.farm_id=b.farm_id
 WHERE crop_cd='0041'

I can do the same with soybeans (crop_cd= '0081') and then I could 
subtract the sum of these from the total of all farms that grow either 
corn or soybeans to get the number of farms growing both, but having to 
do all those queries sounds very time consuming and inefficient. Is 
there a better way to get the farm counts or data by categories like 
farms growing only corn, farms growing only soybeans, farms growing 
both? I'm also interested in possibly expanding to a general case where 
I could select more than two crops. and get counts of the permutations.


Here's a sketch of the relevant pieces of the data base.

*Tables:*
farms crops
=== ===
farm_id  bigint (pkey) crop_id   (pkey)
type farm_idforeign key to farms
size crop_cd0041 = corn 0081=soybeans
...year
...

Any help would be much appreciated.

TIA,

- Bill Thoen





Re: [GENERAL] Need Help With a A Simple Query That's Not So Simple

2011-10-31 Thread David Johnston
From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Bill Thoen
Sent: Monday, October 31, 2011 6:51 PM
To: Postgrresql
Subject: [GENERAL] Need Help With a A Simple Query That's Not So Simple

I think this should be easy, but I can't seem to put the SQL together
correctly and would appreciate any help. (I'm using Pg 8.4 in CentOS 5.5, if
that matters.)

I have a table of Farms and a table of crops in a 1:M relationship of Farms
: Crops. There are lots of different crops to choose form but for now I'm
only interested in two crops; corn and soybeans. Some farms grow only corn
and some grow only soybeans, and some grow both. What I'd like to know is,
which Farms and how many are growing only corn, which and how many are
growing soybeans and which and how many are growing both? I can easily get
all the corn growers with: 

SELECT a.* 
  FROM farms a 
  JOIN crops b 
    ON a.farm_id=b.farm_id 
 WHERE crop_cd='0041'

I can do the same with soybeans (crop_cd= '0081') and then I could subtract
the sum of these from the total of all farms that grow either corn or
soybeans to get the number of farms growing both, but having to do all those
queries sounds very time consuming and inefficient. Is there a better way to
get the farm counts or data by categories like farms growing only corn,
farms growing only soybeans, farms growing both? I'm also interested in
possibly expanding to a general case where I could select more than two
crops. and get counts of the permutations.

Here's a sketch of the relevant pieces of the data base.

Tables:  
farms  crops
===    ===
farm_id  bigint (pkey) crop_id   (pkey)
type   farm_id    foreign key to farms
size   crop_cd    0041 = corn 0081=soybeans
...    year
   ...

Any help would be much appreciated.

TIA,

- Bill Thoen
---

General Idea:

WITH crop_one AS (
SELECT farm_id, crop_cd AS crop_one_cd ...
), crop_two AS (
SELECT farm_id, crop_cd AS crop_two_cd
)
SELECT *
FROM crop_one
FULL OUTER JOIN crop_two USING (farm_id)
;

Records with NULL for crop_one_cd only grow crop 2, records with NULL for
crop_two_cd only grow crop 1, records where neither field is NULL grow
both.

Not sure regarding the general case.  You likely want to use ARRAY_AGG to
get a result like:

Farm_id_100, { 'CROP_CD_1', 'CROP_CD_2' }

You could then probably get a query to output something like:
 (crop_id, farms_exclusive, farms_shared, farms_without)
Where each of the farms_ columns is an array of farm_ids that match the
particular conditional

= ALL (exclusive); != ALL  = ANY (shared); != ANY (without)

David J.





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Need Help With a A Simple Query That's Not So Simple

2011-10-31 Thread Bill Thoen

On 10/31/2011 5:05 PM, David Johnston wrote:

From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Bill Thoen
Sent: Monday, October 31, 2011 6:51 PM
To: Postgrresql
Subject: [GENERAL] Need Help With a A Simple Query That's Not So Simple

[...]
What I'd like to know is,
which Farms and how many are growing only corn, which and how many are
growing soybeans and which and how many are growing both?
[...]
Is there a better way to
get the farm counts or data by categories like farms growing only corn,
farms growing only soybeans, farms growing both? I'm also interested in
possibly expanding to a general case where I could select more than two
crops. and get counts of the permutations.
[...]
---

General Idea:

WITH crop_one AS (
SELECT farm_id, crop_cd AS crop_one_cd ...
), crop_two AS (
SELECT farm_id, crop_cd AS crop_two_cd
)
SELECT *
FROM crop_one
FULL OUTER JOIN crop_two USING (farm_id)
;

Records with NULL for crop_one_cd only grow crop 2, records with NULL for
crop_two_cd only grow crop 1, records where neither field is NULL grow
both.

Not sure regarding the general case.  You likely want to use ARRAY_AGG to
get a result like:

Farm_id_100, { 'CROP_CD_1', 'CROP_CD_2' }

You could then probably get a query to output something like:
  (crop_id, farms_exclusive, farms_shared, farms_without)
Where each of the farms_ columns is an array of farm_ids that match the
particular conditional

= ALL (exclusive); != ALL  = ANY (shared); != ANY (without)

David J.
Thanks David! That worked great! When I filled in the the query from the 
general idea in your example above like so:


WITH crop_one AS (
SELECT farm_id, crop_cd AS corn FROM gfc_inilmoidia_2007 WHERE 
crop_cd ='0041'

), crop_two AS (
SELECT farm_id, crop_cd AS soybeans FROM gfc_inilmoidia_2007 
WHERE crop_cd = '0081'

)
SELECT *
FROM crop_one
FULL OUTER JOIN crop_two USING (farm_id)
;

It produced the following (which is essentially the base of what I'm 
looking for):


 farm_id | corn | soybeans
-+--+--
1473 | 0041 | 0081
1474 | 0041 | 0081
1474 | 0041 | 0081
1474 | 0041 | 0081
1474 | 0041 | 0081
1475 | 0041 |
1475 | 0041 |
1476 | 0041 | 0081
1476 | 0041 | 0081
1476 | 0041 | 0081
1476 | 0041 | 0081
1476 | 0041 | 0081
1476 | 0041 | 0081
1476 | 0041 | 0081
1476 | 0041 | 0081
1476 | 0041 | 0081
1477 | 0041 |
1478 | 0041 | 0081
1479 | 0041 |
1480 |  | 0081
1480 |  | 0081

Thanks so much for the quick reply. You've also just opened up a whole 
new area of query possibilities for me of which I wasn't aware


- Bill Thoen