Re: [GENERAL] how do i count() similar items

2010-09-08 Thread jackassplus
>
> Use a CASE statement?
>
> something like:
> select  case WHEN os ~* E'^windows' then 'windows'
>   WHEN os ~* E'server' then 'server'
>   WHEN os ~* E'nix$' then '*nix'
>   else 'other' end
>   as osval, count(*) from os_tbl  group by osval order by osval;
>
> The hard part is making sure your regexes cover all the bases, without
> duplication.
>
> It still sounds like the value should be a reference to a unique value in a
> small table of operating system entries, then store the value, rather than
> the string, in the main table.
>
> Susan
>

I'll give this a shot. looks like it will work well. Regexes, I don't mind,
SQL, I'm new at. Much less the pg functions.
thanks to both you and bricklin.


Re: [GENERAL] how do i count() similar items

2010-09-08 Thread Susan Cassidy
Use a CASE statement?

something like:
select  case WHEN os ~* E'^windows' then 'windows'
   WHEN os ~* E'server' then 'server'
   WHEN os ~* E'nix$' then '*nix'
   else 'other' end
   as osval, count(*) from os_tbl  group by osval order by osval;

The hard part is making sure your regexes cover all the bases, without 
duplication.

It still sounds like the value should be a reference to a unique value in a 
small table of operating system entries, then store the value, rather than the 
string, in the main table.

Susan

-Original Message-
From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of jackassplus
Sent: Wednesday, September 08, 2010 12:22 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] how do i count() similar items


 
> To ensure data integrity,
> you should probably create a fruit_type table with a unique column that
> lists the possible types, and then foreign key the fruit_type column in
> the fruits table to that to ensure nothing funky is entered.  An enum
> for type is another possibility.

In the real world, this column actiually holds Operating Systems.
I have 7 variants of Windows XP, even more of server, a dozen *nixes,
etc, etc and it is fed from an external app.
So I am looking for a magic query, or even a perl function to wrap up
insde a procedure, whatever.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do i count() similar items

2010-09-08 Thread bricklen
On Wed, Sep 8, 2010 at 12:22 PM, jackassplus  wrote:
>
> 
>> To ensure data integrity,
>> you should probably create a fruit_type table with a unique column that
>> lists the possible types, and then foreign key the fruit_type column in
>> the fruits table to that to ensure nothing funky is entered.  An enum
>> for type is another possibility.
>
> In the real world, this column actiually holds Operating Systems.
> I have 7 variants of Windows XP, even more of server, a dozen *nixes,
> etc, etc and it is fed from an external app.
> So I am looking for a magic query, or even a perl function to wrap up
> insde a procedure, whatever.

If your column values can be grouped by regexp, you might be able to
get away with using a CASE statement.
eg.
select
sum(case when val ~* 'windows xp' then 1 else 0 end) as winxp,
sum(case when val ~* 'nix' then 1 else 0 end) as nix,
sum(case when val ~* 'redhat|rhel' then 1 else 0 end) as rh


(I don't have your original mail handy to reference the example values).

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do i count() similar items

2010-09-08 Thread jackassplus

 
> To ensure data integrity,
> you should probably create a fruit_type table with a unique column that
> lists the possible types, and then foreign key the fruit_type column in
> the fruits table to that to ensure nothing funky is entered.  An enum
> for type is another possibility.

In the real world, this column actiually holds Operating Systems.
I have 7 variants of Windows XP, even more of server, a dozen *nixes,
etc, etc and it is fed from an external app.
So I am looking for a magic query, or even a perl function to wrap up
insde a procedure, whatever.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do i count() similar items

2010-09-08 Thread Darren Duncan

jackassplus wrote:

lets say I hve the following in the 'fruits' table:

Round orange
Sunkist orange
navel orange
strawberry
blueberry
sunkist orange
apple

how would I get something like the following:

count as c | Fruit type
-
4  | orange
2  | berry
1  | apple


Your best solution is to have separate database fields for your 2 levels of 
detail, say call them "fruit type" and "variety", or maybe other fields 
depending on purpose.


This is because the organization of fruit is somewhat arbitrary and isn't easily 
encoded into variety names without kludges.  Also, many fruit with similar names 
are actually very different.  So simple textual analysis of the field you have 
often won't be very useful.


For example, a kiwifruit is a berry, and a strawberry is very different from a 
blueberry, the first not actually being a berry at all.  There are also various 
citrus which are crosses between oranges and other citrus.  There are also stone 
fruits which are crosses between plums and apricots.  Also, "Sunkist" isn't a 
variety of orange but rather is a brand name used for multiple varieties.


(Yes, members of my family cultivate fruit trees, so I have a lot of second-hand 
experience with this.)


If you still want to go by textual analysis as you suggest, it will be 
nontrivial and involve pattern matching for common suffixes where some are 
separate words and some aren't, and you can do this pattern matching in an extra 
select-list item which you then group by.


-- Darren Duncan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do i count() similar items

2010-09-08 Thread Bill Moran
In response to jackassplus :

> lets say I have the following in the 'fruits' table:
> 
> Round orange
> Sunkist orange
> navel orange
> strawberry
> blueberry
> sunkist orange
> apple
> 
> how would I get something like the following:
> 
> count as c | Fruit type
> -
> 4  | orange
> 2  | berry
> 1  | apple

Organize your data better.

If you want to analyze data in the way you describe, then you need to store
the data in a way that makes in analyzable.  I'm 100% sure that someone can
provide you with some query magic that will make the example you provided
produce the results you're wanting.  I'm also 100% sure that a few weeks
or months down the line you find exceptions that will break that solution,
and that said solution will never really be reliable.

If you're going to analyze fruits by type, then you need a type column.
You can then group by type to get count()s.  To ensure data integrity,
you should probably create a fruit_type table with a unique column that
lists the possible types, and then foreign key the fruit_type column in
the fruits table to that to ensure nothing funky is entered.  An enum
for type is another possibility.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general