Re: [GENERAL] dynamic crosstab

2008-02-20 Thread Balázs Klein

I always hope that somebody might have something similar but
 generic - eg. create those columns automatically and just treat them all
 as text.

I came up with this amateurish one based on 
http://www.ledscripts.com/tech/article/view/5.html. 
Maybe someone can use it:
takes
- a select statement
- a name for the resulting view
- the column name of the id
- the column name of the attribute
- the column name of the value
- the aggregate function used

It recreates the view of the given name as a crosstab of the sql specified.

CREATE OR REPLACE FUNCTION public.create_crosstab_view (eavsql_inarg 
varchar, resview varchar, rowid varchar, colid varchar, val varchar, agr 
varchar) RETURNS pg_catalog.void AS
$body$
DECLARE
casesql varchar;
dynsql varchar;
r record;
BEGIN   
 dynsql='';
 
 for r in 
  select * from pg_views where lower(viewname) = lower(resview)
  loop
  execute 'DROP VIEW ' || resview;
  end loop;   
 
 casesql='SELECT DISTINCT ' || colid || ' AS v from (' || eavsql_inarg || ') 
eav ORDER BY ' || colid;
 FOR r IN EXECUTE casesql Loop
dynsql = dynsql || ', ' || agr || '(CASE WHEN ' || colid || '=' || r.v || ' 
THEN ' || val || ' ELSE NULL END) AS ' || agr || '_' || r.v;
 END LOOP;
 dynsql = 'CREATE VIEW ' || resview || ' AS SELECT ' || rowid || dynsql || ' 
from (' || eavsql_inarg || ') eav GROUP BY ' || rowid;  
 EXECUTE dynsql;
END
$body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] dynamic crosstab

2008-02-15 Thread Balázs Klein
Joe wrote
 It occurs to me that it shouldn't be terribly difficult to make an
 alternate version of crosstab() that returns an array rather than tuples
 (back when crosstab() was first written, Postgres didn't support NULL
 array elements). Is this worth considering for 8.4?

I think there should be a generic way in Postgres to return from an EAV model. 
Although I have no evidence on that I keep thinking that the db must be more 
effective at that than the application would be.

I was hoping that now with PG supporting plan invalidation it would be possible 
to return a recordset. If there is no generic way to return a recordset than 
being able to return an array is much better than nothing.

B.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] dynamic crosstab

2008-02-15 Thread Balázs Klein
Erik Jones wrote:
 First, please stop top-posting.  It makes it difficult for both me
 and others to know to whom/what you are replying.

Sorry, I don't know much about mailing list customs - I had to look up what 
top-posting is. I will behave now ... 

I would prefer to keep the complications for when I retrieve the data rather 
then when I store it.

I could imagine something like this though to create a crosstab as an array, 
but I am afraid that there is no assurance that the resulting array would 
contain the values in the same order for each focus:

tbl(eID, aID, value)

Select eID, array_accum(value) from 
(
 (Select Distinct eID from tbl) e
  CROSS JOIN
 (Select Distinct aID from tbl) a
) ea
 LEFT OUTER JOIN
tbl USING (eID, aID)
GROUP BY eID

B.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] dynamic crosstab

2008-02-15 Thread Balázs Klein
 given that answers for a questionnaire are stored as a
 batch 

Not in our setup - for all sorts of reasons (preserving responses on a 
connection failure or restart, monitoring response latency in real time, 
creating adaptive/branching questionnaires) we send each response separately.

 people running reports on will be the ones to notice, i.e. at
 retrieval time.

I am not sure - different responses are aggregated into different attributes in 
different ways - those properties need to be retrieved during scoring/report 
generation, so being able to create a join directly on a response is a good 
thing for me. But report generation - in our case it must be a DTP quality PDF 
- is such a beast anyway that db times dwarf compared to pdf generation.

The problem comes when I need to present the responses themselves in a 
human-friendly way - as an export or display or report. Do you think there is a 
way to ensure that the order of the values in the array below is the same for 
each person?

tbl(eID, aID, value)

Select eID, array_accum(value) from
(
 (Select Distinct eID from tbl) e
  CROSS JOIN
 (Select Distinct aID from tbl) a
) ea
 LEFT OUTER JOIN
tbl USING (eID, aID)
GROUP BY eID


Thx for the help.
B.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] dynamic crosstab

2008-02-15 Thread Balázs Klein

 Balázs Klein wrote:
 
  I was hoping that now with PG supporting plan invalidation it would
  be possible to return a recordset.
 
 Plan invalidation has nothing to do with it. In Postgres a returned
 recordset can be used as a row source in the FROM clause -- this
 requires data type information to be known at parse time.
 
 Joe

I thought that it includes that the return type can be changed/redefined at 
runtime. No luck there than.

Thx.
B.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] dynamic crosstab

2008-02-15 Thread Balázs Klein


 -Original Message-
  Do youthink there is a way to ensure that the order of the values in the
  array below is the same for each person?
 
  tbl(eID, aID, value)
 
  Select eID, array_accum(value) from
  (
   (Select Distinct eID from tbl) e
CROSS JOIN
   (Select Distinct aID from tbl) a
  ) ea
   LEFT OUTER JOIN
  tbl USING (eID, aID)
  GROUP BY eID
 
 The only way to ever guarantee a particular order is via an ORDER BY
 clause.

Sure. I just didn’t know where to put it - most aggregates don't care about the 
row order, but for this one it is important.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] dynamic crosstab

2008-02-14 Thread Balázs Klein

Hi,
Yes I know that SPSS can do this - in fact that is the only way I could solve 
this so far, but that is a very expensive workaround for anybody not currently 
owning SPSS.

Thanks.
SWK



-Original Message-
From: jr [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, February 13, 2008 1:31 PM
To: SunWuKung
Subject: Re: dynamic crosstab

hi SWK

SunWuKung wrote:
 I know that most db people don't care much about pivot/crosstab in the
 db but imagine this situation:
 I am storing questionnaire results on people. Since the questionnaires
 are created by users I have no other way than using an EAV model like

are you using the right tool for this task?

 Moreover my users can't do anything with this dataformat - they need
 to pivot it offline anyway, which is not easy (Excel cant do it,
 Access cant do it, numberGo cant do it for different reasons).

back at college we used SPSS - the Statistical Package for Social Sciences.

 Please let me know if you know of a good db based way to create a
 dynamic crosstab in Postgres - or why there shouldn't be one.

to be honest I don't; I think that a specialised product (such as SPSS) 
will solve both problems in one stroke.

-- 

regards, jr.  ([EMAIL PROTECTED])


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] dynamic crosstab

2008-02-14 Thread Balázs Klein
Hi,
ye, hundreds of columns - but there is no helping it, that’s the way many 
questionnaire are and the representation of the responses (when not in a 
database) is always one person per row. I would need this for exporting, but 
also to show results online.

Although it’s a good idea I am afraid that an array could only help me when the 
info I store about all the persons in the query are exactly the same (there 
wouldn’t be empty cells in a crosstab) - it’s very useful for some cases but in 
general that sounds like a dangerous presumption for me.

I think this is a generic shortcoming of Postgres - whenever you are forced to 
create an EAV (Entity-Attribute-Value) model you have no generic or way of 
going back to the usual one entity per row model. This is something that Access 
has been able to do (up to 255 columns) as far as I can remember. When I google 
about this topic I find that the majority of people are still referring to that 
solution as the easiest for this purpose. Tablefunc crosstab is so close to a 
good solution for this with the syntax where you could specify the columns with 
a query - the only shortcoming is that you still have to enumerate the columns 
and their datatype. I always hope that somebody might have something similar 
but generic - eg. create those columns automatically and just treat them all as 
text.

Regards,
SWK

-Original Message-
From: Tino Wildenhain [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, February 13, 2008 2:05 PM
To: SunWuKung
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] dynamic crosstab

Hi,

SunWuKung wrote:
 Hi,
 
 I found this to create dynamic crosstabs (where the resulting columns
...
 This could work although for hundreds of columns it looks a bit scary
 for me.

Well I'd say hundreds of columns are always scary, no matter how you do
it :-)

...
 I know that most db people don't care much about pivot/crosstab in the
 db but imagine this situation:
 I am storing questionnaire results on people. Since the questionnaires
 are created by users I have no other way than using an EAV model like
 personID, questionID, responseValue to store responses. Now this table
 gets long 300 question per questionnaire, 3000 people and we have 1m
 row. Now whenever I need to download this data in my case 2/3rd of it
 would be redundant if I could pivot it first - and in a 20MB csv its
 significant (I know its a tradeoff between processing and storage).
 Moreover my users can't do anything with this dataformat - they need
 to pivot it offline anyway, which is not easy (Excel cant do it,
 Access cant do it, numberGo cant do it for different reasons).

What about not pivoting it? You can run your analysis directly
against your database.

 Although the application could do it I think this is a generic
 functionality that the database is more suited for.

Well after all you want a CSV not a table. You could shortcut this
with a generic query which creates array out of your columns
and join them to a CSV line. This would just be outputted as
one single column from database.

 Please let me know if you know of a good db based way to create a
 dynamic crosstab in Postgres - or why there shouldn't be one.

See above :-)

Regards
Tino

Internal Virus Database is out-of-date.
Checked by AVG Free Edition. 
Version: 7.5.503 / Virus Database: 269.16.0/1137 - Release Date: 11/18/2007 
5:15 PM
 


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] dynamic crosstab

2008-02-14 Thread Balázs Klein
Yes, once I have the select outputting it to CSV is not a problem. As you say 
PG handles that nicely.

Thx
SWK

-Original Message-
From: Reece Hart [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, February 13, 2008 9:39 PM
To: Tino Wildenhain
Cc: SunWuKung; pgsql-general@postgresql.org
Subject: Re: [GENERAL] dynamic crosstab

On Wed, 2008-02-13 at 14:04 +0100, Tino Wildenhain wrote:
 Well after all you want a CSV not a table. You could shortcut this
 with a generic query which creates array out of your columns
 and join them to a CSV line. This would just be outputted as
 one single column from database.

Depending on your use case, this may be a better way:

In psql:
= \copy (select col1,col2,col3 from data) TO data.csv CSV HEADER

or on the command line:
$ psql -c '\copy (select col1,col2,col3 from data) TO data.csv CSV 
HEADER'

Strictly speaking, the CSV formatting isn't being done in the database
but rather by psql.

-Reece

-- 
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] dynamic crosstab

2008-02-14 Thread Balázs Klein

Hi,
the part that I don't know is how to put those NULLs in.
It could well be doable I just can't do it myself.

How does the query look like that produces from this input:
PersonID AttributeID Value
1   1   aaa
1   2   bbb
1   3   ccc
2   1   ddd
2   3   eee

this output, without manually enumerating the attributeids:
1 (aaa,bbb,ccc)
2 (ddd,NULL,eee)

Thx.
B.

-Original Message-
From: Erik Jones [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 14, 2008 5:15 PM
To: Balázs Klein
Cc: 'Tino Wildenhain'; 'SunWuKung'; pgsql-general@postgresql.org
Subject: Re: [GENERAL] dynamic crosstab


On Feb 14, 2008, at 2:04 AM, Balázs Klein wrote:

 Hi,
 ye, hundreds of columns - but there is no helping it, that’s the  
 way many questionnaire are and the representation of the responses  
 (when not in a database) is always one person per row. I would need  
 this for exporting, but also to show results online.

 Although it’s a good idea I am afraid that an array could only help  
 me when the info I store about all the persons in the query are  
 exactly the same (there wouldn’t be empty cells in a crosstab) -  
 it’s very useful for some cases but in general that sounds like a  
 dangerous presumption for me.

As of versions = 8.2 you can store NULL values in arrays.  Perhaps  
you could have a Question - Index table and then use an array per  
person for their answers.


 I think this is a generic shortcoming of Postgres - whenever you  
 are forced to create an EAV (Entity-Attribute-Value) model you have  
 no generic or way of going back to the usual one entity per row  
 model. This is something that Access has been able to do (up to 255  
 columns) as far as I can remember. When I google about this topic I  
 find that the majority of people are still referring to that  
 solution as the easiest for this purpose. Tablefunc crosstab is so  
 close to a good solution for this with the syntax where you could  
 specify the columns with a query - the only shortcoming is that you  
 still have to enumerate the columns and their datatype. I always  
 hope that somebody might have something similar but generic - eg.  
 create those columns automatically and just treat them all as text.

Have a look at http://www.varlena.com/varlena/GeneralBits/110.php for  
a totally different approach to questionnaires.

Erik Jones

DBA | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate  market in style.
Visit us online at http://www.myemma.com




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match