On Fri, 12 Nov 2010, Tom Lane wrote: > Generally, wanting more than a few dozen columns is a > good sign that you need to rethink your schema design. > What are you trying to accomplish exactly?
Generally speaking, yes. However, survey instruments are a very legitimate design where this is not only correct, but quite desirable. Sure -- in an ideal world you might be able to turn your instrument into something that is a highly normalized structure. However, it's simply not practical. PostgreSQL absolutely should look into supporting this. While the storage may not be ideal, it is an increasingly common case. Perhaps the storage engine could permit a set of columns to be tagged for ``sparse column storage``? On Fri, 12 Nov 2010, Mark Michell wrote: > Yes I understand that this is "bad design" but what we are doing is > storing each form field in a survey in its own column. For very long > surveys we end up with thousands of elements. It's *not* necessarily bad design. If you've got a survey instrument that has 1000 unique data points, it is a reasonable thing to do. We (Prometheus Research) have these sorts of things all the times. #1 You want to store and query them based on specific fields. #2 You want to enforce the data integrity of each column by using types, enums and other techniques. #3 You want to integrate this structure with the other aspects of your database. > I know storing in an array is possible but it makes it > so much easier to query the data set when each element > is in its own field. comments on why I should not do this Using HSTORE or using an EAV table completely undermines the ability to ensure column-level typing, constraints, and... use SQL. With arrays / EAV table structure, you end up having to re-invent your query language. It is a non-solution. > the possible alternatives The work-around we use is to break our instrument into one table per ``section``. We also save each section as the user completes that part of the instrument. This has the advantage of allowing you to model repeating sections. We typically enforce column uniqueness across all sections. This requires your application and query tool to be a bit smart about making a virtual table with all 1000+ columns. Kinda sucks, but it works. ... I guess the other option is to look at a RDF data store where you use SPARQL to query the data. OpenJena SDB is implemented as a 3-table storage in PostgreSQL if you wish to keep the same database for both your assessment storage and other tables. I'm not lovin' this option, but it's on the table for our company. Instead, I'd absolutely love to see PostgreSQL add a variant scalar type (for when the type of a column changes over the lifetime of the instrument) and a column storage for Nth+ and additional columns. For now, the work-around above sort of works for us. Best, Clark -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general