Lee Hachadoorian wrote:
I work with state labor data which is reported to us in the form

        industry, year, quarter1, quarter2, quarter3, quarter4

where each quarter represents an employment count. Obviously, this can
be normalized to

        industry, year, quarter, employment

Can anyone comment on, or point to me to an article or discussion
regarding, why one would use an array column instead of normalizing
the data? That is, would there be any benefit to storing it as

        industry int, year smallint, employment int[ ]

where the last column would be a four element array with data for the
four quarters.

Thanks,
--Lee

--
Lee Hachadoorian
PhD Student, Geography
Program in Earth & Environmental Sciences
CUNY Graduate Center

If you want to do that, I'd recommend:
industry int,
year    smallint,
emp_q1  int,
emp_q2  int,
emp_q3  int,
emp_q4  int

That way it is more clear, easier to query, uses less space and you wont end up with employment data for the 5th quarter or something odd like that.

Arrays are great for working with your data during the query process. But you should generally avoid using them to store your data on disk.

Scott


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to