On 16 Nov 2001, Jason E. Stewart wrote:

> I'm building a tool that will enable scientists to load there
> experimental data into a database. That data will come as spreadsheets
> of data from scientists. Each group of scientists will use slightly
> different technology to generate the data so those spreadsheets are
> very likely to have different numbers of columns, and they will
> certainly have different data types in the various columns, however
> from one group, all the data should have the same format (or small set
> of formats). 
> 
> So whatever solution I come up with, needs to be flexible and store
> data no matter how many columns it has or what the data types for the
> fields are. One complication is that there are likely to be millions
> of rows from the spreadsheets, so I want it to be reasonable efficient
> (no joins if possible).
> 
> Variable length arrays seemed the obvious way to solve this. 
> 
> I just wanted to avoid having to create a new table for each
> spreadsheet configuration. A small finite number of tables would be
> fine, but I couldn't come up with a way.
That's your problem right here. You _should_ create a new table for each
spreadsheet as they are semantically different. Use unions to provide
unified view of fields that are indeed common.

Skimping on proper relational design short-term will likely to bite you in
long term.

Putting dissimilar values into same field is only inviting trouble.

-alex

Reply via email to