RE: Best design for a table using variant data

Rick James Mon, 20 Aug 2012 17:20:41 -0700

Even if you get past those issues, you will get to other nasty problems...  
Ugly JOINs, terrible performance, huge disk footprint, etc.


Split the attributes into two groups:
* The ones you _really_ need to search on.
* The rest.
Build columns for the first group; use the appropriate datatypes.
Throw the rest into a JSON string, put that into another column.  (I like to 
compress the JSON and put it into a MEDIUMBLOB.)

For searching,...
Use SQL (WHERE ...) to filter on any attributes in the first group.
Then use your programming language (PHP, etc) to finish the filtering, after 
fetching (and uncompressing) the JSON.

> -----Original Message-----
> From: Gaston Gloesener [mailto:[email protected]]
> Sent: Sunday, August 12, 2012 10:58 PM
> To: [email protected]
> Subject: RE: Best design for a table using variant data
> 
> > You don't specify how many different types (including min/max values)
> > you
> expect to be using. If you expect to end up with a few hundred, then
> you should perhaps consider using an ENUM or SET column directly in the
> data table.
> 
> I do not quite understand where the use of enums/sets could be help
> here.
> 
> Regarding the number of basic types there are only expected to be a few
> similar to a programming language: At the beginning there will be 4
> types (integer, 64-bit integer, floating point (probably double) and
> strings with their associated constraints. Each single attribute (i.e.
> table row) would only use one of these types.
> 
> Possibly my explanations where not clear enough, so let me take a
> virtual example which would match my needs: A database which represents
> a second hand shop. This shop would contain a number of things of
> different types.
> Let's take "cars" and "clothes" as example. Those two would have a
> number off attributes which depend on the type. I.e. the car would have
> Brand (String), Model (String), "manufacturing year" (Inetger, max
> 2012), horsepower (integer), ... while some of these are meaningless
> for the clothes which will have "size", ...
> 
> These attributes are not known at development type and are to be
> defined by the users of the database and may be extended over time. As
> said above this is a virtual example which does not match the actual
> much more complex database use.
> 
> / Gaston
> 
> -----Original Message-----
> From: Carsten Pedersen [mailto:[email protected]]
> Sent: Friday, August 10, 2012 15:33
> To: Gaston Gloesener
> Cc: [email protected]
> Subject: Re: Best design for a table using variant data
> 
> You don't specify how many different types (including min/max values)
> you expect to be using. If you expect to end up with a few hundred,
> then you should perhaps consider using an ENUM or SET column directly
> in the data table.
> 
> / Carsten
> 
> On 10.08.2012 10:51, Gaston Gloesener wrote:
> > Hello,
> >
> >
> >
> > I am currently facing a design where a table (virtually) needs to
> > store attributes of a topic (related table). The attributes can be
> > user defined, i.e. not known at development type and depend on other
> > factors. Each attributes value can be one of different types (int,
> > int64, double, string) and may have constraints like min, max or
> > length
> (string).
> >
> >
> >
> > Thus the data type would be modeled as variant in some programming
> > languages, but this is not an option in SQL (beside the MS SQLserver
> > sql_variant extension).
> >
> >
> >
> > So, how to simulate this in SQL.
> >
> >
> >
> > Basically there would be one table describing the attributes type
> > (Type identifier, min/max,.) and one table for the values itself.
> >
> >
> >
> > The design I am currently thinking of would be to make exactly these
> > two tables, with the attributes having a Dataype column and iMin,iMax
> > for integer, i64Min, i64Max, fMin,fMax for double , sMinLen, sMaxLen
> > for strings. The same applies to the value table which will have
> > iValue, i64Value, fValue, string columns to hold the actual data.
> >
> >
> >
> > Now the columns will be filled according to the data type, columns
> not
> > matching the type will be NULL.
> >
> >
> >
> > This means that each row in the table will have virtual space for any
> > data type which violates database normalization. However it seems to
> > me to be the best deal for performance and data space as NULL takes
> > virtually no room
> > (4/8 bytes in total for a number of fields in some circumstances) and
> > requires no complex queries.
> >
> >
> >
> > One could also imagine to have the constraints moved to a separate
> > table and interpreted according to the data type. Also a table for
> > each type could be imagined but this will make the queries very
> > complicated working against performance.
> >
> >
> >
> > Note: The model has to work for huge databases
> >
> >
> >
> > Anybody has a better alternative ?
> >
> >
> >
> >
> >
> >
> 
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql
> 
> 
> 
> 
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

RE: Best design for a table using variant data

Reply via email to