Do not fall too easilly into hstore :-) while it sometimes good and extremely easy to setup, it has some drawbacks - querying and searching has some limitations (keys/values not easily indexable, horrible syntax) - storage not optimised (one hstore field = several dozens of boolean columns) - only text values, no data type validation
I'd recommend option (4) - normalize! FR 2011/12/5 Mike Christensen <m...@kitchenpc.com> > >> I have a database full of recipes, one recipe per row. I need to > >> store a bunch of arbitrary "flags" for each recipe to mark various > >> properties such as Gluton-Free, No meat, No Red Meat, No Pork, No > >> Animals, Quick, Easy, Low Fat, Low Sugar, Low Calorie, Low Sodium and > >> Low Carb. Users need to be able to search for recipes that contain > >> one or more of those flags by checking checkboxes in the UI. > >> > >> I'm searching for the best way to store these properties in the > >> Recipes table. > > > > I'd use hstore to store them as tags. You can then use hstore's GiST > index > > support to get quick lookups. > >> > >> 1. Have a separate column for each property and create an index on > >> each of those columns. I may have upwards of about 20 of these > >> properties, so I'm wondering if there's any drawbacks with creating a > >> whole bunch of BOOL columns on a single table. > > > > It'll get frustrating as you start adding new categories, and will drive > you > > insane as soon as you want to let the user define their own categories - > > which you will land up wanting to do in your problem space. I'd avoid it. > >> > >> 2. Use a bitmask for all properties and store the whole thing in one > >> numeric column that contains the appropriate number of bits. Create a > >> separate index on each bit so searches will be fast. > > > > Same as above, it'll get annoying to manage when you want user tagging. > >> > >> 3. Create an ENUM with a value for each tag, then create a column that > >> has an ARRAY of that ENUM type. I believe an ANY clause on an array > >> column can use an INDEX, but have never done this. > > > > Same again. > >> > >> 4. Create a separate table that has a one-to-many mapping of recipes > >> to tags. Each tag would be a row in this table. The table would > >> contain a link to the recipe, and an ENUM value for which tag is "on" > >> for that recipe. When querying, I'd have to do a nested SELECT to > >> filter out recipes that didn't contain at least one of these tags. I > >> think this is the more "normal" way of doing this, but it does make > >> certain queries more complicated - If I want to query for 100 recipes > >> and also display all their tags, I'd have to use an INNER JOIN and > >> consolidate the rows, or use a nested SELECT and aggregate on the fly. > > > > That'll get slow. It'll work and is IMO better than all the other options > > you suggested, but I'd probably favour hstore over it. > > The hstore module sounds fantastic! > > I'm curious as to how these columns are serialized back through the > driver, such as Npgsql. Do I get the values as strings, such as a > comma delimited key/value pair list? Or would I need to do some > custom logic to deserialize them? > > Right now, I'm using Npgsql as a driver, and NHibernate/Castle > ActiveRecord as an ORM. > > Mike > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >