Hi, Maybe not completely the wright place to ask but... I have this schema design question (db is postgres of course). I have a couple of classes with attributes. The only goal is to search the object that I want to find (which is stored on the harddrive).
I have hundreds of classes that are similar but not the same. They all have attributes/properties (type is probably String), e.g. (in pseudo code): class A_version_1 { attribute1, attribute2, attribute3, ..., attributeN } class A_version_2 { attribute1, attribute3, ..., attributeN, attributeN+1, attributeN+2 } class B_version_1 { attribute3, attribute4, attribute7, attributeN+3, ..., attributeN+M } Class A will have attributes from class B, class B will have attributes from class C and so on. My initial thought was to use the (sometimes dreaded) EAV model: class_id, object_id, attribute_id and attribute_value. In this way I can make queries like: SELECT CLASS_ID, OBJECT_ID FROM EAV_TABLE EAV WHERE EAV.ATTRIBUTE_ID = X AND EAV.ATTRIBUTE_VALUE = 'searchstring' AND EXISTS (SELECT OBJECT_ID FROM EAV_TABLE EAV2 WHERE EAV.OBJECT_ID = EAV2.OBJECT_ID AND EAV.CLASS_ID = EAV2.CLASS_ID AND EAV2.ATTRIBUTE_ID = Y AND EAV2.ATTRIBUTE_VALUE = 'searchstring2') Results from this query could be entities from multiple classes! The alternative is, as many people say: make a proper table for each class which would lead to hundreds of unions. Is that good/performant? I thought it would not... To put all attributes of all classes (as columns) in one table is impossible. The number of total attributes should be in the thousands. A third alternative I came up with is the entity/value schema design where each attribute would have its own table. A query would look like this: SELECT CLASS_ID, OBJECT_ID FROM EV_X EAV WHERE EAV.ATTRIBUTE_VALUE = 'searchstring' AND EXISTS (SELECT OBJECT_ID FROM EV_Y EAV2 WHERE EAV.OBJECT_ID = EAV2.OBJECT_ID AND EAV.CLASS_ID = EAV2.CLASS_ID AND EAV2.ATTRIBUTE_VALUE = 'searchstring2') Which would be a nice way to partition the otherwise large table (but there would be thousands of smaller tables). The app I'm writing has to scale to about 1 billion attributes/value-pairs in total. A normal search query would imply about 5 search terms (but there could be 20). Any suggestions/remarks (I think the EXISTS should be replaced by an IN, something else)? Did anyone implement such a search method (or did they decide to make a different design)? Did it work/scale? Thanks in advance, Mark O.