Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "ThomasBoose/EERD model components to Cassandra Column family's" page has been changed by ThomasBoose. http://wiki.apache.org/cassandra/ThomasBoose/EERD%20model%20components%20to%20Cassandra%20Column%20family%27s?action=diff&rev1=4&rev2=5 -------------------------------------------------- If this is not yet making sence, read on. == Indexing == - In order to add an index to a column, other then the ColumnFamily key, we should to create a second ColumnFamily. Every insert, which can be either an insert or update in Cassandra, on the original columnfamily we will update the corresponding index. + In order to add an index to a column, other then the ColumnFamily key, we should to create a second ColumnFamily. Every insert, which can be either an insert or update in Cassandra, on the original ColumnFamily we will update the corresponding index. Think of a ColumnFamily cf_Person (examples in Python using pycassa) @@ -38, +38 @@ If it is necessary to use different keys for both collections, sometimes it is not up to one designer to select both keys, although the number of element are equal and they are related one on one, in a relational model the designer gets to select either key to insert into the other collection with an unique and foreign key constraint. In Cassandra modeling you are forced to either croslink both key's, So you'd design both key's foreign in both ColumnFamily's. Or you create a third ColumnFamily in which you store both keys preceded by a token to which columfamily you are refering. Lets focus on the first option. Say we hand out phones to our employees and we agree that every employee will always have one phone. and phones that are not used are not stored in our database. The phone has a phonenumber as key where the employee has a social security number. In order to know which number to dial when looking for employee X and who is calling giving a specific phonenumber we need to store both keys foreign in both ColumnFamily's. - ||||||||<tablewidth="400px">'''CF_Employee''' || + ||||||||<tablewidth="400px"style="text-align: center;">'''CF_Employee''' || ||<style="text-align: center;" |2>123-12-1234 ||name ||phone ||salary || ||John ||0555-123456 ||10.000 || ||<style="text-align: center;" |2>321-21-4321 ||name ||phone ||salary || ||Jane ||0555-654321 ||12.000 || - ||||||<tablewidth="400px" tablestyle="text-align: left;"style="text-align: center;">'''CF_Phone''' || @@ -55, +54 @@ - Using a static columnname and requiring input in the foreign key fields, checking the existence of the key in the other columnfamily and processing updates and deletes are all subject to programming in the DBMS layer. Cassandra itself does not, and probably will not, provide foreign key logic. One could imagine an process that makes sure the cross references stay consistend: + Using a static columnname and requiring input in the foreign key fields, checking the existence of the key in the other ColumnFamily and processing updates and deletes are all subject to programming in the DBMS layer. Cassandra itself does not, and probably will not, provide foreign key logic. One could imagine an process that makes sure the cross references stay consistend: {{{ cf_Employee.insert('321-21-4321', {'name':'Jane', 'phone':'0555-654321'}) @@ -76, +75 @@ === 1 to Many === In one to many relationships we add the key from the "one" side foreign to the "many" side. So if we're modelinng students studing at only one school-unit at a time we would add the unit's key to the student as foreign. Considering that no foreign key logic is provided you will have to write your own code to enforce consistancy in unit's existing when the unit attribute of a student is set and defining behaviour when deleting a unit. Cosiddering the fact that this kind of relation is very common one could best create the logic for this at a seperate DBMS tier. - Every student has only one school-unit so we enforce one static name of a column that will reference this unit. for instance this column in the cf_Student columnfamily is called "school-unit". In a cassandra database this is not sufficient to retrieve all student within this unit. One could find answers to questions like these but it would require quite a lot of processing power. If a ColumnFamily, the cf_School_unit family in this case, has only one of these relations, then one could chose to add all student keys to that ColumnFamily it self. I would not count on this situation persisting in future releases of you system and therefore sugest that you'de provide seperate ColumnFamily's for each one to many relationship that you model. + Every student has only one school-unit so we enforce one static name of a column that will reference this unit. for instance this column in the cf_Student ColumnFamily is called "school-unit". In a cassandra database this is not sufficient to retrieve all student within this unit. One could find answers to questions like these but it would require quite a lot of processing power. If a ColumnFamily, the cf_School_unit family in this case, has only one of these relations, then one could chose to add all student keys to that ColumnFamily it self. I would not count on this situation persisting in future releases of you system and therefore sugest that you'de provide seperate ColumnFamily's for each one to many relationship that you model. - This would leed to three columnFamily's + This would leed to three ColumnFamily's - - ||||||||<tablewidth="400px">'''CF_Student''' || + ||||||||<tablewidth="400px"style="text-align: center;">'''CF_Student''' || ||<style="text-align: center;" |2>123-12-1234 ||name ||unit ||city || ||John ||SE ||the hague || ||<style="text-align: center;" |2>321-21-4321 ||name ||unit ||city || @@ -97, +95 @@ || || || + + No value's are actualy stored in the columns indicating de studentnumbers. These columns only exist to indicate which students are present in this unit.