I was wondering if I could have a bit more insight as why we are seeing 
different insertion times between regular column families and super columns.

We have a group object (with its name) that may have a series of attributes 
(name/value). There can be up a million group object and different groups can 
share several attributes. In our first design we had a super column we have the 
column path as

        ColumnPath ("Index", [attribute value], [group name]) and row key is 
the attribute name. The value
        we are inserting is an empty byte array

In the second design we simply our model and

        ColumnPath ("Index", null, [group name]) and the row key is simply the 
attribute name concatenated      with the attribute value. The value inserted 
again is an empty array

In the first case we, inserting 250K group it took about 1.5 hours and in the 
second case it took 45 minutes. In both tests, we started Cassandra with no 
data, using OPP in two nodes (each 16 core 64 GB)

We are wondering why inserting when using super columns we get lower 
performance.

Thanks,

Carlos




This email message and any attachments are for the sole use of the intended 
recipients and may contain proprietary and/or confidential information which 
may be privileged or otherwise protected from disclosure. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not an 
intended recipient, please contact the sender by reply email and destroy the 
original message and any copies of the message as well as any attachments to 
the original message.

Reply via email to