Hello devs After CASSANDRA-7423 (Cassandra 3.6), it is possible to declare un-frozen UDT at 1st level and more important/interesting, it is possible to update atomically individual fields on an UDT (without the need to rewrite the UDT entirely)
This JIRA opens tremendous new opportunity in term of data modeling. It is sensible to store hierarchical data (think JSON/document) using UDT and collections. However, update of individual fields at level 2 and deeper is still not supported (UDT must be frozen at level 2 and onward) This limitation makes the update of individual fields at deeper level a nightmare. Let's take a contrive example: CREATE TYPE phone_number ( international_prefix text, local_prefix text, suffix text, type text // HOME, WORK, MOBILE, LANDLINE ... ); CREATE TYPE contact ( firstname text, lastname text, phone phone_number, email text, ... ); CREATE TABLE user_contacts( user_id uuid, contact_id uuid, contact_details contact, PRIMARY KEY ((user_id), contact_id) ); Now, to update a contact phone_number, one has to: - perform a SELECT contact_details.phone_number FROM user_contacts WHERE user_id=.. AND contact_id = ... - perform an UPDATE user_contacts SET contact_details.phone_number = ... WHERE user_id=.. AND contact_id = ... Not only this update procedure is bad (but mandatory) because it implies a read-before-write anti-pattern but it also exposes the end-user to horrendous concurrency issues. If there are 2 concurrent updates on the same phone_number nested UDT, one changing the type field and the other changing the suffix value, we will face data loss: - either the 1st update wins (in term of LWW) and the type is updated but not the suffix - or the 2nd update wins and the suffix is updated but not the type Ideally we would like to have both fields type & suffix updated. The only fix currently available is to rely on LWT and using a version column as optimistic concurrency locking: UPDATE user_contacts SET contact_details.phone_number = ..., version=2 WHERE user_id=.. AND contact_id = ... IF version=1 This guarantees that only 1 concurrent update succeeds at a time and force the failing updates re-fetching the fresh data and retry Of course, LWT has a huge cost and is overkill to solve such a problem. Thus my question are: - is there any plan to extend the UDT field updates to deeper level - is it complicated to do so ? I'm tempted to cast a glance and attempt a patch but I would like to know if it is going a gigantic task or not Thanks Regards Duy Hai DOAN