Level N atomic updates for UDT necessary

2019-02-17 Thread DuyHai Doan
Hello devs

After CASSANDRA-7423 (Cassandra 3.6), it is possible to declare un-frozen
UDT at 1st level and more important/interesting, it is possible to update
atomically individual fields on an UDT (without the need to rewrite the UDT
entirely)

This JIRA opens tremendous new opportunity in term of data modeling. It is
sensible to store hierarchical data (think JSON/document) using UDT and
collections.

However, update of individual fields at level 2 and deeper is still not
supported (UDT must be frozen at level 2 and onward)

This limitation makes the update of individual fields at deeper level a
nightmare.

Let's take a contrive example:

CREATE TYPE phone_number (
international_prefix text,
local_prefix text,
suffix text,
type text // HOME, WORK, MOBILE, LANDLINE ...
);

CREATE TYPE contact (
firstname text,
lastname text,
phone phone_number,
email text,
...
);

CREATE TABLE user_contacts(
   user_id uuid,
   contact_id uuid,
   contact_details contact,
   PRIMARY KEY ((user_id), contact_id)
);

Now, to update a contact phone_number, one has to:

- perform a SELECT contact_details.phone_number FROM user_contacts WHERE
user_id=.. AND contact_id = ...
- perform an UPDATE user_contacts SET contact_details.phone_number = ...
WHERE user_id=.. AND contact_id = ...

Not only this update procedure is bad (but mandatory) because it implies a
read-before-write anti-pattern but it also exposes the end-user to
horrendous concurrency issues.

If there are 2 concurrent updates on the same phone_number nested UDT, one
changing the type field and the other changing the suffix value, we will
face data loss:
  - either the 1st update wins (in term of LWW) and the type is updated but
not the suffix
  - or the 2nd update wins and the suffix is updated but not the type

Ideally we would like to have both fields type & suffix updated. The only
fix currently available is to rely on LWT and using a version column as
optimistic concurrency locking:

UPDATE user_contacts SET contact_details.phone_number = ..., version=2
WHERE user_id=.. AND contact_id = ... IF version=1

This guarantees that only 1 concurrent update succeeds at a time and force
the failing updates re-fetching the fresh data and retry

Of course, LWT has a huge cost and is overkill to solve such a problem.

Thus my question are:

- is there any plan to extend the UDT field updates to deeper level
- is it complicated to do so ? I'm tempted to cast a glance and attempt a
patch but I would like to know if it is going a gigantic task or not

Thanks

Regards

Duy Hai DOAN


Re: CASSANDRA-14482

2019-02-17 Thread dinesh.jo...@yahoo.com.INVALID
Thanks all for your input. The consensus is to go forward with this ticket.
Dinesh 

On Friday, February 15, 2019, 12:54:20 PM PST, Sumanth Pasupuleti 
 wrote:  
 
 +1

On Fri, Feb 15, 2019 at 12:14 PM Dikang Gu  wrote:

> +1
>
> On Fri, Feb 15, 2019 at 10:27 AM Vinay Chella 
> wrote:
>
> > We have been using Zstd compressor across different products/services
> here
> > and have seen significant improvements, getting this in 4.0 would be a
> big
> > win.
> >
> > +1
> >
> > Thanks,
> > Vinay Chella
> >
> >
> > On Fri, Feb 15, 2019 at 10:19 AM Jeff Jirsa  wrote:
> >
> > > +1
> > >
> > > --
> > > Jeff Jirsa
> > >
> > >
> > > > On Feb 15, 2019, at 9:35 AM, Jonathan Ellis 
> wrote:
> > > >
> > > > IMO "add a new compression class that has demonstrable benefits to
> > Sushma
> > > > and Joseph" is sufficiently noninvasive that we should allow it into
> > 4.0.
> > > >
> > > > On Fri, Feb 15, 2019 at 10:48 AM Dinesh Joshi
> > > >  wrote:
> > > >
> > > >> Hey folks,
> > > >>
> > > >> Just wanted to get a pulse on whether we can proceed with ZStd
> > support.
> > > >> The consensus on the ticket was that it’s a very valuable addition
> > > without
> > > >> any risk of destabilizing 4.0. It’s ready to go if there aren’t any
> > > >> objections.
> > > >>
> > > >> Dinesh
> > > >>
> > > >>
> -
> > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>
> > > >>
> > > >
> > > > --
> > > > Jonathan Ellis
> > > > co-founder, http://www.datastax.com
> > > > @spyced
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>
>
> --
> Dikang
>