Re: Versioning in cassandra

Vivek Mishra Tue, 03 Sep 2013 11:07:22 -0700

My bad. I did miss out to read "latest version" part.

-Vivek



On Tue, Sep 3, 2013 at 11:20 PM, dawood abdullah
<muhammed.daw...@gmail.com>wrote:

> I have tried with both the options creating secondary index and also tried
> adding parentid to primary key, but I am getting all the files with
> parentid 'yyy', what I want is the latest version of file with the
> combination of parentid, fileid. Say below are the records inserted in the
> file table:
>
> insert into file (id, parentid, version, contenttype, description, name)
> values ('f1', 'd1', '2011-03-04', 'pdf', 'f1 file', 'file1');
> insert into file (id, parentid, version, contenttype, description, name)
> values ('f1', 'd1', '2011-03-05', 'pdf', 'f1 file', 'file1');
> insert into file (id, parentid, version, contenttype, description, name)
> values ('f2', 'd1', '2011-03-05', 'pdf', 'f1 file', 'file1');
> insert into file (id, parentid, version, contenttype, description, name)
> values ('f2', 'd1', '2011-03-06', 'pdf', 'f1 file', 'file1');
>
> I want to write a query which returns me second and last record and not
> the first and third record, because for the first and third record there
> exists a latest version, for the combination of id and parentid.
>
> I am confused If at all this is achievable, please suggest.
>
> Dawood
>
>
>
> On Tue, Sep 3, 2013 at 10:58 PM, Vivek Mishra <mishra.v...@gmail.com>wrote:
>
>> create secondary index over parentid.
>> OR
>> make it part of clustering key
>>
>> -Vivek
>>
>>
>> On Tue, Sep 3, 2013 at 10:42 PM, dawood abdullah <
>> muhammed.daw...@gmail.com> wrote:
>>
>>> Jan,
>>>
>>> The solution you gave works spot on, but there is one more requirement I
>>> forgot to mention. Following is my table structure
>>>
>>> CREATE TABLE file (
>>>   id text,
>>>   contenttype text,
>>>   createdby text,
>>>   createdtime timestamp,
>>>   description text,
>>>   name text,
>>>   parentid text,
>>>   version timestamp,
>>>   PRIMARY KEY (id, version)
>>>
>>> ) WITH CLUSTERING ORDER BY (version DESC);
>>>
>>>
>>> The query (select * from file where id = 'xxx' limit 1;) provided solves
>>> the problem of finding the latest version file. But I have one more
>>> requirement of finding all the latest version files having parentid say
>>> 'yyy'.
>>>
>>> Please suggest how can this query be achieved.
>>>
>>> Dawood
>>>
>>>
>>>
>>> On Tue, Sep 3, 2013 at 12:43 AM, dawood abdullah <
>>> muhammed.daw...@gmail.com> wrote:
>>>
>>>> In my case version can be timestamp as well. What do you suggest
>>>> version number to be, do you see any problems if I keep version as counter
>>>> / timestamp ?
>>>>
>>>>
>>>> On Tue, Sep 3, 2013 at 12:22 AM, Jan Algermissen <
>>>> jan.algermis...@nordsc.com> wrote:
>>>>
>>>>>
>>>>> On 02.09.2013, at 20:44, dawood abdullah <muhammed.daw...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Requirement is like I have a column family say File
>>>>> >
>>>>> > create table file(id text primary key, fname text, version int,
>>>>> mimetype text, content text);
>>>>> >
>>>>> > Say, I have few records inserted, when I modify an existing record
>>>>> (content is modified) a new version needs to be created. As I need to have
>>>>> provision to revert to back any old version whenever required.
>>>>> >
>>>>>
>>>>> So, can version be a timestamp? Or does it need to be an integer?
>>>>>
>>>>> In the former case, make use of C*'s ordering like so:
>>>>>
>>>>> CREATE TABLE file (
>>>>>    file_id text,
>>>>>    version timestamp,
>>>>>    fname text,
>>>>>    ....
>>>>>    PRIMARY KEY (file_id,version)
>>>>> ) WITH CLUSTERING ORDER BY (version DESC);
>>>>>
>>>>> Get the latest file version with
>>>>>
>>>>> select * from file where file_id = 'xxx' limit 1;
>>>>>
>>>>> If it has to be an integer, use counter columns.
>>>>>
>>>>> Jan
>>>>>
>>>>>
>>>>> > Regards,
>>>>> > Dawood
>>>>> >
>>>>> >
>>>>> > On Mon, Sep 2, 2013 at 10:47 PM, Jan Algermissen <
>>>>> jan.algermis...@nordsc.com> wrote:
>>>>> > Hi Dawood,
>>>>> >
>>>>> > On 02.09.2013, at 16:36, dawood abdullah <muhammed.daw...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > > Hi
>>>>> > > I have a requirement of versioning to be done in Cassandra.
>>>>> > >
>>>>> > > Following is my column family definition
>>>>> > >
>>>>> > > create table file_details(id text primary key, fname text, version
>>>>> int, mimetype text);
>>>>> > >
>>>>> > > I have a secondary index created on fname column.
>>>>> > >
>>>>> > > Whenever I do an insert for the same 'fname', the version should
>>>>> be incremented. And when I retrieve a row with fname it should return me
>>>>> the latest version row.
>>>>> > >
>>>>> > > Is there a better way to do in Cassandra? Please suggest what
>>>>> approach needs to be taken.
>>>>> >
>>>>> > Can you explain more about your use case?
>>>>> >
>>>>> > If the version need not be a small number, but could be a timestamp,
>>>>> you could make use of C*'s ordering feature , have the database set the 
>>>>> new
>>>>> version as a timestamp and retrieve the latest one with a simple LIMIT 1
>>>>> query. (I'll explain more when this is an option for you).
>>>>> >
>>>>> > Jan
>>>>> >
>>>>> > P.S. Me being a REST/HTTP head, an alarm rings when I see 'version'
>>>>> next to 'mimetype' :-) What exactly are you versioning here? Maybe we can
>>>>> even change the situation from a functional POV?
>>>>> >
>>>>> >
>>>>> > >
>>>>> > > Regards,
>>>>> > >
>>>>> > > Dawood
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> >
>>>>> >
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Versioning in cassandra

Reply via email to