Astyanax is performing the increment using counter columns.

In storm-cassandra, the code for incrementing the column value is here:

AstyanaxClient.java:422
mutation.withRow(columnFamily, rowKey)
.incrementCounterColumn(columnName, incrementAmount);

This uses the counter column mechanisms exposed by Astyanax.  For more
information, go here:
https://github.com/Netflix/astyanax/wiki/Working-with-counter-columns

This should work, except for the caveats mentioned already.  Cassandra is
addressing this under: https://issues.apache.org/jira/browse/CASSANDRA-4775)

-brian

---
Brian O'Neill
Chief Architect
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42>   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Adrian Mocanu <amoc...@verticalscope.com>
Reply-To:  <user@storm.incubator.apache.org>
Date:  Monday, January 6, 2014 at 10:21 AM
To:  "user@storm.incubator.apache.org" <user@storm.incubator.apache.org>
Subject:  RE: Cassandra bolt

Hi
I am actually looking into using CassandraCounterBatchingBolt but atm I¹m
not sure how Cassandra handles these eventual consistency issues so I need
to research that. The reason I mention this issues is because I cannot find
anywhere in the code where before a write there is a read .. which bothers
me .. maybe Cassandra does it w counter columns? IDK.
 
The issue I¹m talking ab is updating the same counter consecutively, but
faster than the updates propagate to  other Cassandra nodes.
 
Example:
Say I have 3 cassandra nodes. The counters on each of these nodes are 0.
Node1:0, node2:0, node3:0
 
An increment comes: 5
5 -> Node1:0, node2:0, node3:0
 
Increment starts at node 5 ­ still needs to propagate to node1 and node3
Node1:0, node2:5, node3:0
 
In the meantime, another increment arrives before previous increment is
propagated:
3 -> Node1:0, node2:5, node3:0
 
Assuming 3 starts at a different node than where 5 started we have:
Node1:3, node2:5, node3:0
 
Now if 3 gets propagated to the other nodes AS AN INCREMENT and not as a new
value (and the same for 5) then eventually they would all equal 8 and this
is what I want.
 
If 3 overwrites 5 (because it has a later timestamp) this is problematic ­
not what I want.
 
Will see what the Cassandra group says... or if the creators of
CassandraCounterBatchingBolt is on this group please let me know J
 
Thanks
Adrian 
 
 
From: Vladi Feigin [mailto:vladi...@gmail.com]
Sent: January-04-14 2:00 AM
To: user@storm.incubator.apache.org
Subject: Re: Cassandra bolt
 

Hi Adrian,

 

Why you don't use C* counters? Looks like your scenario fits for this. I
think CassandraCounterBatchingBolt provides  what you need

Vladi

 

On Fri, Jan 3, 2014 at 11:00 PM, Adrian Mocanu <amoc...@verticalscope.com>
wrote:
> 
> Happy New Year all!
>  
> I'm working on a solution for the following scenario: I have tuples coming to
> a cassandra bolt. The tuples are of this form: TupleData(String name, Int
> count, Long time) Time field is unique per batch only but not overall because
> some tuples may come in late but have the same name and time but different
> count. 
>  
> For example:
> I can receive these tuples for the same time: (x1,3,1111), (x2,4,1111)
> Then the bolt may receive (x1,5,1111)
> After these are put in cassandra, column family x1 should have value 8 for
> time 1111 and column family x2 should have value 4 for time 1111
>  
> Caching aside, cassandra bolt needs to check if there is a count already in
> the db for the tuple with given name and time. If it does exist then retrieve,
> increment it with newly received value, and update db exntry w the new value.
> (At this point I'm not sure if update or delete+reinsert is speedier)
> If no db entry exists, then add the new tuple.
>  
> I've looked at cassandra bolts code from
> https://github.com/hmsonline/storm-cassandra/tree/master/src/main/java/com/hms
> online/storm/cassandra/bolt
> which is the same as cassandra bolt from storm-contrib.
>  
> There is a class CassandraCounterBatchingBolt, but after looking at it I don't
> believe it does the look up in db first before saving the value to db, which
> leads me to believe that this will not work.
>  
> What I'm looking for seems pretty basic and I wonder if there is a cassandra
> bolt to do db lookup before updating db. Does such a bolt exist open-sourced?
> Otherwise I'm thinking of building mine on top of CassandraBatchingBolt.
>  
> -Adrian
>  
 


Reply via email to