Hi Padma,
Have you considered reducing the dataset before writing it to Cassandra? Looks
like this consistency problem could be avoided by cleaning the dataset of
unnecessary records before persisting it:
val onlyMax = rddByPrimaryKey.reduceByKey{case (x,y) => Max(x,y)} // your max
function
Hi All,
I have the following scenario in writing rows to Cassandra from Spark
Streaming -
in a 1 sec batch, I have 3 tickets with same ticket number (primary key)
but with different envelope numbers (i.e envelope 1, envelope 2, envelope
3.) I am writing these messages to Cassandra using