Writing streaming data to cassandra creates duplicates

Priya Ch Sun, 26 Jul 2015 11:20:07 -0700

Hi All,

 I have a problem when writing streaming data to cassandra. Or existing
product is on Oracle DB in which while wrtiting data, locks are maintained
such that duplicates in the DB are avoided.


But as spark has parallel processing architecture, if more than 1 thread is
trying to write same data i.e with same primary key, is there as any scope
to created duplicates? If yes, how to address this problem either from
spark or from cassandra side ?

Thanks,
Padma Ch

Writing streaming data to cassandra creates duplicates

Reply via email to