Hi,
While working with spark running on top of cassandra, I wanted to do some
filtering on data.
It can be done either on server side(where clause while cassandraTable
query is written) or on client side(filter transformation on rdd).
Which one of them is preferred keeping performance and time in mind?

I am using spark java connector.


*References :**1.*
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/7_java_api.md
Note: See the description of filtering
<https://github.com/datastax/spark-cassandra-connector/blob/master/doc/3_selection.md>
 to understand the limitations of the where method.
*2.*
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/3_selection.md
To filter rows, you can use the filter transformation provided by Spark
.... To avoid this overhead, CassandraRDD offers the where method, which
lets you pass arbitrary CQL condition(s) to filter the row set on the
server.

Thanks and Regards

Siddharth Verma

*Software Engineer*

CA2125, 2nd Floor, ASF Centre-A, Jwala Mill Road,
Udyog Vihar Phase - IV, Gurgaon-122016, INDIA
Download Our App
[image: A]
<https://play.google.com/store/apps/details?id=com.snapdeal.main&utm_source=mobileAppLp&utm_campaign=android>
[image:
A]
<https://itunes.apple.com/in/app/snapdeal-mobile-shopping/id721124909?ls=1&mt=8&utm_source=mobileAppLp&utm_campaign=ios>
[image:
W]
<http://www.windowsphone.com/en-in/store/app/snapdeal/ee17fccf-40d0-4a59-80a3-04da47a5553f>

Reply via email to