Hi, While working with spark running on top of cassandra, I wanted to do some filtering on data. It can be done either on server side(where clause while cassandraTable query is written) or on client side(filter transformation on rdd). Which one of them is preferred keeping performance and time in mind?
I am using spark java connector. *References :**1.* https://github.com/datastax/spark-cassandra-connector/blob/master/doc/7_java_api.md Note: See the description of filtering <https://github.com/datastax/spark-cassandra-connector/blob/master/doc/3_selection.md> to understand the limitations of the where method. *2.* https://github.com/datastax/spark-cassandra-connector/blob/master/doc/3_selection.md To filter rows, you can use the filter transformation provided by Spark .... To avoid this overhead, CassandraRDD offers the where method, which lets you pass arbitrary CQL condition(s) to filter the row set on the server. Thanks and Regards Siddharth Verma *Software Engineer* CA2125, 2nd Floor, ASF Centre-A, Jwala Mill Road, Udyog Vihar Phase - IV, Gurgaon-122016, INDIA Download Our App [image: A] <https://play.google.com/store/apps/details?id=com.snapdeal.main&utm_source=mobileAppLp&utm_campaign=android> [image: A] <https://itunes.apple.com/in/app/snapdeal-mobile-shopping/id721124909?ls=1&mt=8&utm_source=mobileAppLp&utm_campaign=ios> [image: W] <http://www.windowsphone.com/en-in/store/app/snapdeal/ee17fccf-40d0-4a59-80a3-04da47a5553f>