Hi All,

I am running Spark 3.0.1 on Kubernetes where Spark fetching data from
Cassandra and stores it in a JavaRDD.

My Question is Does RDD JavaFunctions *repartitionByCassandraReplica *works
on Kubernetes environment. I can get the result if I am using it in case of
Spark Stand Alone on Virtualized Environment but as if I use the same
API (*repartitionByCassandraReplica
* ) on Kubernetes , spark RDD return as empty.

*API :*
CassandraJavaUtil.javaFunctions(theJavaRDD).repartitionByCassandraReplica(keyspaceName,
tableName, partitionsPerHost, partitionkeyMapper, rowWriterFactory).

Please suggest Can Spark Data Locality awareness can be achieved in
Kubernetes as well as availability of this feature directly
impacts performance.

Regards
User

Reply via email to