The github repo is https://github.com/datastax/spark-cassandra-connector
The talk video and slides should be uploaded soon on spark summit website On Wednesday, June 8, 2016, Chanh Le <giaosu...@gmail.com> wrote: > Thanks, I'll look into it. Any luck to get link related to. > > On Thu, Jun 9, 2016, 12:43 PM Jasleen Kaur <jasleenkaur1...@gmail.com > <javascript:_e(%7B%7D,'cvml','jasleenkaur1...@gmail.com');>> wrote: > >> Try using the datastax package. There was a great talk on spark summit >> about it. It will take care of the boiler plate code and you can focus on >> real business value >> >> On Wednesday, June 8, 2016, Chanh Le <giaosu...@gmail.com >> <javascript:_e(%7B%7D,'cvml','giaosu...@gmail.com');>> wrote: >> >>> Hi everyone, >>> I tested the partition by columns of data frame but it’s not good I mean >>> wrong. >>> I am using Spark 1.6.1 load data from Cassandra. >>> I repartition by 2 field date, network_id - 200 partitions >>> I reparation by 1 field date - 200 partitions. >>> but my data is data of 90 days -> I mean if we reparation by date it >>> will be 90 partitions. >>> >>> val daily = sql >>> .read >>> .format("org.apache.spark.sql.cassandra") >>> .options(Map("table" -> dailyDetailTableName, "keyspace" -> reportSpace)) >>> .load() >>> .repartition(col("date")) >>> >>> >>> >>> I mean It doesn’t change the way I put the columns to repartition. >>> >>> Does anyone has the same problem? >>> >>> Thank in advance. >>> >>