Hi,
I'm working on a system which has to deal with time series data. I've been
happy using Cassandra for time series and Spark looks promising as a
computational platform.
I consider chunking time series in Cassandra necessary, e.g. by 3 weeks as
kairosdb does it. This allows an 8 byte chunk star
Hi all,
I didn't find the *issues* button on
https://github.com/datastax/spark-cassandra-connector/ so posting here.
Any one have an idea why token ranges are grouped into one partition per
executor? I expected at least one per core. Any suggestions on how to work
around this? Doing a repartition
Hi all,
Wanted to let you know I've forked PySpark Cassandra on
https://github.com/TargetHolding/pyspark-cassandra. Unfortunately the
original code didn't work for me and I couldn't figure out how it could
work. But it inspired! so I rewrote the majority of the project.
The rewrite implements ful
Hi,
I'm trying to connect to Cassandra through PySpark using the
spark-cassandra-connector from datastax based on the work of Mike
Sukmanowsky.
I can use Spark and Cassandra through the datastax connector in Scala just
fine. Where things fail in PySpark is that an exception is raised in
org.apach