Hi, I am using Spark and the Cassandra-connector to save customer events for later batch analysis.
Primary access pattern later on will be by time-slice One way to save the events would be to create a C* row per day, for example, and within that row store the events in decreasing time order. However, this will cause a hot spot in the cluster for each day. The other two options I see would be sharding (e.g. create 100 rows per day) or use a new table for every day. I prefer the last option, but am not sure whether that is a good pattern with the C* connector. Can anyone provide insights to guide that decision? Jan --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org