Re: cassandra database design

2016-09-01 Thread Stone Fang
Thanks,Carlos. the key point is how to balance the data spread around the cluster and the partition number of query. it is hard to determine which is best.anyway,thanks for your suggestion.it help me a lot. stone On Thu, Sep 1, 2016 at 4:54 PM, Carlos Alonso wrote: > I guess there's no easy sol

Re: cassandra database design

2016-09-01 Thread Carlos Alonso
I guess there's no easy solution for this. The bucketing technique you were applying with the publish_pre extra field making a composite partition key is probably your best bet but you're right being concerned that all your workload will hit the same node during an hour. I'd then suggest adding a

Re: cassandra database design

2016-08-31 Thread Stone Fang
access pattern is select *from datacenter where datacentername = '' and publish>$time and publish<$time On Wed, Aug 31, 2016 at 8:37 PM, Carlos Alonso wrote: > Maybe a good question could be: > > Which is your access pattern to this data? > > Carlos Alonso | Software Engineer | @calonso

Re: cassandra database design

2016-08-31 Thread Carlos Alonso
Maybe a good question could be: Which is your access pattern to this data? Carlos Alonso | Software Engineer | @calonso On 31 August 2016 at 11:47, Stone Fang wrote: > Hi all, > have some questions on how to define clustering key. > > have a table like this > > CR

cassandra database design

2016-08-31 Thread Stone Fang
Hi all, have some questions on how to define clustering key. have a table like this CREATE TABLE datacenter{ datacentername varchar, publish timestamp, value varchar, PRIMARY KEY(datacentername,publish) } *issues:* there are only two datacenter,so the data would only have two partitions.an