A large column slice in my case is tens of thousands of columns, each a few K's in size and independent in processing from others. My plan was to read slices of a few hundred to a thousand columns and process them in a pipeline for reduced overall latency. Regardless of my specific case, though, I thought one of the best ways to get good performance scaling in Cassandra was to distribute reads and writes to multiple nodes. Are there situations where that's not a good idea?
CK. On Mon, Feb 1, 2010 at 6:00 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > No. Why do you want to do multiple parallel reads instead of one > sequential read? > > On Mon, Feb 1, 2010 at 4:45 PM, Cagatay Kavukcuoglu > <caga...@kavukcuoglu.org> wrote: >> Hi, >> >> What's the recommended way to do parallel reads of a large slice of >> columns when one doesn't know enough about the column names to divide >> them for parallel reading in a meaningful way? SliceRange allows >> setting the start and finish column names, but you wouldn't be able to >> set the start field of the next read until the previous read >> completed. An offset field for the SliceRange would have worked, but I >> don't see it. Is there a way to divide the big read query into >> multiple *parallel* small read queries without requiring advance >> knowledge of the column names? >> >> CK. >> >