Shenghua,
 
> The problem is the user might only want all the data via a "select *"
> like statement. It seems that 257 connections to query the rows are necessary.
> However, is there any way to prohibit 257 concurrent connections?


Your reasoning is correct.
The number of connections should be tunable via the
"cassandra.input.split.size" property. See
ConfigHelper.setInputSplitSize(..)

The problem is that vnodes completely trashes this, since splits
returned don't span across vnodes.
There's an issue out for this –
https://issues.apache.org/jira/browse/CASSANDRA-6091
 but part of the problem is that the thrift stuff involved here is
 getting rewritten¹ to be pure cql.

In the meantime you override the CqlInputFormat and manually re-merge
splits together, where location sets match, so to better honour
inputSplitSize and to return to a more reasonable number of connections.
We do this, using code similar to this patch
https://github.com/michaelsembwever/cassandra/pull/2/files

~mck

¹ https://issues.apache.org/jira/browse/CASSANDRA-8358

Reply via email to