Hi everyone, I was playing with a single-node Cassandra installation when discovered that a request like [SELECT COUNT(*) FROM CF] seems to load the entire dataset of CF into RAM. I am not sure is it expected to behave this way or not. I'd expect it to iterate through the entire set of rows rather than collect values in memory.
My steps: create table big_table ( k int primary key, idx bigint, val ascii, ts timestamp ); create index on big_table (idx); I filled the table above with 400 random rows, where column 'val' was written with random strings of 10MB each. Thus I came up roughly with 4GB of data. At this point everything is fine, response delays are pretty good and memory consumption is adequate. Things go bad with a counting request like [SELECT COUNT(1) FROM big_table] - that makes the database die with OOM. However, it is possible to fetch any column except the huge one: [SELECT k FROM big_table] - this works okay. As far as I understand, a counting request works roughly the same way as [SELECT * FROM] with only difference that it doesn't return any data back. Is my reasoning correct? Thanks in advance, Pavel.