Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Jeff Jirsa
Slight nuance: we don't load the whole row into memory, but the column index (and the result set, and the tombstones in the partition), which can still spike your GC/heap (and potentially overflow the row cache, if you have it on, which is atypical). On Wed, Feb 21, 2018 at 1:35 PM, Carl Mueller

Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Carl Mueller
Cass 2.1.14 is missing some wide row optimizations done in later cass releases IIRC. Speculation: IN won't matter, it will load the entire wide row into memory regardless which might spike your GC/heap and overflow the rowcache On Wed, Feb 21, 2018 at 2:16 PM, Gareth Collins

Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Gareth Collins
Thanks for the response! I could understand that being the case if the Cassandra cluster is not loaded. Splitting the work across multiple nodes would obviously make the query faster. But if this was just a single node, shouldn't one IN query be faster than multiple due to the fact that, if I

Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Rahul Singh
That depends on the driver you use but separate queries asynchronously around the cluster would be faster. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Feb 20, 2018, 6:48 PM -0500, Eric Stevens , wrote: > Someone can correct me if I'm wrong, but I believe if you

Re: Performance Of IN Queries On Wide Rows

2018-02-20 Thread Eric Stevens
Someone can correct me if I'm wrong, but I believe if you do a large IN() on a single partition's cluster keys, all the reads are going to be served from a single replica. Compared to many concurrent individual equal statements you can get the performance gain of leaning on several replicas for

Performance Of IN Queries On Wide Rows

2018-02-20 Thread Gareth Collins
Hello, When querying large wide rows for multiple specific values is it better to do separate queries for each value...or do it with one query and an "IN"? I am using Cassandra 2.1.14 I am asking because I had changed my app to use 'IN' queries and it **appears** to be slower rather than faster.