Re: Performance Of IN Queries On Wide Rows

Carl Mueller Wed, 21 Feb 2018 13:35:40 -0800

Cass 2.1.14 is missing some wide row optimizations done in later cass
releases IIRC.


Speculation: IN won't matter, it will load the entire wide row into memory
regardless which might spike your GC/heap and overflow the rowcache

On Wed, Feb 21, 2018 at 2:16 PM, Gareth Collins <gareth.o.coll...@gmail.com>
wrote:

> Thanks for the response!
>
> I could understand that being the case if the Cassandra cluster is not
> loaded. Splitting the work across multiple nodes would obviously make
> the query faster.
>
> But if this was just a single node, shouldn't one IN query be faster
> than multiple due to the fact that, if I understand correctly,
> Cassandra should need to do less work?
>
> thanks in advance,
> Gareth
>
> On Wed, Feb 21, 2018 at 7:27 AM, Rahul Singh
> <rahul.xavier.si...@gmail.com> wrote:
> > That depends on the driver you use but separate queries asynchronously
> > around the cluster would be faster.
> >
> >
> > --
> > Rahul Singh
> > rahul.si...@anant.us
> >
> > Anant Corporation
> >
> > On Feb 20, 2018, 6:48 PM -0500, Eric Stevens <migh...@gmail.com>, wrote:
> >
> > Someone can correct me if I'm wrong, but I believe if you do a large
> IN() on
> > a single partition's cluster keys, all the reads are going to be served
> from
> > a single replica.  Compared to many concurrent individual equal
> statements
> > you can get the performance gain of leaning on several replicas for
> > parallelism.
> >
> > On Tue, Feb 20, 2018 at 11:43 AM Gareth Collins <
> gareth.o.coll...@gmail.com>
> > wrote:
> >>
> >> Hello,
> >>
> >> When querying large wide rows for multiple specific values is it
> >> better to do separate queries for each value...or do it with one query
> >> and an "IN"? I am using Cassandra 2.1.14
> >>
> >> I am asking because I had changed my app to use 'IN' queries and it
> >> **appears** to be slower rather than faster. I had assumed that the
> >> "IN" query should be faster...as I assumed it only needs to go down
> >> the read path once (i.e. row cache -> memtable -> key cache -> bloom
> >> filter -> index summary -> index -> compaction -> sstable) rather than
> >> once for each entry? Or are there some additional caveats that I
> >> should be aware of for 'IN' query performance (e.g. ordering of 'IN'
> >> query entries, closeness of 'IN' query values in the SSTable etc.)?
> >>
> >> thanks in advance,
> >> Gareth Collins
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: user-h...@cassandra.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Re: Performance Of IN Queries On Wide Rows

Reply via email to