Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

shalom sagges Tue, 28 May 2019 00:00:36 -0700

Hi Attila,

I'm definitely no guru, but I've experienced several cases where people at
my company used allow filtering and caused major performance issues.
As data size increases, the impact will be stronger. If you have large
partitions, performance will decrease.
GC can be affected. And if GC stops the world too long for too many times,
you will feel it.


I sincerely believe the best way would be to educate the users and remodel
the data. Perhaps you need to denormalize your tables or at least use
secondary indices (I prefer to keep it as simple as possible and
denormalize).
If it's a cluster for analytics, perhaps you need to build a designated
cluster only for that so if something does break or get too pressured,
normal activities wouldn't be affected, but there are pros and cons for
that idea too.

Hope this helps.

Regards,


On Tue, May 28, 2019 at 9:43 AM Attila Wind <attilaw@swf.technology> wrote:

> Hi Gurus,
>
> Looks we stopped this thread. However I would be very much curious answers
> regarding b) ...
>
> Anyone any comments on that?
> I do see this as a potential production outage risk now... Especially as
> we are planning to run analysis queries by hand exactly like that over the
> cluster...
>
> thanks!
> Attila Wind
>
> http://www.linkedin.com/in/attilaw
> Mobile: +36 31 7811355
>
>
> On 2019. 05. 23. 11:42, shalom sagges wrote:
>
> a) Interesting... But only in case you do not provide partitioning key
> right? (so IN() is for partitioning key?)
>
> I think you should ask yourself a different question. Why am I using ALLOW
> FILTERING in the first place? What happens if I remove it from the query?
> I prefer to denormalize the data to multiple tables or at least create an
> index on the requested column (preferably queried together with a known
> partition key).
>
> b) Still does not explain or justify "all 8 nodes to halt and
> unresponsiveness to external requests" behavior... Even if servers are busy
> with the request seriously becoming non-responsive...?
>
> I think it can justify the unresponsiveness. When using ALLOW FILTERING,
> you are doing something like a full table scan in a relational database.
>
> There is a lot of information on the internet regarding this subject such
> as
> https://www.instaclustr.com/apache-cassandra-scalability-allow-filtering-partition-keys/
>
> Hope this helps.
>
> Regards,
>
> On Thu, May 23, 2019 at 7:33 AM Attila Wind <attilaw@swf.technology>
> <attilaw@swf.technology> wrote:
>
>> Hi,
>>
>> "When you run a query with allow filtering, Cassandra doesn't know where
>> the data is located, so it has to go node by node, searching for the
>> requested data."
>>
>> a) Interesting... But only in case you do not provide partitioning key
>> right? (so IN() is for partitioning key?)
>>
>> b) Still does not explain or justify "all 8 nodes to halt and
>> unresponsiveness to external requests" behavior... Even if servers are busy
>> with the request seriously becoming non-responsive...?
>>
>> cheers
>> Attila Wind
>>
>> http://www.linkedin.com/in/attilaw
>> Mobile: +36 31 7811355
>>
>>
>> On 2019. 05. 23. 0:37, shalom sagges wrote:
>>
>> Hi Vsevolod,
>>
>> 1) Why such behavior? I thought any given SELECT request is handled by a
>> limited subset of C* nodes and not by all of them, as per connection
>> consistency/table replication settings, in case.
>> When you run a query with allow filtering, Cassandra doesn't know where
>> the data is located, so it has to go node by node, searching for the
>> requested data.
>>
>> 2) Is it possible to forbid ALLOW FILTERING flag for given users/groups?
>> I'm not familiar with such a flag. In my case, I just try to educate the
>> R&D teams.
>>
>> Regards,
>>
>> On Wed, May 22, 2019 at 5:01 PM Vsevolod Filaretov <vsfilare...@gmail.com>
>> wrote:
>>
>>> Hello everyone,
>>>
>>> We have an 8 node C* cluster with large volume of unbalanced data. Usual
>>> per-partition selects work somewhat fine, and are processed by limited
>>> number of nodes, but if user issues SELECT WHERE IN () ALLOW FILTERING,
>>> such command stalls all 8 nodes to halt and unresponsiveness to external
>>> requests while disk IO jumps to 100% across whole cluster. In several
>>> minutes all nodes seem to finish ptocessing the request and cluster goes
>>> back to being responsive. Replication level across whole data is 3.
>>>
>>> 1) Why such behavior? I thought any given SELECT request is handled by a
>>> limited subset of C* nodes and not by all of them, as per connection
>>> consistency/table replication settings, in case.
>>>
>>> 2) Is it possible to forbid ALLOW FILTERING flag for given users/groups?
>>>
>>> Thank you all very much in advance,
>>> Vsevolod Filaretov.
>>>
>>

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

Reply via email to