Re: questions

narges saleh Fri, 23 Aug 2019 03:15:36 -0700

Hello Ilya,
 I agree with you that partitioning based on month was a bad example,
because most will be idle. Country or customer are better examples of my
case. There are limited number of them, but they are disproportionate and
they are always active. Let's take the country example. I need to search
and aggregate the volume of sales in each city and by country. I have a
couple of hundreds countries.
Let me ask a basic question.  If my queries/aggregations are based on
cities and countries, do I need to partition based on countries (or even
cities)?  I want to avoid network hops for my searches and aggregations as
much as possible (I do not slow writes either but I am aware of the trade
off between read/writes and replication and partitioning). What do I define
my affinity key on and what do I partition on?


thanks again for your help.

On Fri, Aug 23, 2019 at 4:03 AM Ilya Kasnacheev <ilya.kasnach...@gmail.com>
wrote:

> Hello!
>
> Partitioning based on let's say user id is usually fair, because there
> usually are 100,000ths of users and neither of those owns disproportionate
> amount of data.
>
> Partitioning by month is especially bad, since in a given months, all of
> partitions will be basically idle save for one, and there would be a lot of
> contention.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> чт, 22 авг. 2019 г. в 19:31, narges saleh <snarges...@gmail.com>:
>
>> I am not sure you can find real world examples where caches can be evenly
>> partitioned, if the partitioning factor is an affinity key. I comparing,
>> with partitioning case with relational databases, say partitioning based on
>> month of the year. I definitely don't have 100s of departments but I do
>> have 10s of departments, but departments are very disproportional in size.
>> As for rebalancing case, the pods will be added to the system as the
>> volume increases, so I'd assume that would prompt ignite to rebalance.
>>
>> On Thu, Aug 22, 2019 at 11:00 AM Ilya Kasnacheev <
>> ilya.kasnach...@gmail.com> wrote:
>>
>>> Hello!
>>>
>>> 1) No. Ignite only rebalances data when nodes are joining or leaving
>>> cluster.
>>> 2) Ignite's affinity is not really well suited to such detailed manual
>>> assignment. It is assumed that your cache has large number of partitions
>>> (e.g. 1024) and data is distributed evenly between all partitions. Having
>>> department as affinity key is suboptimal because there's not many
>>> departments and they usually vary in size. That's the kind of distribution
>>> that you want to avoid.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> чт, 22 авг. 2019 г. в 18:37, narges saleh <snarges...@gmail.com>:
>>>
>>>> Thanks Ilya for replies.
>>>> 1)  Doesn't ignite rebalance the nodes if there are additional nodes
>>>> available and the data doesn't fit the cache current ignite node? Consider
>>>> a scenario where I have 100 pods on a physical node, assuming pod = ignite
>>>> node.
>>>> 2)  I am not sure what you mean by confining half of cache to one
>>>> cluster and another half to another node. If my affinity key is department
>>>> id, why can't I have department A on a partitioned cache, one partition on
>>>> one node in cluster A, and the other partition on another node on another
>>>> cluster.
>>>>
>>>> I might be misunderstanding the whole, and I'd appreciate clarification.
>>>>
>>>> On Thu, Aug 22, 2019 at 6:52 AM Ilya Kasnacheev <
>>>> ilya.kasnach...@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> 1) When there is an overflow, either page eviction kicks in, or, if it
>>>>> is disabled, you get an IgniteOOM, after which the node is no longer
>>>>> usable. Please avoid overflowing any data regions since there's no 
>>>>> graceful
>>>>> handling currently.
>>>>> 2) I don't think so. You can't easily confine half of cache's data to
>>>>> one cluster group and another half to other group.
>>>>>
>>>>> Such scenarios are not recommended. We expect that all partitions have
>>>>> same amount of data. Not that there are a few gargantuan partitions that
>>>>> don't fit in a single node.
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> вт, 20 авг. 2019 г. в 06:29, narges saleh <snarges...@gmail.com>:
>>>>>
>>>>>> Hello All,
>>>>>>
>>>>>> I'd appreciate your answers to my questions.
>>>>>>
>>>>>> 1) Assuming I use affinity key among 4 caches, and they all end up on
>>>>>> the same ignite node. What happens where is an overflow? Does the 
>>>>>> overflow
>>>>>> data end up on a joined node? How do I keep the related data from all the
>>>>>> caches close to each other when the volume of exceeds a single node?
>>>>>>
>>>>>> 2) Is there a concept of cluster affinity, meaning having a cluster
>>>>>> group defined based on some affinity key? For example, if I have two
>>>>>> departments A and B, can I have a cluster group for department A and
>>>>>> another for department B?
>>>>>>
>>>>>> Thanks,
>>>>>> Narges
>>>>>>
>>>>>

Re: questions

Reply via email to