Re: Designing Affinity Key for more locality

2021-04-29 Thread William.L
I am using a user centric modeling approach where most of the computations
would be on a per-user basis (joins) before aggregation. The idea is to put
the data (across different tables/caches) for the same user in the same
partition/server. That's the reason why I chose user-id as the affinity key.

Using tenant/group as the affinity key is not good scalability. Some
tenant/group dataset might be too large for one partition/server (we are
using persistence mode). Even if it does fit, it would not benefit from load
balancing of the computation across the servers.








--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Designing Affinity Key for more locality

2021-04-28 Thread Ilya Kasnacheev
Hello!

SQL query planner will not understand the locality if you use surrogate
value as affinity key.

Maybe you need to define your own affinity function (extends
RendezvousAffinityFunction) which will map keys to partitions. I'm not sure
that it will help query planner though.

Regards,
-- 
Ilya Kasnacheev


вт, 27 апр. 2021 г. в 09:01, Pavel Tupitsyn :

> Hi William,
>
> Can you describe the use case and domain model in more detail?
>
> 1. AffinityKey is used to colocate some data with other data.
>What do you achieve with user-id being the affinity key?
>
> 2. If you'd like to put all users for a given tenant/group
> to the same node for efficiency, then use tenant-id as the user
> affinity key.
> UUID is fine, no need for extra logic with ints.
>
> On Tue, Apr 27, 2021 at 5:33 AM William.L  wrote:
>
>> Came across this statement in the Data Partitioning documents:
>>
>> "The affinity function determines the mapping between keys and partitions.
>> Each partition is identified by a number from a limited set (0 to 1023 by
>> default)."
>>
>> Looks like there is no point for adding another layer of mapping unless I
>> am
>> going for a smaller number.
>> Are there other ways in ignite to get more locality for subset of the
>> data?
>>
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>


Re: Designing Affinity Key for more locality

2021-04-27 Thread Pavel Tupitsyn
Hi William,

Can you describe the use case and domain model in more detail?

1. AffinityKey is used to colocate some data with other data.
   What do you achieve with user-id being the affinity key?

2. If you'd like to put all users for a given tenant/group
to the same node for efficiency, then use tenant-id as the user
affinity key.
UUID is fine, no need for extra logic with ints.

On Tue, Apr 27, 2021 at 5:33 AM William.L  wrote:

> Came across this statement in the Data Partitioning documents:
>
> "The affinity function determines the mapping between keys and partitions.
> Each partition is identified by a number from a limited set (0 to 1023 by
> default)."
>
> Looks like there is no point for adding another layer of mapping unless I
> am
> going for a smaller number.
> Are there other ways in ignite to get more locality for subset of the data?
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Designing Affinity Key for more locality

2021-04-26 Thread William.L
Came across this statement in the Data Partitioning documents:

"The affinity function determines the mapping between keys and partitions.
Each partition is identified by a number from a limited set (0 to 1023 by
default)."

Looks like there is no point for adding another layer of mapping unless I am
going for a smaller number.
Are there other ways in ignite to get more locality for subset of the data?
 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Designing Affinity Key for more locality

2021-04-26 Thread William.L
Hi,

I am currently using user-id as the affinity key and because it is a
uuid/string, it will distribute across all partitions in the cluster.
However, for my scenario, I am writing and reading users that are for the
same tenant/group so it seems to me that it is better to design with more
read/write locality within a partition.

One approach I am thinking about is to hash the tenant-id + user-id into
1024 integer values (think of it as logical buckets) which I will use as the
affinity key. This way I can still get the colocation of user data while
also getting more locality within a partition.

Question is whether there are some negative trade-off in Ignite with using
this approach?

Note, I am also using Ignite SQL so I plan to set this integer as a SQL
field so that I can do colocation join on the user. Is this even necessary
if distributed join is disabled?

Thanks.






--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/