Re: Resurrect FairAffinityFunction

Vladimir Ozerov Wed, 16 Aug 2017 02:00:07 -0700

I am still not quite understand why we bother so much about partition
migration while we already have RendezvousAffinityFunction? Looks like we
tried to make "fair" be closer to "rendezvous", and then tried to make
"rendezvous" closer to "fair", but without any success:
- we are still not satisfied with distribution of "rendezvous", so we want
to return "fair" back
- and "fair" cannot co-locate caches properly, so things like SQL and
"affinityCall" are broken for it


Can we add a flag to FairAffinityFunction to control whether it should
previous history or not? Is yes - it is not deterministic, if no - it is
deterministic.


On Wed, Aug 16, 2017 at 11:29 AM, Vladimir Ozerov <[email protected]>
wrote:

> I already explained why checking specifically for FairAffintiyFunction in
> SQL is wrong.
>
> On Wed, Aug 16, 2017 at 10:00 AM, Dmitriy Setrakyan <[email protected]
> > wrote:
>
>> Vladimir, I think you should look more carefully at what FairAffinity does
>> and why it exists. There is no way to "fix" it not to accept previous
>> state
>> - it must be provided.
>>
>> I do not understand why we need to change anything in code when there is a
>> simple solution being proposed here. Why not just add a validation to SQL
>> when fair affinity is used with JOINs and be done with it?
>>
>> Alexey G, given that you originally wrote the FairAffinity function, can
>> you please comment?
>>
>> D.
>>
>> On Tue, Aug 15, 2017 at 10:41 PM, Vladimir Ozerov <[email protected]>
>> wrote:
>>
>> > Dima,
>> >
>> > It is not hard to implement. It is hard to reason on whether your query
>> > will fail or not. Moreover, cache groups is an antipattern for SQL,
>> > personally I do not want users to use it unless absolutely needed (large
>> > topologies, large number of caches). Also take in count that the same
>> > problem with different partition distribution holds for any two caches
>> with
>> > different affinity functions, so the problem is not tied to
>> > FairAffinityFunction only.
>> >
>> > IMO correct fix should be as follows:
>> > 1) Add requirement that every affinity function must provide sensible
>> > implementation of equals/hashCode
>> > 2) Add "boolean deterministic()" method to affinity function interface;
>> > "true" means that function doesn't depend on any external things, such
>> as
>> > topology history
>> > 3) Fail SQL if there are at least two PARTITIONED caches with different
>> or
>> > non-deterministic affinity functions, and distributed joins are not
>> enabled
>> > 4) Fix FairAffinityFunction and make it deterministic, it should not
>> depend
>> > on previous state.
>> >
>> > Makes sense?
>> >
>> > On Wed, Aug 16, 2017 at 3:31 AM, Dmitriy Setrakyan <
>> [email protected]>
>> > wrote:
>> >
>> > > On Tue, Aug 15, 2017 at 1:12 PM, Vladimir Ozerov <
>> [email protected]>
>> > > wrote:
>> > >
>> > > > I do not like the idea as it would make it very hard to reason about
>> > > > whether your SQL will fail or not. Let's looks at the problem from
>> the
>> > > > different angle. I have this question for years - why in the world
>> > *fair*
>> > > > affinity function, whose only ultimate goal is to provide equal
>> > partition
>> > > > distribution, depends on it's own previous state? Can we re-design
>> in a
>> > > way
>> > > > that it depends only on partition count and current topology state?
>> > > >
>> > >
>> > > Vladimir, we must know previous state, otherwise the data partitions
>> will
>> > > be randomly moving across the network every time a topology changes.
>> > >
>> > > From the SQL standpoint, you can just fail all queries that have a
>> JOIN
>> > > from different cache groups, if at least one of the groups is using
>> Fair
>> > > Affinity. I am not sure why this would be hard.
>> > >
>> > >
>> > > >
>> > > > On Thu, Aug 10, 2017 at 12:16 AM, Valentin Kulichenko <
>> > > > [email protected]> wrote:
>> > > >
>> > > > > As far as I know, all logical caches with the same affinity
>> function
>> > > and
>> > > > > node filter will end up in the same group. If that's the case, I
>> like
>> > > the
>> > > > > idea. This is exactly what I was looking for.
>> > > > >
>> > > > > -Val
>> > > > >
>> > > > > On Wed, Aug 9, 2017 at 8:18 AM, Evgenii Zhuravlev <
>> > > > > [email protected]>
>> > > > > wrote:
>> > > > >
>> > > > > > Dmitriy,
>> > > > > >
>> > > > > > Yes, you're right. Moreover, it looks like a good practice to
>> > combine
>> > > > > > caches that will be used for collocated JOINs in one group
>> since it
>> > > > > reduces
>> > > > > > overall overhead.
>> > > > > >
>> > > > > > I think it's not a problem to add this restriction to the SQL
>> JOIN
>> > > > level
>> > > > > if
>> > > > > > we will decide to use this solution.
>> > > > > >
>> > > > > > Evgenii
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > 2017-08-09 17:07 GMT+03:00 Dmitriy Setrakyan <
>> > [email protected]
>> > > >:
>> > > > > >
>> > > > > > > On Wed, Aug 9, 2017 at 6:28 AM, ezhuravl <
>> > [email protected]
>> > > >
>> > > > > > wrote:
>> > > > > > >
>> > > > > > > > Folks,
>> > > > > > > >
>> > > > > > > > I've started working on a https://issues.apache.org/
>> > > > > > > > jira/browse/IGNITE-5836
>> > > > > > > > ticket and found that the recently added feature with
>> > cacheGroups
>> > > > > doing
>> > > > > > > > pretty much the same that was described in this issue.
>> > CacheGroup
>> > > > > > > > guarantees
>> > > > > > > > that all caches within a group have same assignments since
>> they
>> > > > > share a
>> > > > > > > > single underlying 'physical' cache.
>> > > > > > > >
>> > > > > > >
>> > > > > > > > I think we can return FairAffinityFunction and add
>> information
>> > to
>> > > > its
>> > > > > > > > Javadoc that all caches with same AffinityFunction and
>> > NodeFilter
>> > > > > > should
>> > > > > > > be
>> > > > > > > > combined in cache group to avoid a problem with inconsistent
>> > > > previous
>> > > > > > > > assignments.
>> > > > > > > >
>> > > > > > > > What do you guys think?
>> > > > > > > >
>> > > > > > >
>> > > > > > > Are you suggesting that we can only reuse the same
>> > > > FairAffinityFunction
>> > > > > > > across the logical caches within the same group? This would
>> mean
>> > > that
>> > > > > > > caches from the different groups cannot participate in JOINs
>> or
>> > > > > > collocated
>> > > > > > > compute.
>> > > > > > >
>> > > > > > > I think I like the idea, however, we need to make sure that we
>> > > > enforce
>> > > > > > this
>> > > > > > > restriction, at least at the SQL JOIN level.
>> > > > > > >
>> > > > > > > Alexey G, Val, would be nice to hear your thoughts on this.
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > > Evgenii
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > View this message in context: http://apache-ignite-
>> > > > > > > > developers.2346864.n4.nabble.com/Resurrect-
>> > FairAffinityFunction-
>> > > > > > > > tp19987p20669.html
>> > > > > > > > Sent from the Apache Ignite Developers mailing list archive
>> at
>> > > > > > > Nabble.com.
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: Resurrect FairAffinityFunction

Reply via email to