If I understand correctly, the problem is we are doing the following two transitions together: 1) old master: MASTER->SLAVE 2) new master: OFFLINE->SLAVE
Considering OFFLINE->SLAVE is a long transition, we will have a long period without MASTER, which is bad. Ideally we can first do: new master: OFFLINE->SLAVE Once this is done, we do: old master: MASTER->SLAVE In this case, the during the long OFFLINE->SLAVE transition, the old master is still on. I think to achieve this, as Kishore mentioned, we might put a partition level constraint of 1 and let OFFLINE->SLAVE be of higher priority than MASTER->SLAVE. Partition level constraint is supported by 0.6.4: https://issues.apache.org/jira/browse/HELIX-426 You can refer to an example here: https://github.com/apache/helix/blob/helix-0.6.x/helix-core/src/test/java/org/apache/helix/integration/TestPartitionLevelTransitionConstraint.java Thanks, Jason On Wed, Mar 4, 2015 at 12:57 PM, kishore g <[email protected]> wrote: > Hi Gavin, > > Here is the mail that describes the solution. > http://apache.markmail.org/message/alhlibbjmm75gog3 > > I do see that this was not supported in some 0.6.x, Jason any idea when we > added the partition constraint support. > > Also, if you need upgrade to latest 0.6.x, you need the to re-deploy only > the controller. > > thanks, > Kishore G > > On Tue, Mar 3, 2015 at 1:41 PM, kishore g <[email protected]> wrote: > > > Hi Gavin, > > > > That should do the trick. What Vlad said is right. There was an email > > thread that explained this in detail. Let me see if I can find that. > > > > thanks, > > Kishore G > > > > On Tue, Mar 3, 2015 at 11:15 AM, Gavin Li <[email protected]> wrote: > > > >> Can I resolve this by state transition priority and constraint? > >> > >> On Tue, Mar 3, 2015 at 12:40 AM, Gavin Li <[email protected]> wrote: > >> > >> > +dev and commits > >> > > >> > ---------- Forwarded message ---------- > >> > From: Gavin Li <[email protected]> > >> > Date: Tue, Mar 3, 2015 at 12:24 AM > >> > Subject: question about state transition > >> > To: [email protected], [email protected] > >> > > >> > > >> > Hi, > >> > > >> > We have each server handle some partitions, and we use Master Slave > >> model. > >> > > >> > We need to do some work when transit from offline to slave, sometimes > it > >> > takes long time. So when the server ranked higher in ideal state is up > >> and > >> > doing the work during the transition of offline to slave, the other > >> server > >> > is changed to slave from master. > >> > > >> > This is causing some period of time that there's no master at all > which > >> is > >> > problematic. Is it possible to bring the other server down when the > >> higher > >> > rank server is transiting from slave to master instead of when > >> transition > >> > from offline to slave? > >> > > >> > THanks, > >> > Gavin Li > >> > > >> > > >> > > > > >
