[
https://issues.apache.org/jira/browse/HELIX-659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030755#comment-16030755
]
Jiajun Wang edited comment on HELIX-659 at 5/31/17 7:17 AM:
------------------------------------------------------------
h1. Proposal
In this document, we propose to introduce an additional layer of state
mechanism into Helix.
Considering Pinot case, what they need is transiting from "ONLINE:V1" to
"ONLINE:V2". Note that "V1" to "V2" transition is in parallel of the existing
state transition. It is special in following ways:
# The state is not pre-defined. New version numbers may appear after state
transition model is registered.
# Helix won't understand the internal logic of this additional state. So there
is no way that Helix automatically computes idea state. It will rely on
application's configuration to update this state.
We will take the above 2 points as assumptions.
As for expected workflow, still take Pinot partition version as an example:
# Pinot needs to register their own logic for version upgrade, which means a
new state model (factory name).
# Helix provides API to configure resources with additional state ("VERSION").
# Upon resource configuration changed, the controller triggers state transition
and sends message to the participants.
# Participants handles message by calling corresponding state transition
methods. Then update in current state.
# Controller listens on current state change. If any update, it processes and
reflects the update in the external view.
h1. Design
h2. Register Associate States Model / Factory
Note that since associate states maybe not pre-defined, so
defaultTransitionHandler has to be implemented.
h3. State Model Factory:
public abstract class AssociateStateModelFactory extends
StateModelFactory<AssociateStateModel> {
...
}
public abstract class AssociateStateModel extends StateModel {
static final String DEFAULT_INITIAL_STATE = "UNKNOWN";
protected String _currentState = DEFAULT_INITIAL_STATE;
public String getCurrentState() {
return _currentState;
}
// !!!!!!!!!!! Changed part !!!!!!!!!!!! //
@transition(from='from', to='to')
public void defaultTransitionHandler(Message message, NotificationContext
context) {
logger
.error("Default transition handler. The idea is to invoke this if no
transition method is found. To be implemented");
}
public boolean updateState(String newState) {
_currentState = newState;
return true;
}
public void rollbackOnError(Message message, NotificationContext context,
StateTransitionError error) {
logger.error("Default rollback method invoked on error. Error Code: " +
error.getCode());
}
public void reset() {
logger
.warn("Default reset method invoked. Either because the process longer
own this resource or session timedout");
}
@Transition(to = "DROPPED", from = "ERROR")
public void onBecomeDroppedFromError(Message message, NotificationContext
context)
throws Exception {
logger.info("Default ERROR->DROPPED transition invoked.");
}
}
h2. Resource Configuration
h3. Resource config with associate state VERSION:
{
"id":"Test_Resource"
,"simpleFields":{
}
,"listFields":{
"ASSOCIATE_STATE_MODEL_DEF_REFS": [
"VERSION"
],
"ASSOCIATE_STATE_MODEL_FACTORY_NAMES": [
"DEFAULT"
],
"ASSOCIATE_STATES": [
"1.0.1"
],
}
,"mapFields":{
}
}
h2. Additional APIs to configure associate states
/**
* Set configuration values
* @param scope
* @param properties
*/
void setConfig(HelixConfigScope scope, Map<String, List<String>>
listProperties);
/**
* Get configuration values
* @param scope
* @param keys
* @return configuration values ordered by the provided keys
*/
Map<String, List<String>> getConfig(HelixConfigScope scope, List<String> keys);
h2. Partition with the Associate States on the Participant State And EV
h3. Current States:
{
"id":"example_resource"
,"simpleFields":{
"STATE_MODEL_DEF":"MasterSlave"
,"STATE_MODEL_FACTORY_NAME":"DEFAULT"
,"BUCKET_SIZE":"0"
,"SESSION_ID":"25b2ce5dfbde0fa"
}
,"listFields":{
"ASSOCIATE_STATE_MODEL_DEF_REFS": [
"VERSION"
],
"ASSOCIATE_STATE_MODEL_FACTORY_NAMES": [
"DEFAULT"
]
}
,"mapFields":{
"example_resource_0":{
"CURRENT_STATE":"MASTER"
"ASSOCIATE_STATES":"1.0.1" // Split by ":" if multiple associate states
are set
,"INFO":""
}
}
}
h3. Associate state in External View:
{
"id":"example_resource"
,"simpleFields":{
,"STATE_MODEL_DEF_REF":"MasterSlave"
}
,"listFields":{
"ASSOCIATE_STATE_MODEL_DEF_REFS": [
"VERSION"
]
}
,"mapFields":{
"example_resource_0":{
// Given more than one assistant states, they will be split by ":". And
the main state will always be the first state.
"lca1-app0004.stg.linkedin.com_11932":"MASTER:1.0.1"
,"lca1-app0048.stg.linkedin.com_11932":"SLAVE:1.0.0"
}
}
}
h2. Helix Controller Updates
On resource configuration changes:
* Fill ClusterDataCache with associate states and related state models /
factories from resource configuration.
* Merge associate states to BestPossibleStateOutput.
* Fill associate states and related state models / factories into the message
before sending to participants.
On participant state changes:
* Besides existing read, also read and fill associate states. Then fill EV with
complete states information.
h2. Helix Participant Updates
On receiving state transition message:
* Read main state and associate states, trigger state transitions in order.
* Do main state transition first, then do associate states transitions one by
one.
** If any state transition failed, set an error state to cover all states and
stop processing. User should fix problem and reset to initial states.
** If state transition succeeds, update current state.
h1. Alternative options
h2. Introducing UPGRADING State for additional state transitions
Adding a new internal state UPGRADING for partition upgrade.
So upgrade will happen when the partition is transited "to" or "from" UPGRADING
status.
Note that application has the freedom to define whether UPGRADING is a special
online status or not.
For Pinot case, upgrading partition (even before they are back to ONLINE) might
be active partition.
The problem of this new state is that it only works fine for a single
additional state.
Once we have more than one additional state to take care, UPGRADING state is
not enough.
h2. Rely on resetting partition to load new states
Whenever a new version is available, application update versions for the
resource. Then resetting all partitions.
Then during state transition from offline to online, participants will read new
version and apply to the related partitions.
The problem of this method is changing in the additional state will affect the
main state. A partition will be offline for a while. During this period, even
old version will be not available.
h2. Application registers message handler to handle upgrading message
In this method, the controller is only responsible for sending upgrade request
to participants. Participants will be responsible for reporting local
participant versions.
Since the controller has no clue about how to control the additional state, the
application will need to process all the logics.
h1. Validation
Add unit tests / integration tests for validate associate states.
Verify Pinot Version use case.
was (Author: jiajunwang):
h1. Proposal
In this document, we propose to introduce an additional layer of state
mechanism into Helix.
Considering Pinot case, what they need is transiting from "ONLINE:V1" to
"ONLINE:V2". Note that "V1" to "V2" transition is in parallel of the existing
state transition. It is special in following ways:
# The state is not pre-defined. New version numbers may appear after state
transition model is registered.
# Helix won't understand the internal logic of this additional state. So there
is no way that Helix automatically computes idea state. It will rely on
application's configuration to update this state.
We will take the above 2 points as assumptions.
As for expected workflow, still take Pinot partition version as an example:
# Pinot needs to register their own logic for version upgrade, which means a
new state model (factory name).
# Helix provides API to configure resources with additional state ("VERSION").
# Upon resource configuration changed, the controller triggers state transition
and sends message to the participants.
# Participants handles message by calling corresponding state transition
methods. Then update in current state.
# Controller listens on current state change. If any update, it processes and
reflects the update in the external view.
h1. Design
h2. Register Associate States Model / Factory
Note that since associate states maybe not pre-defined, so
defaultTransitionHandler has to be implemented.
State Model Factory:
public abstract class AssociateStateModelFactory extends
StateModelFactory<AssociateStateModel> {
...
}
public abstract class AssociateStateModel extends StateModel {
static final String DEFAULT_INITIAL_STATE = "UNKNOWN";
protected String _currentState = DEFAULT_INITIAL_STATE;
public String getCurrentState() {
return _currentState;
}
// !!!!!!!!!!! Changed part !!!!!!!!!!!! //
@transition(from='from', to='to')
public void defaultTransitionHandler(Message message, NotificationContext
context) {
logger
.error("Default transition handler. The idea is to invoke this if no
transition method is found. To be implemented");
}
public boolean updateState(String newState) {
_currentState = newState;
return true;
}
public void rollbackOnError(Message message, NotificationContext context,
StateTransitionError error) {
logger.error("Default rollback method invoked on error. Error Code: " +
error.getCode());
}
public void reset() {
logger
.warn("Default reset method invoked. Either because the process longer
own this resource or session timedout");
}
@Transition(to = "DROPPED", from = "ERROR")
public void onBecomeDroppedFromError(Message message, NotificationContext
context)
throws Exception {
logger.info("Default ERROR->DROPPED transition invoked.");
}
}
h2. Resource Configuration
Resource config with associate state VERSION:
{
"id":"Test_Resource"
,"simpleFields":{
}
,"listFields":{
"ASSOCIATE_STATE_MODEL_DEF_REFS": [
"VERSION"
],
"ASSOCIATE_STATE_MODEL_FACTORY_NAMES": [
"DEFAULT"
],
"ASSOCIATE_STATES": [
"1.0.1"
],
}
,"mapFields":{
}
}
h2. Additional APIs to configure associate states
/**
* Set configuration values
* @param scope
* @param properties
*/
void setConfig(HelixConfigScope scope, Map<String, List<String>>
listProperties);
/**
* Get configuration values
* @param scope
* @param keys
* @return configuration values ordered by the provided keys
*/
Map<String, List<String>> getConfig(HelixConfigScope scope, List<String> keys);
h2. Partition with the Associate States on the Participant State And EV
Current States:
{
"id":"example_resource"
,"simpleFields":{
"STATE_MODEL_DEF":"MasterSlave"
,"STATE_MODEL_FACTORY_NAME":"DEFAULT"
,"BUCKET_SIZE":"0"
,"SESSION_ID":"25b2ce5dfbde0fa"
}
,"listFields":{
"ASSOCIATE_STATE_MODEL_DEF_REFS": [
"VERSION"
],
"ASSOCIATE_STATE_MODEL_FACTORY_NAMES": [
"DEFAULT"
]
}
,"mapFields":{
"example_resource_0":{
"CURRENT_STATE":"MASTER"
"ASSOCIATE_STATES":"1.0.1" // Split by ":" if multiple associate states
are set
,"INFO":""
}
}
}
Associate state in External View:
{
"id":"example_resource"
,"simpleFields":{
,"STATE_MODEL_DEF_REF":"MasterSlave"
}
,"listFields":{
"ASSOCIATE_STATE_MODEL_DEF_REFS": [
"VERSION"
]
}
,"mapFields":{
"example_resource_0":{
// Given more than one assistant states, they will be split by ":". And
the main state will always be the first state.
"lca1-app0004.stg.linkedin.com_11932":"MASTER:1.0.1"
,"lca1-app0048.stg.linkedin.com_11932":"SLAVE:1.0.0"
}
}
}
h2. Helix Controller Updates
On resource configuration changes:
* Fill ClusterDataCache with associate states and related state models /
factories from resource configuration.
* Merge associate states to BestPossibleStateOutput.
* Fill associate states and related state models / factories into the message
before sending to participants.
On participant state changes:
* Besides existing read, also read and fill associate states. Then fill EV with
complete states information.
h2. Helix Participant Updates
On receiving state transition message:
* Read main state and associate states, trigger state transitions in order.
* Do main state transition first, then do associate states transitions one by
one.
* If any state transition failed, set an error state to cover all states and
stop processing. User should fix problem and reset to initial states.
* If state transition succeeds, update current state.
h1. Alternative options
h2. Introducing UPGRADING State for additional state transitions
Adding a new internal state UPGRADING for partition upgrade.
So upgrade will happen when the partition is transited "to" or "from" UPGRADING
status.
Note that application has the freedom to define whether UPGRADING is a special
online status or not.
For Pinot case, upgrading partition (even before they are back to ONLINE) might
be active partition.
The problem of this new state is that it only works fine for a single
additional state.
Once we have more than one additional state to take care, UPGRADING state is
not enough.
h2. Rely on resetting partition to load new states
Whenever a new version is available, application update versions for the
resource. Then resetting all partitions.
Then during state transition from offline to online, participants will read new
version and apply to the related partitions.
The problem of this method is changing in the additional state will affect the
main state. A partition will be offline for a while. During this period, even
old version will be not available.
h2. Application registers message handler to handle upgrading message
In this method, the controller is only responsible for sending upgrade request
to participants. Participants will be responsible for reporting local
participant versions.
Since the controller has no clue about how to control the additional state, the
application will need to process all the logics.
h1. Validation
Add unit tests / integration tests for validate associate states.
Verify Pinot Version use case.
> Support Additional Associate States
> -----------------------------------
>
> Key: HELIX-659
> URL: https://issues.apache.org/jira/browse/HELIX-659
> Project: Apache Helix
> Issue Type: New Feature
> Components: helix-core
> Affects Versions: 0.6.x
> Reporter: Jiajun Wang
>
> Currently, Helix only supports management a single state for all
> resources/partitions. However, in the real world, cluster management
> requirements may be more complicated than that.
> In Pinot, for example, each partition need to be assigned a version for
> ensuring data consistency.
> When a new version comes, the system needs to replace the old partition with
> the new one. And the replacement is done one partition by one partition. So
> any reads during this period will get inconsistent data.
> Pinot system cannot directly put the version information into the
> section(partition) state field because it is already occupied by the main
> state (offline-online for instance) used by Helix controller.
> So Pinot team relies on some workarounds to implement their application
> logic: creating a new resource with the latest version and replace them after
> the resource is fully loaded. And for Helix controller, version is unknown.
> Another option is Pinot team maintaining their own config item or property
> store item for recording versions.
> Both ways require Pinot team implementing version control themselves.
> Another requirement is from Ambry team. Where partition can be "ONLINE:READ"
> or "ONLINE:WRITE".
> In both cases, single state mechanism is not sufficient for applications'
> requirement.
> It would be very helpful to provide a framework level feature that supports
> more than one states for each partition.
> Benefits:
> # The application doesn't need to write additional code for managing
> additional states.
> # Avoid potential conflict when multiple states transition happens
> concurrently.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)