Hi, all State machine-based Alpha improves performance by an order of magnitude. I have pushed Alpha's benchmark report [1] and added the stress test tool module alpha-benchmark [2]. Next I will merge branch SCB-1321 to master branch.
Any questions please tell me. [1] https://github.com/apache/servicecomb-pack/blob/SCB-1321/alpha/alpha-fsm/Benchmark.md <https://github.com/apache/servicecomb-pack/blob/SCB-1321/alpha/alpha-fsm/Benchmark.md> [2] https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-benchmark <https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-benchmark> Lei Zhang > 在 2019年7月9日,上午8:50,Willem Jiang <willem.ji...@gmail.com> 写道: > > Hi Zhanglei, > > I agree with you, in the timeout scenario, it's hard to tell if the > transaction is finished successfully or not. > I think we could provide an extension or plugin to let the user do > some extra actions to do the compensation work after verifying the > transaction states. > > BTW, as there are some changes happening in the master, maybe you can > consider to merge the SCB-1321 branch to master branch. > > Willem Jiang > > Twitter: willemjiang > Weibo: 姜宁willem > > On Tue, Jul 9, 2019 at 8:21 AM Zhang Lei <zhang_...@boco.com.cn> wrote: >> >> Hi All >> >> I have completed the acceptance test for the state machine and pushed to >> branch SCB-1321 and CI pass. See more feature progress here[1]. >> >> In the acceptance test, the timeout is different from the previous one. When >> the timeout occurs, the transaction will enter the suspended state because >> we are not sure whether the sub-transaction is completed and when it is >> completed. >> >> For example: >> when booking timeout, we are not sure about the execution status of car or >> hotel. If car or hotel sends TxEndedEvent after compensation, they will not >> be compensated. >> >> Alpha >> [x] State machine design document >> [x] State machine prototype >> [x] State machine prototype unit test >> [x] Receive saga events using the internal message bus >> [x] State machine integration test >> [x] Enable state machine support via parameters >> [x] Verify Akka persistent >> [ ] Verify Akka cluster reliability >> [ ] Save the terminated transaction data to the database >> [ ] Support for in-process nested global transactions >> [ ] Support for cross-process nested global transactions >> [ ] Support for query terminated transaction data by RESTful API >> [ ] Support for query running transaction data by RESTful API >> [ ] Support for query running transaction data by RESTful API >> [ ] Support for query suspended global transaction by RESTful API >> [ ] Support for compensate failed sub-transaction by RESTful API >> >> Omega Components >> [x] Enable state machine support via parameters >> [x] State machine calls omega side compensation >> [x] @SagaStart supports thread termination after the timeout >> >> Alpha & Omega >> [x] Acceptance-pack-akka-spring-demo pass >> [ ] Add sub-transaction timeout exception for akka acceptance test >> [ ] Add compensation failure for akka acceptance test >> [ ] Add compensation retry success for akka acceptance test >> [ ] Alpha single node benchmark performance test >> [ ] Alpha cluster benchmark performance test >> >> Tools >> [ ] Alpha Benchmark tools >> >> Do Next: >> 1. State machine metrics collection >> 2. Alpha Benchmark tools >> 3. Single alpha benchmark performance test >> 4. Verify Akka cluster reliability >> >> [1] https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm >> <https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm> >> >> Lei Zhang >> >> >>> 在 2019年6月28日,下午5:50,Zhang Lei <zhang_...@boco.com.cn> 写道: >>> >>> Hi, All >>> >>> alpha-fsm has been pushed to the branch SCB-1321 >>> >>> Completed: >>> 1. State machine design document[1] >>> 2. State machine prototype >>> 3. State machine test case >>> 4. Receive saga events using the internal message bus >>> >>> Key emphasis of next stage in work: >>> In order to carry out the feasibility verification as soon as possible, I >>> will not consider the reliability issue for the time being. >>> 1. Refactor Omega components, add SagaAbortedEvent, SagaTimeoutEvent, >>> TxComponsitedEvent >>> 2. Save compensation method parameters in Actor and trigger compensation in >>> Actor >>> 3. Do not use Kafka and only verify single node alpha, The Alpha server >>> receives the saga event and puts it into the internal message bus. >>> >>> Planning: >>> 1. Persist actor data to the database when it terminates >>> 2. Integration Kafka >>> 3. Support WAL[2] recovery mode >>> 4. Verify Akka cluster reliability >>> >>> [1] >>> https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm >>> <https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm> >>> [2] https://en.wikipedia.org/wiki/Write-ahead_logging >>> <https://en.wikipedia.org/wiki/Write-ahead_logging> >>> >>> if you have other comments, please let us know. >>> >>> Thanks, >>> Lei Zhang >>> >>>> 在 2019年6月27日,上午9:50,Willem Jiang <willem.ji...@gmail.com> 写道: >>>> >>>> We just leverage the message broker to make sure Alpha get the >>>> transaction event from Omega. >>>> In most cases Alpha don't need to talk back to Omega, we just need to >>>> make sure all the transaction message are stored (Alpha can process it >>>> later). >>>> >>>> If Omega cannot talk the message broker, Omega should abort the >>>> transaction processing with transport exception. >>>> >>>> Willem Jiang >>>> >>>> Twitter: willemjiang >>>> Weibo: 姜宁willem >>>> >>>> On Tue, Jun 25, 2019 at 8:42 AM Zhang Lei <zhang_...@boco.com.cn> wrote: >>>>> >>>>> Hi, Zhang jun >>>>> >>>>>> I just cared about the recovery scan thread design. >>>>>> Kafka can ensure event message can be consumed by alpha exactly, but >>>>>> recovery need know all the participated transaction response to decide >>>>>> rollback or commit, so I think scan thread is also necessary. >>>>> >>>>> I am not sure, but I think Akka's persistence can solve this problem you >>>>> care about. >>>>> Of course, this ability needs to be verified >>>>> >>>>> Thanks, >>>>> Zhang Lei >>>>> >>>>>> 在 2019年6月24日,上午10:46,赵俊 <zhaoju...@jd.com> 写道: >>>>>> >>>>>> Hi, Zhang Lei >>>>>> >>>>>>> A2 : I think we only need to ensure that the message can be reliably >>>>>>> delivered to the state machine, The state machine is only a synchronous >>>>>>> record state transition when the transaction is executed normally. At >>>>>>> present, the compensation method based on table scan is also >>>>>>> asynchronous. I am not sure if I have answered your question, or you >>>>>>> can give me more information. >>>>>> >>>>>> If we have a mechanism that ensure main service can collect all the >>>>>> participated transaction response from alpha correctly before >>>>>> commit/rollback, it is OK. >>>>>> >>>>>>> Q2 : Also we should consider about recovery, it seems that recovery is >>>>>>> as same as before based on database. >>>>>>> A2 : I think the question you care about is how to recover when the >>>>>>> alpha is down, this is a little different from the current version. >>>>>>> 1. We can base on Kafka's reliability and control the offset of the >>>>>>> topic, one message at a time >>>>>>> 2. Of course, we can also do some extra design for it, such as logging >>>>>>> the data log file locally after receiving the Kafka message. Resume the >>>>>>> message by reading the data log file when the alpha machine restarts >>>>>> >>>>>> I just cared about the recovery scan thread design. >>>>>> Kafka can ensure event message can be consumed by alpha exactly, but >>>>>> recovery need know all the participated transaction response to decide >>>>>> rollback or commit, so I think scan thread is also necessary. >>>>>> >>>>>> >>>>>> >>>>>>> On Jun 23, 2019, at 1:04 PM, Zhang Lei <zhang_...@boco.com.cn> wrote: >>>>>>> >>>>>>> Hi, Zhao Jun >>>>>>> >>>>>>> Thank you for your reply! >>>>>>> >>>>>>> This design document does not elaborate on reliability aspects. >>>>>>> >>>>>>> My initial thought is this >>>>>>> >>>>>>> Q1 : It seems that omega should hold on after consuming the event >>>>>>> message from Kafka instead of completing pushing message >>>>>>> A2 : I think we only need to ensure that the message can be reliably >>>>>>> delivered to the state machine, The state machine is only a synchronous >>>>>>> record state transition when the transaction is executed normally. At >>>>>>> present, the compensation method based on table scan is also >>>>>>> asynchronous. I am not sure if I have answered your question, or you >>>>>>> can give me more information. >>>>>>> >>>>>>> Q2 : Also we should consider about recovery, it seems that recovery is >>>>>>> as same as before based on database. >>>>>>> A2 : I think the question you care about is how to recover when the >>>>>>> alpha is down, this is a little different from the current version. >>>>>>> 1. We can base on Kafka's reliability and control the offset of the >>>>>>> topic, one message at a time >>>>>>> 2. Of course, we can also do some extra design for it, such as logging >>>>>>> the data log file locally after receiving the Kafka message. Resume the >>>>>>> message by reading the data log file when the alpha machine restarts >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Lei Zhang >>>>>>> >>>>>>>> 在 2019年6月23日,上午7:08,zhaojun <zhaoju...@126.com> 写道: >>>>>>>> >>>>>>>> I have some questions about the design. >>>>>>>> 1. It seems that omega should hold on after consuming the event >>>>>>>> message from Kafka instead of completing pushing message. >>>>>>>> 2. Also we should consider about recovery, it seems that recovery is >>>>>>>> as same as before based on database. >>>>>>>> >>>>>>>> ------------------ >>>>>>>> Zhao Jun >>>>>>>> Apache Sharding-Sphere & ServiceComb >>>>>>>> >>>>>>>>> On Jun 21, 2019, at 6:41 PM, Zhang Lei <zhang_...@boco.com.cn> wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I have created the alpha-fsm module on branch SCB-1321 and submitted >>>>>>>>> the design documentation, state machine prototype and test cases. >>>>>>>>> If there is any problem, please let me know. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Lei Zhang >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm >>>>>>>>> >>>>>>>>> <https://github.com/apache/servicecomb-pack/tree/SCB-1321/alpha/alpha-fsm> >>>>>>>>> >>>>>>>>>> 在 2019年6月20日,下午3:25,Zheng Feng <zh.f...@gmail.com> 写道: >>>>>>>>>> >>>>>>>>>> Yeah, I think Willem has create one [1] before and do you mind I >>>>>>>>>> assign >>>>>>>>>> this issue to you ? >>>>>>>>>> >>>>>>>>>> [1] https://issues.apache.org/jira/browse/SCB-1258 >>>>>>>>>> >>>>>>>>>> Zhang Lei <zhang_...@boco.com.cn> 于2019年6月20日周四 下午2:34写道: >>>>>>>>>> >>>>>>>>>>> Hi, Zheng Feng >>>>>>>>>>> >>>>>>>>>>> Thanks for your advice, I will create a JIRA first and start with >>>>>>>>>>> the >>>>>>>>>>> design documentation. >>>>>>>>>>> >>>>>>>>>>> Lei Zhang >>>>>>>>>>> >>>>>>>>>>>> 在 2019年6月19日,下午8:09,Zheng Feng <zh.f...@gmail.com> 写道: >>>>>>>>>>>> >>>>>>>>>>>> Thanks a lot for sharing these information ! I think this state >>>>>>>>>>>> machine >>>>>>>>>>>> could be very experimental so it would helpful to create an >>>>>>>>>>>> experimental >>>>>>>>>>>> branch to add this module but not in the master branch. >>>>>>>>>>>> >>>>>>>>>>>> Zhang Lei <cool...@qq.com> 于2019年6月19日周三 下午5:42写道: >>>>>>>>>>>> >>>>>>>>>>>>> I have completed some of the design and prototype in my github. >>>>>>>>>>>>> >>>>>>>>>>>>> In the design document [1] my original idea was that a >>>>>>>>>>>>> transaction >>>>>>>>>>>>> consisted of a SagaActor and several TxActors, and later TxAcotr >>>>>>>>>>>>> was >>>>>>>>>>>>> removed to reduce implementation complexity. >>>>>>>>>>>>> I haven't had time to modify the documentation yet, but the >>>>>>>>>>>>> SagaActor >>>>>>>>>>>>> state machine [2] is up to date. >>>>>>>>>>>>> Here you can see the test cases of SagaActor [3] >>>>>>>>>>>>> >>>>>>>>>>>>> [1] >>>>>>>>>>>>> >>>>>>>>>>> https://github.com/coolbeevip/playground/tree/master/state_machine_demo/saga-akkafsm >>>>>>>>>>>>> < >>>>>>>>>>>>> >>>>>>>>>>> https://github.com/coolbeevip/playground/tree/master/state_machine_demo/saga-akkafsm >>>>>>>>>>>>>> >>>>>>>>>>>>> [2] >>>>>>>>>>>>> >>>>>>>>>>> https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/assets/saga_state_diagram.png >>>>>>>>>>>>> < >>>>>>>>>>>>> >>>>>>>>>>> https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/assets/saga_state_diagram.png >>>>>>>>>>>>>> >>>>>>>>>>>>> [3] >>>>>>>>>>>>> >>>>>>>>>>> https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/src/test/java/coolbeevip/playgroud/statemachine/saga/SagaActorTest.java >>>>>>>>>>>>> < >>>>>>>>>>>>> >>>>>>>>>>> https://github.com/coolbeevip/playground/blob/master/state_machine_demo/saga-akkafsm/src/test/java/coolbeevip/playgroud/statemachine/saga/SagaActorTest.java >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Lei Zhang >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> 在 2019年6月19日,下午2:34,zhaojun <zhaoju...@126.com> 写道: >>>>>>>>>>>>>> >>>>>>>>>>>>>> If we use AKKA, how can we design the actors, and how can we >>>>>>>>>>>>>> guarantee >>>>>>>>>>>>> omega will receive the message synchronize. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>> >>