Hi all.
Do I understand correctly that mistral/ocata doesn't has a failover and 
recovery?

I found two problems.
First is in mistral-executor.
1. You need to run an action that executes for a long time. For example, 
std.sleep.
2. Kill mistral-executor where action executes.
3. Revive mistral-executor.
4. Workflow and task will be in RUNNING state forever.

WA:
Set timeout and retry attribute to all task.
But this maybe affected some cases.

Second is in mistral-engine.
1. You need to run many workflow that consists many tasks at the same time.
 For example, 20 workflows of 20 taks with std.noop action. 
The tasks are linked sequentially.
2. Kill mistral-engine.
3. Revive mistral-engine.
4. Some workflows and tasks will be in RUNNING state forever.

i think the problem is here 
https://github.com/openstack/mistral/blob/master/mistral/services/scheduler.py#L115
 
When mistral-engine is died between two transactions. 
Delayed calls are marked as processing аnd never called by scheduller.
Do you have a WA for this case?

Is this the expected behavior?
Is there a similar case when the workflow will be in RUNNING state forever?
When will you do failover/executor? Will it be in the new release 
https://blueprints.launchpad.net/mistral/+spec/mistral-fault-tolerance ?
Will you merge it to ocata release?

Performance.

I found two problems too :)

First, when we start many workflows. For example 200. They are completed using 
a large amount of time.
Maybe the throttling could help here?

Second. Can do you help with mistral scaling? What settings should I change in 
mistral.conf or rabbit config? 

acceleration relative to 1 mistral-engine and 1 mistral-executor. Workflows 
consits of 20 taks with std.noop action.

2 mistral-engine and 2 mistral-executor 1.70
3 mistral-engine and 3 mistral-executor 2.08
5 mistral-engine and 5 mistral-executor 2.22

Thrid. 
A process of 20 tasks running time takes longer than one second.
The last step is the completion of the process takes 200 milliseconds for 
scheduling. 
Can i reduce the ratio 
https://github.com/openstack/mistral/blob/master/mistral/engine/workflow_handler.py#L115
 ? 
Would it be worse?

Best regards,

Vitalii Solodilov

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to