Hi,

Currently there are lot of Thread.sleep calls in mock iaas component which
makes it slow and cause unexpected behavior due to concurrency issues. Also
it has a significant performance overhead when running integration tests
since mock iaas is being used for test cases. I've been working on
improving this component by doing following changes;

 - Remove *all* Thread sleep calls in mock iaas
 - Introduce a method named 'initialize' to start event receivers and
publishers. This is a synchronous call which grantees that receiver and
publisher objects will be created successfully. If not it will throw an
exception and startInstance() method call in CC will fail. Earlier this
task was delegated to an executor service which made it difficult to check
whether mock instance was created successfully.
 - Create topology receiver in mock instance and listen for member
initialized and member started events. It will publish instance started and
instance activated events based on topology events received rather than
sleeping for some time interval before publishing.

After making those changes I faced multiple integration test failures. This
was mainly because integration tests relied heavily on Thread sleep calls
to assert various conditions. With these changes, the time taken for a mock
instance/app to become active came down to milliseconds, hence test cases
could not detect member status or app status correctly. Therefore I had to
introduce a new non-blocking mechanism to check app/member status by using
thread synchronization.

Now the average time taken for complete integration tests is around 16 mins
(earlier it was more than 30 mins). This is almost 50% performance gain.
Created JIRA at [1].


Following is a summary of changes.

AutoscalerTopologyEventReceiver:
 - Fix formatting and log messages
 - ClusterInstanceTerminatedEventListener check appMonitor is null when
calling destroy() on monitor

AutoscalerServiceImpl
 - Fix formatting and log messages

ClusterStatusActiveProcessor
 - Fix formatting and log messages (log cluster-instance-id)

GroupStatusActiveProcessor
  - Fix formatting and log messages (log group-instance-id)
  - Print groups map entries and cluster data holder map entries if debug
enabled

GroupStatusProcessor
  - Fix formatting and log messages (log group-instance-id)

GroupStatusTerminatedProcessor
  - Fix formatting and log messages (log group-instance-id)

CloudControllerServiceComponent
  - Increase THREAD_POOL_SIZE from 10 to 20


TopologyBuilder
  - Fix formatting and log messages
  - Move acquire topology lock call outside of try block

RestClient
 - Add logs for every method to help troubleshooting integration failures

StratosTestServerManager
 - Call waitForPort method with restart timeout of 600000 ms. This is to
avoid test failures due to slow builder machines.


RestConstants
 - Add entity name: REPO_NOTIFY_NAME = "GitHook"

TopologyHandler
 - Update timeout values:
APPLICATION_ACTIVATION_TIMEOUT = 300000;
APPLICATION_INACTIVATION_TIMEOUT = 120000;
APPLICATION_UNDEPLOYMENT_TIMEOUT = 30000;
MEMBER_TERMINATION_TIMEOUT = 120000;
APPLICATION_TOPOLOGY_INIT_TIMEOUT = 20000;

 - Increase executorService pool size from 10 to 30 to compensate for
additional event receivers
 - Event receivers with logs for events
     - healthStatEventReceiver: MemberFaultEvent

     - applicationsEventReceiver:
ApplicationInstanceActivatedEventListener,
ApplicationInstanceInactivatedEventListener

     - topologyEventReceiver: MemberActivatedEventListener,
MemberTerminatedEventListener, ClusterInstanceActivatedEventListener,
ClusterInstanceInactivateEventListener

 - Added logs for every method to help troubleshoot integration failures
 - Asynchronous mechanism to assertApplicationActiveStatus
 - Asynchronous mechanism to assertApplicationInactiveStatus


Application
 - Added log in getStatus() method to print status of all application
instances


ApplicationCreatedMessageProcessor
  - Fix formatting and log messages

ApplicationDeletedMessageProcessor
  - Fix formatting and log messages
ApplicationInstanceActivatedMessageProcessor
  - Fix formatting and log messages (log app-instance-id)

ApplicationInstanceCreatedMessageProcessor
 - Fix formatting and log messages (log app-instance-id)

ApplicationInstanceInactivatedMessageProcessor
- Fix formatting and log messages (log app-instance-id)

ApplicationInstanceTerminatedMessageProcessor
 - Fix formatting and log messages (log app-instance-id)

ApplicationInstanceTerminatingMessageProcessor
 - Fix formatting and log messages (log app-instance-id)

ClusterStatusClusterActivatedMessageProcessor
 - Fix formatting and log messages (log cluster-instance-id)

ClusterStatusClusterInactivateMessageProcessor
 - Fix formatting and log messages (log cluster-instance-id)

ClusterStatusClusterInstanceCreatedMessageProcessor
 - Fix formatting and log messages (log cluster-instance-id)

ClusterStatusClusterResetMessageProcessor
 - Fix formatting and log messages (log cluster-instance-id)

ClusterStatusClusterTerminatedMessageProcessor
 - Fix formatting and log messages (log cluster-instance-id)

ClusterStatusClusterTerminatingMessageProcessor
 - Fix formatting and log messages (log cluster-instance-id)

Introduce removeEventListener method for message processor chain to remove
a registered event listener object. This is needed since integration tests
will be registering event listeners on-demand. Those listeners needs to be
removed at the end of the test case.
 - ApplicationSignUpMessageProcessorChain
 - ApplicationsMessageProcessorChain
 - ClusterStatusMessageProcessorChain
 - DomainMappingMessageProcessorChain
 - HealthStatMessageProcessorChain
 - InitializerMessageProcessorChain
 - InstanceNotifierMessageProcessorChain
 - InstanceStatusMessageProcessorChain
 - TenantMessageProcessorChain
 - TopologyMessageProcessorChain
 - MessageProcessorChain
 - ApplicationsEventMessageDelegator
 - ApplicationsEventReceiver

InstanceNotifierEventReceiver
 - synchronized blocks for execute() and terminate()
 - eventSubscriber object creation moved to constructor from execute()
method. This is to avoid possible NPE when calling terminate() method

MetadataApi
 - Catch generic Exception instead of RegistryException

DataStore
 - throws MetadataException in addition to RegistryException
MetadataApiRegistry
 - throw new MetadataException instead of RegistryException


MockIaasServiceComponent
 - Move mockIaasServiceUtil.startInstancesPersisted() to
MockIaasServiceImpl()

MockIaasServiceImpl
 - start persisted mock instances in the constructor
 - Remove all thread sleep calls


MockIaasServiceUtil
 - Move startInstancesPersisted() method to MockIaasServiceImpl()


MockInstance
 - Create TopologyEventReceiver object and register:
MemberInitializedEventListener, MemberStartedEventListener,
MemberMaintenanceListener
 - Publish InstanceStartedEvent upon receiving MemberInitializedEvent
 - Publish InstanceActivatedEvent upon receiving MemberStartedEvent
 - Publish InstanceReadyToShutdownEvent upon receiving
MemberMaintenanceModeEvent
 - synchronized terminate() and initialize() methods
 - Introduce an initialize() method to start event receivers and health
stat publisher
 - Introduce field 'memberStatus' of type MemberStatus to track the
life-cycle of mock instance

MockIaasServiceTest
 - Start embedded MB in unit test case since mockIaasService.startInstance
call will start the event receivers as well

GitHookTestCase
 - Make artifactUpdateEventCount an AtomicInteger
 - Replace restClient.doPost call with restClient.addEntity call
 - Terminate instanceNotifierEventReceiver at the end of test case
 - Shutdown eventListenerExecutorService at the end of test case

[1] https://issues.apache.org/jira/browse/STRATOS-1633

Thanks.

-- 
Akila Ravihansa Perera
WSO2 Inc.;  http://wso2.com/

Blog: http://ravihansa3000.blogspot.com

Reply via email to