Hi,

I've fixed most of these cases we discussed here. To summarize;

 - Registry exceptions should always be thrown from persist() method and
should be handled by the methods that are calling it
 - If service name, cluster id or member id is not found in the Topology, a
RuntimeException should be thrown instead of returning from the method.

Please go through the changes in [1] and share your thoughts.

[1]
https://github.com/apache/stratos/commit/8d46fab0f921e37d5f6da11f8006b68b3743226d

Thanks.


On Thu, Oct 1, 2015 at 12:14 PM, Imesh Gunaratne <im...@apache.org> wrote:

> A good finding Dinithi! May be you can fix this for 4.1.4.
>
> Thanks
>
> On Wed, Sep 30, 2015 at 9:22 PM, Dinithi De Silva <dinit...@wso2.com>
> wrote:
>
>> Hi Akila,
>>
>> Thank you for bringing this up. +1 for this as this would lead to an
>> unrecoverable state and we can not even find out where it went wrong.
>>
>> I found another place in CloudControllerServiceImpl class which we need
>> to fix.
>>
>> private void onClusterRemoval(final String clusterId) {
>> ClusterContext ctxt =
>> CloudControllerContext.getInstance().getClusterContext(clusterId);
>> TopologyBuilder.handleClusterRemoved(ctxt);
>> CloudControllerContext.getInstance().removeClusterContext(clusterId);
>>
>> CloudControllerContext.getInstance().removeMemberContextsOfCluster(clusterId);
>> CloudControllerContext.getInstance().persist();
>> }
>>
>> As Imesh suggested it would be better to scan the complete code base to
>> find out the places we need to fix.
>>
>> Thanks.
>>
>> On Tue, Sep 29, 2015 at 8:08 PM, Imesh Gunaratne <im...@apache.org>
>> wrote:
>>
>>> Yes indeed! As I found the problem is in CloudControllerUtil class:
>>>
>>> public static void persistTopology(Topology topology) {
>>>     try {
>>>         
>>> RegistryManager.getInstance().persist(CloudControllerConstants.TOPOLOGY_RESOURCE,
>>>  topology);
>>>     } catch (RegistryException e) {
>>>         String msg = "Failed to persist the Topology in registry. ";
>>>         log.fatal(msg, e);
>>>     }
>>> }
>>>
>>> We might need to scan the entire codebase for such occurrences.
>>>
>>> Thanks
>>>
>>>
>>> On Tue, Sep 29, 2015 at 4:26 PM, Reka Thirunavukkarasu <r...@wso2.com>
>>> wrote:
>>>
>>>> +1 for handing these dependent operations as atomic in order to avoid
>>>> inconsistency. It is a good thought.
>>>>
>>>> Better have the same approach for application model also as it has to
>>>> update monitor hierarchy and application model at the same time.
>>>>
>>>> Thanks,
>>>> Reka
>>>>
>>>> On Tue, Sep 29, 2015 at 1:57 AM, Akila Ravihansa Perera <
>>>> raviha...@wso2.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> This is to bring your attention to possible inconsistency that could
>>>>> arise due to rare edge cases. For an eg: If you follow the piece of code 
>>>>> at
>>>>> [1], handleMemberTerminated can return without successfully removing a
>>>>> member from topology but the next step will get executed which is to 
>>>>> remove
>>>>> the member context from CC's context, thus leading to inconsistencies. 
>>>>> This
>>>>> could result in permanent inconsistent state and system will never 
>>>>> recover.
>>>>>
>>>>> I'm proposing the $subject to resolve this issue. I think simply
>>>>> throwing an exception if an operation is not successful and handling those
>>>>> exceptions gracefully should fix the problem rather than silently 
>>>>> returning
>>>>> from the method call. Above is only one example and we may have to find
>>>>> such occurrences throughout the CC component since that is the only module
>>>>> which writes/updates the topology.
>>>>>
>>>>> However, we need to check whether same problem can occur for
>>>>> Application model maintained by AS and Tenant model maintained by SM.
>>>>> @Reka: Any thoughts?
>>>>>
>>>>> I've created the JIRA [2] to track this issue.
>>>>>
>>>>> [1]
>>>>> https://github.com/ravihansa3000/stratos/blob/stratos-4.1.x/components/org.apache.stratos.cloud.controller/src/main/java/org/apache/stratos/cloud/controller/services/impl/CloudControllerServiceUtil.java#L57
>>>>> [2] https://issues.apache.org/jira/browse/STRATOS-1578
>>>>>
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> Akila Ravihansa Perera
>>>>> WSO2 Inc.;  http://wso2.com/
>>>>>
>>>>> Blog: http://ravihansa3000.blogspot.com
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Reka Thirunavukkarasu
>>>> Senior Software Engineer,
>>>> WSO2, Inc.:http://wso2.com,
>>>> Mobile: +94776442007
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Imesh Gunaratne
>>>
>>> Senior Technical Lead, WSO2
>>> Committer & PMC Member, Apache Stratos
>>>
>>
>>
>>
>> --
>> *Dinithi De Silva*
>> Associate Software Engineer, WSO2 Inc.
>> m:+94716667655 | e:dinit...@wso2.com | w: www.wso2.com
>> | a: #20, Palm Grove, Colombo 03
>>
>
>
>
> --
> Imesh Gunaratne
>
> Senior Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos
>



-- 
Akila Ravihansa Perera
WSO2 Inc.;  http://wso2.com/

Blog: http://ravihansa3000.blogspot.com

Reply via email to