Hi Shaheed,

We had been using member fault detection to test the termination-behavior
in beta2 and after that like one week before. So, i believe that It will
work in the latest master. However we will also verify this again.

Thanks,
Reka

On Mon, May 18, 2015 at 7:53 PM, Imesh Gunaratne <im...@apache.org> wrote:

> Thanks Shaheed! I will verify the second problem where Stratos is not
> detecting manually terminated members.
>
> Thanks
>
> On Mon, May 18, 2015 at 3:39 PM, Shaheedur Haque (shahhaqu) <
> shahh...@cisco.com> wrote:
>
>>  Ack. We are just in the middle of doing getting sync’d up again to
>> master, and it sounds like that might fix the persistence issue.
>>
>>
>>
>> I guess that leaves the Cartridge Agent reconnect side of the problem…
>>
>>
>>
>> *From:* Lahiru Sandaruwan [mailto:lahi...@wso2.com]
>> *Sent:* 17 May 2015 03:06
>>
>> *To:* dev
>> *Cc:* Ryan Du Plessis (rdupless); Luca Martini (lmartini)
>> *Subject:* Re: Clustered deployments of Stratos
>>
>>
>>
>> Hi Shaheed,
>>
>>
>>
>> Similarly it would be a great help, if you can verify all these issues in
>> latest code, since we have been fixing a lot of issues in recent days, as a
>> result of RC1 testing.
>>
>>
>>
>> Thanks.
>>
>>
>>
>> On Fri, May 15, 2015 at 9:42 PM, Imesh Gunaratne <im...@apache.org>
>> wrote:
>>
>> Hi Shaheed,
>>
>>
>>
>> Thanks for the quick response, after analyzing the results you have
>> provided again, it looks like only the deployment policies are missing
>> after the failover. We have fixed this issue in commit
>> revision: 0c515aa013850575ddcfa2e299da5f0ec250ebc3
>>
>>
>>
>>
>> http://mail-archives.apache.org/mod_mbox/incubator-stratos-commits/201504.mbox/%3c22eed4e8639c401a8fda637fa6bb4...@git.apache.org%3E
>>
>>
>>
>> Would you mind verifying whether this is there in your runtime?
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>> On Fri, May 15, 2015 at 9:02 PM, Shaheedur Haque (shahhaqu) <
>> shahh...@cisco.com> wrote:
>>
>> The latter; we never have both Stratos instances running.
>>
>>
>>
>> *From:* Imesh Gunaratne [mailto:im...@apache.org]
>> *Sent:* 15 May 2015 16:17
>> *To:* dev
>> *Cc:* Ryan Du Plessis (rdupless); Luca Martini (lmartini)
>>
>>
>> *Subject:* Re: Clustered deployments of Stratos
>>
>>
>>
>> Hi Shaheed,
>>
>>
>>
>> Do you have both active and passive Stratos nodes running at the same
>> time or do you start the passive node once the active node goes down?
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Fri, May 15, 2015 at 6:31 PM, Shaheedur Haque (shahhaqu) <
>> shahh...@cisco.com> wrote:
>>
>> Hi Imesh,
>>
>>
>>
>> I finally got round to a proper series of tests, and here are the
>> conclusions:
>>
>>
>>
>> ·        In Stratos 4.0, after a Pacemaker driven failover, the newly
>> Active Stratos has lost all Cartridge Definitions.
>>
>> ·        In current [1] Stratos 4.1, after a Pacemaker driven failover,
>> the newly Active Stratos:
>>
>> o   Has lost all Deployment Policies.
>>
>> o   Has lost contact with the Cartridge Agents, and all VMs are stuck
>> with whatever state they had before the failover.
>>
>> ·        Note: I have not verified if Cartridge Groups are lost or not.
>>
>>
>>
>> I include the test results below at [2] and [3]. I am concerned as to
>> whether 4.1 is ready for GA on this basis, so though more testing is no
>> doubt possible (e.g. Cartridge Groups) I wanted to get this info to the
>> list ASAP.
>>
>>
>>
>> Thanks, Shaheed
>>
>>
>>
>> [1] A recent build somewhere between beta 1 and beta 2, but I don’t think
>> any relevant fixes have been made in master.
>>
>>
>>
>> [2] Persistence test output from Stratos 4.1. Note:
>>
>>
>>
>> 1.      In the build I have, the CLI is broken for a couple of commands;
>> these are supplemented by direct “curl” commands further down.
>>
>> 2.      I’ve used one of our commands to show the instances and their
>> state for a given application since there is not a compact JSON or
>> convenient Startos CLI for that.
>>
>>
>>
>> *PERSISTENCE TEST, BEFORE FAILOVER*
>>
>> *================================*
>>
>>
>>
>> stratos> list-tenants
>>
>> Tenants:
>>
>>
>> +-----------------------+-----------+------------------+--------+------------------------------+
>>
>> | Domain                | Tenant ID | Email            | State  | Created
>> Date                 |
>>
>>
>> +-----------------------+-----------+------------------+--------+------------------------------+
>>
>> | cloud1.qmog.cisco.com | 1         | clo...@cisco.com | Active | Fri
>> May 15 04:46:58 MDT 2015 |
>>
>>
>> +-----------------------+-----------+------------------+--------+------------------------------+
>>
>>
>>
>> stratos> list-network-partitions
>>
>> Network partitions found:
>>
>> +----------------------+----------------------+
>>
>> | Network Partition ID | Number of Partitions |
>>
>> +----------------------+----------------------+
>>
>> | RegionOne            | 1                    |
>>
>> +----------------------+----------------------+
>>
>>
>>
>> stratos> list-deployment-policies
>>
>> Deployment policies found:
>>
>> +-------------------+---------------+
>>
>> | ID                | Accessibility |
>>
>> +-------------------+---------------+
>>
>> | static-2-ha       | 1             |
>>
>> +-------------------+---------------+
>>
>> | autoscale-2-10-ha | 1             |
>>
>> +-------------------+---------------+
>>
>> | autoscale-1-5     | 1             |
>>
>> +-------------------+---------------+
>>
>> | static-1          | 1             |
>>
>> +-------------------+---------------+
>>
>>
>>
>> stratos> list-application-policies
>>
>> Error in listing application policies
>>
>> No application policies found
>>
>>
>>
>> stratos> list-autoscaling-policies
>>
>> Error in listing autoscaling policies
>>
>> No autoscaling policies found
>>
>>
>>
>> stratos> list-cartridges
>>
>> Cartridges found:
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | Type             | Category    | Name             |
>> Description                | Version | Multi-Tenant |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cartridge-proxy  | Application | cartridge-proxy  | cartridge-proxy
>> Cartridge  | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-sample-vm  | Application | cisco-sample-vm  | cisco-sample-vm
>> Cartridge  | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-qvpc-cf-01 | Application | cisco-qvpc-cf-01 | cisco-qvpc-cf-01
>> Cartridge | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-qvpc-cf-02 | Application | cisco-qvpc-cf-02 | cisco-qvpc-cf-02
>> Cartridge | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-qvpc-si    | Application | cisco-qvpc-si    | cisco-qvpc-si
>> Cartridge    | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-qvpc-sf    | Application | cisco-qvpc-sf    | cisco-qvpc-sf
>> Cartridge    | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>>
>>
>> stratos> list-applications
>>
>> Applications found:
>>
>> +-----------------+-----------------+----------+
>>
>> | Application ID  | Alias           | Status   |
>>
>> +-----------------+-----------------+----------+
>>
>> | cartridge-proxy | cartridge-proxy | Deployed |
>>
>> +-----------------+-----------------+----------+
>>
>> | cisco-sample-vm | cisco-sample-vm | Deployed |
>>
>> +-----------------+-----------------+----------+
>>
>>
>>
>> $ curl -uadmin:admin -k -H'Content-type: application/json'
>> https://localhost:9443/api/autoscalingPolicies
>>
>>
>> [{"id":"economyPolicy","instanceRoundingFactor":0,"isPublic":false,"loadThresholds":""}]
>>
>>
>>
>> $ curl -uadmin:admin -k -H'Content-type: application/json'
>> https://localhost:9443/api/applicationPolicies
>>
>>
>> [{"algorithm":"one-after-another","id":"default-iaas","networkPartitions":["RegionOne"],"properties":{"name":"networkPartitionGroups","value":"RegionOne"}}]
>>
>>
>>
>>
>>
>> *PERSISTENCE TEST, AFTER FAILOVER*
>>
>> *===============================*
>>
>>
>>
>> stratos> list-tenants
>>
>> Tenants:
>>
>>
>> +-----------------------+-----------+------------------+--------+------------------------------+
>>
>> | Domain                | Tenant ID | Email            | State  | Created
>> Date                 |
>>
>>
>> +-----------------------+-----------+------------------+--------+------------------------------+
>>
>> | cloud1.qmog.cisco.com | 1         | clo...@cisco.com | Active | Fri
>> May 15 05:26:52 MDT 2015 |
>>
>>
>> +-----------------------+-----------+------------------+--------+------------------------------+
>>
>>
>>
>> stratos> list-network-partitions
>>
>> Network partitions found:
>>
>> +----------------------+----------------------+
>>
>> | Network Partition ID | Number of Partitions |
>>
>> +----------------------+----------------------+
>>
>> | RegionOne            | 1                    |
>>
>> +----------------------+----------------------+
>>
>>
>>
>> stratos> list-deployment-policies
>>
>> No deployment policies found
>>
>>
>>
>> stratos> list-application-policies
>>
>> Error in listing application policies
>>
>> No application policies found
>>
>>
>>
>> stratos> list-autoscaling-policies
>>
>> Error in listing autoscaling policies
>>
>> No autoscaling policies found
>>
>>
>>
>> stratos> list-cartridges
>>
>> Cartridges found:
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | Type             | Category    | Name             |
>> Description                | Version | Multi-Tenant |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cartridge-proxy  | Application | cartridge-proxy  | cartridge-proxy
>> Cartridge  | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-sample-vm  | Application | cisco-sample-vm  | cisco-sample-vm
>> Cartridge  | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-qvpc-cf-01 | Application | cisco-qvpc-cf-01 | cisco-qvpc-cf-01
>> Cartridge | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-qvpc-cf-02 | Application | cisco-qvpc-cf-02 | cisco-qvpc-cf-02
>> Cartridge | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-qvpc-si    | Application | cisco-qvpc-si    | cisco-qvpc-si
>> Cartridge    | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>> | cisco-qvpc-sf    | Application | cisco-qvpc-sf    | cisco-qvpc-sf
>> Cartridge    | 1       | false        |
>>
>>
>> +------------------+-------------+------------------+----------------------------+---------+--------------+
>>
>>
>>
>> stratos> list-applications
>>
>> Applications found:
>>
>> +-----------------+-----------------+----------+
>>
>> | Application ID  | Alias           | Status   |
>>
>> +-----------------+-----------------+----------+
>>
>> | cartridge-proxy | cartridge-proxy | Deployed |
>>
>> +-----------------+-----------------+----------+
>>
>> | cisco-sample-vm | cisco-sample-vm | Deployed |
>>
>> +-----------------+-----------------+----------+
>>
>>
>>
>> $ curl -uadmin:admin -k -H'Content-type: application/json'
>> https://localhost:9443/api/autoscalingPolicies
>>
>>
>> [{"id":"economyPolicy","instanceRoundingFactor":0,"isPublic":false,"loadThresholds":""}]
>>
>>
>>
>> $ curl -uadmin:admin -k -H'Content-type: application/json'
>> https://localhost:9443/api/applicationPolicies
>>
>>
>> [{"algorithm":"one-after-another","id":"default-iaas","networkPartitions":["RegionOne"],"properties":{"name":"networkPartitionGroups","value":"RegionOne"}}]
>>
>>
>>
>> [3] Cartridge test output from Stratos 4.1. Note:
>>
>>
>>
>> 1.      We do not use a VIP for Stratos, either for 4.0 or 4.1.
>>
>> 2.      We expect the Cartridge Agent to use a DNS lookup when it ends
>> up reconnecting, and this worked just fine in Stratos 4.0.
>>
>>
>>
>> *CARTRIDGE TEST, BEFORE FAILOVER*
>>
>> *==============================*
>>
>>
>>
>> $ ./bin/orchestration subscription list-instances --admin cisco-sample-vm
>>
>> cisco-sample-vm: applicationInstances 1, groupInstances 0,
>> clusterInstances 1, members 1 (Active 1)
>>
>>      cisco-sample-vm: 172.16.180.30/10.0.0.101: status Active
>>
>>
>>
>> *CARTRIDGE TEST, AFTER FAILOVER*
>>
>> *=============================*
>>
>>
>>
>> $ ./bin/orchestration subscription list-instances --admin cisco-sample-vm
>>
>> cisco-sample-vm: applicationInstances 1, groupInstances 0,
>> clusterInstances 1, members 1 (Active 1)
>>
>>      cisco-sample-vm: 172.16.180.30/10.0.0.101: status Active
>>
>>
>>
>> *CARTRIDGE TEST,  AFTER FAILOVER WAIT 5 MINUTES, THEN KILL INSTANCE, THEN
>> WAIT 2 MINUTES*
>>
>>
>> *===================================================================================*
>>
>>
>>
>> $ ./bin/orchestration subscription list-instances --admin cisco-sample-vm
>>
>> cisco-sample-vm: applicationInstances 1, groupInstances 0,
>> clusterInstances 1, members 1 (Active 1)
>>
>>      cisco-sample-vm: 172.16.180.30/10.0.0.101: status Active
>>
>>
>>
>>
>>
>>
>>
>> *From:* Imesh Gunaratne [mailto:im...@apache.org]
>> *Sent:* 14 May 2015 20:34
>>
>>
>> *To:* dev
>> *Subject:* Re: Clustered deployments of Stratos
>>
>>
>>
>> It would be better to use the REST API to query and see whether the
>> relevant entities are persisted. Since data is stored in binary format in
>> the registry it would be difficult to query the database and verify this.
>>
>>
>>
>> On Thu, May 14, 2015 at 10:47 PM, Shaheedur Haque (shahhaqu) <
>> shahh...@cisco.com> wrote:
>>
>> I looked at REG_RESOURCEs a9s well as a few others) but I’m afraid I am
>> going to need more specifics.
>>
>>
>>
>> For example, what query would you recommend to look at say deployment
>> policies and cartridge definitions?
>>
>>
>>
>> *From:* Imesh Gunaratne [mailto:im...@apache.org]
>> *Sent:* 09 May 2015 09:08
>>
>>
>> *To:* dev
>> *Subject:* Re: Clustered deployments of Stratos
>>
>>
>>
>> Yes you could refer the tables that have the prefix "REG_".
>>
>>
>>
>> On Sat, May 9, 2015 at 4:11 AM, Shaheedur Haque (shahhaqu) <
>> shahh...@cisco.com> wrote:
>>
>> Can you suggest what tables to look at?
>>
>>
>>
>> *From:* Imesh Gunaratne [mailto:im...@apache.org]
>> *Sent:* 07 May 2015 18:00
>>
>>
>> *To:* dev
>> *Subject:* Re: Clustered deployments of Stratos
>>
>>
>>
>> Hi Shaheed,
>>
>>
>>
>> Thanks for the clarification! May be the problem is with the MySQL
>> active-passive configuration.
>>
>>
>>
>> I understand that you are switching the same OpenStack volume from active
>> node to the passive node (when the passive node becomes active) therefore
>> technically it should work. May be we need to investigate this problem
>> further by analysing whether data is persisted properly in the active node
>> before the passive node becomes active.
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Tue, May 5, 2015 at 4:22 PM, Shaheedur Haque (shahhaqu) <
>> shahh...@cisco.com> wrote:
>>
>> The data is not synchronised between the active and passive nodes. For
>> clarity, this is the HA model we had, much as described in the blog:
>>
>>
>>
>> ·        2 nodes, with Pacemaker in active-passive mode.
>>
>> ·        Under Pacemaker control:
>>
>> o   We run MySQL in active-passive mode, using a single OpenStack volume
>> which we attach/reattach as the active role moves around nodes.
>>
>> o   As the Pacemaker moves the volume, and thus MySQL around on node
>> failures, ActiveMQ and Stratos are moved around too.
>>
>> o   Thus, everything operates in active-passive mode.
>>
>>
>>
>> Even in this model, as the active Stratos 4.0 is moved around (i.e. the
>> Stratos JVM on the old active node has gone with the node, and Pacemaker
>> starts up a new Stratos JVM on what used to be the passive node), we found
>> that the Cartridge Definition objects were found to be missing and, as a
>> clumsy workaround [1], we had to replay the stored copied of them into
>> Stratos using the REST API.
>>
>>
>>
>> With Stratos 4.1, using the new object names , early indications are 
>> *Deployment
>> Policies* and *Application Deployment* policies are lost as the active
>> fails over to the passive. If anything, these objects are more likely to
>> hit the problems of [1], since Stratos 4.1 expects these to be tweaked on
>> the fly (min/max etc).
>>
>>
>>
>> Thanks, Shaheed
>>
>>
>>
>> [1] Clearly, this loses any changes that were not in the stored copies.
>>
>>
>>
>> *From:* Imesh Gunaratne [mailto:im...@apache.org]
>> *Sent:* 03 May 2015 06:43
>> *To:* dev@stratos.apache.org
>>
>>
>> *Subject:* Re: Clustered deployments of Stratos
>>
>>
>>
>> Hi Shaheed,
>>
>>
>>
>> Thanks for taking time to test this!
>>
>>
>>
>> Just to clarify the exact problem, do you mean that data is not
>> synchronized between the active and passive nodes or they are not persisted
>> in the active node?
>>
>>
>>
>> Thanks
>>
>>
>> On Sunday, May 3, 2015, Shaheedur Haque (shahhaqu) <shahh...@cisco.com>
>> wrote:
>>
>>
>> I have been looking into our use of Linux HA to setup an Active-Passive
>> configuration. Testing indicates that in 4.1 (beta1), several objects seem
>> not to be persisted properly. This includes at least:
>>
>> - Cartridges
>> - Deployment policies
>>
>> Am I missing something? Is it safe to workaround this by replaying those
>> objects?
>>  ------------------------------
>>
>> *From:* Imesh Gunaratne [im...@apache.org]
>> *Sent:* 23 April 2015 10:47
>> *To:* dev
>> *Subject:* Re: Clustered deployments of Stratos
>>
>> Hi Shaheed,
>>
>>
>>
>> Currently N-way clustering is still not possible with CC, AS & SM. We
>> completed the initial phase of this feature however it was not completed.
>> You could refer mail thread "[Discuss] Clustering Feature Implementation
>> for 4.1.0-Alpha Release" for details.
>>
>>
>>
>> However at present [1] is valid. We could use Linux HA and deploy CC, AS
>> and SM in Active-Passive mode.
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Apr 23, 2015 at 2:41 PM, Shaheedur Haque (shahhaqu) <
>> shahh...@cisco.com> wrote:
>>
>> Hi,
>>
>>
>>
>> We currently try to achieve HA with Stratos using something so unpleasant
>> that I won’t even describe it here J. It has been suggested that Stratos
>> has, for a while now, supported a clustered mode of deployment where, given
>> N servers:
>>
>>
>>
>> ·        The SM, AS and MB operate in a N-way clustered mode
>>
>> ·        The CEP operates in a N-way loadsharing mode
>>
>> ·        The Cartridge Agents can react to a failure in one of the N
>> CEPs by failing over to one of the other N-1 remaining servers
>>
>>
>>
>> In looking for documentation on how to set this up, I came across these
>> two write-ups [1] and [2]. Questions:
>>
>>
>>
>> ·        Both these documents mention only using N=2. Is that still
>> correct?
>>
>> ·        [1] Seems recently written, and [2] is a little older but not
>> much. Are both documents still regarded as current?
>>
>>
>>
>> Also, I’d love to hear any other experiences people have of running
>> configurations like this.
>>
>>
>>
>> Thanks, Shaheed
>>
>>
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Configuring+HA+Using+Pacemaker+and+Heartbeat
>>
>> [2] http://blog.lasindu.com/2014/08/wso2-private-paas-supporting.html
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Imesh Gunaratne
>>
>>
>>
>> Technical Lead, WSO2
>>
>> Committer & PMC Member, Apache Stratos
>>
>>
>>
>> --
>>
>> Imesh Gunaratne
>>
>>
>>
>> Senior Technical Lead, WSO2
>>
>> Committer & PMC Member, Apache Stratos
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Imesh Gunaratne
>>
>>
>>
>> Senior Technical Lead, WSO2
>>
>> Committer & PMC Member, Apache Stratos
>>
>>
>>
>>
>>
>> --
>>
>> Imesh Gunaratne
>>
>>
>>
>> Senior Technical Lead, WSO2
>>
>> Committer & PMC Member, Apache Stratos
>>
>>
>>
>>
>>
>> --
>>
>> Imesh Gunaratne
>>
>>
>>
>> Senior Technical Lead, WSO2
>>
>> Committer & PMC Member, Apache Stratos
>>
>>
>>
>>
>>
>> --
>>
>> Imesh Gunaratne
>>
>>
>>
>> Senior Technical Lead, WSO2
>>
>> Committer & PMC Member, Apache Stratos
>>
>>
>>
>>
>>
>> --
>>
>> Imesh Gunaratne
>>
>>
>>
>> Senior Technical Lead, WSO2
>>
>> Committer & PMC Member, Apache Stratos
>>
>>
>>
>>
>>
>> --
>>
>> --
>> Lahiru Sandaruwan
>>
>> Committer and PMC member, Apache Stratos,
>> Senior Software Engineer,
>> WSO2 Inc., http://wso2.com
>>
>> lean.enterprise.middleware
>>
>> phone: +94773325954
>> email: lahi...@wso2.com blog: http://lahiruwrites.blogspot.com/
>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>
>>
>>
>
>
>
> --
> Imesh Gunaratne
>
> Senior Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos
>



-- 
Reka Thirunavukkarasu
Senior Software Engineer,
WSO2, Inc.:http://wso2.com,
Mobile: +94776442007

Reply via email to