Sanjib, after looking at the test results in CI, it seems some of the failures 
are not that sporadic, for example there is a consistent flow push failure when 
the leader restarts and the switch connects to the just restarted leader 
(currently a follower). I wonder if you could change your test so that after 
leader restart mininet connects to the old leader instead to a fix follower 
(follower #2), this way we can achieve a consistent error, easier to reproduce.

BR/Luis


> On Apr 4, 2016, at 10:21 PM, Luis Gomez <[email protected]> wrote:
> 
> The suite looks good to me too, also I cannot reproduce this issue using He 
> plugin in my local setup so lets observe the behavior in CI.
> 
> BR/Luis
> 
> 
>> On Apr 4, 2016, at 5:06 PM, Jamo Luhrsen <[email protected]> wrote:
>> 
>> Sanjib,
>> 
>> Thanks for the work.  I gave +1 and some comments.  Luis should be
>> the one to take it across the finish line.
>> 
>> As for the random failures, do you know which open bugs we can
>> associate those with?
>> 
>> Thanks,
>> JamO
>> 
>> On 04/04/2016 04:35 AM, Sanjib Mohapatra wrote:
>>> Hi Luis/Jamo
>>> 
>>> I have raised a review request https://git.opendaylight.org/gerrit/37054  
>>> regarding a new test suite Cluster HA Data Recovery Leader Follower 
>>> Failover. it covers following tests.
>>> 
>>>   Description: In a 3 node cluster initial inventory shard status is
>>>   verified and following tests are performed.
>>> 
>>>   - Mininet switch is connected to a follower node and flow is added via
>>>     another follower node. Disconnect the switch and restarts the Leader.
>>>     Connect the switch again once Cluster is formed and verify flow is
>>>     installed in the switch again
>>> 
>>>   - Disconnect the switch and restarts one of the follower node. Connect
>>>     the switch and verify the flow is installed in the switch again.
>>> 
>>>   - Disconnect the switch and restarts the Cluster. Connect the switch
>>>     again when Cluster is formed and verify the flow is installed
>>>     successfully.
>>> 
>>> Please do the review.
>>> 
>>> Test Result of local run in a 3 node cluster is given below, the sporadic 
>>> failure needs to be investigated and BUG to be raised. 
>>> 
>>> root@mininet-vm:/home/mininet/integration/test/csit/suites/openflowplugin/Clustering#
>>>  pybot -L TRACE  -v MININET_USER:mininet -v USER_HOME:/home/mininet -v 
>>> CONTROLLER:10.183.181.51 -v CONTROLLER1:10.183.181.52 -v 
>>> CONTROLLER2:10.183.181.53 -v USER:root -v PASSWORD:rootroot  -v 
>>> DEFAULT_LINUX_PROMPT:\#  -v NUM_ODL_SYSTEM:3 -v MININET_PASSWORD:rootroot 
>>> -v ODL_OF_PLUGIN:helium  -v 
>>> KARAF_HOME:/home/mininet/controller-Be/deploy/current/odl -v 
>>> WORKSPACE:/home/mininet -v BUNDLEFOLDER:controller-Be/deploy/current/odl 
>>> 030__Cluster_HA_Data_Recovery_Leader_Follower_Failover.robot
>>> ==============================================================================
>>> Cluster HA Data Recovery Leader Follower Failover :: Test suite for 
>>> Cluster...
>>> ==============================================================================
>>> Create Original Cluster List :: Create original cluster list.         | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Check Shards Status Before Leader Restart :: Check Status for all ... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Get inventory Leader Before Leader Restart :: Find leader in the i... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Start Mininet Connect To Follower Node1 :: Start mininet with conn... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Add Flows In Follower Node2 and Verify Before Leader Restart :: Ad... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Stop Mininet Connected To Follower Node1 and Exit :: Stop mininet ... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Restart Leader From Cluster Node :: Kill Leader Node and Start it ... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Get inventory Follower After Leader Restart :: Find new Followers ... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Start Mininet Connect To Follower Node2 :: Start mininet with conn... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Verify Flows In Switch After Leader Restart :: Verify flows are in... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Stop Mininet Connected To Follower Node2 and Exit :: Stop mininet ... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Restart Follower Node2 :: Kill Follower Node2 and Start it Up, Ver... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Get inventory Follower After Follower Restart :: Find Followers an... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Start Mininet Connect To Leader :: Start mininet with connection t... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Verify Flows In Switch After Follower Restart :: Verify flows are ... | 
>>> FAIL |
>>> Keyword 'MininetKeywords.Mininet Sync Status' failed after retrying for 15 
>>> seconds. The last error was: '*** s1 
>>> ------------------------------------------------------------------------
>>> OFPST_AGGREGATE reply (OF1.3) (xid=0x2): packet_count=0 byte_count=0 
>>> flow_count=0
>>> mininet>' contains 'flow_count=1' 0 times, not 1 time.
>>> ------------------------------------------------------------------------------
>>> Stop Mininet Connected To Leader and Exit :: Stop mininet Connecte... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Restart Full Cluster :: Kill all Cluster Nodes and Start it Up All.   | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Get inventory Status After Cluster Restart :: Find New Followers a... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Start Mininet Connect To Follower Node2 After Cluster Restart :: S... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Verify Flows In Switch After Cluster Restart :: Verify flows are i... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Delete Flows In Follower Node1 and Verify After Leader Restart :: ... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Stop Mininet Connected To Follower Node2 and Exit After Cluster Re... | 
>>> PASS |
>>> ------------------------------------------------------------------------------
>>> Cluster HA Data Recovery Leader Follower Failover :: Test suite fo... | 
>>> FAIL |
>>> 22 critical tests, 21 passed, 1 failed
>>> 22 tests total, 21 passed, 1 failed
>>> ==============================================================================
>>> Output:  
>>> /home/mininet/integration/test/csit/suites/openflowplugin/Clustering/output.xml
>>> Log:     
>>> /home/mininet/integration/test/csit/suites/openflowplugin/Clustering/log.html
>>> Report:  
>>> /home/mininet/integration/test/csit/suites/openflowplugin/Clustering/report.html
>>> 
>>> Thanks
>>> Sanjib
>>> 
>>> 
>>> -----Original Message-----
>>> From: Gerrit Code Review [mailto:[email protected]] 
>>> Sent: 04 April 2016 16:57
>>> To: Sanjib Mohapatra
>>> Subject: Change in integration/test[master]: Added test suites for cluster 
>>> HA data recovery leader follow...
>>> 
>>> From jenkins-releng <[email protected]>:
>>> 
>>> jenkins-releng has posted comments on this change.
>>> 
>>> Change subject: Added test suites for cluster HA data recovery leader 
>>> follower failover 
>>> ......................................................................
>>> 
>>> 
>>> Patch Set 1: Verified+1
>>> 
>>> Build Unstable 
>>> 
>>> https://jenkins.opendaylight.org/releng/job/openflowplugin-csit-verify-1node-flow-services/98/
>>>  : SUCCESS
>>> 
>>> https://jenkins.opendaylight.org/releng/job/openflowplugin-csit-verify-3node-clustering-helium-design/41/
>>>  : SUCCESS
>>> 
>>> https://jenkins.opendaylight.org/releng/job/integration-csit-verify-1node-library/1460/
>>>  : UNSTABLE
>>> 
>>> https://jenkins.opendaylight.org/releng/job/openflowplugin-csit-verify-3node-clustering/74/
>>>  : UNSTABLE
>>> 
>>> https://jenkins.opendaylight.org/releng/job/integration-test-verify-python-boron/214/
>>>  : SUCCESS
>>> 
>>> --
>>> To view, visit https://git.opendaylight.org/gerrit/37054
>>> To unsubscribe, visit https://git.opendaylight.org/gerrit/settings
>>> 
>>> Gerrit-MessageType: comment
>>> Gerrit-Change-Id: I1cc767712bac3694ba4bf9b052765903bce28ae8
>>> Gerrit-PatchSet: 1
>>> Gerrit-Project: integration/test
>>> Gerrit-Branch: master
>>> Gerrit-Owner: Sanjib Mohapatra <[email protected]>
>>> Gerrit-Reviewer: jenkins-releng <[email protected]>
>>> Gerrit-HasComments: No
>>> _______________________________________________
>>> integration-dev mailing list
>>> [email protected]
>>> https://lists.opendaylight.org/mailman/listinfo/integration-dev
>>> 
> 

_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to