[tickets] [opensaf:tickets] #2415 CKPT node director failed to execute ckpt create request

2019-07-02 Thread Minh Hon Chau via Opensaf-tickets
- Description has changed:

Diff:



--- old
+++ new
@@ -1,5 +1,5 @@
 After the following two patches were removed, based on OpenSAF CS8701, CKPT 
node director failed to execute ckpt create request(Collocated Checkpoints, 
Asynchronous Update).
--ph4_01_headless_escalation_for_osaftest.diff
+-ph4_01_headless_escalation_for_abcdtest.diff
 -mds_log_level.diff
 
 CPND_MAX_REPLICAS =1000



- **Blocker**:  --> False



---

** [tickets:#2415] CKPT node director failed to execute ckpt create request**

**Status:** fixed
**Milestone:** 5.2.0
**Created:** Fri Apr 07, 2017 01:30 AM UTC by David Byrne
**Last Updated:** Mon Apr 10, 2017 06:35 AM UTC
**Owner:** A V Mahesh (AVM)


After the following two patches were removed, based on OpenSAF CS8701, CKPT 
node director failed to execute ckpt create request(Collocated Checkpoints, 
Asynchronous Update).
-ph4_01_headless_escalation_for_abcdtest.diff
-mds_log_level.diff

CPND_MAX_REPLICAS =1000
retention_time is set to 30s

Test procedure
1. Send 34 ckpt request per second
34*30 = 1020 which is > CPND_MAX_REPLICAS
Failed which is expected
2. Send 32 ckpt request per second
32*30 = 960 which is < CPND_MAX_REPLICAS
It used to pass, but now failed since removing the above two patches.
syslog:
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ncs_sel_obj_create: socketpair failed 
- Too many open files
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ER cpnd has exceeded the maximum 
number of allowed replicas (CPND_MAX_REPLICAS)
Test debug info:
Apr 5, 2017 1:46:08 AM INFO ANSWER type: report 
start-time: 1491349366.360 
stop-time: 1491349567.269 
total: send=6428 recv=6407 fail=6407

Change test procedure for investigation purpose
1. Start test from 32 ckpt/s
32*30 = 960 which is  < CPND_MAX_REPLICAS
Passed
Apr 6, 2017 2:56:27 AM INFO ANSWER type: report 
start-time: 1491439975.068 
stop-time: 1491440187.347 
total: send=6792 send-failed=0 recv=6780  
2. then test 34 ckpt/s
Failed
3. Then test 33 ckpt/s
Failed
4. Then back to 32 ckpt/s again
Failed

From this experiment, we can see that once exceed the CPND_MAX_REPLICAS, ckpt 
service can’t be recovered. 
Note: the problem only occurs for Collocated Checkpoints, Asynchronous Update. 
Run the same test for Non-Collocated Checkpoints, Synchronous Update, it is OK.

Test Contact: Li Suo


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2415 CKPT node director failed to execute ckpt create request

2017-04-09 Thread Anders Widell
- **status**: review --> fixed
- **Comment**:

changeset:   8754:aa1ba7700cf2
user:Anders Widell 
date:Mon Apr 10 08:32:59 2017 +0200
summary: ckpt: Increase limit for number of file desciptors in CKPTND 
[#2415]

[staging:aa1ba7]




---

** [tickets:#2415] CKPT node director failed to execute ckpt create request**

**Status:** fixed
**Milestone:** 5.2.0
**Created:** Fri Apr 07, 2017 01:30 AM UTC by David Byrne
**Last Updated:** Sat Apr 08, 2017 09:10 AM UTC
**Owner:** A V Mahesh (AVM)


After the following two patches were removed, based on OpenSAF CS8701, CKPT 
node director failed to execute ckpt create request(Collocated Checkpoints, 
Asynchronous Update).
-ph4_01_headless_escalation_for_osaftest.diff
-mds_log_level.diff

CPND_MAX_REPLICAS =1000
retention_time is set to 30s

Test procedure
1. Send 34 ckpt request per second
34*30 = 1020 which is > CPND_MAX_REPLICAS
Failed which is expected
2. Send 32 ckpt request per second
32*30 = 960 which is < CPND_MAX_REPLICAS
It used to pass, but now failed since removing the above two patches.
syslog:
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ncs_sel_obj_create: socketpair failed 
- Too many open files
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ER cpnd has exceeded the maximum 
number of allowed replicas (CPND_MAX_REPLICAS)
Test debug info:
Apr 5, 2017 1:46:08 AM INFO ANSWER type: report 
start-time: 1491349366.360 
stop-time: 1491349567.269 
total: send=6428 recv=6407 fail=6407

Change test procedure for investigation purpose
1. Start test from 32 ckpt/s
32*30 = 960 which is  < CPND_MAX_REPLICAS
Passed
Apr 6, 2017 2:56:27 AM INFO ANSWER type: report 
start-time: 1491439975.068 
stop-time: 1491440187.347 
total: send=6792 send-failed=0 recv=6780  
2. then test 34 ckpt/s
Failed
3. Then test 33 ckpt/s
Failed
4. Then back to 32 ckpt/s again
Failed

From this experiment, we can see that once exceed the CPND_MAX_REPLICAS, ckpt 
service can’t be recovered. 
Note: the problem only occurs for Collocated Checkpoints, Asynchronous Update. 
Run the same test for Non-Collocated Checkpoints, Synchronous Update, it is OK.

Test Contact: Li Suo


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2415 CKPT node director failed to execute ckpt create request

2017-04-08 Thread Anders Widell
- **status**: assigned --> review
- **Milestone**: next --> 5.2.0



---

** [tickets:#2415] CKPT node director failed to execute ckpt create request**

**Status:** review
**Milestone:** 5.2.0
**Created:** Fri Apr 07, 2017 01:30 AM UTC by David Byrne
**Last Updated:** Fri Apr 07, 2017 03:55 AM UTC
**Owner:** A V Mahesh (AVM)


After the following two patches were removed, based on OpenSAF CS8701, CKPT 
node director failed to execute ckpt create request(Collocated Checkpoints, 
Asynchronous Update).
-ph4_01_headless_escalation_for_osaftest.diff
-mds_log_level.diff

CPND_MAX_REPLICAS =1000
retention_time is set to 30s

Test procedure
1. Send 34 ckpt request per second
34*30 = 1020 which is > CPND_MAX_REPLICAS
Failed which is expected
2. Send 32 ckpt request per second
32*30 = 960 which is < CPND_MAX_REPLICAS
It used to pass, but now failed since removing the above two patches.
syslog:
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ncs_sel_obj_create: socketpair failed 
- Too many open files
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ER cpnd has exceeded the maximum 
number of allowed replicas (CPND_MAX_REPLICAS)
Test debug info:
Apr 5, 2017 1:46:08 AM INFO ANSWER type: report 
start-time: 1491349366.360 
stop-time: 1491349567.269 
total: send=6428 recv=6407 fail=6407

Change test procedure for investigation purpose
1. Start test from 32 ckpt/s
32*30 = 960 which is  < CPND_MAX_REPLICAS
Passed
Apr 6, 2017 2:56:27 AM INFO ANSWER type: report 
start-time: 1491439975.068 
stop-time: 1491440187.347 
total: send=6792 send-failed=0 recv=6780  
2. then test 34 ckpt/s
Failed
3. Then test 33 ckpt/s
Failed
4. Then back to 32 ckpt/s again
Failed

From this experiment, we can see that once exceed the CPND_MAX_REPLICAS, ckpt 
service can’t be recovered. 
Note: the problem only occurs for Collocated Checkpoints, Asynchronous Update. 
Run the same test for Non-Collocated Checkpoints, Synchronous Update, it is OK.

Test Contact: Li Suo


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2415 CKPT node director failed to execute ckpt create request

2017-04-06 Thread A V Mahesh (AVM)
- **Milestone**: 5.2.0 --> next



---

** [tickets:#2415] CKPT node director failed to execute ckpt create request**

**Status:** assigned
**Milestone:** next
**Created:** Fri Apr 07, 2017 01:30 AM UTC by David Byrne
**Last Updated:** Fri Apr 07, 2017 03:54 AM UTC
**Owner:** A V Mahesh (AVM)


After the following two patches were removed, based on OpenSAF CS8701, CKPT 
node director failed to execute ckpt create request(Collocated Checkpoints, 
Asynchronous Update).
-ph4_01_headless_escalation_for_osaftest.diff
-mds_log_level.diff

CPND_MAX_REPLICAS =1000
retention_time is set to 30s

Test procedure
1. Send 34 ckpt request per second
34*30 = 1020 which is > CPND_MAX_REPLICAS
Failed which is expected
2. Send 32 ckpt request per second
32*30 = 960 which is < CPND_MAX_REPLICAS
It used to pass, but now failed since removing the above two patches.
syslog:
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ncs_sel_obj_create: socketpair failed 
- Too many open files
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ER cpnd has exceeded the maximum 
number of allowed replicas (CPND_MAX_REPLICAS)
Test debug info:
Apr 5, 2017 1:46:08 AM INFO ANSWER type: report 
start-time: 1491349366.360 
stop-time: 1491349567.269 
total: send=6428 recv=6407 fail=6407

Change test procedure for investigation purpose
1. Start test from 32 ckpt/s
32*30 = 960 which is  < CPND_MAX_REPLICAS
Passed
Apr 6, 2017 2:56:27 AM INFO ANSWER type: report 
start-time: 1491439975.068 
stop-time: 1491440187.347 
total: send=6792 send-failed=0 recv=6780  
2. then test 34 ckpt/s
Failed
3. Then test 33 ckpt/s
Failed
4. Then back to 32 ckpt/s again
Failed

From this experiment, we can see that once exceed the CPND_MAX_REPLICAS, ckpt 
service can’t be recovered. 
Note: the problem only occurs for Collocated Checkpoints, Asynchronous Update. 
Run the same test for Non-Collocated Checkpoints, Synchronous Update, it is OK.

Test Contact: Li Suo


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2415 CKPT node director failed to execute ckpt create request

2017-04-06 Thread A V Mahesh (AVM)
- **status**: unassigned --> assigned
- **assigned_to**: A V Mahesh (AVM)
- **Comment**:

Please provide following :

- The bug is,  for Collocated Checkpoints once exceed the CPND_MAX_REPLICAS 
limit ,
   ckpt  service can’t be recovered ,  and for Non-Collocated Checkpoints, 
Synchronous 
   working fine , is that right ?

-  I hope you are using  5.2.RC1  code  ?

changeset:   8700:654ad1d8c491
tag: 5.2.RC1
summary: release: Update configure.ac for version 5.2.RC1

- Which two  #ticket patches were removed  and why ?



---

** [tickets:#2415] CKPT node director failed to execute ckpt create request**

**Status:** assigned
**Milestone:** 5.2.0
**Created:** Fri Apr 07, 2017 01:30 AM UTC by David Byrne
**Last Updated:** Fri Apr 07, 2017 01:30 AM UTC
**Owner:** A V Mahesh (AVM)


After the following two patches were removed, based on OpenSAF CS8701, CKPT 
node director failed to execute ckpt create request(Collocated Checkpoints, 
Asynchronous Update).
-ph4_01_headless_escalation_for_osaftest.diff
-mds_log_level.diff

CPND_MAX_REPLICAS =1000
retention_time is set to 30s

Test procedure
1. Send 34 ckpt request per second
34*30 = 1020 which is > CPND_MAX_REPLICAS
Failed which is expected
2. Send 32 ckpt request per second
32*30 = 960 which is < CPND_MAX_REPLICAS
It used to pass, but now failed since removing the above two patches.
syslog:
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ncs_sel_obj_create: socketpair failed 
- Too many open files
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ER cpnd has exceeded the maximum 
number of allowed replicas (CPND_MAX_REPLICAS)
Test debug info:
Apr 5, 2017 1:46:08 AM INFO ANSWER type: report 
start-time: 1491349366.360 
stop-time: 1491349567.269 
total: send=6428 recv=6407 fail=6407

Change test procedure for investigation purpose
1. Start test from 32 ckpt/s
32*30 = 960 which is  < CPND_MAX_REPLICAS
Passed
Apr 6, 2017 2:56:27 AM INFO ANSWER type: report 
start-time: 1491439975.068 
stop-time: 1491440187.347 
total: send=6792 send-failed=0 recv=6780  
2. then test 34 ckpt/s
Failed
3. Then test 33 ckpt/s
Failed
4. Then back to 32 ckpt/s again
Failed

From this experiment, we can see that once exceed the CPND_MAX_REPLICAS, ckpt 
service can’t be recovered. 
Note: the problem only occurs for Collocated Checkpoints, Asynchronous Update. 
Run the same test for Non-Collocated Checkpoints, Synchronous Update, it is OK.

Test Contact: Li Suo


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2415 CKPT node director failed to execute ckpt create request

2017-04-06 Thread David Byrne



---

** [tickets:#2415] CKPT node director failed to execute ckpt create request**

**Status:** unassigned
**Milestone:** 5.2.0
**Created:** Fri Apr 07, 2017 01:30 AM UTC by David Byrne
**Last Updated:** Fri Apr 07, 2017 01:30 AM UTC
**Owner:** nobody


After the following two patches were removed, based on OpenSAF CS8701, CKPT 
node director failed to execute ckpt create request(Collocated Checkpoints, 
Asynchronous Update).
-ph4_01_headless_escalation_for_osaftest.diff
-mds_log_level.diff

CPND_MAX_REPLICAS =1000
retention_time is set to 30s

Test procedure
1. Send 34 ckpt request per second
34*30 = 1020 which is > CPND_MAX_REPLICAS
Failed which is expected
2. Send 32 ckpt request per second
32*30 = 960 which is < CPND_MAX_REPLICAS
It used to pass, but now failed since removing the above two patches.
syslog:
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ncs_sel_obj_create: socketpair failed 
- Too many open files
Apr  5 01:42:46 SC-2-1 osafckptnd[4958]: ER cpnd has exceeded the maximum 
number of allowed replicas (CPND_MAX_REPLICAS)
Test debug info:
Apr 5, 2017 1:46:08 AM INFO ANSWER type: report 
start-time: 1491349366.360 
stop-time: 1491349567.269 
total: send=6428 recv=6407 fail=6407

Change test procedure for investigation purpose
1. Start test from 32 ckpt/s
32*30 = 960 which is  < CPND_MAX_REPLICAS
Passed
Apr 6, 2017 2:56:27 AM INFO ANSWER type: report 
start-time: 1491439975.068 
stop-time: 1491440187.347 
total: send=6792 send-failed=0 recv=6780  
2. then test 34 ckpt/s
Failed
3. Then test 33 ckpt/s
Failed
4. Then back to 32 ckpt/s again
Failed

From this experiment, we can see that once exceed the CPND_MAX_REPLICAS, ckpt 
service can’t be recovered. 
Note: the problem only occurs for Collocated Checkpoints, Asynchronous Update. 
Run the same test for Non-Collocated Checkpoints, Synchronous Update, it is OK.

Test Contact: Li Suo


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets