Re: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group scheduling

2017-04-24 Thread Savolainen, Petri (Nokia - FI/Espoo)
Ping.

> -Original Message-
> From: Wallen, Carl (Nokia - FI/Espoo)
> Sent: Thursday, April 20, 2017 3:11 PM
> To: lng-odp@lists.linaro.org
> Cc: Savolainen, Petri (Nokia - FI/Espoo)  labs.com>
> Subject: RE: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group
> scheduling
> 
> For the entire patch set:
> Reviewed-and-tested-by: Carl Wallén 
> 
> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> Savolainen, Petri (Nokia - FI/Espoo)
> Sent: Thursday, April 20, 2017 9:38 AM
> To: lng-odp@lists.linaro.org
> Subject: Re: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group
> scheduling
> 
> Ping. Fixes bug https://bugs.linaro.org/show_bug.cgi?id=2945
> 
> 
> 
> > -Original Message-
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> > Savolainen, Petri (Nokia - FI/Espoo)
> > Sent: Wednesday, April 12, 2017 12:57 PM
> > To: lng-odp@lists.linaro.org
> > Subject: Re: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group
> > scheduling
> >
> > Ping.
> >
> > This patch set removes the non-deterministic latency, lower QoS and
> > potential queue starvation that is caused by this code ...
> >
> > -   if (grp > ODP_SCHED_GROUP_ALL &&
> > -   !odp_thrmask_isset(&sched-
> > >sched_grp[grp].mask,
> > -
> > sched_local.thr)) {
> > -   /* This thread is not eligible
> > for work from
> > -* this queue, so continue
> > scheduling it.
> > -*/
> > -   ring_enq(ring, PRIO_QUEUE_MASK,
> > qi);
> > -
> > -   i++;
> > -   id++;
> > -   continue;
> > -   }
> >
> >
> > ... which sends queues of "wrong" group back to the end of the priority
> > queue. If e.g. tens of threads are sending it back and only one thread
> > would accept it, it's actually very likely that queue service level is
> > much lower than it should be.
> >
> > Improved latency can be seen already with the new l2fwd -g option.
> >
> > -Petri
> >
> >
> > > -Original Message-
> > > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> > Petri
> > > Savolainen
> > > Sent: Thursday, April 06, 2017 2:59 PM
> > > To: lng-odp@lists.linaro.org
> > > Subject: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group
> > scheduling
> > >
> > > Use separate priority queues for different groups. Sharing
> > > the same priority queue over multiple groups caused multiple
> > > issues:
> > > * latency and ordering issues when threads push back
> > >   events (from wrong groups) to the tail of the priority queue
> > > * unnecessary contention (scaling issues) when threads belong
> > >   to different groups
> > >
> > > Lowered the maximum number of groups from 256 to 32 (in the default
> > > configuration) to limit memory usage of priority queues. This should
> > > be enough for the most users.
> > >
> > > Signed-off-by: Petri Savolainen 
> > > ---
> > >  platform/linux-generic/odp_schedule.c | 284 +++--
> --
> > --
> > > -
> > >  1 file changed, 195 insertions(+), 89 deletions(-)
> > >
> > > diff --git a/platform/linux-generic/odp_schedule.c b/platform/linux-
> > > generic/odp_schedule.c
> > > index e7079b9..f366e7e 100644
> > > --- a/platform/linux-generic/odp_schedule.c
> > > +++ b/platform/linux-generic/odp_schedule.c
> > > @@ -34,7 +34,7 @@ ODP_STATIC_ASSERT((ODP_SCHED_PRIO_NORMAL > 0) &&
> > > "normal_prio_is_not_between_highest_and_lowest");
> > >
> > >  /* Number of scheduling groups */
> > > -#define NUM_SCHED_GRPS 256
> > > +#define NUM_SCHED_GRPS 32
> > >
> > >  /* Priority queues per priority */
> > >  #define QUEUES_PER_PRIO  4
> > > @@ -163,7 +163,11 @@ typedef struct {
> > >   ordered_stash_t stash[MAX_ORDERED_STASH];
> > >   } ordered;
> > >
> > > + uint32_t grp_epoch;
> > > + int num_grp;
> > > + uint8_t grp[NUM_SCHED_GRPS];
> > >   uint8_t weight_tbl[WEIGHT_TBL_SIZE];
> > > + uint8_t grp_weight[WEIGHT_TBL_SIZE];
> > >
> > >  } sched

Re: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group scheduling

2017-04-20 Thread Wallen, Carl (Nokia - FI/Espoo)
For the entire patch set:
Reviewed-and-tested-by: Carl Wallén 

-Original Message-
From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of 
Savolainen, Petri (Nokia - FI/Espoo)
Sent: Thursday, April 20, 2017 9:38 AM
To: lng-odp@lists.linaro.org
Subject: Re: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group scheduling

Ping. Fixes bug https://bugs.linaro.org/show_bug.cgi?id=2945



> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> Savolainen, Petri (Nokia - FI/Espoo)
> Sent: Wednesday, April 12, 2017 12:57 PM
> To: lng-odp@lists.linaro.org
> Subject: Re: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group
> scheduling
> 
> Ping.
> 
> This patch set removes the non-deterministic latency, lower QoS and
> potential queue starvation that is caused by this code ...
> 
> - if (grp > ODP_SCHED_GROUP_ALL &&
> - !odp_thrmask_isset(&sched-
> >sched_grp[grp].mask,
> -
> sched_local.thr)) {
> - /* This thread is not eligible
> for work from
> -  * this queue, so continue
> scheduling it.
> -  */
> - ring_enq(ring, PRIO_QUEUE_MASK,
> qi);
> -
> - i++;
> - id++;
> - continue;
> - }
> 
> 
> ... which sends queues of "wrong" group back to the end of the priority
> queue. If e.g. tens of threads are sending it back and only one thread
> would accept it, it's actually very likely that queue service level is
> much lower than it should be.
> 
> Improved latency can be seen already with the new l2fwd -g option.
> 
> -Petri
> 
> 
> > -Original Message-
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> Petri
> > Savolainen
> > Sent: Thursday, April 06, 2017 2:59 PM
> > To: lng-odp@lists.linaro.org
> > Subject: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group
> scheduling
> >
> > Use separate priority queues for different groups. Sharing
> > the same priority queue over multiple groups caused multiple
> > issues:
> > * latency and ordering issues when threads push back
> >   events (from wrong groups) to the tail of the priority queue
> > * unnecessary contention (scaling issues) when threads belong
> >   to different groups
> >
> > Lowered the maximum number of groups from 256 to 32 (in the default
> > configuration) to limit memory usage of priority queues. This should
> > be enough for the most users.
> >
> > Signed-off-by: Petri Savolainen 
> > ---
> >  platform/linux-generic/odp_schedule.c | 284 +++
> --
> > -
> >  1 file changed, 195 insertions(+), 89 deletions(-)
> >
> > diff --git a/platform/linux-generic/odp_schedule.c b/platform/linux-
> > generic/odp_schedule.c
> > index e7079b9..f366e7e 100644
> > --- a/platform/linux-generic/odp_schedule.c
> > +++ b/platform/linux-generic/odp_schedule.c
> > @@ -34,7 +34,7 @@ ODP_STATIC_ASSERT((ODP_SCHED_PRIO_NORMAL > 0) &&
> >   "normal_prio_is_not_between_highest_and_lowest");
> >
> >  /* Number of scheduling groups */
> > -#define NUM_SCHED_GRPS 256
> > +#define NUM_SCHED_GRPS 32
> >
> >  /* Priority queues per priority */
> >  #define QUEUES_PER_PRIO  4
> > @@ -163,7 +163,11 @@ typedef struct {
> > ordered_stash_t stash[MAX_ORDERED_STASH];
> > } ordered;
> >
> > +   uint32_t grp_epoch;
> > +   int num_grp;
> > +   uint8_t grp[NUM_SCHED_GRPS];
> > uint8_t weight_tbl[WEIGHT_TBL_SIZE];
> > +   uint8_t grp_weight[WEIGHT_TBL_SIZE];
> >
> >  } sched_local_t;
> >
> > @@ -199,7 +203,7 @@ typedef struct {
> > pri_mask_t pri_mask[NUM_PRIO];
> > odp_spinlock_t mask_lock;
> >
> > -   prio_queue_t   prio_q[NUM_PRIO][QUEUES_PER_PRIO];
> > +   prio_queue_t
> > prio_q[NUM_SCHED_GRPS][NUM_PRIO][QUEUES_PER_PRIO];
> >
> > odp_spinlock_t poll_cmd_lock;
> > /* Number of commands in a command queue */
> > @@ -214,8 +218,10 @@ typedef struct {
> > odp_shm_t  shm;
> > uint32_t   pri_count[NUM_PRIO][QUEUES_PER_PRIO];
> >
> > -   odp_spinlock_t grp_lock;
> > -   odp_thrmask_t mask_all;
> > +   odp_thrmask_tmask_all;
> > +   odp_spinlock_t   grp_lock;
> > +   odp_atomic_u32_t grp_epoch;
> > +
> > struct {
> &g

Re: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group scheduling

2017-04-19 Thread Savolainen, Petri (Nokia - FI/Espoo)
Ping. Fixes bug https://bugs.linaro.org/show_bug.cgi?id=2945



> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> Savolainen, Petri (Nokia - FI/Espoo)
> Sent: Wednesday, April 12, 2017 12:57 PM
> To: lng-odp@lists.linaro.org
> Subject: Re: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group
> scheduling
> 
> Ping.
> 
> This patch set removes the non-deterministic latency, lower QoS and
> potential queue starvation that is caused by this code ...
> 
> - if (grp > ODP_SCHED_GROUP_ALL &&
> - !odp_thrmask_isset(&sched-
> >sched_grp[grp].mask,
> -
> sched_local.thr)) {
> - /* This thread is not eligible
> for work from
> -  * this queue, so continue
> scheduling it.
> -  */
> - ring_enq(ring, PRIO_QUEUE_MASK,
> qi);
> -
> - i++;
> - id++;
> - continue;
> - }
> 
> 
> ... which sends queues of "wrong" group back to the end of the priority
> queue. If e.g. tens of threads are sending it back and only one thread
> would accept it, it's actually very likely that queue service level is
> much lower than it should be.
> 
> Improved latency can be seen already with the new l2fwd -g option.
> 
> -Petri
> 
> 
> > -Original Message-
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> Petri
> > Savolainen
> > Sent: Thursday, April 06, 2017 2:59 PM
> > To: lng-odp@lists.linaro.org
> > Subject: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group
> scheduling
> >
> > Use separate priority queues for different groups. Sharing
> > the same priority queue over multiple groups caused multiple
> > issues:
> > * latency and ordering issues when threads push back
> >   events (from wrong groups) to the tail of the priority queue
> > * unnecessary contention (scaling issues) when threads belong
> >   to different groups
> >
> > Lowered the maximum number of groups from 256 to 32 (in the default
> > configuration) to limit memory usage of priority queues. This should
> > be enough for the most users.
> >
> > Signed-off-by: Petri Savolainen 
> > ---
> >  platform/linux-generic/odp_schedule.c | 284 +++
> --
> > -
> >  1 file changed, 195 insertions(+), 89 deletions(-)
> >
> > diff --git a/platform/linux-generic/odp_schedule.c b/platform/linux-
> > generic/odp_schedule.c
> > index e7079b9..f366e7e 100644
> > --- a/platform/linux-generic/odp_schedule.c
> > +++ b/platform/linux-generic/odp_schedule.c
> > @@ -34,7 +34,7 @@ ODP_STATIC_ASSERT((ODP_SCHED_PRIO_NORMAL > 0) &&
> >   "normal_prio_is_not_between_highest_and_lowest");
> >
> >  /* Number of scheduling groups */
> > -#define NUM_SCHED_GRPS 256
> > +#define NUM_SCHED_GRPS 32
> >
> >  /* Priority queues per priority */
> >  #define QUEUES_PER_PRIO  4
> > @@ -163,7 +163,11 @@ typedef struct {
> > ordered_stash_t stash[MAX_ORDERED_STASH];
> > } ordered;
> >
> > +   uint32_t grp_epoch;
> > +   int num_grp;
> > +   uint8_t grp[NUM_SCHED_GRPS];
> > uint8_t weight_tbl[WEIGHT_TBL_SIZE];
> > +   uint8_t grp_weight[WEIGHT_TBL_SIZE];
> >
> >  } sched_local_t;
> >
> > @@ -199,7 +203,7 @@ typedef struct {
> > pri_mask_t pri_mask[NUM_PRIO];
> > odp_spinlock_t mask_lock;
> >
> > -   prio_queue_t   prio_q[NUM_PRIO][QUEUES_PER_PRIO];
> > +   prio_queue_t
> > prio_q[NUM_SCHED_GRPS][NUM_PRIO][QUEUES_PER_PRIO];
> >
> > odp_spinlock_t poll_cmd_lock;
> > /* Number of commands in a command queue */
> > @@ -214,8 +218,10 @@ typedef struct {
> > odp_shm_t  shm;
> > uint32_t   pri_count[NUM_PRIO][QUEUES_PER_PRIO];
> >
> > -   odp_spinlock_t grp_lock;
> > -   odp_thrmask_t mask_all;
> > +   odp_thrmask_tmask_all;
> > +   odp_spinlock_t   grp_lock;
> > +   odp_atomic_u32_t grp_epoch;
> > +
> > struct {
> > char   name[ODP_SCHED_GROUP_NAME_LEN];
> > odp_thrmask_t  mask;
> > @@ -223,6 +229,7 @@ typedef struct {
> > } sched_grp[NUM_SCHED_GRPS];
> >
> > struct {
> > +   int grp;
> > int prio;
> > int queue

Re: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group scheduling

2017-04-12 Thread Savolainen, Petri (Nokia - FI/Espoo)
Ping.

This patch set removes the non-deterministic latency, lower QoS and potential 
queue starvation that is caused by this code ...

-   if (grp > ODP_SCHED_GROUP_ALL &&
-   !odp_thrmask_isset(&sched->sched_grp[grp].mask,
-  sched_local.thr)) {
-   /* This thread is not eligible for work from
-* this queue, so continue scheduling it.
-*/
-   ring_enq(ring, PRIO_QUEUE_MASK, qi);
-
-   i++;
-   id++;
-   continue;
-   }


... which sends queues of "wrong" group back to the end of the priority queue. 
If e.g. tens of threads are sending it back and only one thread would accept 
it, it's actually very likely that queue service level is much lower than it 
should be.

Improved latency can be seen already with the new l2fwd -g option. 

-Petri


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Petri
> Savolainen
> Sent: Thursday, April 06, 2017 2:59 PM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [PATCH 3/3] linux-gen: sched: optimize group scheduling
> 
> Use separate priority queues for different groups. Sharing
> the same priority queue over multiple groups caused multiple
> issues:
> * latency and ordering issues when threads push back
>   events (from wrong groups) to the tail of the priority queue
> * unnecessary contention (scaling issues) when threads belong
>   to different groups
> 
> Lowered the maximum number of groups from 256 to 32 (in the default
> configuration) to limit memory usage of priority queues. This should
> be enough for the most users.
> 
> Signed-off-by: Petri Savolainen 
> ---
>  platform/linux-generic/odp_schedule.c | 284 +++--
> -
>  1 file changed, 195 insertions(+), 89 deletions(-)
> 
> diff --git a/platform/linux-generic/odp_schedule.c b/platform/linux-
> generic/odp_schedule.c
> index e7079b9..f366e7e 100644
> --- a/platform/linux-generic/odp_schedule.c
> +++ b/platform/linux-generic/odp_schedule.c
> @@ -34,7 +34,7 @@ ODP_STATIC_ASSERT((ODP_SCHED_PRIO_NORMAL > 0) &&
> "normal_prio_is_not_between_highest_and_lowest");
> 
>  /* Number of scheduling groups */
> -#define NUM_SCHED_GRPS 256
> +#define NUM_SCHED_GRPS 32
> 
>  /* Priority queues per priority */
>  #define QUEUES_PER_PRIO  4
> @@ -163,7 +163,11 @@ typedef struct {
>   ordered_stash_t stash[MAX_ORDERED_STASH];
>   } ordered;
> 
> + uint32_t grp_epoch;
> + int num_grp;
> + uint8_t grp[NUM_SCHED_GRPS];
>   uint8_t weight_tbl[WEIGHT_TBL_SIZE];
> + uint8_t grp_weight[WEIGHT_TBL_SIZE];
> 
>  } sched_local_t;
> 
> @@ -199,7 +203,7 @@ typedef struct {
>   pri_mask_t pri_mask[NUM_PRIO];
>   odp_spinlock_t mask_lock;
> 
> - prio_queue_t   prio_q[NUM_PRIO][QUEUES_PER_PRIO];
> + prio_queue_t
> prio_q[NUM_SCHED_GRPS][NUM_PRIO][QUEUES_PER_PRIO];
> 
>   odp_spinlock_t poll_cmd_lock;
>   /* Number of commands in a command queue */
> @@ -214,8 +218,10 @@ typedef struct {
>   odp_shm_t  shm;
>   uint32_t   pri_count[NUM_PRIO][QUEUES_PER_PRIO];
> 
> - odp_spinlock_t grp_lock;
> - odp_thrmask_t mask_all;
> + odp_thrmask_tmask_all;
> + odp_spinlock_t   grp_lock;
> + odp_atomic_u32_t grp_epoch;
> +
>   struct {
>   char   name[ODP_SCHED_GROUP_NAME_LEN];
>   odp_thrmask_t  mask;
> @@ -223,6 +229,7 @@ typedef struct {
>   } sched_grp[NUM_SCHED_GRPS];
> 
>   struct {
> + int grp;
>   int prio;
>   int queue_per_prio;
>   } queue[ODP_CONFIG_QUEUES];
> @@ -273,7 +280,7 @@ static void sched_local_init(void)
>  static int schedule_init_global(void)
>  {
>   odp_shm_t shm;
> - int i, j;
> + int i, j, grp;
> 
>   ODP_DBG("Schedule init ... ");
> 
> @@ -293,15 +300,20 @@ static int schedule_init_global(void)
>   sched->shm  = shm;
>   odp_spinlock_init(&sched->mask_lock);
> 
> - for (i = 0; i < NUM_PRIO; i++) {
> - for (j = 0; j < QUEUES_PER_PRIO; j++) {
> - int k;
> + for (grp = 0; grp < NUM_SCHED_GRPS; grp++) {
> + for (i = 0; i < NUM_PRIO; i++) {
> + for (j = 0; j < QUEUES_PER_PRIO; j++) {
> +   

[lng-odp] [PATCH 3/3] linux-gen: sched: optimize group scheduling

2017-04-06 Thread Petri Savolainen
Use separate priority queues for different groups. Sharing
the same priority queue over multiple groups caused multiple
issues:
* latency and ordering issues when threads push back
  events (from wrong groups) to the tail of the priority queue
* unnecessary contention (scaling issues) when threads belong
  to different groups

Lowered the maximum number of groups from 256 to 32 (in the default
configuration) to limit memory usage of priority queues. This should
be enough for the most users.

Signed-off-by: Petri Savolainen 
---
 platform/linux-generic/odp_schedule.c | 284 +++---
 1 file changed, 195 insertions(+), 89 deletions(-)

diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index e7079b9..f366e7e 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -34,7 +34,7 @@ ODP_STATIC_ASSERT((ODP_SCHED_PRIO_NORMAL > 0) &&
  "normal_prio_is_not_between_highest_and_lowest");
 
 /* Number of scheduling groups */
-#define NUM_SCHED_GRPS 256
+#define NUM_SCHED_GRPS 32
 
 /* Priority queues per priority */
 #define QUEUES_PER_PRIO  4
@@ -163,7 +163,11 @@ typedef struct {
ordered_stash_t stash[MAX_ORDERED_STASH];
} ordered;
 
+   uint32_t grp_epoch;
+   int num_grp;
+   uint8_t grp[NUM_SCHED_GRPS];
uint8_t weight_tbl[WEIGHT_TBL_SIZE];
+   uint8_t grp_weight[WEIGHT_TBL_SIZE];
 
 } sched_local_t;
 
@@ -199,7 +203,7 @@ typedef struct {
pri_mask_t pri_mask[NUM_PRIO];
odp_spinlock_t mask_lock;
 
-   prio_queue_t   prio_q[NUM_PRIO][QUEUES_PER_PRIO];
+   prio_queue_t   prio_q[NUM_SCHED_GRPS][NUM_PRIO][QUEUES_PER_PRIO];
 
odp_spinlock_t poll_cmd_lock;
/* Number of commands in a command queue */
@@ -214,8 +218,10 @@ typedef struct {
odp_shm_t  shm;
uint32_t   pri_count[NUM_PRIO][QUEUES_PER_PRIO];
 
-   odp_spinlock_t grp_lock;
-   odp_thrmask_t mask_all;
+   odp_thrmask_tmask_all;
+   odp_spinlock_t   grp_lock;
+   odp_atomic_u32_t grp_epoch;
+
struct {
char   name[ODP_SCHED_GROUP_NAME_LEN];
odp_thrmask_t  mask;
@@ -223,6 +229,7 @@ typedef struct {
} sched_grp[NUM_SCHED_GRPS];
 
struct {
+   int grp;
int prio;
int queue_per_prio;
} queue[ODP_CONFIG_QUEUES];
@@ -273,7 +280,7 @@ static void sched_local_init(void)
 static int schedule_init_global(void)
 {
odp_shm_t shm;
-   int i, j;
+   int i, j, grp;
 
ODP_DBG("Schedule init ... ");
 
@@ -293,15 +300,20 @@ static int schedule_init_global(void)
sched->shm  = shm;
odp_spinlock_init(&sched->mask_lock);
 
-   for (i = 0; i < NUM_PRIO; i++) {
-   for (j = 0; j < QUEUES_PER_PRIO; j++) {
-   int k;
+   for (grp = 0; grp < NUM_SCHED_GRPS; grp++) {
+   for (i = 0; i < NUM_PRIO; i++) {
+   for (j = 0; j < QUEUES_PER_PRIO; j++) {
+   prio_queue_t *prio_q;
+   int k;
 
-   ring_init(&sched->prio_q[i][j].ring);
+   prio_q = &sched->prio_q[grp][i][j];
+   ring_init(&prio_q->ring);
 
-   for (k = 0; k < PRIO_QUEUE_RING_SIZE; k++)
-   sched->prio_q[i][j].queue_index[k] =
-   PRIO_QUEUE_EMPTY;
+   for (k = 0; k < PRIO_QUEUE_RING_SIZE; k++) {
+   prio_q->queue_index[k] =
+   PRIO_QUEUE_EMPTY;
+   }
+   }
}
}
 
@@ -317,12 +329,17 @@ static int schedule_init_global(void)
sched->pktio_cmd[i].cmd_index = PKTIO_CMD_FREE;
 
odp_spinlock_init(&sched->grp_lock);
+   odp_atomic_init_u32(&sched->grp_epoch, 0);
 
for (i = 0; i < NUM_SCHED_GRPS; i++) {
memset(sched->sched_grp[i].name, 0, ODP_SCHED_GROUP_NAME_LEN);
odp_thrmask_zero(&sched->sched_grp[i].mask);
}
 
+   sched->sched_grp[ODP_SCHED_GROUP_ALL].allocated = 1;
+   sched->sched_grp[ODP_SCHED_GROUP_WORKER].allocated = 1;
+   sched->sched_grp[ODP_SCHED_GROUP_CONTROL].allocated = 1;
+
odp_thrmask_setall(&sched->mask_all);
 
ODP_DBG("done\n");
@@ -330,29 +347,38 @@ static int schedule_init_global(void)
return 0;
 }
 
+static inline void queue_destroy_finalize(uint32_t qi)
+{
+   sched_cb_queue_destroy_finalize(qi);
+}
+
 static int schedule_term_global(void)
 {
int ret = 0;
int rc = 0;
-   int i, j;
+   int i, j, grp;
 
-   for (i = 0; i < NUM_PRIO; i++) {
-   for (j = 0; j < QUEUES_PER_PRIO; j++) {
-