[opnfv-tech-discuss] [doctor] For the issue about the notification time is large than 1S

2016-09-08 Thread dong . wenjuan
Hi doctors,

For the issue about the notification time is large than 1S.
I check the log and find out that from the inspector received the event to 
nova-api begin to handle the reset_state is taken up most of the time, 
about over 80%. For example, the total notification time is 2.26s, the 
process of inspector takes 1.983s.

In the test inspector script, we find all the VM under all telant, and 
then set all the VM states as error.
As we need to improve the performance,but it can not be handle in a short 
time.

Shall we extned 1S to 3S or change back to no failed to notification time 
calculation to let functest green?
Meanwhile doing the performance improvement, then change it back?

Any suggestions will be welcome. Thank you~

BR,
dwj



董文娟   Wenjuan Dong
控制器四部 / 无线产品   Controller Dept Ⅳ. / Wireless Product Operation
 


上海市浦东新区碧波路889号中兴通讯D3
D3, ZTE, No. 889, Bibo Rd.
T: +86 021 85922M: +86 13661996389
E: dong.wenj...@zte.com.cn
www.ztedevice.com


___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss


Re: [opnfv-tech-discuss] [doctor] For the issue about the notification time is large than 1S

2016-09-08 Thread Juvonen, Tomi (Nokia - FI/Espoo)
Hi,

This is why inspector should be topology aware, meaning here it should already 
know the VMs belonging to the host. Also in real world when event is received  
in most cases that need to be analyzed before get to calling actions to VM or 
host as need to know what is affected by the event.

What comes to failing testing, I think it is normal to temporarily make it pass 
before get the bug fix. Anyhow this is sample inspector, so how complex code we 
want to put into it?

Br,
Tomi

From: dong.wenj...@zte.com.cn [mailto:dong.wenj...@zte.com.cn]
Sent: Friday, September 09, 2016 8:00 AM
To: Carlos Goncalves ; souvi...@docomolab-euro.com; 
r-m...@cq.jp.nec.com; Juvonen, Tomi (Nokia - FI/Espoo) 
; kunzm...@docomolab-euro.com
Cc: opnfv-tech-discuss@lists.opnfv.org
Subject: [opnfv-tech-discuss] [doctor] For the issue about the notification 
time is large than 1S


Hi doctors,

For the issue about the notification time is large than 1S.
I check the log and find out that from the inspector received the event to 
nova-api begin to handle the reset_state is taken up most of the time, about 
over 80%. For example, the total notification time is 2.26s, the process of 
inspector takes 1.983s.

In the test inspector script, we find all the VM under all telant, and then set 
all the VM states as error.
As we need to improve the performance,but it can not be handle in a short time.

Shall we extned 1S to 3S or change back to no failed to notification time 
calculation to let functest green?
Meanwhile doing the performance improvement, then change it back?

Any suggestions will be welcome. Thank you~

BR,
dwj



董文娟   Wenjuan Dong

控制器四部 / 无线产品   Controller Dept Ⅳ. / Wireless Product Operation

[icon]

[logo]
上海市浦东新区碧波路889号中兴通讯D3
D3, ZTE, No. 889, Bibo Rd.
T: +86 021 85922M: +86 13661996389
E: dong.wenj...@zte.com.cn<mailto:dong.wenj...@zte.com.cn>
www.ztedevice.com<http://www.ztedevice.com/>



___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss


Re: [opnfv-tech-discuss] [doctor] For the issue about the notification time is large than 1S

2016-09-09 Thread Yujun Zhang
I agree to modify the pass criteria in functest. It is reasonable to be
considered as *functional* OK if the test scripts exit normally.

If supported by functest, it would be nice to have a warning if the
performance is below the bar, but this is not a test failure.

Meanwhile, I have some questions on the performance deviation on different
pods.

   1. Which inspector is deployed on the pods which achieved 200ms - 300ms
   result?
   2. If it is linked to enumerating all the hosts, then what configuration
   is applied on these pod?

On Fri, Sep 9, 2016 at 1:42 PM Juvonen, Tomi (Nokia - FI/Espoo) <
tomi.juvo...@nokia.com> wrote:

> Hi,
>
>
>
> This is why inspector should be topology aware, meaning here it should
> already know the VMs belonging to the host. Also in real world when event
> is received  in most cases that need to be analyzed before get to calling
> actions to VM or host as need to know what is affected by the event.
>
>
>
> What comes to failing testing, I think it is normal to temporarily make it
> pass before get the bug fix. Anyhow this is sample inspector, so how
> complex code we want to put into it?
>
>
>
> Br,
>
> Tomi
>
>
>
> *From:* dong.wenj...@zte.com.cn [mailto:dong.wenj...@zte.com.cn]
> *Sent:* Friday, September 09, 2016 8:00 AM
> *To:* Carlos Goncalves ;
> souvi...@docomolab-euro.com; r-m...@cq.jp.nec.com; Juvonen, Tomi (Nokia -
> FI/Espoo) ; kunzm...@docomolab-euro.com
> *Cc:* opnfv-tech-discuss@lists.opnfv.org
> *Subject:* [opnfv-tech-discuss] [doctor] For the issue about the
> notification time is large than 1S
>
>
>
>
> Hi doctors,
>
> For the issue about the notification time is large than 1S.
> I check the log and find out that from the inspector received the event
> to nova-api begin to handle the reset_state is taken up most of the time,
> about over 80%. For example, the total notification time is 2.26s, the
> process of inspector takes 1.983s.
>
> In the test inspector script, we find all the VM under all telant, and
> then set all the VM states as error.
> As we need to improve the performance,but it can not be handle in a short
> time.
>
> Shall we extned 1S to 3S or change back to no failed to notification time
> calculation to let functest green?
> Meanwhile doing the performance improvement, then change it back?
>
> Any suggestions will be welcome. Thank you~
>
> BR,
> dwj
>
>
>
> *董文娟**   Wenjuan Dong*
>
> 控制器四部 / 无线产品   Controller Dept Ⅳ. / Wireless Product Operation
>
>
> [image: image002.jpg]
>
> [image: image004.jpg]
> 上海市浦东新区碧波路889号中兴通讯D3
> D3, ZTE, No. 889, Bibo Rd.
> T: +86 021 85922M: +86 13661996389
> E: dong.wenj...@zte.com.cn
> www.ztedevice.com
>
>
> ___
> opnfv-tech-discuss mailing list
> opnfv-tech-discuss@lists.opnfv.org
> https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss
>
___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss


Re: [opnfv-tech-discuss] [doctor] For the issue about the notification time is large than 1S

2016-09-09 Thread Carlos Goncalves
As I’ve already commented in the patch you submitted to Gerrit [1]: no, we 
should not extend the accepted max notification time from 1s to anything higher 
than that.

Currently our Doctor tests are passing in Apex in all available PODs as well as 
local environments (e.g. devstack). For Apex, the notification time is around 
250ms which is much lower than the max 1 second. If we cannot get it green 
light in any other scenario/installer, we don’t claim any support.

Carlos

[1] https://gerrit.opnfv.org/gerrit/#/c/20627

From: dong.wenj...@zte.com.cn [mailto:dong.wenj...@zte.com.cn]
Sent: 09 September 2016 07:00
To: Carlos Goncalves; souvi...@docomolab-euro.com; r-m...@cq.jp.nec.com; 
tomi.juvo...@nokia.com; kunzm...@docomolab-euro.com
Cc: opnfv-tech-discuss@lists.opnfv.org
Subject: [opnfv-tech-discuss] [doctor] For the issue about the notification 
time is large than 1S


Hi doctors,

For the issue about the notification time is large than 1S.
I check the log and find out that from the inspector received the event to 
nova-api begin to handle the reset_state is taken up most of the time, about 
over 80%. For example, the total notification time is 2.26s, the process of 
inspector takes 1.983s.

In the test inspector script, we find all the VM under all telant, and then set 
all the VM states as error.
As we need to improve the performance,but it can not be handle in a short time.

Shall we extned 1S to 3S or change back to no failed to notification time 
calculation to let functest green?
Meanwhile doing the performance improvement, then change it back?

Any suggestions will be welcome. Thank you~

BR,
dwj



董文娟   Wenjuan Dong

控制器四部 / 无线产品   Controller Dept Ⅳ. / Wireless Product Operation

[icon]

[logo]
上海市浦东新区碧波路889号中兴通讯D3
D3, ZTE, No. 889, Bibo Rd.
T: +86 021 85922M: +86 13661996389
E: dong.wenj...@zte.com.cn<mailto:dong.wenj...@zte.com.cn>
www.ztedevice.com<http://www.ztedevice.com/>



___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss


Re: [opnfv-tech-discuss] [doctor] For the issue about the notification time is large than 1S

2016-09-09 Thread Souville, Bertrand
My understanding is that Fuel team/experts are now investigating the issue.
Let’s give them few more days…

 

Bertrand

 

From: Carlos Goncalves [mailto:carlos.goncal...@neclab.eu] 
Sent: Friday, September 09, 2016 11:06 AM
To: dong.wenj...@zte.com.cn; Souville, Bertrand
; r-m...@cq.jp.nec.com; tomi.juvo...@nokia.com;
Kunzmann, Gerald 
Cc: opnfv-tech-discuss@lists.opnfv.org
Subject: RE: [opnfv-tech-discuss] [doctor] For the issue about the
notification time is large than 1S

 

As I’ve already commented in the patch you submitted to Gerrit [1]: no, we
should not extend the accepted max notification time from 1s to anything
higher than that.

 

Currently our Doctor tests are passing in Apex in all available PODs as well
as local environments (e.g. devstack). For Apex, the notification time is
around 250ms which is much lower than the max 1 second. If we cannot get it
green light in any other scenario/installer, we don’t claim any support.

 

Carlos

 

[1] https://gerrit.opnfv.org/gerrit/#/c/20627

 

From: dong.wenj...@zte.com.cn <mailto:dong.wenj...@zte.com.cn>
[mailto:dong.wenj...@zte.com.cn] 
Sent: 09 September 2016 07:00
To: Carlos Goncalves; souvi...@docomolab-euro.com
<mailto:souvi...@docomolab-euro.com> ; r-m...@cq.jp.nec.com
<mailto:r-m...@cq.jp.nec.com> ; tomi.juvo...@nokia.com
<mailto:tomi.juvo...@nokia.com> ; kunzm...@docomolab-euro.com
<mailto:kunzm...@docomolab-euro.com> 
Cc: opnfv-tech-discuss@lists.opnfv.org
<mailto:opnfv-tech-discuss@lists.opnfv.org> 
Subject: [opnfv-tech-discuss] [doctor] For the issue about the notification
time is large than 1S

 


Hi doctors, 

For the issue about the notification time is large than 1S. 
I check the log and find out that from the inspector received the event to
nova-api begin to handle the reset_state is taken up most of the time, about
over 80%. For example, the total notification time is 2.26s, the process of
inspector takes 1.983s. 

In the test inspector script, we find all the VM under all telant, and then
set all the VM states as error. 
As we need to improve the performance,but it can not be handle in a short
time. 

Shall we extned 1S to 3S or change back to no failed to notification time
calculation to let functest green? 
Meanwhile doing the performance improvement, then change it back? 

Any suggestions will be welcome. Thank you~ 

BR, 
dwj



董文娟   Wenjuan Dong 

控制器四部 / 无线产品   Controller Dept Ⅳ. / Wireless Product Operation 
  





上海市浦东新区碧波路889号中兴通讯D3
D3, ZTE, No. 889, Bibo Rd.
T: +86 021 85922M: +86 13661996389
E: dong.wenj...@zte.com.cn <mailto:dong.wenj...@zte.com.cn> 
 <http://www.ztedevice.com/> www.ztedevice.com

 



smime.p7s
Description: S/MIME cryptographic signature
___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss


Re: [opnfv-tech-discuss] [doctor] For the issue about the notification time is large than 1S

2016-09-11 Thread Yujun Zhang
Hi, Carlos

According to the data collected by @Wenjuan, it seems when the test fails,
most time is consumed by the inspector api disable_compute_host[1]

if event_type == 'compute.host.down':
inspector.disable_compute_host(hostname)

I checked the source code and it looks it will iterate through the server
list to set them to error state. So I wonder if it is related to the total
number of server in the test environment.

Could you please provide the log in apex environment so we can dig further
to find out the root cause?

BTW: as @Tomi pointed out, the inspector should be topology aware, knowing
all VM's on the host, so I think we may create the server list
in initialization phase and use the saved list when processing
`compute.host.down` event. This will be a better emulation of real
inspector.

[1] https://git.opnfv.org/cgit/doctor/tree/tests/inspector.py#n63



On Fri, Sep 9, 2016 at 7:15 PM Souville, Bertrand <
souvi...@docomolab-euro.com> wrote:

> My understanding is that Fuel team/experts are now investigating the
> issue. Let’s give them few more days…
>
>
>
> Bertrand
>
>
>
> *From:* Carlos Goncalves [mailto:carlos.goncal...@neclab.eu]
> *Sent:* Friday, September 09, 2016 11:06 AM
> *To:* dong.wenj...@zte.com.cn; Souville, Bertrand <
> souvi...@docomolab-euro.com>; r-m...@cq.jp.nec.com; tomi.juvo...@nokia.com;
> Kunzmann, Gerald 
> *Cc:* opnfv-tech-discuss@lists.opnfv.org
> *Subject:* RE: [opnfv-tech-discuss] [doctor] For the issue about the
> notification time is large than 1S
>
>
>
> As I’ve already commented in the patch you submitted to Gerrit [1]: no, we
> should not extend the accepted max notification time from 1s to anything
> higher than that.
>
>
>
> Currently our Doctor tests are passing in Apex in all available PODs as
> well as local environments (e.g. devstack). For Apex, the notification time
> is around 250ms which is much lower than the max 1 second. If we cannot get
> it green light in any other scenario/installer, we don’t claim any support.
>
>
>
> Carlos
>
>
>
> [1] https://gerrit.opnfv.org/gerrit/#/c/20627
>
>
>
> *From:* dong.wenj...@zte.com.cn [mailto:dong.wenj...@zte.com.cn
> ]
> *Sent:* 09 September 2016 07:00
> *To:* Carlos Goncalves; souvi...@docomolab-euro.com; r-m...@cq.jp.nec.com;
> tomi.juvo...@nokia.com; kunzm...@docomolab-euro.com
> *Cc:* opnfv-tech-discuss@lists.opnfv.org
> *Subject:* [opnfv-tech-discuss] [doctor] For the issue about the
> notification time is large than 1S
>
>
>
>
> Hi doctors,
>
> For the issue about the notification time is large than 1S.
> I check the log and find out that from the inspector received the event
> to nova-api begin to handle the reset_state is taken up most of the time,
> about over 80%. For example, the total notification time is 2.26s, the
> process of inspector takes 1.983s.
>
> In the test inspector script, we find all the VM under all telant, and
> then set all the VM states as error.
> As we need to improve the performance,but it can not be handle in a short
> time.
>
> Shall we extned 1S to 3S or change back to no failed to notification time
> calculation to let functest green?
> Meanwhile doing the performance improvement, then change it back?
>
> Any suggestions will be welcome. Thank you~
>
> BR,
> dwj
>
>
>
> *董文娟**   Wenjuan Dong*
>
> 控制器四部 / 无线产品   Controller Dept Ⅳ. / Wireless Product Operation
>
>
> [image: image003.jpg]
>
> [image: image004.jpg]
> 上海市浦东新区碧波路889号中兴通讯D3
> D3, ZTE, No. 889, Bibo Rd.
> T: +86 021 85922M: +86 13661996389
> E: dong.wenj...@zte.com.cn
> www.ztedevice.com
>
>
> ___
> opnfv-tech-discuss mailing list
> opnfv-tech-discuss@lists.opnfv.org
> https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss
>
___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss


Re: [opnfv-tech-discuss] [doctor] For the issue about the notification time is large than 1S

2016-09-12 Thread Juvonen, Tomi (Nokia - FI/Espoo)
Hi,

So all and all. Surely need something fixed in FUEL installation, but also 
Inspector needs to work fast enough in scale. Optimally this means it needs to 
be aware on VMs running on host, so there will not be any extra time spent 
figuring out that when failure occurs. Also currently there only exist this 
very simple test scenario where you only need to be aware of VMs on a single 
host. It is totally something different when need to find out thing like switch 
failure that also cause problem on host(s) (more time consuming). Anyhow those 
will be then when also Vitrage is integrated as the Inspector.

Br,
Tomi

From: Yujun Zhang [mailto:zhangyujun+...@gmail.com]
Sent: Monday, September 12, 2016 9:58 AM
To: Souville, Bertrand ; 
carlos.goncal...@neclab.eu; dong.wenj...@zte.com.cn; r-m...@cq.jp.nec.com; 
Juvonen, Tomi (Nokia - FI/Espoo) 
Cc: opnfv-tech-discuss@lists.opnfv.org
Subject: Re: [opnfv-tech-discuss] [doctor] For the issue about the notification 
time is large than 1S

Hi, Carlos

According to the data collected by @Wenjuan, it seems when the test fails, most 
time is consumed by the inspector api disable_compute_host[1]

if event_type == 'compute.host.down':
inspector.disable_compute_host(hostname)

I checked the source code and it looks it will iterate through the server list 
to set them to error state. So I wonder if it is related to the total number of 
server in the test environment.

Could you please provide the log in apex environment so we can dig further to 
find out the root cause?

BTW: as @Tomi pointed out, the inspector should be topology aware, knowing all 
VM's on the host, so I think we may create the server list
in initialization phase and use the saved list when processing 
`compute.host.down` event. This will be a better emulation of real inspector.

[1] https://git.opnfv.org/cgit/doctor/tree/tests/inspector.py#n63


On Fri, Sep 9, 2016 at 7:15 PM Souville, Bertrand 
mailto:souvi...@docomolab-euro.com>> wrote:
My understanding is that Fuel team/experts are now investigating the issue. 
Let’s give them few more days…

Bertrand

From: Carlos Goncalves 
[mailto:carlos.goncal...@neclab.eu<mailto:carlos.goncal...@neclab.eu>]
Sent: Friday, September 09, 2016 11:06 AM
To: dong.wenj...@zte.com.cn<mailto:dong.wenj...@zte.com.cn>; Souville, Bertrand 
mailto:souvi...@docomolab-euro.com>>; 
r-m...@cq.jp.nec.com<mailto:r-m...@cq.jp.nec.com>; 
tomi.juvo...@nokia.com<mailto:tomi.juvo...@nokia.com>; Kunzmann, Gerald 
mailto:kunzm...@docomolab-euro.com>>
Cc: 
opnfv-tech-discuss@lists.opnfv.org<mailto:opnfv-tech-discuss@lists.opnfv.org>
Subject: RE: [opnfv-tech-discuss] [doctor] For the issue about the notification 
time is large than 1S

As I’ve already commented in the patch you submitted to Gerrit [1]: no, we 
should not extend the accepted max notification time from 1s to anything higher 
than that.

Currently our Doctor tests are passing in Apex in all available PODs as well as 
local environments (e.g. devstack). For Apex, the notification time is around 
250ms which is much lower than the max 1 second. If we cannot get it green 
light in any other scenario/installer, we don’t claim any support.

Carlos

[1] https://gerrit.opnfv.org/gerrit/#/c/20627

From: dong.wenj...@zte.com.cn<mailto:dong.wenj...@zte.com.cn> 
[mailto:dong.wenj...@zte.com.cn]
Sent: 09 September 2016 07:00
To: Carlos Goncalves; 
souvi...@docomolab-euro.com<mailto:souvi...@docomolab-euro.com>; 
r-m...@cq.jp.nec.com<mailto:r-m...@cq.jp.nec.com>; 
tomi.juvo...@nokia.com<mailto:tomi.juvo...@nokia.com>; 
kunzm...@docomolab-euro.com<mailto:kunzm...@docomolab-euro.com>
Cc: 
opnfv-tech-discuss@lists.opnfv.org<mailto:opnfv-tech-discuss@lists.opnfv.org>
Subject: [opnfv-tech-discuss] [doctor] For the issue about the notification 
time is large than 1S


Hi doctors,

For the issue about the notification time is large than 1S.
I check the log and find out that from the inspector received the event to 
nova-api begin to handle the reset_state is taken up most of the time, about 
over 80%. For example, the total notification time is 2.26s, the process of 
inspector takes 1.983s.

In the test inspector script, we find all the VM under all telant, and then set 
all the VM states as error.
As we need to improve the performance,but it can not be handle in a short time.

Shall we extned 1S to 3S or change back to no failed to notification time 
calculation to let functest green?
Meanwhile doing the performance improvement, then change it back?

Any suggestions will be welcome. Thank you~

BR,
dwj



董文娟   Wenjuan Dong

控制器四部 / 无线产品   Controller Dept Ⅳ. / Wireless Product Operation

[image003.jpg]

[image004.jpg]
上海市浦东新区碧波路889号中兴通讯D3
D3, ZTE, No. 889, Bibo Rd.
T: +86 021 85922M: +86 13661996389
E: dong.wenj...@zte.com.cn<mailto:dong.wenj...@zte.com.cn>
www.ztedevice.com<http://www.ztedevice.com/>



___

Re: [opnfv-tech-discuss] [doctor] For the issue about the notification time is large than 1S

2016-09-12 Thread Yujun Zhang
Following up also the discussion in
https://gerrit.opnfv.org/gerrit/#/c/20877/

It just remind me of something. Is there a guideline on how the inspector
should behave to achieve the targeted performance from Doctor project? e.g.
be topology awareness, simultaneously set all affected host to error state
instead of one by one...

Although we shall keep the sample inspector a simple model, it would be
good to demonstrate these essential design guidelines that Doctor project
is concerning. I think it could be a topic of release D.

As for release C, I think we just need to get the blocking issues resolved
and keep the rest as it is. What do you think, doctors?

--
Yujun

On Mon, Sep 12, 2016 at 5:43 PM Juvonen, Tomi (Nokia - FI/Espoo) <
tomi.juvo...@nokia.com> wrote:

> Hi,
>
>
>
> So all and all. Surely need something fixed in FUEL installation, but also
> Inspector needs to work fast enough in scale. Optimally this means it needs
> to be aware on VMs running on host, so there will not be any extra time
> spent figuring out that when failure occurs. Also currently there only
> exist this very simple test scenario where you only need to be aware of VMs
> on a single host. It is totally something different when need to find out
> thing like switch failure that also cause problem on host(s) (more time
> consuming). Anyhow those will be then when also Vitrage is integrated as
> the Inspector.
>
>
>
> Br,
>
> Tomi
>
>
>
> *From:* Yujun Zhang [mailto:zhangyujun+...@gmail.com]
> *Sent:* Monday, September 12, 2016 9:58 AM
> *To:* Souville, Bertrand ;
> carlos.goncal...@neclab.eu; dong.wenj...@zte.com.cn; r-m...@cq.jp.nec.com;
> Juvonen, Tomi (Nokia - FI/Espoo) 
> *Cc:* opnfv-tech-discuss@lists.opnfv.org
> *Subject:* Re: [opnfv-tech-discuss] [doctor] For the issue about the
> notification time is large than 1S
>
>
>
> Hi, Carlos
>
>
>
> According to the data collected by @Wenjuan, it seems when the test fails,
> most time is consumed by the inspector api disable_compute_host[1]
>
>
>
> if event_type == 'compute.host.down':
>
> inspector.disable_compute_host(hostname)
>
>
>
> I checked the source code and it looks it will iterate through the server
> list to set them to error state. So I wonder if it is related to the total
> number of server in the test environment.
>
>
>
> Could you please provide the log in apex environment so we can dig further
> to find out the root cause?
>
>
>
> BTW: as @Tomi pointed out, the inspector should be topology aware, knowing
> all VM's on the host, so I think we may create the server list
>
> in initialization phase and use the saved list when processing
> `compute.host.down` event. This will be a better emulation of real
> inspector.
>
>
>
> [1] https://git.opnfv.org/cgit/doctor/tree/tests/inspector.py#n63
>
>
>
>
>
> On Fri, Sep 9, 2016 at 7:15 PM Souville, Bertrand <
> souvi...@docomolab-euro.com> wrote:
>
> My understanding is that Fuel team/experts are now investigating the
> issue. Let’s give them few more days…
>
>
>
> Bertrand
>
>
>
> *From:* Carlos Goncalves [mailto:carlos.goncal...@neclab.eu]
> *Sent:* Friday, September 09, 2016 11:06 AM
> *To:* dong.wenj...@zte.com.cn; Souville, Bertrand <
> souvi...@docomolab-euro.com>; r-m...@cq.jp.nec.com; tomi.juvo...@nokia.com;
> Kunzmann, Gerald 
> *Cc:* opnfv-tech-discuss@lists.opnfv.org
> *Subject:* RE: [opnfv-tech-discuss] [doctor] For the issue about the
> notification time is large than 1S
>
>
>
> As I’ve already commented in the patch you submitted to Gerrit [1]: no, we
> should not extend the accepted max notification time from 1s to anything
> higher than that.
>
>
>
> Currently our Doctor tests are passing in Apex in all available PODs as
> well as local environments (e.g. devstack). For Apex, the notification time
> is around 250ms which is much lower than the max 1 second. If we cannot get
> it green light in any other scenario/installer, we don’t claim any support.
>
>
>
> Carlos
>
>
>
> [1] https://gerrit.opnfv.org/gerrit/#/c/20627
>
>
>
> *From:* dong.wenj...@zte.com.cn [mailto:dong.wenj...@zte.com.cn
> ]
> *Sent:* 09 September 2016 07:00
> *To:* Carlos Goncalves; souvi...@docomolab-euro.com; r-m...@cq.jp.nec.com;
> tomi.juvo...@nokia.com; kunzm...@docomolab-euro.com
> *Cc:* opnfv-tech-discuss@lists.opnfv.org
> *Subject:* [opnfv-tech-discuss] [doctor] For the issue about the
> notification time is large than 1S
>
>
>
>
> Hi doctors,
>
> For the issue about the notification time is large than 1S.
> I check the log and find out that from the inspector received the event
> to 

Re: [opnfv-tech-discuss] [doctor] For the issue about the notification time is large than 1S

2016-09-14 Thread Carlos Goncalves
Hi Yujun,

Sorry for the delay...

I don’t have a local Apex environment to extract and provide you with the 
information you requested, sorry.

Carlos

From: Yujun Zhang [mailto:zhangyujun+...@gmail.com]
Sent: 12 September 2016 08:58
To: Souville, Bertrand; Carlos Goncalves; dong.wenj...@zte.com.cn; 
r-m...@cq.jp.nec.com; tomi.juvo...@nokia.com
Cc: opnfv-tech-discuss@lists.opnfv.org
Subject: Re: [opnfv-tech-discuss] [doctor] For the issue about the notification 
time is large than 1S

Hi, Carlos

According to the data collected by @Wenjuan, it seems when the test fails, most 
time is consumed by the inspector api disable_compute_host[1]

if event_type == 'compute.host.down':
inspector.disable_compute_host(hostname)

I checked the source code and it looks it will iterate through the server list 
to set them to error state. So I wonder if it is related to the total number of 
server in the test environment.

Could you please provide the log in apex environment so we can dig further to 
find out the root cause?

BTW: as @Tomi pointed out, the inspector should be topology aware, knowing all 
VM's on the host, so I think we may create the server list
in initialization phase and use the saved list when processing 
`compute.host.down` event. This will be a better emulation of real inspector.

[1] https://git.opnfv.org/cgit/doctor/tree/tests/inspector.py#n63


On Fri, Sep 9, 2016 at 7:15 PM Souville, Bertrand 
mailto:souvi...@docomolab-euro.com>> wrote:
My understanding is that Fuel team/experts are now investigating the issue. 
Let’s give them few more days…

Bertrand

From: Carlos Goncalves 
[mailto:carlos.goncal...@neclab.eu<mailto:carlos.goncal...@neclab.eu>]
Sent: Friday, September 09, 2016 11:06 AM
To: dong.wenj...@zte.com.cn<mailto:dong.wenj...@zte.com.cn>; Souville, Bertrand 
mailto:souvi...@docomolab-euro.com>>; 
r-m...@cq.jp.nec.com<mailto:r-m...@cq.jp.nec.com>; 
tomi.juvo...@nokia.com<mailto:tomi.juvo...@nokia.com>; Kunzmann, Gerald 
mailto:kunzm...@docomolab-euro.com>>
Cc: 
opnfv-tech-discuss@lists.opnfv.org<mailto:opnfv-tech-discuss@lists.opnfv.org>
Subject: RE: [opnfv-tech-discuss] [doctor] For the issue about the notification 
time is large than 1S

As I’ve already commented in the patch you submitted to Gerrit [1]: no, we 
should not extend the accepted max notification time from 1s to anything higher 
than that.

Currently our Doctor tests are passing in Apex in all available PODs as well as 
local environments (e.g. devstack). For Apex, the notification time is around 
250ms which is much lower than the max 1 second. If we cannot get it green 
light in any other scenario/installer, we don’t claim any support.

Carlos

[1] https://gerrit.opnfv.org/gerrit/#/c/20627

From: dong.wenj...@zte.com.cn<mailto:dong.wenj...@zte.com.cn> 
[mailto:dong.wenj...@zte.com.cn]
Sent: 09 September 2016 07:00
To: Carlos Goncalves; 
souvi...@docomolab-euro.com<mailto:souvi...@docomolab-euro.com>; 
r-m...@cq.jp.nec.com<mailto:r-m...@cq.jp.nec.com>; 
tomi.juvo...@nokia.com<mailto:tomi.juvo...@nokia.com>; 
kunzm...@docomolab-euro.com<mailto:kunzm...@docomolab-euro.com>
Cc: 
opnfv-tech-discuss@lists.opnfv.org<mailto:opnfv-tech-discuss@lists.opnfv.org>
Subject: [opnfv-tech-discuss] [doctor] For the issue about the notification 
time is large than 1S


Hi doctors,

For the issue about the notification time is large than 1S.
I check the log and find out that from the inspector received the event to 
nova-api begin to handle the reset_state is taken up most of the time, about 
over 80%. For example, the total notification time is 2.26s, the process of 
inspector takes 1.983s.

In the test inspector script, we find all the VM under all telant, and then set 
all the VM states as error.
As we need to improve the performance,but it can not be handle in a short time.

Shall we extned 1S to 3S or change back to no failed to notification time 
calculation to let functest green?
Meanwhile doing the performance improvement, then change it back?

Any suggestions will be welcome. Thank you~

BR,
dwj



董文娟   Wenjuan Dong

控制器四部 / 无线产品   Controller Dept Ⅳ. / Wireless Product Operation

[image003.jpg]

[image004.jpg]
上海市浦东新区碧波路889号中兴通讯D3
D3, ZTE, No. 889, Bibo Rd.
T: +86 021 85922M: +86 13661996389
E: dong.wenj...@zte.com.cn<mailto:dong.wenj...@zte.com.cn>
www.ztedevice.com<http://www.ztedevice.com/>



___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org<mailto:opnfv-tech-discuss@lists.opnfv.org>
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss
___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss