Re: [openstack-dev] [ceilometer] about workload partition

2017-12-04 Thread gordon chung


On 2017-12-03 10:30 PM, 李田清 wrote:
> 
> 
> On 2017-12-01 05:03 AM, 李田清 wrote:
>  >> Hello,
>  >>       we test workload partition, and find it's much slower than not
>  >> using it.
>  >>       After some review, we find that, after get samples from
>  >> notifications.sample
>  >>       ceilometer unpacks them and sends them one by one to the pipe
>  >>       ceilometer.pipe.*, this will make the consumer slow. Right now,
>  >> the rabbit_qos_prefetch_count to 1. If we sent it to 10, the connection
>  >> will be reset
> 
>  > currently, i believe rabbit_qos_prefetch_count will be set to whatever
>  > value you set batch_size to.
> You mean in the past, it can not be set to whatever?
> We test newton, and find under 1k vm, if we open workload partition,
> it will be reset regularly.
> 
>  >i'll give a two part answer but first i'll start with a question: what
>  >version of oslo.messaging do you have?
> 
> newton 5.10.2
> 

i just checked, oslo.messaging==5.10.2 has the offending patch that 
decreases performance significantly. as newton is EOL, it seems you have 
a few choices:
- revert to 5.10.1 but i'm not sure if that reintroduces old bugs
- manually patch your oslo.messaging with: 
https://review.openstack.org/#/c/524099
- pull in a newer oslo.messaging from another branch (fix is unreleased 
currently)
- manually patch notification agent to use multiple threads[1] (but this 
will possibly add some instability to your transforms)
- disable workload_partitioning

option 2 or 5 are probably the safest choices depending on your load.

[1] 
https://github.com/openstack/ceilometer/blob/newton-eol/ceilometer/notification.py#L305-L307

cheers,

-- 
gord
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer] about workload partition

2017-12-03 Thread 李田清
On 2017-12-01 05:03 AM, 李田清 wrote:
>> Hello,
>>   we test workload partition, and find it's much slower than not 
>> using it.
>>   After some review, we find that, after get samples from 
>> notifications.sample
>>   ceilometer unpacks them and sends them one by one to the pipe
>>   ceilometer.pipe.*, this will make the consumer slow. Right now, 
>> the rabbit_qos_prefetch_count to 1. If we sent it to 10, the connection 
>> will be reset

> currently, i believe rabbit_qos_prefetch_count will be set to whatever 
> value you set batch_size to.
You mean in the past, it can not be set to whatever?
We test newton, and find under 1k vm, if we open workload partition,
it will be reset regularly. 

>>   regularly. Under this pos, the consumer will be very slow in 
>> workload partition. If you do not use workload partition, the messages 
>> can all be consumer. If you use it, the messages in pipe will be piled 
>> up more and more。

> what is "pos"? i'm not sure it means the same thing to both of us... or 
> well i guess it could :)
sorry, it is qos. this qos. 

>>  May be right now workload partition is not a good choice? Or any 
>> suggestion?
>> 

>i'll give a two part answer but first i'll start with a question: what 
>version of oslo.messaging do you have?

newton 5.10.2

>i see a performance drop as well but the reason for it is because of an 
>oslo.messaging bug introduced into master/pike/ocata releases. more 
>details can be found here: 
>https://bugs.launchpad.net/oslo.messaging/+bug/1734788. we're working >on 
>backporting it. we've also done some work regarding performance/memory 
>to shrink memory usage of partitioning in master[1].

>with that said, there are only two scenarios where you should have 
>partitioning enabled. if you have multiple notification agents AND:

>1. you have transformations in your pipeline
>2. you want to batch efficiently to gnocchi

>if you don't have workload partitioning on, your transform metrics will 
>probably be wrong or missing values. it also won't batch to gnocchi so 
>you'll see a lot more http requests there.

>so yes, you do have a choice to disable it, but the above is your tradeoff.

>[1] 
>https://review.openstack.org/#/q/topic:plugin+
>(status:open+OR+status:merged)+project:openstack/ceilometer

>cheers,

>-- 
>gord
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer] about workload partition

2017-12-01 Thread gordon chung


On 2017-12-01 05:03 AM, 李田清 wrote:
> Hello,
>       we test workload partition, and find it's much slower than not 
> using it.
>       After some review, we find that, after get samples from 
> notifications.sample
>       ceilometer unpacks them and sends them one by one to the pipe
>       ceilometer.pipe.*, this will make the consumer slow. Right now, 
> the rabbit_qos_prefetch_count to 1. If we sent it to 10, the connection 
> will be reset

currently, i believe rabbit_qos_prefetch_count will be set to whatever 
value you set batch_size to.

>       regularly. Under this pos, the consumer will be very slow in 
> workload partition. If you do not use workload partition, the messages 
> can all be consumer. If you use it, the messages in pipe will be piled 
> up more and more。

what is "pos"? i'm not sure it means the same thing to both of us... or 
well i guess it could :)

>      May be right now workload partition is not a good choice? Or any 
> suggestion?
> 

i'll give a two part answer but first i'll start with a question: what 
version of oslo.messaging do you have?

i see a performance drop as well but the reason for it is because of an 
oslo.messaging bug introduced into master/pike/ocata releases. more 
details can be found here: 
https://bugs.launchpad.net/oslo.messaging/+bug/1734788. we're working on 
backporting it. we've also done some work regarding performance/memory 
to shrink memory usage of partitioning in master[1].

with that said, there are only two scenarios where you should have 
partitioning enabled. if you have multiple notification agents AND:

1. you have transformations in your pipeline
2. you want to batch efficiently to gnocchi

if you don't have workload partitioning on, your transform metrics will 
probably be wrong or missing values. it also won't batch to gnocchi so 
you'll see a lot more http requests there.

so yes, you do have a choice to disable it, but the above is your tradeoff.

[1] 
https://review.openstack.org/#/q/topic:plugin+(status:open+OR+status:merged)+project:openstack/ceilometer

cheers,

-- 
gord
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [ceilometer] about workload partition

2017-12-01 Thread 李田清
Hello,
 we test workload partition, and find it's much slower than not using it.
 After some review, we find that, after get samples from 
notifications.sample
 ceilometer unpacks them and sends them one by one to the pipe 
 ceilometer.pipe.*, this will make the consumer slow. Right now, the 
rabbit_qos_prefetch_count to 1. If we sent it to 10, the connection will be 
reset
 regularly. Under this pos, the consumer will be very slow in workload 
partition. If you do not use workload partition, the messages can all be 
consumer. If you use it, the messages in pipe will be piled up more and more。
May be right now workload partition is not a good choice? Or any suggestion?__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev