Hi,

Thanks for detailed answer.
Please see my comments inline.

Regards,

On 07/01/2014 04:28 PM, Ihar Hrachyshka wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 30/06/14 21:34, Alexei Kornienko wrote:
Hello,


My understanding is that your analysis is mostly based on running
a profiler against the code. Network operations can be bottlenecked
in other places.

You compare 'simple script using kombu' with 'script using
oslo.messaging'. You don't compare script using oslo.messaging
before refactoring and 'after that. The latter would show whether
refactoring was worth the effort. Your test shows that
oslo.messaging performance sucks, but it's not definite that
hotspots you've revealed, once fixed, will show huge boost.

My concern is that it may turn out that once all the effort to
refactor the code is done, we won't see major difference. So we
need base numbers, and performance tests would be a great helper
here.


It's really sad for me to see so little faith in what I'm saying.
The test I've done using plain kombu driver was needed exactly to
check that network is not the bottleneck for messaging
performance. If you don't believe in my performance analysis we
could ask someone else to do their own research and provide
results.
Technology is not about faith. :)

First, let me make it clear I'm *not* against refactoring or anything
that will improve performance. I'm just a bit skeptical, but hopefully
you'll be able to show everyone I'm wrong, and then the change will
occur. :)

To add more velocity to your effort, strong arguments should be
present. To facilitate that, I would start from adding performance
tests that would give us some basis for discussion of changes proposed
later.
Please see below for detailed answer about performance tests implementation.
It explains a bit why it's hard to present arguments that would be strong enough for you.
I may run performance tests locally but it's not enough for community.

And in addition I've provided some links to existing implementation with places that IMHO cause bottlenecks. From my point of view that code is doing obviously stupid things (like closing/opening sockets for each message sent). That is enough for me to rewrite it even without additional proofs that it's wrong.

Then, describing proposed details in a spec will give more exposure to
your ideas. At the moment, I see general will to enhance the library,
but not enough details on how to achieve this. Specification can make
us think not about the burden of change that obviously makes people
skeptic about rewrite-all approach, but about specific technical issues.
I agree that we should start with a spec. However instead of having spec of needed changes I would prefer to have a spec describing needed functionality of the library (it may differ from existing functionality). Using such a spec we could decide what it needed and what needs to be removed to achieve what we need.

Problem with refactoring that I'm planning is that it's not a
minor refactoring that can be applied in one patch but it's the
whole library rewritten from scratch.
You can still maintain a long sequence of patches, like we did when we
migrated neutron to oslo.messaging (it was like ~25 separate pieces).
Talking into account possible gate issues I would like to avoid long series of patches since they won't be able to land at the same time and rebasing will become a huge pain. If we decide to start working on 2.0 API/implementation I think a topic branch 2.0a would be much better.

Existing messaging code was written long long time ago (in a galaxy
far far away maybe?) and it was copy-pasted directly from nova. It
was not built as a library and it was never intended to be used
outside of nova. Some parts of it cannot even work normally cause
it was not designed to work with drivers like zeromq (matchmaker
stuff).
oslo.messaging is NOT the code you can find in oslo-incubator rpc
module. It was hugely rewritten to expose a new, cleaner API. This is
btw one of the reasons migration to this new library is so painful. It
was painful to move to oslo.messaging, so we need clear need for a
change before switching to yet another library.
API indeed has changed but general implementation details and processing flow goes way back to 2011 and nova code (for example general Publisher/Consumer implementation in impl_rabbit)
That's the code I'm talking about.

Refactoring as I see it will do the opposite thing. It will keep intact as much API as possible but change internals to make it more efficient (that's why I call it refactoring) So 2.0 version might be (partially?) backwards compatible and migration won't be such a pain.

The reason I've raised this question on the mailing list was to get
some agreement about future plans of oslo.messaging development and
start working on it in coordination with community. For now I don't
see any actions plan emerging from it. I would like to see us
bringing more constructive ideas about what should be done.

If you think that first action should be profiling lets discuss how
it should be implemented (cause it works for me just fine on my
local PC). I guess we'll need to define some basic scenarios that
would show us overall performance of the library.
Let's start from basic send/receive throughput, for tiny and large
messages, multiple consumers etc.
This would be a great start but it's quite hard to test basic send/receive since existing code is written around rpc.
I don't see a way to send a message without complex rpc code being involved.
That's why I propose to start refactoring that would separate rpc code from basic messaging code.

There are a lot of questions that should be answered to implement
this: Where such tests would run (jenking, local PC, devstack VM)?
I would expect it to be exposed to jenkins thru 'tox'. We then can set
up a separate job to run them and compare with a base line [TBD: what
*is* baseline?] to make sure we don't introduce performance regressions.
Such tests cannot be exposed thru 'tox' since they require some environment setup (rabbitmq-server, zeromq matchmaker, etc.). Such setup is way out of scope for tox.
Cause of this we should find some other way to run such tests.

How such scenarios should look like? How do we measure performance
(cProfile, etc.)?
I think we're interested in message rate, not CPU utilization.
Problem here that it's hard to find bottleneck in message rate without deeper analysis (cpu utilization, etc.)

How do we collect results? How do we analyze results to find
bottlenecks? etc.

Another option would be to spend some of my free time implementing
mentioned refactoring (as I see it) and show you the results of
performance testing compared with existing code.
This approach generally doesn't work beyond PoC. Openstack is a
complex project, and we need to stick to procedures - spec review,
then coding, all in upstream, with no private branches outside common
infrastructure.
I agree with such approach but it also has many drawbacks. I don't know a clean way to communicate design drafts and implementation details without actually writing the code. And if you already have a working code all this spec review becomes quite a useless burden. If you know a way how to solve this problem (creating high-mid level architecture design) please share it with me so we could use it.

The only problem with such approach is that my code won't be
oslo.messaging and it won't be accepted by community. It may be
drop in base for v2.0 but I'm afraid this won't be acceptable
either.

Future does not occur here that way. If you want your work to be
consumed by community, you need to work with it.
That's what I'm trying to do :)

Regards, Alexei Kornienko


2014-06-30 17:51 GMT+03:00 Gordon Sim <g...@redhat.com
<mailto:g...@redhat.com>>:

On 06/30/2014 12:22 PM, Ihar Hrachyshka wrote:

Alexei Kornienko wrote:

Some performance tests may be introduced but they would be more
like functional tests since they require setup of actual messaging
server (rabbit, etc.).


Yes. I think we already have some. F.e.
tests/drivers/test_impl_qpid.__py attempts to use local Qpid
server (backing up to fake server if it's not available).


I always get failures when there is a real qpidd service listening
on the expected port. Does anyone else see this?



_________________________________________________ OpenStack-dev
mailing list OpenStack-dev@lists.openstack.__org
<mailto:OpenStack-dev@lists.openstack.org>
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>





_______________________________________________ OpenStack-dev
mailing list OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJTsreKAAoJEC5aWaUY1u57f/EIAOBzJ4dGKViBcg22DOP5dmeH
jRRb9T2RDABpMRwtGkYlWSIyaP6f/eeXP9+9LQrMKkw7hlg6U50d+UmHCD18w0/8
gM/n6CpX/RPb5WmO3oyIol5kPnZo/ZVH2O6FEaS+0vwIdBDMwt5hOIFzA+AB4ZXM
n9PG0OnGrRIEQSBiJ6N0ujSnNiLisH59odKmw4B3mFjvfwiFUdY1cWqNlAMm7J0e
J7bu/eocEbvftff4y/Jh5DFx8S3pKpJUby7WgWc1WsOqkD/wyKLYIc/2WyB9CI08
SiMB4MnRNvJ95lSnmZNsgSXAct5qze0/fe/IC5+lCiM6L7tzt8bLYx+j4IrLzsI=
=9L6r
-----END PGP SIGNATURE-----

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to