Hi,

Please see some minor comments inline.
Do you think we can schedule some time to discuss this topic on one of the upcoming meetings? We can come out with some kind of the summary and actions plan to start working on.

Regards,

On 07/01/2014 05:52 PM, Ihar Hrachyshka wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 01/07/14 15:55, Alexei Kornienko wrote:
Hi,

Thanks for detailed answer. Please see my comments inline.

Regards,

On 07/01/2014 04:28 PM, Ihar Hrachyshka wrote: On 30/06/14 21:34,
Alexei Kornienko wrote:
Hello,


My understanding is that your analysis is mostly based on
running a profiler against the code. Network operations can
be bottlenecked in other places.

You compare 'simple script using kombu' with 'script using
oslo.messaging'. You don't compare script using
oslo.messaging before refactoring and 'after that. The latter
would show whether refactoring was worth the effort. Your
test shows that oslo.messaging performance sucks, but it's
not definite that hotspots you've revealed, once fixed, will
show huge boost.

My concern is that it may turn out that once all the effort
to refactor the code is done, we won't see major difference.
So we need base numbers, and performance tests would be a
great helper here.


It's really sad for me to see so little faith in what I'm
saying. The test I've done using plain kombu driver was
needed exactly to check that network is not the bottleneck
for messaging performance. If you don't believe in my
performance analysis we could ask someone else to do their
own research and provide results.
Technology is not about faith. :)

First, let me make it clear I'm *not* against refactoring or
anything that will improve performance. I'm just a bit skeptical,
but hopefully you'll be able to show everyone I'm wrong, and then
the change will occur. :)

To add more velocity to your effort, strong arguments should be
present. To facilitate that, I would start from adding performance
tests that would give us some basis for discussion of changes
proposed later.
Please see below for detailed answer about performance tests
implementation. It explains a bit why it's hard to present
arguments that would be strong enough for you. I may run
performance tests locally but it's not enough for community.
Yes, that's why shipping some tests ready to run with oslo.messaging
can help. Science is about reproducility, right? ;)

And in addition I've provided some links to existing
implementation with places that IMHO cause bottlenecks. From my
point of view that code is doing obviously stupid things (like
closing/opening sockets for each message sent).
That indeed sounds bad.

That is enough for me to rewrite it even without additional
proofs that it's wrong.
[Full disclosure: I'm not as involved into oslo.messaging internals as
you probably are, so I may speak out dumb things.]

I wonder whether there are easier ways to fix that particular issue
without rewriting everything from scratch. Like, provide a pool of
connections and make send() functions use it instead of creating new
connections (?)
I've tried to find a way to fix that without big changes but unfortunately I've failed to do so. Problem I see is that connection pool is defined and used on 1 layer of library and the problem is on the other. To fix this issues we need to change several layers of code and it's shared between 2 drivers - rabbit, qpid. Cause of this it seems really hard to make some logically finished and working patches that would allow us to move in proper direction without big refactoring of the drivers structure.

Then, describing proposed details in a spec will give more exposure
to your ideas. At the moment, I see general will to enhance the
library, but not enough details on how to achieve this.
Specification can make us think not about the burden of change that
obviously makes people skeptic about rewrite-all approach, but
about specific technical issues.
I agree that we should start with a spec. However instead of
having spec of needed changes I would prefer to have a spec
describing needed functionality of the library (it may differ
from existing functionality).
Meaning, breaking API, again?
It's not about breaking the API it's about making it more logical and independent. Right now it's not clear to me what API classes are used and how they are used. A lot of driver details leak outside the API and it makes it hard to improve driver without changing the API. What I would like to see is a clear definition of what library should provide and API interface that it should implement. It may be a little bit java like so API should be defined and frozed and anyone could propose their driver implementation using kombu/qpid/zeromq or pigeons and trained dolphins to deliver messages.

This would allow us to change drivers without touching the API and test their performance separately.

Using such a spec we could decide what it needed and what needs
to be removed to achieve what we need.
Problem with refactoring that I'm planning is that it's not
a minor refactoring that can be applied in one patch but it's
the whole library rewritten from scratch.
You can still maintain a long sequence of patches, like we did when
we migrated neutron to oslo.messaging (it was like ~25 separate
pieces).
Talking into account possible gate issues I would like to avoid
long series of patches since they won't be able to land at the
same time and rebasing will become a huge pain.
But you're the one proposing the change, you need to take burden.
Having a new branch for everything-rewritten version of the library
means that each bug fix or improvement to the library will require
being tracked by each developer in two branches, with significantly
different code. I think it's more honest to put rebase pain on people
who rework the code than on everyone else.

If we decide to start working on 2.0 API/implementation I think a
topic branch 2.0a would be much better.
I respectfully disagree. See above.

Existing messaging code was written long long time ago (in a
galaxy far far away maybe?) and it was copy-pasted directly
from nova. It was not built as a library and it was never
intended to be used outside of nova. Some parts of it cannot
even work normally cause it was not designed to work with
drivers like zeromq (matchmaker stuff).
oslo.messaging is NOT the code you can find in oslo-incubator rpc
module. It was hugely rewritten to expose a new, cleaner API. This
is btw one of the reasons migration to this new library is so
painful. It was painful to move to oslo.messaging, so we need clear
need for a change before switching to yet another library.
API indeed has changed but general implementation details and
processing flow goes way back to 2011 and nova code (for example
general Publisher/Consumer implementation in impl_rabbit) That's
the code I'm talking about.
Roger.

Refactoring as I see it will do the opposite thing. It will keep
intact as much API as possible but change internals to make it
more efficient (that's why I call it refactoring) So 2.0 version
might be (partially?) backwards compatible and migration won't be
such a pain.
That sounds promising. Though see my concern on your suggestion to
revisit the scope of the library above.

The reason I've raised this question on the mailing list was
to get some agreement about future plans of oslo.messaging
development and start working on it in coordination with
community. For now I don't see any actions plan emerging from
it. I would like to see us bringing more constructive ideas
about what should be done.

If you think that first action should be profiling lets
discuss how it should be implemented (cause it works for me
just fine on my local PC). I guess we'll need to define some
basic scenarios that would show us overall performance of the
library.
Let's start from basic send/receive throughput, for tiny and large
messages, multiple consumers etc.
This would be a great start but it's quite hard to test basic
send/receive since existing code is written around rpc. I don't
see a way to send a message without complex rpc code being
involved. That's why I propose to start refactoring that would
separate rpc code from basic messaging code.
Again, removing RPC code from your tests won't mean the library as a
whole will get higher performance. That said, refactoring that would
result in clear separation of layers can be beneficial even without
major performance boost. But that means that we probably should not
put performance concerns as the main reason for rework. I would set
'clean code' as the primary goal.
Yes we can consider clean code as a primary goal. In the same time I hope that we'll get 2 goals at once since clean code will also work faster.

There are a lot of questions that should be answered to
implement this: Where such tests would run (jenking, local
PC, devstack VM)?
I would expect it to be exposed to jenkins thru 'tox'. We then can
set up a separate job to run them and compare with a base line
[TBD: what *is* baseline?] to make sure we don't introduce
performance regressions.
Such tests cannot be exposed thru 'tox' since they require some
environment setup (rabbitmq-server, zeromq matchmaker, etc.).
Such setup is way out of scope for tox. Cause of this we should
find some other way to run such tests.
You may just assume server is already set and available thru a common
socket.
Assuming that something is already setup is not an option. If we add new env to tox I assume that anyone can run it locally and get the same results as he would get from jenkins.

How such scenarios should look like? How do we measure
performance (cProfile, etc.)?
I think we're interested in message rate, not CPU utilization.
Problem here that it's hard to find bottleneck in message rate
without deeper analysis (cpu utilization, etc.)
Tests are not to show spots, they can be used to avoid performance
regressions or support claims about alleged performance gains added by
a patch (or refactoring).

How do we collect results? How do we analyze results to find
bottlenecks? etc.

Another option would be to spend some of my free time
implementing mentioned refactoring (as I see it) and show you
the results of performance testing compared with existing
code.
This approach generally doesn't work beyond PoC. Openstack is a
complex project, and we need to stick to procedures - spec review,
then coding, all in upstream, with no private branches outside
common infrastructure.
I agree with such approach but it also has many drawbacks. I
don't know a clean way to communicate design drafts and
implementation details without actually writing the code.
You may still have a PoC. You just should not consider it as a final
code, it's there to support the spec case.

And if you already have a working code all this spec review
becomes quite a useless burden.
It adds context to your spec. And be prepared that lots of code you
write before or after doing spec job *will* be rewritten. :)

If you know a way how to solve this problem (creating high-mid
level architecture design) please share it with me so we could
use it.
The only problem with such approach is that my code won't be
oslo.messaging and it won't be accepted by community. It may
be drop in base for v2.0 but I'm afraid this won't be
acceptable either.

Future does not occur here that way. If you want your work to be
consumed by community, you need to work with it.
That's what I'm trying to do :)
OK. BTW you can also join Oslo team at #openstack-oslo to discuss your
case and whatnot.

Cheers,
/Ihar

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJTssslAAoJEC5aWaUY1u57v3kH/iqLISfDdmtI8bBz9PcMw16P
/aL6ufUyz6bPZVj+sTcjaPZznhcSLaWzDQVk5fSam1yr0yTAs66AG70gkWWcisFY
EY5xwTyXzMeufDfWATsyXGxeZCUZhwIxjKas2UXhnErT2sd7DRtSuQXwDZfmn36V
Q6YsQiwXOZAAEmnadF6w7Bgq2BBI9Pt6p+BN9syj32fvGNLBZuKo8hz1uWyXB14k
m5blNqYVIeMMynTWUXgT7lH0poVtHpBs8hcoKRXGlAuyc5OtX1Dkq+cTIhAO6Tnj
dFK0D1R/g1fAaVuojw12vqEWRUKL1AK1lyrQVKlX9PgU3pDlfch0WxfcFIqWolY=
=YE5H
-----END PGP SIGNATURE-----

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to