Github user revans2 commented on the issue:
https://github.com/apache/storm/pull/2723
@danny0405 I have spent some time looking at your patch, I have not found
any issues with the code itself, but I easily could have missed something. My
biggest problems is that I just cannot get past the backwards incompatibility
imposed by NeedsFullTopologiesScheduler. I also don't want to merge in a
performance improvement without any actual numbers to back it up.
To get the performance numbers, you really only want to know how long it
takes to schedule. You don't actually need to run a full cluster. The
simplest way to make that happen is to fake out the heartbeats for the
supervisors and the workers. You could do it as a stand alone application, but
it might be nice to have a bit more control over it so you can simulate workers
that don't come up, or workers that crash.
Once you have that working I really would like to see a breakdown of how
much time is being spent computing the different parts that go into creating
the new Cluster.
As for NeedsFullTopologiesScheduler would either like to see this switched
so schedulers opt into getting less information, or even better have us cache
the fully computed inputs to Cluster and just update the cache incrementally
instead of leaving things out.
---