When you have this issue, can you restart storm-manager or storm-nimbus
to see if Storm can load normally? We did one of these each time and it
works as a workaround.
On 26/06/2017 12:12, Rui Abreu wrote:
Don't have the logs anymore, I'm afraid. Not much information that
could help debugging, other than it appear to be spending some time
trying to redistribute the existent topologies among the 600 workers
(300 *2).
I'll get back to you if I'm able to reproduce the problem.
On 25 June 2017 at 20:42, Erik Weathers <[email protected]
<mailto:[email protected]>> wrote:
300 is not that many supervisors from my perspective. Have any of
you experiencing this issue dug in to see what's slowing it down?
- Erik
On Fri, Jun 23, 2017 at 1:48 AM Rui Abreu <[email protected]
<mailto:[email protected]>> wrote:
On 22 June 2017 at 19:35, Erik Weathers <[email protected]
<mailto:[email protected]>> wrote:
Sounds like you've hit some scaling bottleneck with
Storm. I've never tried running nearly that number of
topologies.
It might be an inefficiency with the number of API calls,
or with the interactions between the Nimbus and ZooKeeper.
A similar situation occurs on Storm 1.1.0 when the number of
supervisors approaches 300 (2 workers per supervisor). Storm
UI fails to load because /api/vi/{supervisor,topology, nimbus}
calls timeout.
--
My THALES email is [email protected].
+33 (0)5 62 88 84 40
Thales Services, Toulouse, France