[GitHub] [incubator-druid] jihoonson commented on issue #7828: Supervisor marks succeeded replicas as failed too aggressively

2019-06-04 Thread GitBox
jihoonson commented on issue #7828: Supervisor marks succeeded replicas as 
failed too aggressively
URL: 
https://github.com/apache/incubator-druid/issues/7828#issuecomment-498882002
 
 
   I'm removing this from 0.15.0 since this issue has been a while which means 
it's not a regression. Instead, I will mention this problem in the release 
notes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on issue #7828: Supervisor marks succeeded replicas as failed too aggressively

2019-06-04 Thread GitBox
jihoonson commented on issue #7828: Supervisor marks succeeded replicas as 
failed too aggressively
URL: 
https://github.com/apache/incubator-druid/issues/7828#issuecomment-498826492
 
 
   I have been looking at more deeply and noticed that this is what's happening.
   
   - The task was finished and unregistered its chatHandler.
   - But the process which was running the task was not terminated immediately. 
For example, `unannouncePropagationDelay` can block the process from being 
terminated for a while.
   - The task status is updated only after the task process terminates 
(`ForkingTaskRunner`).
   - So, while the task process was waiting to be terminated, the task status 
in the metadata store was still `RUNNING` even though the task itself was 
already finished.
   - The supervisor killed the task because it returned an error of `Can't find 
chatHandler` for the shutdown request.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on issue #7828: Supervisor marks succeeded replicas as failed too aggressively

2019-06-04 Thread GitBox
jihoonson commented on issue #7828: Supervisor marks succeeded replicas as 
failed too aggressively
URL: 
https://github.com/apache/incubator-druid/issues/7828#issuecomment-498758398
 
 
   I have the same feeling, but not 100% sure what made this more frequent. 
Maybe https://github.com/apache/incubator-druid/pull/7234 is a bit related. And 
`unannouncePropagationDelay` is the thing that makes this always happening.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org