Remove black list feature from Chukwa Agent to Chukwa Collector communication
-----------------------------------------------------------------------------

                 Key: HADOOP-4805
                 URL: https://issues.apache.org/jira/browse/HADOOP-4805
             Project: Hadoop Core
          Issue Type: Bug
         Environment: Redhat EL 5, Java 6
            Reporter: Eric Yang
            Assignee: Eric Yang


Recently, new load balance algorithm was added to improve chukwa agent to 
chukwa collector communication.  The design was to send one HTTP POST per 
collector, and rotate through the list of collector to load balance the 
collectors.  When a collector fail to respond, the collector is black listed 
for 5 minutes.  If all collectors are not responding, sleep for random 1-5 
minutes.  Unfortunately, this algorithm produced problem for slower machines.  
The slower machines end up black list all collectors and sleep indefinitely.  
This ticket is to restore the algorithm to the original design.  The agent will 
shuffle the collector list. The agent will try it's best effort to make HTTP 
POST to the same collector until error occurs, then it will iterate through the 
list of random collectors.  


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to