Prakash Khemani created HBASE-5860:
--------------------------------------

             Summary: splitlogmanager should not unnecessarily resubmit tasks 
when zk unavailable
                 Key: HBASE-5860
                 URL: https://issues.apache.org/jira/browse/HBASE-5860
             Project: HBase
          Issue Type: Improvement
            Reporter: Prakash Khemani
            Assignee: Prakash Khemani


(Doesn't really impact the run time or correctness of log splitting)

say the master has lost connection to zk. splitlogmanager's timeoutmanager will 
realize that all the tasks that were submitted are still unassigned. It will 
resubmit those tasks (i.e. create dummy znodes)

splitlogmanager should realze that the tasks are unassigned but their znodes 
have not been created.


012-04-20 13:11:20,516 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
dead splitlog worker msgstore295.snc4.facebook.com,60020,1334948757026
2012-04-20 13:11:20,517 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
Scheduling batch of logs to split
2012-04-20 13:11:20,517 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
started splitting logs in 
[hdfs://msgstore215.snc4.facebook.com:9000/MSGSTORE215-SNC4-HBASE/.logs/msgstore295.snc4.facebook.com,60020,1334948757026-splitting]
2012-04-20 13:11:20,565 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
connection to server msgstore235.snc4.facebook.com/10.30.222.186:2181
2012-04-20 13:11:20,566 INFO org.apache.zookeeper.ClientCnxn: Socket connection 
established to msgstore235.snc4.facebook.com/10.30.222.186:2181, initiating 
session
2012-04-20 13:11:20,575 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
total tasks = 4 unassigned = 4
2012-04-20 13:11:20,576 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
resubmitting unassigned task(s) after timeout
2012-04-20 13:11:21,577 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
resubmitting unassigned task(s) after timeout
2012-04-20 13:11:21,683 INFO org.apache.zookeeper.ClientCnxn: Unable to read 
additional data from server sessionid 0x36ccb0f8010002, likely server has 
closed socket, closing socket connection and attempting reconnect
2012-04-20 13:11:21,683 INFO org.apache.zookeeper.ClientCnxn: Unable to read 
additional data from server sessionid 0x136ccb0f4890000, likely server has 
closed socket, closing socket connection and attempting reconnect
2012-04-20 13:11:21,786 WARN 
org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback: create rc 
=CONNECTIONLOSS for 
/hbase/splitlog/hdfs%3A%2F%2Fmsgstore215.snc4.facebook.com%3A9000%2FMSGSTORE215-SNC4-HBASE%2F.logs%2Fmsgstore295.snc4.facebook.com%2C60020%2C1334948757026-splitting%2F10.30.251.186%253A60020.1334951586677
 retry=3
2012-04-20 13:11:21,786 WARN 
org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback: create rc 
=CONNECTIONLOSS for 
/hbase/splitlog/hdfs%3A%2F%2Fmsgstore215.snc4.facebook.com%3A9000%2FMSGSTORE215-SNC4-HBASE%2F.logs%2Fmsgstore295.snc4.facebook.com%2C60020%2C1334948757026-splitting%2F10.30.251.186%253A60020.1334951920332
 retry=3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to