[ https://issues.apache.org/jira/browse/HAWQ-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lin Wen resolved HAWQ-979. -------------------------- Resolution: Fixed > Resource Broker Should Reconnect Hadoop Yarn When Failed to Get Cluster Report > ------------------------------------------------------------------------------ > > Key: HAWQ-979 > URL: https://issues.apache.org/jira/browse/HAWQ-979 > Project: Apache HAWQ > Issue Type: Bug > Components: Resource Manager > Reporter: Lin Wen > Assignee: Lin Wen > Fix For: 2.0.1.0-incubating > > > While HAWQ with yarn mode is running, sometimes the heartbeat thread of > libyarn maybe fail(e.g. YARN RM restarts) and quit, > 2016-08-03 18:45:27.913838 > PDT,,,p34645,th-1290610400,,,,0,con4,,seg-10000,,,,,"WARNING","01000","YARN > mode resource broker failed to get YARN queue report of queue default. > LibYarnClient::getQueueInfo, Catch the Exception:LibYarnClient::libyarn AM > heartbeat thread has stopped.",,,,,,,0,,"resourcebroker_LIBYARN_proc.c",1840, > resource broker process should re-register HAWQ to YARN in this case, but > actually not. > The reason is: > In function handleRM2RB_GetClusterReport(), when RB2YARN_getQueueReport() > failed, function sendRBGetClusterReportErrorData() is called, but > sendRBGetClusterReportErrorData() returns OK(should return RESBROK_ERROR_GRM) -- This message was sent by Atlassian JIRA (v6.3.4#6332)