[ https://issues.apache.org/jira/browse/TWILL-220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Albert Shau updated TWILL-220: ------------------------------ Description: If the ResourceReportClient is unable to fetch the resource report, it logs an error with a big stack trace. I have seen cluster setups where the hostname are not set up correctly, so the call always fails. In those cases, you end up with logs like: {noformat} 2017-02-24 06:45:09,542 - ERROR [reporter-scheduler:o.a.t.y.ResourceReportClient@59] - Exception getting resource report from http://xxxxxx:43931/resources. java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.7.0_75] at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) ~[na:1.7.0_75] at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) ~[na:1.7.0_75] at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) ~[na:1.7.0_75] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.7.0_75] at java.net.Socket.connect(Socket.java:579) ~[na:1.7.0_75] at java.net.Socket.connect(Socket.java:528) ~[na:1.7.0_75] at sun.net.NetworkClient.doConnect(NetworkClient.java:180) ~[na:1.7.0_75] at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) ~[na:1.7.0_75] at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) ~[na:1.7.0_75] at sun.net.www.http.HttpClient.<init>(HttpClient.java:211) ~[na:1.7.0_75] at sun.net.www.http.HttpClient.New(HttpClient.java:308) ~[na:1.7.0_75] at sun.net.www.http.HttpClient.New(HttpClient.java:326) ~[na:1.7.0_75] at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:997) ~[na:1.7.0_75] at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:933) ~[na:1.7.0_75] at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:851) ~[na:1.7.0_75] at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1301) ~[na:1.7.0_75] at java.net.URL.openStream(URL.java:1037) ~[na:1.7.0_75] at org.apache.twill.yarn.ResourceReportClient.get(ResourceReportClient.java:52) ~[org.apache.twill.twill-yarn-0.9.0.jar:na] at org.apache.twill.yarn.YarnTwillController.getResourceReport(YarnTwillController.java:330) [co.cask.cdap.cdap-app-fabric-4.0.1.jar:na] at co.cask.cdap.app.guice.ImpersonatedTwillController$1.call(ImpersonatedTwillController.java:86) [na:na] at co.cask.cdap.app.guice.ImpersonatedTwillController$1.call(ImpersonatedTwillController.java:82) [na:na] at co.cask.cdap.common.security.ImpersonationUtils$1.run(ImpersonationUtils.java:46) [na:na] at java.security.AccessController.doPrivileged(Native Method) [na:1.7.0_75] at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_75] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) [hadoop-common-2.7.1.2.4.2.0-258.jar:na] at co.cask.cdap.common.security.ImpersonationUtils.doAs(ImpersonationUtils.java:43) [na:na] at co.cask.cdap.common.security.DefaultImpersonator.doAs(DefaultImpersonator.java:60) [na:na] at co.cask.cdap.app.guice.ImpersonatedTwillController.getResourceReport(ImpersonatedTwillController.java:82) [na:na] at co.cask.cdap.internal.app.runtime.distributed.DistributedProgramRuntimeService$ClusterResourceReporter.reportResources(DistributedProgramRuntimeService.java:385) [na:na] at co.cask.cdap.internal.app.runtime.AbstractResourceReporter.runOneIteration(AbstractResourceReporter.java:72) [na:na] at com.google.common.util.concurrent.AbstractScheduledService$1$1.run(AbstractScheduledService.java:170) [com.google.guava.guava-13.0.1.jar:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_75] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_75] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_75] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] {noformat} Instead of returning null and logging an error, the class should throw the right type of exception and let the caller decide what to do with it. was:If the ResourceReportClient is unable to fetch the resource report, it logs an error with a big stack trace. I have seen cluster setups where the hostname are not set up correctly, so the call always fails. Instead of returning null and logging an error, the class should throw the right type of exception and let the caller decide what to do with it. > ResourceReportClient should not error log > ----------------------------------------- > > Key: TWILL-220 > URL: https://issues.apache.org/jira/browse/TWILL-220 > Project: Apache Twill > Issue Type: Bug > Reporter: Albert Shau > > If the ResourceReportClient is unable to fetch the resource report, it logs > an error with a big stack trace. I have seen cluster setups where the > hostname are not set up correctly, so the call always fails. In those cases, > you end up with logs like: > {noformat} > 2017-02-24 06:45:09,542 - ERROR > [reporter-scheduler:o.a.t.y.ResourceReportClient@59] - Exception getting > resource report from http://xxxxxx:43931/resources. > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > ~[na:1.7.0_75] > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) > ~[na:1.7.0_75] > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) > ~[na:1.7.0_75] > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) > ~[na:1.7.0_75] > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > ~[na:1.7.0_75] > at java.net.Socket.connect(Socket.java:579) ~[na:1.7.0_75] > at java.net.Socket.connect(Socket.java:528) ~[na:1.7.0_75] > at sun.net.NetworkClient.doConnect(NetworkClient.java:180) > ~[na:1.7.0_75] > at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) > ~[na:1.7.0_75] > at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) > ~[na:1.7.0_75] > at sun.net.www.http.HttpClient.<init>(HttpClient.java:211) > ~[na:1.7.0_75] > at sun.net.www.http.HttpClient.New(HttpClient.java:308) ~[na:1.7.0_75] > at sun.net.www.http.HttpClient.New(HttpClient.java:326) ~[na:1.7.0_75] > at > sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:997) > ~[na:1.7.0_75] > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:933) > ~[na:1.7.0_75] > at > sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:851) > ~[na:1.7.0_75] > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1301) > ~[na:1.7.0_75] > at java.net.URL.openStream(URL.java:1037) ~[na:1.7.0_75] > at > org.apache.twill.yarn.ResourceReportClient.get(ResourceReportClient.java:52) > ~[org.apache.twill.twill-yarn-0.9.0.jar:na] > at > org.apache.twill.yarn.YarnTwillController.getResourceReport(YarnTwillController.java:330) > [co.cask.cdap.cdap-app-fabric-4.0.1.jar:na] > at > co.cask.cdap.app.guice.ImpersonatedTwillController$1.call(ImpersonatedTwillController.java:86) > [na:na] > at > co.cask.cdap.app.guice.ImpersonatedTwillController$1.call(ImpersonatedTwillController.java:82) > [na:na] > at > co.cask.cdap.common.security.ImpersonationUtils$1.run(ImpersonationUtils.java:46) > [na:na] > at java.security.AccessController.doPrivileged(Native Method) > [na:1.7.0_75] > at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_75] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > [hadoop-common-2.7.1.2.4.2.0-258.jar:na] > at > co.cask.cdap.common.security.ImpersonationUtils.doAs(ImpersonationUtils.java:43) > [na:na] > at > co.cask.cdap.common.security.DefaultImpersonator.doAs(DefaultImpersonator.java:60) > [na:na] > at > co.cask.cdap.app.guice.ImpersonatedTwillController.getResourceReport(ImpersonatedTwillController.java:82) > [na:na] > at > co.cask.cdap.internal.app.runtime.distributed.DistributedProgramRuntimeService$ClusterResourceReporter.reportResources(DistributedProgramRuntimeService.java:385) > [na:na] > at > co.cask.cdap.internal.app.runtime.AbstractResourceReporter.runOneIteration(AbstractResourceReporter.java:72) > [na:na] > at > com.google.common.util.concurrent.AbstractScheduledService$1$1.run(AbstractScheduledService.java:170) > [com.google.guava.guava-13.0.1.jar:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > [na:1.7.0_75] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > [na:1.7.0_75] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > [na:1.7.0_75] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [na:1.7.0_75] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_75] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_75] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] > {noformat} > Instead of returning null and logging an error, the class should throw the > right type of exception and let the caller decide what to do with it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)