Hi Aravindan, The solution provided fixed my cluster issue. :-) Thank you so much!
Thanks, Qin On Thu, May 4, 2017 at 9:00 AM, Aravindan Vijayan <[email protected]> wrote: > Hi Qin, > > It is probably because the default HBase version that is being used in > Apache is old. > > Do you see this in AMS Hbase logs? > > java.io.IOException: java.io.IOException: Unable to load configured store > engine 'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine' > > > If that is the case, please set the following configs in custom ams-site > and restart Metrics collector. > > timeline.metrics.hbase.aggregate.table.compaction.policy.key = > hbase.hstore.defaultengine.compactionpolicy.class > timeline.metrics.hbase.aggregate.table.compaction.policy.class = > org.apache.hadoop.hbase.regionserver.compactions.FIFOCompactionPolicy > > timeline.metrics.aggregate.table.hbase.hstore.blockingStoreFiles = 1000 > > > -- > Thanks and Regards, > Aravindan Vijayan > > > > > > > > On 5/3/17, 10:15 PM, "Qin Liu" <[email protected]> wrote: > > >Hi Aravindan, > > > >The collector is up and running and the metrics on ambari-metrics service > >summary page looks good. > > > >The following exception/errors are in ambari-server.log: > >03 May 2017 20:46:34,451 ERROR [ambari-client-thread-210] > >MetricsRequestHelper:116 - Error getting timeline metrics : Read timed out > >03 May 2017 20:46:34,452 DEBUG [ambari-client-thread-210] > >MetricsRequestHelper:118 - Error getting timeline metrics : Read timed out > >java.net.SocketTimeoutException: Read timed out > > at java.net.SocketInputStream.socketRead0(Native Method) > > at java.net.SocketInputStream.socketRead(SocketInputStream. > java:116) > > at java.net.SocketInputStream.read(SocketInputStream.java:170) > > at java.net.SocketInputStream.read(SocketInputStream.java:141) > > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > > at java.io.BufferedInputStream.read1(BufferedInputStream. > java:286) > > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient. > java:704) > > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) > > at > >sun.net.www.protocol.http.HttpURLConnection.getInputStream0( > HttpURLConnection.java:1569) > > at > >sun.net.www.protocol.http.HttpURLConnection.getInputStream( > HttpURLConnection.java:1474) > > at > >java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) > > at > >org.apache.ambari.server.controller.internal. > URLStreamProvider.processURL(URLStreamProvider.java:218) > > at > >org.apache.ambari.server.controller.internal. > URLStreamProvider.processURL(URLStreamProvider.java:142) > > at > >org.apache.ambari.server.controller.metrics.timeline. > MetricsRequestHelper.fetchTimelineMetrics(MetricsRequestHelper.java:79) > >... > >03 May 2017 20:46:34,452 ERROR [ambari-client-thread-210] > >MetricsRequestHelper:123 - Cannot connect to collector: > >SocketTimeoutException for qin1.example.com > >03 May 2017 20:46:34,453 DEBUG [ambari-client-thread-210] > >TimelineMetricCacheEntryFactory:88 - Caught IOException on fetching > >metrics. Read timed out > >03 May 2017 20:46:34,453 DEBUG [ambari-client-thread-210] > >MetricsPropertyProvider:537 - Skip populating resources on socket timeout. > >03 May 2017 20:46:34,453 DEBUG [pool-3-thread-1] > >MetricsCollectorHAManager:87 - MetricsCollectorHostDownEvent caught, Down > >collector : qin1.example.com > > > >In ambari-metrics collector log, there are tons of Call exceptions, e.g., > >2017-05-01 22:42:37,698 INFO > >org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, > tries=13, > >retries=35, started=128676 ms ago, cancelled=false, msg=row > >'cpu_idle^@nodemanager' on table 'METRIC_AGGREGATE' at > >region=METRIC_AGGREGATE,,1493703270057.c9fc6ac679c986905075ba50cf634531., > >hostname=qin1.example.com,52106,1493703207689, seqNum=2 > >2017-05-01 22:42:40,310 INFO org.apache.hadoop.hbase.client.AsyncProcess: > >#1, waiting for 5054 actions to finish > >2017-05-01 22:42:41,977 INFO > >org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, > tries=11, > >retries=35, started=88292 ms ago, cancelled=false, msg=row > >'kafka.controller.ControllerStats.LeaderElectionRateAndTimeMs. > 1MinuteRate^@kafka_broker' > >on table 'METRIC_AGGREGATE' at > >region=METRIC_AGGREGATE,,1493703270057.c9fc6ac679c986905075ba50cf634531., > >hostname=qin1.example.com,52106,1493703207689, seqNum=2 > >2017-05-01 22:42:44,716 INFO > >org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, > tries=13, > >retries=35, started=128792 ms ago, cancelled=false, msg=row > >'cpu_idle^@hbase' on table 'METRIC_AGGREGATE' at > >region=METRIC_AGGREGATE,,1493703270057.c9fc6ac679c986905075ba50cf634531., > >hostname=qin1.example.com,52106,1493703207689, seqNum=2 > >... > > > >Need to mention that > >YARN/components/NODEMANAGER?fields=metrics/cpu/cpu_idle, > >HBASE/components/HBASE_REGIONSERVER?fields=metrics/cpu/cpu_nice, and > > > >metrics/kafka/controller/ControllerStats/LeaderElectionRateAndTimeMs/ > 1MinuteRate > >are not available. > > > >Thanks, > > > >Qin > > > > > >On Wed, May 3, 2017 at 12:16 PM, Aravindan Vijayan < > [email protected] > >> wrote: > > > >> Hi Qin, > >> > >> Anything from the ambari-server logs that might point to the issue? Is > the > >> metrics collector component up and running? > >> > >> > >> Also, AMBARI-20622 looks like the last commit that went into the > >> Ambari-2.5.0.3 release. This could be off by a commit or two since > there is > >> no git tag for that release. > >> -- > >> Thanks and Regards, > >> Aravindan Vijayan > >> > >> > >> > >> > >> > >> > >> > >> On 5/2/17, 10:20 AM, "Qin Liu" <[email protected]> wrote: > >> > >> >Hi @avijayan <https://reviews.apache.org/users/avijayan/>, @swagle > >> ><https://reviews.apache.org/users/swagle/>, @dsen, and all, > >> ><https://reviews.apache.org/users/dsen/> > >> > > >> >Can someone also tell me what is the last commit for building > >> >http://public-repo-1.hortonworks.com/ambari/ > centos6/2.x/updates/2.5.0.3/ > >> >ambari.repo? > >> > > >> >Thanks > >> > > >> >On Mon, May 1, 2017 at 11:19 PM, Qin Liu <[email protected]> wrote: > >> > > >> >> Hi all, > >> >> > >> >> Does anyone have this metrics issue "Most widgets on > >> >> HDFS,YARN,HBase,Storm,and Kafka Summary pages show NA" with latest > trunk > >> >> and latest branch-2.5? > >> >> > >> >> I am having this issue with ambari RPMs I built with latest trunk and > >> >> latest branch-2.5 using the following command: > >> >> > >> >> mvn -B clean install package rpm:rpm -Dbuild-rpm -DskipTests > >> >> -Dpython.ver="python >= 2.6" > >> >> > >> >> But I don't have this issue if I use http://public-repo-1. > >> >> hortonworks.com/ambari/centos6/2.x/updates/2.5.0.3/ambari.repo. > >> >> > >> >> Can anyone shed a light on this! > >> >> > >> >> Thanks > >> >> Qin > >> >> > >> >
