Hi,
I'm testing Ambari 2.1.3-snapshot (from Dec 1st, a830cc0) on HDP2.3.0
stack. In this setup Ambari-metrics-collector dies after some minutes
with the below log-paste (note the "FATAL" error, this comes after many
of the exceptions seen on top).
Possibly related to the pasted error below:
On startup it fails to load the native libraries, from the log:
2015-12-03 18:40:44,296 WARN [main] NativeCodeLoader:62 - Unable to
load native-hadoop library for your platform... using builtin-java
classes where applicable
even though they exist in the java.library.path given some lines below
in the log:
2015-12-03 18:40:44,396 INFO [main] ZooKeeper:100 - Client
environment:java.library.path=/usr/lib/ams-hbase/lib/hadoop-native -Xmx3072m
I also tried to replace the path above with a symlink to the
hadoop-client/lib/native dir (which has different content) - but this
did not help.
=========== paste ===============
Thu Dec 03 18:26:25 CET 2015,
RpcRetryingCaller{globalStartTime=1449163034289, pause=100, retries=35},
java.io.IOException: java.io.IOException:
java.lang.NoClassDefFoundError: org/iq8
0/snappy/CorruptionException
at
org.apache.phoenix.coprocessor.ServerCachingEndpointImpl.addServerCache(ServerCachingEndpointImpl.java:78)
at
org.apache.phoenix.coprocessor.generated.ServerCachingProtos$ServerCachingService.callMethod(ServerCachingProtos.java:3200)
at
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7390)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1873)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1855)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2112)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError:
org/iq80/snappy/CorruptionException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at
org.apache.phoenix.coprocessor.ServerCachingEndpointImpl.addServerCache(ServerCachingEndpointImpl.java:72)
... 10 more
Caused by: java.lang.ClassNotFoundException:
org.iq80.snappy.CorruptionException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 13 more
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147)
at
org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel.callExecService(RegionCoprocessorRpcChannel.java:95)
at
org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel.callMethod(CoprocessorRpcChannel.java:56)
at
org.apache.phoenix.coprocessor.generated.ServerCachingProtos$ServerCachingService$Stub.addServerCache(ServerCachingProtos.java:3270)
at
org.apache.phoenix.cache.ServerCacheClient$1$1.call(ServerCacheClient.java:204)
at
org.apache.phoenix.cache.ServerCacheClient$1$1.call(ServerCacheClient.java:189)
at org.apache.hadoop.hbase.client.HTable$16.call(HTable.java:1741)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.io.IOException:
java.lang.NoClassDefFoundError: org/iq80/snappy/CorruptionException
at
org.apache.phoenix.coprocessor.ServerCachingEndpointImpl.addServerCache(ServerCachingEndpointImpl.java:78)
at
org.apache.phoenix.coprocessor.generated.ServerCachingProtos$ServerCachingService.callMethod(ServerCachingProtos.java:3200)
at
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7390)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1873)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1855)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2112)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError:
org/iq80/snappy/CorruptionException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at
org.apache.phoenix.coprocessor.ServerCachingEndpointImpl.addServerCache(ServerCachingEndpointImpl.java:72)
... 10 more
Caused by: java.lang.ClassNotFoundException:
org.iq80.snappy.CorruptionException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 13 more
at
sun.reflect.GeneratedConstructorAccessor43.newInstance(Unknown Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:322)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1619)
at
org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:92)
at
org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:89)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
... 10 more
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException):
java.io.IOException: java.lang.NoClassDefFoundError:
org/iq80/snappy/CorruptionException
at
org.apache.phoenix.coprocessor.ServerCachingEndpointImpl.addServerCache(ServerCachingEndpointImpl.java:78)
at
org.apache.phoenix.coprocessor.generated.ServerCachingProtos$ServerCachingService.callMethod(ServerCachingProtos.java:3200)
at
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7390)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1873)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1855)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2112)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError:
org/iq80/snappy/CorruptionException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at
org.apache.phoenix.coprocessor.ServerCachingEndpointImpl.addServerCache(ServerCachingEndpointImpl.java:72)
... 10 more
Caused by: java.lang.ClassNotFoundException:
org.iq80.snappy.CorruptionException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 13 more
at
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1206)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.execService(ClientProtos.java:32675)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1615)
... 13 more
2015-12-03 18:26:25,220 INFO
[hconnection-0x33bc72d1-shared--pool2-t265] RpcRetryingCaller:132 - Call
exception, tries=16, retries=35, started=188985 ms ago, cancelled=false,
msg=row
'metricssystem.MetricsSystem.NumActiveSinks^@compute-10-2.local^@^@^@^AMi���datanode'
on table 'METRIC_RECORD' at
region=METRIC_RECORD,metricssystem.MetricsSystem.NumActiveSinks\x00com
pute-10-2.local\x00\x00\x00\x01Mi\xDD\xE3\xF7datanode,1432015934895.363cbca58c745853100106053690db95.,
hostname=compute-10-1.local,61320,1449162924698, seqNum=34149729
2015-12-03 18:26:25,539 INFO
[hconnection-0x33bc72d1-shared--pool2-t155] RpcRetryingCaller:132 - Call
exception, tries=27, retries=35, started=409895 ms ago, cancelled=false,
msg=row
'' on table 'METRIC_RECORD' at
region=METRIC_RECORD,,1432015934895.0f0a9816ffb93fe65176292b6ad378d1.,
hostname=compute-10-1.local,61320,1449162924698, seqNum=24131209
2015-12-03 18:26:25,597 INFO
[hconnection-0x33bc72d1-shared--pool2-t153] RpcRetryingCaller:132 - Call
exception, tries=27, retries=35, started=409953 ms ago, cancelled=false,
msg=row
'metricssystem.MetricsSystem.NumActiveSinks^@compute-10-2.local^@^@^@^AMi���datanode'
on table 'METRIC_RECORD' at
region=METRIC_RECORD,metricssystem.MetricsSystem.NumActiveSinks\x00compute-10-2.local\x00\x00\x00\x01Mi\xDD\xE3\xF7datanode,1432015934895.363cbca58c745853100106053690db95.,
hostname=compute-10-1.local,61320,1449162924698, seqNum=34149729
2015-12-03 18:26:25,680 INFO
[hconnection-0x33bc72d1-shared--pool2-t215] RpcRetryingCaller:132 - Call
exception, tries=22, retries=35, started=309625 ms ago, cancelled=false,
msg=row
'metricssystem.MetricsSystem.NumActiveSinks^@compute-10-2.local^@^@^@^AMi���datanode'
on table 'METRIC_RECORD' at
region=METRIC_RECORD,metricssystem.MetricsSystem.NumActiveSinks\x00compute-10-2.local\x00\x00\x00\x01Mi\xDD\xE3\xF7datanode,1432015934895.363cbca58c745853100106053690db95.,
hostname=compute-10-1.local,61320,1449162924698, seqNum=34149729
2015-12-03 18:26:26,085 INFO
[hconnection-0x33bc72d1-shared--pool2-t228] RpcRetryingCaller:132 - Call
exception, tries=29, retries=35, started=450123 ms ago, cancelled=false,
msg=row
'metricssystem.MetricsSystem.NumActiveSinks^@compute-10-2.local^@^@^@^AMi���datanode'
on table 'METRIC_RECORD' at
region=METRIC_RECORD,metricssystem.MetricsSystem.NumActiveSinks\x00compute-10-2.local\x00\x00\x00\x01Mi\xDD\xE3\xF7datanode,1432015934895.363cbca58c745853100106053690db95.,
hostname=compute-10-1.local,61320,1449162924698, seqNum=34149729
2015-12-03 18:26:26,276 FATAL [pool-1-thread-1]
TimelineMetricStoreWatcher:79 - Error getting metrics from
TimelineMetricStore. Shutting down by TimelineMetricStoreWatcher.
2015-12-03 18:26:26,279 INFO [pool-1-thread-1] ExitUtil:124 - Exiting
with status -1
2015-12-03 18:26:26,281 INFO [Thread-3]
ConnectionManager$HConnectionImplementation:2068 - Closing master
protocol: MasterService
2015-12-03 18:26:26,426 INFO
[hconnection-0x33bc72d1-shared--pool2-t227] RpcRetryingCaller:132 - Call
exception, tries=29, retries=35, started=450464 ms ago, cancelled=false,
msg=row '' on table 'METRIC_RECORD' at
region=METRIC_RECORD,,1432015934895.0f0a9816ffb93fe65176292b6ad378d1.,
hostname=compute-10-1.local,61320,1449162924698, seqNum=24131209
2015-12-03 18:26:26,442 INFO [Thread-1] log:67 - Stopped
[email protected]:6188
2015-12-03 18:26:26,451 WARN [1705435578@qtp-1802896480-9]
GenericExceptionHandler:98 - INTERNAL_SERVER_ERROR
javax.ws.rs.WebApplicationException: java.sql.SQLException: Sub plan [0]
execution interrupted.
at
org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TimelineWebServices.getTimelineMetrics(TimelineWebServices.java:387)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
...
--
Eirik Thorsnes