ah.
2016-12-14 14:05:07,855 [WARN] [AMShutdownThread]
|ats.ATSHistoryLoggingService|: ATSService being stopped,
eventQueueBacklog=14820, maxTimeLeftToFlush=-1, waitForever=true
2016-12-14 14:05:37,877 [ERROR] [AMShutdownThread]
|impl.TimelineClientImpl|: Failed to get the response from the timelin
> looking at the stderr of that one container hanging around we have this below.
Look in the syslog for a log line which starts with
ATSService being stopped, eventQueueBacklog=…, waitForever=true
Cheers,
Gopal
first pass:
1. changing yarn.timeline-service.ttl-enable to false didn't seem work.
i restarted the TLS and HS2 and RM. and the query still stuck around.
2. figure i'd try using RollingLevelDbTimelineStore but got class not
found so i'll dig around for that later today.
current settings f
Thanks Gopal. I'll set the ttl flag to false and see what gives.
Cheers,
Stephen
On Tue, Dec 13, 2016 at 10:48 PM, Gopal Vijayaraghavan
wrote:
> > yarn.timeline-service.ttl-enable=true
>
> Let us validate that this is due to the TTL GC kicking in and disable the
> TTL flag & leave it running f
> yarn.timeline-service.ttl-enable=true
Let us validate that this is due to the TTL GC kicking in and disable the TTL
flag & leave it running for a day.
Better to also verify the Tez logs of sessions hanging along waiting for the
ATS to collect events (look for the last _post log file in the AM
aha. i sense we're getting closer. here are my settings for
yarn.timeline-service.*
yarn.timeline-service.address=${yarn.timeline-service.hostname}:10200
yarn.timeline-service.client.max-retries=30
yarn.timeline-service.client.retry-interval-ms=1000
yarn.timeline-service.enabled=true
yarn.timelin
> well we are seeing these sessions sitting around for over an hour
This could be one of the causes for this issue - a stuck ATS. Tez won't kill a
session till all the ATS info has been submitted out of the process.
RollingLevelDbTimelineStore & EntityGroupFSTimelineStore was written to fix
t
AFAIK, HS2 uses a pool of AMs and submits query to any free AM. There should be
configs which control number of free AMs, timeout and so on for the pool used
by HS2.
On 14-Dec-2016, at 7:54 AM, Stephen Sprague
mailto:sprag...@gmail.com>> wrote:
interesting thank you. pretty sure they ar
i didn't mean to hit send just yet.
well we are seeing these sessions sitting around for over an hour - yet i
don't see that config set so perhaps the default 5 minutes might not be in
play in my case. settings i do see are:
set : hive.cli.tez.session.async=true
set
interesting thank you. pretty sure they are being submitted through
the HS2 service.
On Tue, Dec 13, 2016 at 5:21 PM, Harish JP wrote:
> Hi Stephen,
>
> How are you starting these jobs, beeline, hive-cli, ...? It looks like
> they are being started in session mode, which means the AM wait
Hi Stephen,
How are you starting these jobs, beeline, hive-cli, ...? It looks like they
are being started in session mode, which means the AM waits for 5 minutes
(default value) for a new DAG/query to be submitted, if it does not receive a
query it will timeout and shutdown. The config for thi
11 matches
Mail list logo