Re: Tez GC issues perhaps? not sure.

2016-12-13 Thread Gopal Vijayaraghavan
> yarn.timeline-service.ttl-enable=true Let us validate that this is due to the TTL GC kicking in and disable the TTL flag & leave it running for a day. Better to also verify the Tez logs of sessions hanging along waiting for the ATS to collect events (look for the last _post log file in the AM

Re: Tez GC issues perhaps? not sure.

2016-12-13 Thread Stephen Sprague
aha. i sense we're getting closer. here are my settings for yarn.timeline-service.* yarn.timeline-service.address=${yarn.timeline-service.hostname}:10200 yarn.timeline-service.client.max-retries=30 yarn.timeline-service.client.retry-interval-ms=1000 yarn.timeline-service.enabled=true yarn.timelin

Re: Tez GC issues perhaps? not sure.

2016-12-13 Thread Gopal Vijayaraghavan
> well we are seeing these sessions sitting around for over an hour This could be one of the causes for this issue - a stuck ATS. Tez won't kill a session till all the ATS info has been submitted out of the process. RollingLevelDbTimelineStore & EntityGroupFSTimelineStore was written to fix t

Re: Tez GC issues perhaps? not sure.

2016-12-13 Thread Harish JP
AFAIK, HS2 uses a pool of AMs and submits query to any free AM. There should be configs which control number of free AMs, timeout and so on for the pool used by HS2. On 14-Dec-2016, at 7:54 AM, Stephen Sprague mailto:sprag...@gmail.com>> wrote: interesting thank you. pretty sure they ar

Re: Tez GC issues perhaps? not sure.

2016-12-13 Thread Stephen Sprague
i didn't mean to hit send just yet. well we are seeing these sessions sitting around for over an hour - yet i don't see that config set so perhaps the default 5 minutes might not be in play in my case. settings i do see are: set : hive.cli.tez.session.async=true set

Re: Tez GC issues perhaps? not sure.

2016-12-13 Thread Stephen Sprague
interesting thank you. pretty sure they are being submitted through the HS2 service. On Tue, Dec 13, 2016 at 5:21 PM, Harish JP wrote: > Hi Stephen, > > How are you starting these jobs, beeline, hive-cli, ...? It looks like > they are being started in session mode, which means the AM wait

Re: Tez GC issues perhaps? not sure.

2016-12-13 Thread Harish JP
Hi Stephen, How are you starting these jobs, beeline, hive-cli, ...? It looks like they are being started in session mode, which means the AM waits for 5 minutes (default value) for a new DAG/query to be submitted, if it does not receive a query it will timeout and shutdown. The config for thi