Thanks for reporting this issue Mark. I'm pulling Klou into this
conversation who knows more about the StreamingFileSink. @Klou does the
StreamingFileSink relies on DeleteOnExitHooks to clean up files?

Cheers,
Till

On Tue, Jan 21, 2020 at 3:38 PM Mark Harris <mark.har...@hivehome.com>
wrote:

> Hi,
>
> We're using flink 1.7.2 on an EMR cluster v emr-5.22.0, which runs hadoop
> v "Amazon 2.8.5". We've recently noticed that some TaskManagers fail
> (causing all the jobs running on them to fail) with an
> "java.lang.OutOfMemoryError: GC overhead limit exceeded”. The taskmanager
> (and jobs that should be running on it) remain down until manually
> restarted.
>
> I managed to take and analyze a memory dump from one of the afflicted
> taskmanagers.
>
> It showed that 85% of the heap was made up of
> the java.io.DeleteOnExitHook.files hashset. The majority of the strings in
> that hashset (9041060 out of ~9041100) pointed to files that began
> /tmp/hadoop-yarn/s3a/s3ablock
>
> The problem seems to affect jobs that make use of the StreamingFileSink -
> all of the taskmanager crashes have been on the taskmaster running at least
> one job using this sink, and a cluster running only a single taskmanager /
> job that uses the StreamingFileSink crashed with the GC overhead limit
> exceeded error.
>
> I've had a look for advice on handling this error more broadly without
> luck.
>
> Any suggestions or advice gratefully received.
>
> Best regards,
>
> Mark Harris
>
>
>
> The information contained in or attached to this email is intended only
> for the use of the individual or entity to which it is addressed. If you
> are not the intended recipient, or a person responsible for delivering it
> to the intended recipient, you are not authorised to and must not disclose,
> copy, distribute, or retain this message or any part of it. It may contain
> information which is confidential and/or covered by legal professional or
> other privilege under applicable law.
>
> The views expressed in this email are not necessarily the views of
> Centrica plc or its subsidiaries, and the company, its directors, officers
> or employees make no representation or accept any liability for its
> accuracy or completeness unless expressly stated to the contrary.
>
> Additional regulatory disclosures may be found here:
> https://www.centrica.com/privacy-cookies-and-legal-disclaimer#email
>
> PH Jones is a trading name of British Gas Social Housing Limited. British
> Gas Social Housing Limited (company no: 01026007), British Gas Trading
> Limited (company no: 03078711), British Gas Services Limited (company no:
> 3141243), British Gas Insurance Limited (company no: 06608316), British Gas
> New Heating Limited (company no: 06723244), British Gas Services
> (Commercial) Limited (company no: 07385984) and Centrica Energy (Trading)
> Limited (company no: 02877397) are all wholly owned subsidiaries of
> Centrica plc (company no: 3033654). Each company is registered in England
> and Wales with a registered office at Millstream, Maidenhead Road, Windsor,
> Berkshire SL4 5GD.
>
> British Gas Insurance Limited is authorised by the Prudential Regulation
> Authority and regulated by the Financial Conduct Authority and the
> Prudential Regulation Authority. British Gas Services Limited and Centrica
> Energy (Trading) Limited are authorised and regulated by the Financial
> Conduct Authority. British Gas Trading Limited is an appointed
> representative of British Gas Services Limited which is authorised and
> regulated by the Financial Conduct Authority.
>

Reply via email to