Hi Mark,

if you add `fs.s3a.fast.upload.buffer: true` to your Flink configuration,
it should add that to the respective Hadoop configuration when creating the
file system.
Note, I haven't tried it but all keys with the prefixes "s3.", "s3a.",
"fs.s3a." should be forwarded.

-- Arvid

On Mon, Jan 27, 2020 at 5:16 PM Piotr Nowojski <pi...@ververica.com> wrote:

> Hi,
>
> I think reducing the frequency of the checkpoints and decreasing
> parallelism of the things using the S3AOutputStream class, would help to
> mitigate the issue.
>
> I don’t know about other solutions. I would suggest to ask this question
> directly to Steve L. in the bug ticket [1], as he is the one that fixed the
> issue. If there is no workaround, maybe it would be possible to put a
> pressure on the Hadoop guys to back port the fix to older versions?
>
> Piotrek
>
> [1] https://issues.apache.org/jira/browse/HADOOP-15658
>
> On 27 Jan 2020, at 15:41, Cliff Resnick <cre...@gmail.com> wrote:
>
> I know from experience that Flink's shaded S3A FileSystem does not
> reference core-site.xml, though I don't remember offhand what file (s) it
> does reference. However since it's shaded, maybe this could be fixed by
> building a Flink FS referencing 3.3.0? Last I checked I think it referenced
> 3.1.0.
>
> On Mon, Jan 27, 2020, 8:48 AM David Magalhães <speeddra...@gmail.com>
> wrote:
>
>> Does StreamingFileSink use core-site.xml ? When I was using it, it didn't
>> load any configurations from core-site.xml.
>>
>> On Mon, Jan 27, 2020 at 12:08 PM Mark Harris <mark.har...@hivehome.com>
>> wrote:
>>
>>> Hi Piotr,
>>>
>>> Thanks for the link to the issue.
>>>
>>> Do you know if there's a workaround? I've tried setting the following in
>>> my core-site.xml:
>>>
>>> ​fs.s3a.fast.upload.buffer=true
>>>
>>> To try and avoid writing the buffer files, but the taskmanager breaks
>>> with the same problem.
>>>
>>> Best regards,
>>>
>>> Mark
>>> ------------------------------
>>> *From:* Piotr Nowojski <pi...@data-artisans.com> on behalf of Piotr
>>> Nowojski <pi...@ververica.com>
>>> *Sent:* 22 January 2020 13:29
>>> *To:* Till Rohrmann <trohrm...@apache.org>
>>> *Cc:* Mark Harris <mark.har...@hivehome.com>; flink-u...@apache.org <
>>> flink-u...@apache.org>; kkloudas <kklou...@apache.org>
>>> *Subject:* Re: GC overhead limit exceeded, memory full of DeleteOnExit
>>> hooks for S3a files
>>>
>>> Hi,
>>>
>>> This is probably a known issue of Hadoop [1]. Unfortunately it was only
>>> fixed in 3.3.0.
>>>
>>> Piotrek
>>>
>>> [1] https://issues.apache.org/jira/browse/HADOOP-15658
>>>
>>> On 22 Jan 2020, at 13:56, Till Rohrmann <trohrm...@apache.org> wrote:
>>>
>>> Thanks for reporting this issue Mark. I'm pulling Klou into this
>>> conversation who knows more about the StreamingFileSink. @Klou does the
>>> StreamingFileSink relies on DeleteOnExitHooks to clean up files?
>>>
>>> Cheers,
>>> Till
>>>
>>> On Tue, Jan 21, 2020 at 3:38 PM Mark Harris <mark.har...@hivehome.com>
>>> wrote:
>>>
>>> Hi,
>>>
>>> We're using flink 1.7.2 on an EMR cluster v emr-5.22.0, which runs
>>> hadoop v "Amazon 2.8.5". We've recently noticed that some TaskManagers fail
>>> (causing all the jobs running on them to fail) with an
>>> "java.lang.OutOfMemoryError: GC overhead limit exceeded”. The taskmanager
>>> (and jobs that should be running on it) remain down until manually
>>> restarted.
>>>
>>> I managed to take and analyze a memory dump from one of the afflicted
>>> taskmanagers.
>>>
>>> It showed that 85% of the heap was made up of
>>> the java.io.DeleteOnExitHook.files hashset. The majority of the strings in
>>> that hashset (9041060 out of ~9041100) pointed to files that began
>>> /tmp/hadoop-yarn/s3a/s3ablock
>>>
>>> The problem seems to affect jobs that make use of the StreamingFileSink
>>> - all of the taskmanager crashes have been on the taskmaster running at
>>> least one job using this sink, and a cluster running only a single
>>> taskmanager / job that uses the StreamingFileSink crashed with the GC
>>> overhead limit exceeded error.
>>>
>>> I've had a look for advice on handling this error more broadly without
>>> luck.
>>>
>>> Any suggestions or advice gratefully received.
>>>
>>> Best regards,
>>>
>>> Mark Harris
>>>
>>>
>>>
>>> The information contained in or attached to this email is intended only
>>> for the use of the individual or entity to which it is addressed. If you
>>> are not the intended recipient, or a person responsible for delivering it
>>> to the intended recipient, you are not authorised to and must not disclose,
>>> copy, distribute, or retain this message or any part of it. It may contain
>>> information which is confidential and/or covered by legal professional or
>>> other privilege under applicable law.
>>>
>>> The views expressed in this email are not necessarily the views of
>>> Centrica plc or its subsidiaries, and the company, its directors, officers
>>> or employees make no representation or accept any liability for its
>>> accuracy or completeness unless expressly stated to the contrary.
>>>
>>> Additional regulatory disclosures may be found here:
>>> https://www.centrica.com/privacy-cookies-and-legal-disclaimer#email
>>>
>>> PH Jones is a trading name of British Gas Social Housing Limited.
>>> British Gas Social Housing Limited (company no: 01026007), British Gas
>>> Trading Limited (company no: 03078711), British Gas Services Limited
>>> (company no: 3141243), British Gas Insurance Limited (company no:
>>> 06608316), British Gas New Heating Limited (company no: 06723244), British
>>> Gas Services (Commercial) Limited (company no: 07385984) and Centrica
>>> Energy (Trading) Limited (company no: 02877397) are all wholly owned
>>> subsidiaries of Centrica plc (company no: 3033654). Each company is
>>> registered in England and Wales with a registered office at Millstream,
>>> Maidenhead Road, Windsor, Berkshire SL4 5GD.
>>>
>>> British Gas Insurance Limited is authorised by the Prudential Regulation
>>> Authority and regulated by the Financial Conduct Authority and the
>>> Prudential Regulation Authority. British Gas Services Limited and Centrica
>>> Energy (Trading) Limited are authorised and regulated by the Financial
>>> Conduct Authority. British Gas Trading Limited is an appointed
>>> representative of British Gas Services Limited which is authorised and
>>> regulated by the Financial Conduct Authority.
>>>
>>>
>>>
>>>
>>> The information contained in or attached to this email is intended only
>>> for the use of the individual or entity to which it is addressed. If you
>>> are not the intended recipient, or a person responsible for delivering it
>>> to the intended recipient, you are not authorised to and must not disclose,
>>> copy, distribute, or retain this message or any part of it. It may contain
>>> information which is confidential and/or covered by legal professional or
>>> other privilege under applicable law.
>>>
>>> The views expressed in this email are not necessarily the views of
>>> Centrica plc or its subsidiaries, and the company, its directors, officers
>>> or employees make no representation or accept any liability for its
>>> accuracy or completeness unless expressly stated to the contrary.
>>>
>>> Additional regulatory disclosures may be found here:
>>> https://www.centrica.com/privacy-cookies-and-legal-disclaimer#email
>>>
>>> PH Jones is a trading name of British Gas Social Housing Limited.
>>> British Gas Social Housing Limited (company no: 01026007), British Gas
>>> Trading Limited (company no: 03078711), British Gas Services Limited
>>> (company no: 3141243), British Gas Insurance Limited (company no:
>>> 06608316), British Gas New Heating Limited (company no: 06723244), British
>>> Gas Services (Commercial) Limited (company no: 07385984) and Centrica
>>> Energy (Trading) Limited (company no: 02877397) are all wholly owned
>>> subsidiaries of Centrica plc (company no: 3033654). Each company is
>>> registered in England and Wales with a registered office at Millstream,
>>> Maidenhead Road, Windsor, Berkshire SL4 5GD.
>>>
>>> British Gas Insurance Limited is authorised by the Prudential Regulation
>>> Authority and regulated by the Financial Conduct Authority and the
>>> Prudential Regulation Authority. British Gas Services Limited and Centrica
>>> Energy (Trading) Limited are authorised and regulated by the Financial
>>> Conduct Authority. British Gas Trading Limited is an appointed
>>> representative of British Gas Services Limited which is authorised and
>>> regulated by the Financial Conduct Authority.
>>>
>>
>

Reply via email to