Hi Tony,

Maybe you can consider looking at the doc information for this class, this
class comes from flink-s3-fs-presto.[1]

[1]:
https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/hadoop/conf/Configuration.html

Thanks, vino.

Tony Wei <tony19920...@gmail.com> 于2018年8月29日周三 下午2:18写道:

> Hi Vino,
>
> I thought this config is for aws s3 client, but this client is inner
> flink-s3-fs-presto.
> So, I guessed I should find a way to pass this config to this library.
>
> Best,
> Tony Wei
>
> 2018-08-29 14:13 GMT+08:00 vino yang <yanghua1...@gmail.com>:
>
>> Hi Tony,
>>
>> Sorry, I just saw the timeout, I thought they were similar because they
>> both happened on aws s3.
>> Regarding this setting, isn't "s3.max-client-retries: xxx" set for the
>> client?
>>
>> Thanks, vino.
>>
>> Tony Wei <tony19920...@gmail.com> 于2018年8月29日周三 下午1:17写道:
>>
>>> Hi Vino,
>>>
>>> Thanks for your quick reply, but I think these two questions are
>>> different. The checkpoint in that question
>>> finally finished, but my checkpoint failed due to s3 client timeout. You
>>> can see from my screenshot that
>>> showed the checkpoint failed in a short time.
>>>
>>> According to configuration, do you mean pass the configuration as
>>> program's input arguments? I don't
>>> think it will work. At least I need to find a way to pass it to s3
>>> filesystem builder in my program. However,
>>> I will ask for help to pass it by flink-conf.yaml, because I used that
>>> to config the global setting for s3
>>> filesystem and I thought it might have a simple way to support this
>>> setting like other s3.xxx config.
>>>
>>> Very much appreciate for your answer and help.
>>>
>>> Best,
>>> Tony Wei
>>>
>>> 2018-08-29 11:51 GMT+08:00 vino yang <yanghua1...@gmail.com>:
>>>
>>>> Hi Tony,
>>>>
>>>> A while ago, I have answered a similar question.[1]
>>>>
>>>> You can try to increase this value appropriately. You can't put this
>>>> configuration in flink-conf.yaml, you can put it in the submit command of
>>>> the job[2], or in the configuration file you specify.
>>>>
>>>> [1]:
>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Why-checkpoint-took-so-long-td22364.html#a22375
>>>> [2]:
>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/cli.html
>>>>
>>>> Thanks, vino.
>>>>
>>>> Tony Wei <tony19920...@gmail.com> 于2018年8月29日周三 上午11:36写道:
>>>>
>>>>> Hi,
>>>>>
>>>>> I met checkpoint failure problem that cause by s3 exception.
>>>>>
>>>>> org.apache.flink.fs.s3presto.shaded.com.amazonaws.services.s3.model.AmazonS3Exception:
>>>>>> Your socket connection to the server was not read from or written to 
>>>>>> within
>>>>>> the timeout period. Idle connections will be closed. (Service: Amazon S3;
>>>>>> Status Code: 400; Error Code: RequestTimeout; Request ID:
>>>>>> B8BE8978D3EFF3F5), S3 Extended Request ID:
>>>>>> ePKce/MjMFPPNYi90rGdYmDw3blfvi0xR2CcJpCISEgxM92/6JZAU4whpfXeV6SfG62cnts0NBw=
>>>>>
>>>>>
>>>>> The full stack trace and screenshot is provided in the attachment.
>>>>>
>>>>> My setting for flink cluster and job:
>>>>>
>>>>>    - flink version 1.4.0
>>>>>    - standalone mode
>>>>>    - 4 slots for each TM
>>>>>    - presto s3 filesystem
>>>>>    - rocksdb statebackend
>>>>>    - local ssd
>>>>>    - enable incremental checkpoint
>>>>>
>>>>> No weird message beside the exception in the log file. No high ratio
>>>>> of GC during the checkpoint
>>>>> procedure. And still 3 of 4 parts uploaded successfully on that TM. I
>>>>> didn't find something that
>>>>> would related to this failure. Did anyone meet this problem before?
>>>>>
>>>>> Besides, I also found an issue in other aws sdk[1] that mentioned this
>>>>> s3 exception as well. One
>>>>> reply said you can passively avoid the problem by raising the max
>>>>> client retires config. So I found
>>>>> that config in presto[2]. Can I just add s3.max-client-retries: xxx in
>>>>> flink-conf.yaml to config
>>>>> it? If not, how should I do to overwrite the default value of this
>>>>> configuration? Thanks in advance.
>>>>>
>>>>> Best,
>>>>> Tony Wei
>>>>>
>>>>> [1] https://github.com/aws/aws-sdk-php/issues/885
>>>>> [2]
>>>>> https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/s3/HiveS3Config.java#L218
>>>>>
>>>>
>>>
>

Reply via email to