We are on 1.8 as of now will give "stop with savepoint"  a try once we
upgrade.
I am trying to cancel the job with savepoint and restore it back again.

I think there is an issue with how our s3 lifecycle is configured. Thank
you for your help.

On Sun, Aug 18, 2019 at 8:10 AM Stephan Ewen <se...@apache.org> wrote:

> My first guess would also be the same as Rafi's: The lifetime of the MPU
> part files is so too low for that use case.
>
> Maybe this can help:
>
>   - If you want to  stop a job with a savepoint and plan to restore later
> from it (possible much later, so that the MPU Part lifetime might be
> exceeded), then I would recommend to use Flink 1.9's new "stop with
> savepoint" feature. That should finalize in-flight uploads and make sure no
> lingering part files exist.
>
>   - If you take a savepoint out of a running job to start a new job, you
> probably need to configure the sink differently anyways, to not interfere
> with the running job. In that case, I would suggest to change the name of
> the sink (the operator uid) such that the new job's sink doesn't try to
> resume (and interfere with) the running job's sink.
>
> Best,
> Stephan
>
>
>
> On Sat, Aug 17, 2019 at 11:23 PM Rafi Aroch <rafi.ar...@gmail.com> wrote:
>
>> Hi,
>>
>> S3 would delete files only if you have 'lifecycle rules' [1] defined on
>> the bucket. Could that be the case? If so, make sure to disable / extend
>> the object expiration period.
>>
>> [1]
>> https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html
>> <https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html>
>>
>> Thanks,
>> Rafi
>>
>>
>> On Sat, Aug 17, 2019 at 1:48 AM Oytun Tez <oy...@motaword.com> wrote:
>>
>>> Hi Swapnil,
>>>
>>> I am not familiar with the StreamingFileSink, however, this sounds like
>>> a checkpointing issue to me FileSink should keep its sink state, and remove
>>> from the state the files that it *really successfully* sinks (perhaps
>>> you may want to add a validation here with S3 to check file integrity).
>>> This leaves us in the state with the failed files, partial files etc.
>>>
>>>
>>>
>>> ---
>>> Oytun Tez
>>>
>>> *M O T A W O R D*
>>> The World's Fastest Human Translation Platform.
>>> oy...@motaword.com — www.motaword.com
>>> <http://www.motaword.com/>
>>>
>>>
>>> On Fri, Aug 16, 2019 at 6:02 PM Swapnil Kumar <swku...@zendesk.com>
>>> wrote:
>>>
>>>> Hello, We are using Flink to process input events and aggregate and
>>>> write o/p of our streaming job to S3 using StreamingFileSink but whenever
>>>> we try to restore the job from a savepoint, the restoration fails with
>>>> missing part files error. As per my understanding, s3 deletes those
>>>> part(intermittent) files and can no longer be found on s3. Is there a
>>>> workaround for this, so that we can use s3 as a sink?
>>>>
>>>> --
>>>> Thanks,
>>>> Swapnil Kumar
>>>>
>>>

-- 
Thanks,
Swapnil Kumar

Reply via email to