@Ufuk The cleanup bug for file:// checkpoints is not easy to fix IMHO.

On Mon, 15 Jun 2015 at 15:39 Aljoscha Krettek <aljos...@apache.org> wrote:

> Oh yes, on that I agree. I'm just saying that the checkpoint setting
> should maybe be a central setting.
>
> On Mon, 15 Jun 2015 at 15:38 Matthias J. Sax <
> mj...@informatik.hu-berlin.de> wrote:
>
>> Hi,
>>
>> IMHO, it is very common that Workers do have their own config files (eg,
>> Storm works the same way). And I think it make a lot of senses. You
>> might run Flink in an heterogeneous cluster and you want to assign
>> different memory and slots for different hardware. This would not be
>> possible using a single config file (specified at the master and
>> distribute it).
>>
>>
>> -Matthias
>>
>> On 06/15/2015 03:30 PM, Aljoscha Krettek wrote:
>> > Regarding 1), thats why I said "bugs and features". :D But I think of
>> it as
>> > a bug, since people will normally set in in the flink-conf.yaml on the
>> > master and assume that it works. That's what I assumed and it took me a
>> > while to figure out that the task managers don't respect this setting.
>> >
>> > Regarding 3), if you think about it, this could never work. The state
>> > handle cleanup logic happens purely on the JobManager. So what happens
>> is
>> > that the TaskManagers create state in some directory, let's say
>> > /tmp/checkpoints, on the TaskManager. For cleanup, the JobManager gets
>> the
>> > state handle and calls discard (on the JobManager), this tries to
>> cleanup
>> > the state in /tmp/checkpoints, but of course, there is nothing there
>> since
>> > we are still on the JobManager.
>> >
>> > On Mon, 15 Jun 2015 at 15:23 Márton Balassi <balassi.mar...@gmail.com>
>> > wrote:
>> >
>> >> @Aljoscha:
>> >> 1) I think this just means that you can set the state backend on a
>> >> taskmanager basis.
>> >> 3) This is a serious issue then. Is it work when you set it in the
>> >> flink-conf.yaml?
>> >>
>> >> On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek <aljos...@apache.org
>> >
>> >> wrote:
>> >>
>> >>> So, during my testing of the state checkpointing on a cluster I
>> >> discovered
>> >>> several things (bugs and features):
>> >>>
>> >>>  - If you have a setup where the configuration is not synced to the
>> >> workers
>> >>> they do not pick up the state back-end configuration. The workers do
>> not
>> >>> respect the setting in the flink-cont.yaml on the master
>> >>> - HDFS checkpointing works fine if you manually set it as the per-job
>> >>> state-backend using setStateHandleProvider()
>> >>> - If you manually set the stateHandleProvider to a "file://" backend,
>> old
>> >>> checkpoints will not be cleaned up, they will also not be cleaned up
>> >> when a
>> >>> job is finished.
>> >>>
>> >>> On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <m...@apache.org>
>> wrote:
>> >>>
>> >>>> Hi Henry,
>> >>>>
>> >>>> This is just a dry run. The goal is to get everything in shape for a
>> >>> proper
>> >>>> vote.
>> >>>>
>> >>>> Kind regards,
>> >>>> Max
>> >>>>
>> >>>>
>> >>>> On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra <
>> >> henry.sapu...@gmail.com>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi Max,
>> >>>>>
>> >>>>> Are you doing official VOTE on the RC on 0.9 release or this is just
>> >> a
>> >>>> dry
>> >>>>> run?
>> >>>>>
>> >>>>>
>> >>>>> - Henry
>> >>>>>
>> >>>>> On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <m...@apache.org
>> >
>> >>>>> wrote:
>> >>>>>> Dear Flink community,
>> >>>>>>
>> >>>>>> Here's the second release candidate for the 0.9.0 release. We
>> >> haven't
>> >>>>> had a
>> >>>>>> formal vote on the previous release candidate but it received an
>> >>>> implicit
>> >>>>>> -1 because of a couple of issues.
>> >>>>>>
>> >>>>>> Thanks to the hard-working Flink devs these issues should be solved
>> >>>> now.
>> >>>>>> The following commits have been added to the second release
>> >>> candidate:
>> >>>>>>
>> >>>>>> f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from
>> >>>>>> WritableTypeInformation to be treated as an interface
>> >>>>>> 40e2df5 [FLINK-2072] [ml] Adds quickstart guide
>> >>>>>> af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and
>> >> further
>> >>>>>> renamings for consistency.
>> >>>>>> e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled,
>> >> and
>> >>>>> failed
>> >>>>>> jobs in webinterface
>> >>>>>> ecfde6d [docs][release] update stable version to 0.9.0
>> >>>>>> 4d8ae1c [docs] remove obsolete YARN link and cleanup download links
>> >>>>>> f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats
>> >>>>>> ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups
>> >>>>>> 0edc0c8 [build] [streaming] Streaming parents dependencies pushed
>> >> to
>> >>>>>> children
>> >>>>>> 6380b95 [streaming] Logging update for checkpointed streaming
>> >>>> topologies
>> >>>>>> 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome
>> >>>>> squirrel.
>> >>>>>> 80dd72d [FLINK-2196] [javaAPI] Moved misplaced
>> >> SortPartitionOperator
>> >>>>> class
>> >>>>>> c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in
>> >>>> sync
>> >>>>>> 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot
>> >>> release
>> >>>>>> 87988ae [scripts] remove quickstart scripts
>> >>>>>> f3a96de [streaming] Fixed streaming example jars packaging and
>> >>>>> termination
>> >>>>>> 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in
>> >> Scala
>> >>>>>> Streaming
>> >>>>>> 1343f26 [streaming] Allow force-enabling checkpoints for iterative
>> >>> jobs
>> >>>>>> c59d291 Fixed a few trivial issues:
>> >>>>>> e0e6f59 [streaming] Optional iteration feedback partitioning added
>> >>>>>> 348ac86 [hotfix] Fix YARNSessionFIFOITCase
>> >>>>>> 80cf2c5 [ml] Makes StandardScalers state package private and reduce
>> >>>>>> redundant code. Adjusts flink-ml readme.
>> >>>>>> c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the
>> >>>>>> proprocessing package, test for the for the corresponding
>> >>> functionality
>> >>>>> and
>> >>>>>> documentation.
>> >>>>>> ee7c417 [docs] [streaming] Added states and fold to the streaming
>> >>> docs
>> >>>>>> fcca75c [docs] Fix some typos and grammar in the Streaming
>> >>> Programming
>> >>>>>> Guide.
>> >>>>>>
>> >>>>>>
>> >>>>>> Again, we need to test the new release candidate. Therefore, I've
>> >>>>> created a
>> >>>>>> new document where we keep track of our testing criteria for
>> >>> releases:
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit
>> >>>>>>
>> >>>>>> Everyone who tested previously, could take a different task this
>> >>> time.
>> >>>>> For
>> >>>>>> some components we probably don't have to test again but, if in
>> >>> doubt,
>> >>>>>> testing twice doesn't hurt.
>> >>>>>>
>> >>>>>> Happy testing :)
>> >>>>>>
>> >>>>>> Cheers,
>> >>>>>> Max
>> >>>>>>
>> >>>>>> Git branch: release-0.9.0-rc2
>> >>>>>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/
>> >>>>>> Maven artifacts:
>> >>>>>>
>> >>>>
>> >>
>> https://repository.apache.org/content/repositories/orgapacheflink-1040/
>> >>>>>> PGP public key for verifying the signatures:
>> >>>>>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>>

Reply via email to