@Ufuk The cleanup bug for file:// checkpoints is not easy to fix IMHO. On Mon, 15 Jun 2015 at 15:39 Aljoscha Krettek <[email protected]> wrote:
> Oh yes, on that I agree. I'm just saying that the checkpoint setting > should maybe be a central setting. > > On Mon, 15 Jun 2015 at 15:38 Matthias J. Sax < > [email protected]> wrote: > >> Hi, >> >> IMHO, it is very common that Workers do have their own config files (eg, >> Storm works the same way). And I think it make a lot of senses. You >> might run Flink in an heterogeneous cluster and you want to assign >> different memory and slots for different hardware. This would not be >> possible using a single config file (specified at the master and >> distribute it). >> >> >> -Matthias >> >> On 06/15/2015 03:30 PM, Aljoscha Krettek wrote: >> > Regarding 1), thats why I said "bugs and features". :D But I think of >> it as >> > a bug, since people will normally set in in the flink-conf.yaml on the >> > master and assume that it works. That's what I assumed and it took me a >> > while to figure out that the task managers don't respect this setting. >> > >> > Regarding 3), if you think about it, this could never work. The state >> > handle cleanup logic happens purely on the JobManager. So what happens >> is >> > that the TaskManagers create state in some directory, let's say >> > /tmp/checkpoints, on the TaskManager. For cleanup, the JobManager gets >> the >> > state handle and calls discard (on the JobManager), this tries to >> cleanup >> > the state in /tmp/checkpoints, but of course, there is nothing there >> since >> > we are still on the JobManager. >> > >> > On Mon, 15 Jun 2015 at 15:23 Márton Balassi <[email protected]> >> > wrote: >> > >> >> @Aljoscha: >> >> 1) I think this just means that you can set the state backend on a >> >> taskmanager basis. >> >> 3) This is a serious issue then. Is it work when you set it in the >> >> flink-conf.yaml? >> >> >> >> On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek <[email protected] >> > >> >> wrote: >> >> >> >>> So, during my testing of the state checkpointing on a cluster I >> >> discovered >> >>> several things (bugs and features): >> >>> >> >>> - If you have a setup where the configuration is not synced to the >> >> workers >> >>> they do not pick up the state back-end configuration. The workers do >> not >> >>> respect the setting in the flink-cont.yaml on the master >> >>> - HDFS checkpointing works fine if you manually set it as the per-job >> >>> state-backend using setStateHandleProvider() >> >>> - If you manually set the stateHandleProvider to a "file://" backend, >> old >> >>> checkpoints will not be cleaned up, they will also not be cleaned up >> >> when a >> >>> job is finished. >> >>> >> >>> On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <[email protected]> >> wrote: >> >>> >> >>>> Hi Henry, >> >>>> >> >>>> This is just a dry run. The goal is to get everything in shape for a >> >>> proper >> >>>> vote. >> >>>> >> >>>> Kind regards, >> >>>> Max >> >>>> >> >>>> >> >>>> On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra < >> >> [email protected]> >> >>>> wrote: >> >>>> >> >>>>> Hi Max, >> >>>>> >> >>>>> Are you doing official VOTE on the RC on 0.9 release or this is just >> >> a >> >>>> dry >> >>>>> run? >> >>>>> >> >>>>> >> >>>>> - Henry >> >>>>> >> >>>>> On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <[email protected] >> > >> >>>>> wrote: >> >>>>>> Dear Flink community, >> >>>>>> >> >>>>>> Here's the second release candidate for the 0.9.0 release. We >> >> haven't >> >>>>> had a >> >>>>>> formal vote on the previous release candidate but it received an >> >>>> implicit >> >>>>>> -1 because of a couple of issues. >> >>>>>> >> >>>>>> Thanks to the hard-working Flink devs these issues should be solved >> >>>> now. >> >>>>>> The following commits have been added to the second release >> >>> candidate: >> >>>>>> >> >>>>>> f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from >> >>>>>> WritableTypeInformation to be treated as an interface >> >>>>>> 40e2df5 [FLINK-2072] [ml] Adds quickstart guide >> >>>>>> af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and >> >> further >> >>>>>> renamings for consistency. >> >>>>>> e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled, >> >> and >> >>>>> failed >> >>>>>> jobs in webinterface >> >>>>>> ecfde6d [docs][release] update stable version to 0.9.0 >> >>>>>> 4d8ae1c [docs] remove obsolete YARN link and cleanup download links >> >>>>>> f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats >> >>>>>> ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups >> >>>>>> 0edc0c8 [build] [streaming] Streaming parents dependencies pushed >> >> to >> >>>>>> children >> >>>>>> 6380b95 [streaming] Logging update for checkpointed streaming >> >>>> topologies >> >>>>>> 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome >> >>>>> squirrel. >> >>>>>> 80dd72d [FLINK-2196] [javaAPI] Moved misplaced >> >> SortPartitionOperator >> >>>>> class >> >>>>>> c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in >> >>>> sync >> >>>>>> 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot >> >>> release >> >>>>>> 87988ae [scripts] remove quickstart scripts >> >>>>>> f3a96de [streaming] Fixed streaming example jars packaging and >> >>>>> termination >> >>>>>> 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in >> >> Scala >> >>>>>> Streaming >> >>>>>> 1343f26 [streaming] Allow force-enabling checkpoints for iterative >> >>> jobs >> >>>>>> c59d291 Fixed a few trivial issues: >> >>>>>> e0e6f59 [streaming] Optional iteration feedback partitioning added >> >>>>>> 348ac86 [hotfix] Fix YARNSessionFIFOITCase >> >>>>>> 80cf2c5 [ml] Makes StandardScalers state package private and reduce >> >>>>>> redundant code. Adjusts flink-ml readme. >> >>>>>> c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the >> >>>>>> proprocessing package, test for the for the corresponding >> >>> functionality >> >>>>> and >> >>>>>> documentation. >> >>>>>> ee7c417 [docs] [streaming] Added states and fold to the streaming >> >>> docs >> >>>>>> fcca75c [docs] Fix some typos and grammar in the Streaming >> >>> Programming >> >>>>>> Guide. >> >>>>>> >> >>>>>> >> >>>>>> Again, we need to test the new release candidate. Therefore, I've >> >>>>> created a >> >>>>>> new document where we keep track of our testing criteria for >> >>> releases: >> >>>>>> >> >>>>> >> >>>> >> >>> >> >> >> https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit >> >>>>>> >> >>>>>> Everyone who tested previously, could take a different task this >> >>> time. >> >>>>> For >> >>>>>> some components we probably don't have to test again but, if in >> >>> doubt, >> >>>>>> testing twice doesn't hurt. >> >>>>>> >> >>>>>> Happy testing :) >> >>>>>> >> >>>>>> Cheers, >> >>>>>> Max >> >>>>>> >> >>>>>> Git branch: release-0.9.0-rc2 >> >>>>>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/ >> >>>>>> Maven artifacts: >> >>>>>> >> >>>> >> >> >> https://repository.apache.org/content/repositories/orgapacheflink-1040/ >> >>>>>> PGP public key for verifying the signatures: >> >>>>>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF >> >>>>> >> >>>> >> >>> >> >> >> > >> >>
