[jira] [Created] (FLINK-7368) MetricStore makes cpu spin at 100%
Nico Chen created FLINK-7368: Summary: MetricStore makes cpu spin at 100% Key: FLINK-7368 URL: https://issues.apache.org/jira/browse/FLINK-7368 Project: Flink Issue Type: Bug Components: Metrics Reporter: Nico Chen Flink's `MetricStore` is not thread-safe. multi-treads may acess java' hashmap inside `MetricStore` and can tirgger hashmap's infinte loop. Recently I met the case that flink jobmanager consumed 100% cpu. A part of stacktrace is shown below. The full jstack is in the attachment. {code:java} "ForkJoinPool-1-worker-19" daemon prio=10 tid=0x7fbdacac9800 nid=0x64c1 runnable [0x7fbd7d1c2000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.put(HashMap.java:494) at org.apache.flink.runtime.webmonitor.metrics.MetricStore.addMetric(MetricStore.java:176) at org.apache.flink.runtime.webmonitor.metrics.MetricStore.add(MetricStore.java:121) at org.apache.flink.runtime.webmonitor.metrics.MetricFetcher.addMetrics(MetricFetcher.java:198) at org.apache.flink.runtime.webmonitor.metrics.MetricFetcher.access$500(MetricFetcher.java:58) at org.apache.flink.runtime.webmonitor.metrics.MetricFetcher$4.onSuccess(MetricFetcher.java:188) at akka.dispatch.OnSuccess.internal(Future.scala:212) at akka.dispatch.japi$CallbackBridge.apply(Future.scala:175) at akka.dispatch.japi$CallbackBridge.apply(Future.scala:172) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at scala.runtime.AbstractPartialFunction.applyOrElse(AbstractPartialFunction.scala:28) at scala.concurrent.Future$$anonfun$onSuccess$1.apply(Future.scala:117) at scala.concurrent.Future$$anonfun$onSuccess$1.apply(Future.scala:115) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) at java.util.concurrent.ForkJoinTask$AdaptedRunnable.exec(ForkJoinTask.java:1265) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:334) at java.util.concurrent.ForkJoinWorkerThread.execTask(ForkJoinWorkerThread.java:604) at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:784) at java.util.concurrent.ForkJoinPool.work(ForkJoinPool.java:646) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:398) {code} There are 24 threads show same stacktrace as above to indicate they are spining at HashMap.put(HashMap.java:494) (I am using Java 1.7.0_6). Many posts indicate multi-threads accessing hashmap cause this problem and I reproduce the case as well. Even through `MetricFetcher` has a 10 seconds minimum inteverl between each metrics qurey, it still cannot guarntee query responses do not acess `MtricStore`'s hashmap concurrently. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7367) Parameterize FlinkKinesisProducer on RecordMaxBufferedTime, MaxConnections, RequestTimeout, and more
Bowen Li created FLINK-7367: --- Summary: Parameterize FlinkKinesisProducer on RecordMaxBufferedTime, MaxConnections, RequestTimeout, and more Key: FLINK-7367 URL: https://issues.apache.org/jira/browse/FLINK-7367 Project: Flink Issue Type: Improvement Components: Kinesis Connector Affects Versions: 1.3.0 Reporter: Bowen Li Assignee: Bowen Li Fix For: 1.3.3 Right now, FlinkKinesisProducer only expose two configs for the underlying KinesisProducer: - AGGREGATION_MAX_COUNT - COLLECTION_MAX_COUNT Well, according to [AWS doc|http://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-config.html] and [their sample on github|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/default_config.properties], developers can set more to make the max use of KinesisProducer, and make it fault-tolerant (e.g. by increasing timeout). We need to parameterize FlinkKinesisProducer in order to pass in params listed in AWS doc and sample -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7366) Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 0.12.5
Bowen Li created FLINK-7366: --- Summary: Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 0.12.5 Key: FLINK-7366 URL: https://issues.apache.org/jira/browse/FLINK-7366 Project: Flink Issue Type: Improvement Components: Kinesis Connector Reporter: Bowen Li Assignee: Bowen Li Fix For: 1.4.0 flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus problematic. Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new features and improvements. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7365) warning of attempt to override final parameter: fs.s3.buffer.dir
Bowen Li created FLINK-7365: --- Summary: warning of attempt to override final parameter: fs.s3.buffer.dir Key: FLINK-7365 URL: https://issues.apache.org/jira/browse/FLINK-7365 Project: Flink Issue Type: Bug Reporter: Bowen Li I'm seeing hundreds of line of the following log in my JobManager log file: {code:java} 2017-08-03 19:48:45,330 WARN org.apache.hadoop.conf.Configuration - /usr/lib/hadoop/etc/hadoop/core-site.xml:an attempt to override final parameter: fs.s3.buffer.dir; Ignoring. 2017-08-03 19:48:45,485 WARN org.apache.hadoop.conf.Configuration - /etc/hadoop/conf/core-site.xml:an attempt to override final parameter: fs.s3.buffer.dir; Ignoring. 2017-08-03 19:48:45,486 WARN org.apache.hadoop.conf.Configuration - /usr/lib/hadoop/etc/hadoop/core-site.xml:an attempt to override final parameter: fs.s3.buffer.dir; Ignoring. 2017-08-03 19:48:45,626 WARN org.apache.hadoop.conf.Configuration - /etc/hadoop/conf/core-site.xml:an attempt to override final parameter: fs.s3.buffer.dir; Ignoring .. {code} Info of my Flink cluster: - Running on EMR with emr-5.6.0 - Using FSStateBackend, writing checkpointing data files to s3 - Configured s3 with S3AFileSystem according to https://ci.apache.org/projects/flink/flink-docs-release-1.4/setup/aws.html#set-s3-filesystem - AWS forbids resetting 'fs.s3.buffer.dir' value (it has a tag on this property in core-site.xml), so I set 'fs.s3a.buffer.dir' as '/tmp' Here's my core-site.xml file: {code:java} fs.s3.buffer.dir /mnt/s3,/mnt1/s3 true fs.s3.impl org.apache.hadoop.fs.s3a.S3AFileSystem fs.s3n.impl com.amazon.ws.emr.hadoop.fs.EmrFileSystem ipc.client.connect.max.retries.on.timeouts 5 hadoop.security.key.default.bitlength 256 hadoop.proxyuser.hadoop.groups * hadoop.tmp.dir /mnt/var/lib/hadoop/tmp hadoop.proxyuser.hadoop.hosts * io.file.buffer.size 65536 fs.AbstractFileSystem.s3.impl org.apache.hadoop.fs.s3.EMRFSDelegate fs.s3a.buffer.dir /tmp fs.s3bfs.impl org.apache.hadoop.fs.s3.S3FileSystem {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[CANCEL] [VOTE] Release 1.3.2, release candidate #2
Thanks for testing everyone. Unfortunately, I have to close this vote because of the problem with the Gelly Examples jar. I'll immediately create another RC (in fact I'm already on it). I'm hoping that we can do an expedited vote on that new RC because the only change in that RC is going to be a fix for the Gelly Examples jar. Best, Aljoscha > On 3. Aug 2017, at 14:10, Fabian Hueskewrote: > > +1 > > - checked hashes and signatures > - checked added and modified dependencies since Flink 1.3.1 > - io.dropwizard.metrics/metrics-core was upgraded to 3.1.5 and is ASL > - org.xerial.snappy/snappy-java/1.0.5 was added and is ASL > > - built from source > > Thanks, Fabian > > 2017-08-03 11:24 GMT+02:00 Stefan Richter : > >> +1 >> >> 1. Cluster tests >> >> Tested with the stateful state machine job on the following settings: >> - Cloud env: AWS >> - Distributions: EMR 5.7.0 >> - Flink deployment method: YARN (Hadoop 2.7.2) >> - HA: enabled >> - Kerberos: disabled >> - Kafka version: 0.10, 0.11 >> - State Backends: Heap (Sync / Async) & RocksDB (incremental / full) >> - Filesystem: S3 and HDFS (Hadoop 2.7.2) >> - Externalized checkpoints: enabled & disabled >> >> 2. Building with Scala 2.11 works >> >> 3. Building against Hadoop version works >> >> >>> Am 30.07.2017 um 09:07 schrieb Aljoscha Krettek : >>> >>> Hi everyone, >>> >>> Please review and vote on the release candidate #2 for the version >> 1.3.2, as follows: >>> [ ] +1, Approve the release >>> [ ] -1, Do not approve the release (please provide specific comments) >>> >>> >>> The complete staging area is available for your review, which includes: >>> * JIRA release notes [1], >>> * the official Apache source release and binary convenience releases to >> be deployed to dist.apache.org [2], which is signed with the key with >> fingerprint 0xA8F4FD97121D7293 [3], >>> * all artifacts to be deployed to the Maven Central Repository [4], >>> * source code tag "release-1.3.2-rc2" [5], >>> * website pull request listing the new release and adding announcement >> blog post [6]. >>> >>> The vote will be open for at least 72 hours (excluding this current >> weekend). It is adopted by majority approval, with at least 3 PMC >> affirmative votes. >>> >>> Please use the provided document, as discussed before, for coordinating >> the testing efforts: [7] >>> >>> Thanks, >>> Aljoscha >>> >>> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa? >> projectId=12315522=12340984 >>> [2] http://people.apache.org/~aljoscha/flink-1.3.2-rc2/ >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS >>> [4] https://repository.apache.org/content/repositories/ >> orgapacheflink-1133/ >>> [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h= >> e38825d0c8e7fe2191a4c657984d9939ed8dd0ad >>> [6] https://github.com/apache/flink-web/pull/75 >>> [7] https://docs.google.com/document/d/1dN9AM9FUPizIu4hTKAXJSbbAORRdr >> ce-BqQ8AUHlOqE/edit?usp=sharing >> >>
[jira] [Created] (FLINK-7364) Log exceptions from user code in streaming jobs
Elias Levy created FLINK-7364: - Summary: Log exceptions from user code in streaming jobs Key: FLINK-7364 URL: https://issues.apache.org/jira/browse/FLINK-7364 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 1.3.1 Reporter: Elias Levy Currently, if an exception arises in user supplied code within an operator in a streaming job, Flink terminates the job, but it fails to record the reason for the termination. The logs do not record that there was an exception at all, much less recording the type of exception and where it occurred. This makes it difficult to debug jobs without implementing exception recording code on all user supplied operators. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: [DISCUSS] Service Authorization (redux)
Till, with (c) are you suggesting that we'd use Akka 2.3 for Scala 2.10 and Akka 2.4+ for Scala 2.11+? Sounds reasonable but I don't know how feasible it is. I will say I'm optimistic because a) Akka 2.4 is said to be binary compatible, and b) the Flakka fork appears to be subsumed by 2.4. Let us then take (c) as the tentative plan. I agree the community should discuss dropping Scala 2.10 but I don't want to drive that conversation. Thanks On Thu, Aug 3, 2017 at 6:24 AM, Ufuk Celebiwrote: > I haven't followed this discussion in detail nor am I familiar with > the service authorization topic or Flakka, but a) sounds like a lot of > maintenance work to me. > > If possible I would go with c) and maybe start a discussion about > dropping Scala 2.10 support to check whether that is a viable option > or not. > > – Ufuk > > > On Thu, Aug 3, 2017 at 1:59 PM, Till Rohrmann > wrote: > > Alternatively there would also be an option > > > > c) only support mutual auth for Akka 2.4+ if the backport is unrealistic > to > > do > > > > But this of course would break security for Scala 2.10. On the other hand > > people are already using Flink without this feature. > > > > Cheers, > > Till > > > > On Wed, Aug 2, 2017 at 7:21 PM, Eron Wright > wrote: > > > >> Thanks Till and Aljoscha for the feedback. > >> > >> Seems there are two ways to proceed here, if we accept mutual SSL as the > >> basis. > >> > >> a) Backport mutual-auth support from Akka 2.4 to Flakka. > >> b) Drop support for Scala 2.10 (FLINK-?), move to Akka 2.4 (FLINK-3662). > >> > >> Let's assume (a) for now. > >> > >> > >> > >> On Tue, Aug 1, 2017 at 3:05 PM, Till Rohrmann > >> wrote: > >> > >> > Dropping Java 7 alone is not enough to move to Akka 2.4+. For that we > >> need > >> > at least Scala 2.11. > >> > > >> > Cheers, > >> > Till > >> > > >> > On Tue, Aug 1, 2017 at 4:22 PM, Aljoscha Krettek > > >> > wrote: > >> > > >> > > Hi Eron, > >> > > > >> > > I think after Dropping support for Java 7 we will move to Akka > 2.4+, so > >> > we > >> > > should be good there. I think quite some users should find a (more) > >> > secure > >> > > Flink interesting. > >> > > > >> > > Best, > >> > > Aljoscha > >> > > > On 24. Jul 2017, at 03:11, Eron Wright > wrote: > >> > > > > >> > > > Hello, now might be a good time to revisit an important > enhancement > >> to > >> > > > Flink security, so-called service authorization. This means the > >> > > hardening > >> > > > of a Flink cluster against unauthorized use with some sort of > >> > > > authentication and authorization scheme. Today, Flink relies > >> entirely > >> > > on > >> > > > network isolation to protect itself from unauthorized job > submission > >> > and > >> > > > control, and to protect the secrets contained within a Flink > cluster. > >> > > > This is a problem in multi-user environments like YARN/Mesos/K8. > >> > > > > >> > > > Last fall, an effort was made to implement service authorization > but > >> > the > >> > > PR > >> > > > was ultimately rejected. The idea was to add a simple secret > key to > >> > all > >> > > > network communication between the client, JM, and TM. Akka > itself > >> has > >> > > > such a feature which formed the basis of the solution. There are > >> > > usability > >> > > > challenges with this solution, including a dependency on SSL. > >> > > > > >> > > > Since then, the situation has evolved somewhat, and the use of SSL > >> > mutual > >> > > > authentication is more viable. Mutual auth is supported in Akka > >> > 2.4.12+ > >> > > > (or could be backported to Flakka). My proposal is: > >> > > > > >> > > > 1. Upgrade Akka or backport the functionality to Flakka (see > commit > >> > > > 5d03902c5ec3212cd28f26c9b3ef7c3b628b9451). > >> > > > 2. Implement SSL on any endpoint that doesn't yet support it (e.g. > >> > > > queryable state). > >> > > > 3. Enable mutual auth in Akka and implement it on non-Akka > endpoints. > >> > > > 4. Implement a simple authorization layer that accepts any > >> > authenticated > >> > > > connection. > >> > > > 5. (stretch) generate and store a certificate automatically in > YARN > >> > mode. > >> > > > 6. (stretch) Develop an alternate authentication method for the > Web > >> UI. > >> > > > > >> > > > Are folks interested in this capability? Thoughts on the use of > SSL > >> > > mutual > >> > > > auth versus something else? Thanks! > >> > > > > >> > > > -Eron > >> > > > >> > > > >> > > >> >
State Backend
Hello, I would like to know if we have any latency requirements for choosing appropriate state backend? For example, if an HCFS implementation is used as Flink state backend (instead of stock HDFS), are there any implications that one needs to know with respect to the performance? - Frequency of read/write operations, random vs sequential reads- Load/Usage pattern (Frequent small updates vs bulk operation)- RocksDB->HCFS (Is this kind of recommended option to mitigate some of the challenges outlined above)- S3 Vs HDFS any performance numbers? Appreciate any inputs on this. RegardsVijay
Re: [DISCUSS] Service Authorization (redux)
I haven't followed this discussion in detail nor am I familiar with the service authorization topic or Flakka, but a) sounds like a lot of maintenance work to me. If possible I would go with c) and maybe start a discussion about dropping Scala 2.10 support to check whether that is a viable option or not. – Ufuk On Thu, Aug 3, 2017 at 1:59 PM, Till Rohrmannwrote: > Alternatively there would also be an option > > c) only support mutual auth for Akka 2.4+ if the backport is unrealistic to > do > > But this of course would break security for Scala 2.10. On the other hand > people are already using Flink without this feature. > > Cheers, > Till > > On Wed, Aug 2, 2017 at 7:21 PM, Eron Wright wrote: > >> Thanks Till and Aljoscha for the feedback. >> >> Seems there are two ways to proceed here, if we accept mutual SSL as the >> basis. >> >> a) Backport mutual-auth support from Akka 2.4 to Flakka. >> b) Drop support for Scala 2.10 (FLINK-?), move to Akka 2.4 (FLINK-3662). >> >> Let's assume (a) for now. >> >> >> >> On Tue, Aug 1, 2017 at 3:05 PM, Till Rohrmann >> wrote: >> >> > Dropping Java 7 alone is not enough to move to Akka 2.4+. For that we >> need >> > at least Scala 2.11. >> > >> > Cheers, >> > Till >> > >> > On Tue, Aug 1, 2017 at 4:22 PM, Aljoscha Krettek >> > wrote: >> > >> > > Hi Eron, >> > > >> > > I think after Dropping support for Java 7 we will move to Akka 2.4+, so >> > we >> > > should be good there. I think quite some users should find a (more) >> > secure >> > > Flink interesting. >> > > >> > > Best, >> > > Aljoscha >> > > > On 24. Jul 2017, at 03:11, Eron Wright wrote: >> > > > >> > > > Hello, now might be a good time to revisit an important enhancement >> to >> > > > Flink security, so-called service authorization. This means the >> > > hardening >> > > > of a Flink cluster against unauthorized use with some sort of >> > > > authentication and authorization scheme. Today, Flink relies >> entirely >> > > on >> > > > network isolation to protect itself from unauthorized job submission >> > and >> > > > control, and to protect the secrets contained within a Flink cluster. >> > > > This is a problem in multi-user environments like YARN/Mesos/K8. >> > > > >> > > > Last fall, an effort was made to implement service authorization but >> > the >> > > PR >> > > > was ultimately rejected. The idea was to add a simple secret key to >> > all >> > > > network communication between the client, JM, and TM. Akka itself >> has >> > > > such a feature which formed the basis of the solution. There are >> > > usability >> > > > challenges with this solution, including a dependency on SSL. >> > > > >> > > > Since then, the situation has evolved somewhat, and the use of SSL >> > mutual >> > > > authentication is more viable. Mutual auth is supported in Akka >> > 2.4.12+ >> > > > (or could be backported to Flakka). My proposal is: >> > > > >> > > > 1. Upgrade Akka or backport the functionality to Flakka (see commit >> > > > 5d03902c5ec3212cd28f26c9b3ef7c3b628b9451). >> > > > 2. Implement SSL on any endpoint that doesn't yet support it (e.g. >> > > > queryable state). >> > > > 3. Enable mutual auth in Akka and implement it on non-Akka endpoints. >> > > > 4. Implement a simple authorization layer that accepts any >> > authenticated >> > > > connection. >> > > > 5. (stretch) generate and store a certificate automatically in YARN >> > mode. >> > > > 6. (stretch) Develop an alternate authentication method for the Web >> UI. >> > > > >> > > > Are folks interested in this capability? Thoughts on the use of SSL >> > > mutual >> > > > auth versus something else? Thanks! >> > > > >> > > > -Eron >> > > >> > > >> > >>
Re: [VOTE] Release 1.3.2, release candidate #2
+1 - checked hashes and signatures - checked added and modified dependencies since Flink 1.3.1 - io.dropwizard.metrics/metrics-core was upgraded to 3.1.5 and is ASL - org.xerial.snappy/snappy-java/1.0.5 was added and is ASL - built from source Thanks, Fabian 2017-08-03 11:24 GMT+02:00 Stefan Richter: > +1 > > 1. Cluster tests > > Tested with the stateful state machine job on the following settings: > - Cloud env: AWS > - Distributions: EMR 5.7.0 > - Flink deployment method: YARN (Hadoop 2.7.2) > - HA: enabled > - Kerberos: disabled > - Kafka version: 0.10, 0.11 > - State Backends: Heap (Sync / Async) & RocksDB (incremental / full) > - Filesystem: S3 and HDFS (Hadoop 2.7.2) > - Externalized checkpoints: enabled & disabled > > 2. Building with Scala 2.11 works > > 3. Building against Hadoop version works > > > > Am 30.07.2017 um 09:07 schrieb Aljoscha Krettek : > > > > Hi everyone, > > > > Please review and vote on the release candidate #2 for the version > 1.3.2, as follows: > > [ ] +1, Approve the release > > [ ] -1, Do not approve the release (please provide specific comments) > > > > > > The complete staging area is available for your review, which includes: > > * JIRA release notes [1], > > * the official Apache source release and binary convenience releases to > be deployed to dist.apache.org [2], which is signed with the key with > fingerprint 0xA8F4FD97121D7293 [3], > > * all artifacts to be deployed to the Maven Central Repository [4], > > * source code tag "release-1.3.2-rc2" [5], > > * website pull request listing the new release and adding announcement > blog post [6]. > > > > The vote will be open for at least 72 hours (excluding this current > weekend). It is adopted by majority approval, with at least 3 PMC > affirmative votes. > > > > Please use the provided document, as discussed before, for coordinating > the testing efforts: [7] > > > > Thanks, > > Aljoscha > > > > [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa? > projectId=12315522=12340984 > > [2] http://people.apache.org/~aljoscha/flink-1.3.2-rc2/ > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > [4] https://repository.apache.org/content/repositories/ > orgapacheflink-1133/ > > [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h= > e38825d0c8e7fe2191a4c657984d9939ed8dd0ad > > [6] https://github.com/apache/flink-web/pull/75 > > [7] https://docs.google.com/document/d/1dN9AM9FUPizIu4hTKAXJSbbAORRdr > ce-BqQ8AUHlOqE/edit?usp=sharing > >
Re: [DISCUSS] Service Authorization (redux)
Alternatively there would also be an option c) only support mutual auth for Akka 2.4+ if the backport is unrealistic to do But this of course would break security for Scala 2.10. On the other hand people are already using Flink without this feature. Cheers, Till On Wed, Aug 2, 2017 at 7:21 PM, Eron Wrightwrote: > Thanks Till and Aljoscha for the feedback. > > Seems there are two ways to proceed here, if we accept mutual SSL as the > basis. > > a) Backport mutual-auth support from Akka 2.4 to Flakka. > b) Drop support for Scala 2.10 (FLINK-?), move to Akka 2.4 (FLINK-3662). > > Let's assume (a) for now. > > > > On Tue, Aug 1, 2017 at 3:05 PM, Till Rohrmann > wrote: > > > Dropping Java 7 alone is not enough to move to Akka 2.4+. For that we > need > > at least Scala 2.11. > > > > Cheers, > > Till > > > > On Tue, Aug 1, 2017 at 4:22 PM, Aljoscha Krettek > > wrote: > > > > > Hi Eron, > > > > > > I think after Dropping support for Java 7 we will move to Akka 2.4+, so > > we > > > should be good there. I think quite some users should find a (more) > > secure > > > Flink interesting. > > > > > > Best, > > > Aljoscha > > > > On 24. Jul 2017, at 03:11, Eron Wright wrote: > > > > > > > > Hello, now might be a good time to revisit an important enhancement > to > > > > Flink security, so-called service authorization. This means the > > > hardening > > > > of a Flink cluster against unauthorized use with some sort of > > > > authentication and authorization scheme. Today, Flink relies > entirely > > > on > > > > network isolation to protect itself from unauthorized job submission > > and > > > > control, and to protect the secrets contained within a Flink cluster. > > > > This is a problem in multi-user environments like YARN/Mesos/K8. > > > > > > > > Last fall, an effort was made to implement service authorization but > > the > > > PR > > > > was ultimately rejected. The idea was to add a simple secret key to > > all > > > > network communication between the client, JM, and TM. Akka itself > has > > > > such a feature which formed the basis of the solution. There are > > > usability > > > > challenges with this solution, including a dependency on SSL. > > > > > > > > Since then, the situation has evolved somewhat, and the use of SSL > > mutual > > > > authentication is more viable. Mutual auth is supported in Akka > > 2.4.12+ > > > > (or could be backported to Flakka). My proposal is: > > > > > > > > 1. Upgrade Akka or backport the functionality to Flakka (see commit > > > > 5d03902c5ec3212cd28f26c9b3ef7c3b628b9451). > > > > 2. Implement SSL on any endpoint that doesn't yet support it (e.g. > > > > queryable state). > > > > 3. Enable mutual auth in Akka and implement it on non-Akka endpoints. > > > > 4. Implement a simple authorization layer that accepts any > > authenticated > > > > connection. > > > > 5. (stretch) generate and store a certificate automatically in YARN > > mode. > > > > 6. (stretch) Develop an alternate authentication method for the Web > UI. > > > > > > > > Are folks interested in this capability? Thoughts on the use of SSL > > > mutual > > > > auth versus something else? Thanks! > > > > > > > > -Eron > > > > > > > > >
[jira] [Created] (FLINK-7363) add hashes and signatures to the download page
Nico Kruber created FLINK-7363: -- Summary: add hashes and signatures to the download page Key: FLINK-7363 URL: https://issues.apache.org/jira/browse/FLINK-7363 Project: Flink Issue Type: Improvement Components: Project Website Reporter: Nico Kruber Assignee: Nico Kruber As part of the releases, we also generate MD5 hashes and cryptographic signatures but neither link to those nor do we explain which keys are valid release-signing keys. This should be added. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7362) CheckpointProperties are not correctly set when restoring savepoint with HA enabled
Aljoscha Krettek created FLINK-7362: --- Summary: CheckpointProperties are not correctly set when restoring savepoint with HA enabled Key: FLINK-7362 URL: https://issues.apache.org/jira/browse/FLINK-7362 Project: Flink Issue Type: Bug Components: State Backends, Checkpointing Affects Versions: 1.4.0, 1.3.2 Reporter: Aljoscha Krettek Fix For: 1.4.0 When restoring a savepoint on a HA setup (with ZooKeeper) the web frontend incorrectly says "Type: Checkpoint" in the information box about the latest restore event. The information that this uses is set here: https://github.com/apache/flink/blob/09caa9ffdc8168610c7d0260360c034ea87f904c/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java#L1101 It seems that the {{CheckpointProperties}} of a restored savepoint somehow get lost, maybe because of the recover step that the {{ZookeeperCompletedCheckpointStore}} is going through. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: [VOTE] Release 1.3.2, release candidate #2
+1 1. Cluster tests Tested with the stateful state machine job on the following settings: - Cloud env: AWS - Distributions: EMR 5.7.0 - Flink deployment method: YARN (Hadoop 2.7.2) - HA: enabled - Kerberos: disabled - Kafka version: 0.10, 0.11 - State Backends: Heap (Sync / Async) & RocksDB (incremental / full) - Filesystem: S3 and HDFS (Hadoop 2.7.2) - Externalized checkpoints: enabled & disabled 2. Building with Scala 2.11 works 3. Building against Hadoop version works > Am 30.07.2017 um 09:07 schrieb Aljoscha Krettek: > > Hi everyone, > > Please review and vote on the release candidate #2 for the version 1.3.2, as > follows: > [ ] +1, Approve the release > [ ] -1, Do not approve the release (please provide specific comments) > > > The complete staging area is available for your review, which includes: > * JIRA release notes [1], > * the official Apache source release and binary convenience releases to be > deployed to dist.apache.org [2], which is signed with the key with > fingerprint 0xA8F4FD97121D7293 [3], > * all artifacts to be deployed to the Maven Central Repository [4], > * source code tag "release-1.3.2-rc2" [5], > * website pull request listing the new release and adding announcement blog > post [6]. > > The vote will be open for at least 72 hours (excluding this current weekend). > It is adopted by majority approval, with at least 3 PMC affirmative votes. > > Please use the provided document, as discussed before, for coordinating the > testing efforts: [7] > > Thanks, > Aljoscha > > [1] > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12340984 > [2] http://people.apache.org/~aljoscha/flink-1.3.2-rc2/ > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > [4] https://repository.apache.org/content/repositories/orgapacheflink-1133/ > [5] > https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=e38825d0c8e7fe2191a4c657984d9939ed8dd0ad > [6] https://github.com/apache/flink-web/pull/75 > [7] > https://docs.google.com/document/d/1dN9AM9FUPizIu4hTKAXJSbbAORRdrce-BqQ8AUHlOqE/edit?usp=sharing
[jira] [Created] (FLINK-7361) flink-web doesn't build with ruby 2.4
Nico Kruber created FLINK-7361: -- Summary: flink-web doesn't build with ruby 2.4 Key: FLINK-7361 URL: https://issues.apache.org/jira/browse/FLINK-7361 Project: Flink Issue Type: Bug Components: Project Website Reporter: Nico Kruber Assignee: Nico Kruber The dependencies pulled in by the old jekyll version do not build with ruby 2.4 and fail with something like {code} yajl_ext.c:881:22: error: 'rb_cFixnum' undeclared (first use in this function); did you mean 'rb_isalnum'? {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7360) Support Scala map type
Timo Walther created FLINK-7360: --- Summary: Support Scala map type Key: FLINK-7360 URL: https://issues.apache.org/jira/browse/FLINK-7360 Project: Flink Issue Type: New Feature Components: Table API & SQL Reporter: Timo Walther Currently, Flink SQL supports only Java `java.util.Map`. Scala maps are treated as a blackbox with Flink `GenericTypeInfo`/SQL `ANY` data type. Therefore, you can forward these blackboxes and use them within scalar functions but accessing with the `['key']` operator is not supported. We should convert these special collections at the beginning, in order to use in a SQL statement. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7359) flink-storm should update ComponentConfiguration to stormConfig
yf created FLINK-7359: - Summary: flink-storm should update ComponentConfiguration to stormConfig Key: FLINK-7359 URL: https://issues.apache.org/jira/browse/FLINK-7359 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 1.3.1 Reporter: yf We should first to get the config from the storm component config, then update it to storm config. -- This message was sent by Atlassian JIRA (v6.4.14#64029)