[jira] [Created] (FLINK-7368) MetricStore makes cpu spin at 100%

2017-08-03 Thread Nico Chen (JIRA)
Nico Chen created FLINK-7368:


 Summary: MetricStore makes cpu spin at 100%
 Key: FLINK-7368
 URL: https://issues.apache.org/jira/browse/FLINK-7368
 Project: Flink
  Issue Type: Bug
  Components: Metrics
Reporter: Nico Chen


Flink's `MetricStore` is not thread-safe. multi-treads may acess java' hashmap 
inside `MetricStore` and can tirgger hashmap's infinte loop. 
Recently I met the case that flink jobmanager consumed 100% cpu. A part of 
stacktrace is shown below. The full jstack is in the attachment.
{code:java}
"ForkJoinPool-1-worker-19" daemon prio=10 tid=0x7fbdacac9800 nid=0x64c1 
runnable [0x7fbd7d1c2000]
   java.lang.Thread.State: RUNNABLE
at java.util.HashMap.put(HashMap.java:494)
at 
org.apache.flink.runtime.webmonitor.metrics.MetricStore.addMetric(MetricStore.java:176)
at 
org.apache.flink.runtime.webmonitor.metrics.MetricStore.add(MetricStore.java:121)
at 
org.apache.flink.runtime.webmonitor.metrics.MetricFetcher.addMetrics(MetricFetcher.java:198)
at 
org.apache.flink.runtime.webmonitor.metrics.MetricFetcher.access$500(MetricFetcher.java:58)
at 
org.apache.flink.runtime.webmonitor.metrics.MetricFetcher$4.onSuccess(MetricFetcher.java:188)
at akka.dispatch.OnSuccess.internal(Future.scala:212)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:175)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:172)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at 
scala.runtime.AbstractPartialFunction.applyOrElse(AbstractPartialFunction.scala:28)
at scala.concurrent.Future$$anonfun$onSuccess$1.apply(Future.scala:117)
at scala.concurrent.Future$$anonfun$onSuccess$1.apply(Future.scala:115)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at 
java.util.concurrent.ForkJoinTask$AdaptedRunnable.exec(ForkJoinTask.java:1265)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:334)
at 
java.util.concurrent.ForkJoinWorkerThread.execTask(ForkJoinWorkerThread.java:604)
at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:784)
at java.util.concurrent.ForkJoinPool.work(ForkJoinPool.java:646)
at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:398)
{code}

There are 24 threads show same stacktrace as above to indicate they are spining 
at HashMap.put(HashMap.java:494) (I am using Java 1.7.0_6). Many posts indicate 
multi-threads accessing hashmap cause this problem and I reproduce the case as 
well. Even through `MetricFetcher` has a 10 seconds minimum inteverl between 
each metrics qurey, it still cannot guarntee query responses do not acess 
`MtricStore`'s hashmap concurrently. 
 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7367) Parameterize FlinkKinesisProducer on RecordMaxBufferedTime, MaxConnections, RequestTimeout, and more

2017-08-03 Thread Bowen Li (JIRA)
Bowen Li created FLINK-7367:
---

 Summary: Parameterize FlinkKinesisProducer on 
RecordMaxBufferedTime, MaxConnections, RequestTimeout, and more
 Key: FLINK-7367
 URL: https://issues.apache.org/jira/browse/FLINK-7367
 Project: Flink
  Issue Type: Improvement
  Components: Kinesis Connector
Affects Versions: 1.3.0
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.3.3


Right now, FlinkKinesisProducer only expose two configs for the underlying 
KinesisProducer:

- AGGREGATION_MAX_COUNT
- COLLECTION_MAX_COUNT

Well, according to [AWS 
doc|http://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-config.html] and 
[their sample on 
github|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/default_config.properties],
 developers can set more to make the max use of KinesisProducer, and make it 
fault-tolerant (e.g. by increasing timeout).

We need to parameterize FlinkKinesisProducer in order to pass in params listed 
in AWS doc and sample



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7366) Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 0.12.5

2017-08-03 Thread Bowen Li (JIRA)
Bowen Li created FLINK-7366:
---

 Summary: Upgrade kinesis-producer-library in 
flink-connector-kinesis from 0.10.2 to 0.12.5
 Key: FLINK-7366
 URL: https://issues.apache.org/jira/browse/FLINK-7366
 Project: Flink
  Issue Type: Improvement
  Components: Kinesis Connector
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.4.0


flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic.

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7365) warning of attempt to override final parameter: fs.s3.buffer.dir

2017-08-03 Thread Bowen Li (JIRA)
Bowen Li created FLINK-7365:
---

 Summary: warning of attempt to override final parameter: 
fs.s3.buffer.dir
 Key: FLINK-7365
 URL: https://issues.apache.org/jira/browse/FLINK-7365
 Project: Flink
  Issue Type: Bug
Reporter: Bowen Li


I'm seeing hundreds of line of the following log in my JobManager log file:


{code:java}
2017-08-03 19:48:45,330 WARN  org.apache.hadoop.conf.Configuration  
- /usr/lib/hadoop/etc/hadoop/core-site.xml:an attempt to override 
final parameter: fs.s3.buffer.dir;  Ignoring.
2017-08-03 19:48:45,485 WARN  org.apache.hadoop.conf.Configuration  
- /etc/hadoop/conf/core-site.xml:an attempt to override final 
parameter: fs.s3.buffer.dir;  Ignoring.
2017-08-03 19:48:45,486 WARN  org.apache.hadoop.conf.Configuration  
- /usr/lib/hadoop/etc/hadoop/core-site.xml:an attempt to override 
final parameter: fs.s3.buffer.dir;  Ignoring.
2017-08-03 19:48:45,626 WARN  org.apache.hadoop.conf.Configuration  
- /etc/hadoop/conf/core-site.xml:an attempt to override final 
parameter: fs.s3.buffer.dir;  Ignoring
..
{code}

Info of my Flink cluster:

- Running on EMR with emr-5.6.0
- Using FSStateBackend, writing checkpointing data files to s3
- Configured s3 with S3AFileSystem according to 
https://ci.apache.org/projects/flink/flink-docs-release-1.4/setup/aws.html#set-s3-filesystem
- AWS forbids resetting 'fs.s3.buffer.dir' value (it has a  tag on this 
property in core-site.xml), so I set 'fs.s3a.buffer.dir' as '/tmp'

Here's my core-site.xml file:


{code:java}




















  fs.s3.buffer.dir
  /mnt/s3,/mnt1/s3
  true



  fs.s3.impl
  org.apache.hadoop.fs.s3a.S3AFileSystem



  fs.s3n.impl
  com.amazon.ws.emr.hadoop.fs.EmrFileSystem


  
ipc.client.connect.max.retries.on.timeouts
5
  

  
hadoop.security.key.default.bitlength
256
  

  
hadoop.proxyuser.hadoop.groups
*
  

  
hadoop.tmp.dir
/mnt/var/lib/hadoop/tmp
  

  
hadoop.proxyuser.hadoop.hosts
*
  

  
io.file.buffer.size
65536
  

  
fs.AbstractFileSystem.s3.impl
org.apache.hadoop.fs.s3.EMRFSDelegate
  

  
fs.s3a.buffer.dir
/tmp
  

  
fs.s3bfs.impl
org.apache.hadoop.fs.s3.S3FileSystem
  


{code}






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[CANCEL] [VOTE] Release 1.3.2, release candidate #2

2017-08-03 Thread Aljoscha Krettek
Thanks for testing everyone. Unfortunately, I have to close this vote because 
of the problem with the Gelly Examples jar. I'll immediately create another RC 
(in fact I'm already on it). I'm hoping that we can do an expedited vote on 
that new RC because the only change in that RC is going to be a fix for the 
Gelly Examples jar.

Best,
Aljoscha
> On 3. Aug 2017, at 14:10, Fabian Hueske  wrote:
> 
> +1
> 
> - checked hashes and signatures
> - checked added and modified dependencies since Flink 1.3.1
>  - io.dropwizard.metrics/metrics-core was upgraded to 3.1.5 and is ASL
>  - org.xerial.snappy/snappy-java/1.0.5 was added and is ASL
> 
> - built from source
> 
> Thanks, Fabian
> 
> 2017-08-03 11:24 GMT+02:00 Stefan Richter :
> 
>> +1
>> 
>> 1. Cluster tests
>> 
>> Tested with the stateful state machine job on the following settings:
>> - Cloud env: AWS
>> - Distributions: EMR 5.7.0
>> - Flink deployment method: YARN (Hadoop 2.7.2)
>> - HA: enabled
>> - Kerberos: disabled
>> - Kafka version: 0.10, 0.11
>> - State Backends: Heap (Sync / Async) & RocksDB (incremental / full)
>> - Filesystem: S3 and HDFS (Hadoop 2.7.2)
>> - Externalized checkpoints: enabled & disabled
>> 
>> 2. Building with Scala 2.11 works
>> 
>> 3. Building against Hadoop version works
>> 
>> 
>>> Am 30.07.2017 um 09:07 schrieb Aljoscha Krettek :
>>> 
>>> Hi everyone,
>>> 
>>> Please review and vote on the release candidate #2 for the version
>> 1.3.2, as follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific comments)
>>> 
>>> 
>>> The complete staging area is available for your review, which includes:
>>> * JIRA release notes [1],
>>> * the official Apache source release and binary convenience releases to
>> be deployed to dist.apache.org [2], which is signed with the key with
>> fingerprint 0xA8F4FD97121D7293 [3],
>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>> * source code tag "release-1.3.2-rc2" [5],
>>> * website pull request listing the new release and adding announcement
>> blog post [6].
>>> 
>>> The vote will be open for at least 72 hours (excluding this current
>> weekend). It is adopted by majority approval, with at least 3 PMC
>> affirmative votes.
>>> 
>>> Please use the provided document, as discussed before, for coordinating
>> the testing efforts: [7]
>>> 
>>> Thanks,
>>> Aljoscha
>>> 
>>> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>> projectId=12315522=12340984
>>> [2] http://people.apache.org/~aljoscha/flink-1.3.2-rc2/
>>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>>> [4] https://repository.apache.org/content/repositories/
>> orgapacheflink-1133/
>>> [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=
>> e38825d0c8e7fe2191a4c657984d9939ed8dd0ad
>>> [6] https://github.com/apache/flink-web/pull/75
>>> [7] https://docs.google.com/document/d/1dN9AM9FUPizIu4hTKAXJSbbAORRdr
>> ce-BqQ8AUHlOqE/edit?usp=sharing
>> 
>> 



[jira] [Created] (FLINK-7364) Log exceptions from user code in streaming jobs

2017-08-03 Thread Elias Levy (JIRA)
Elias Levy created FLINK-7364:
-

 Summary: Log exceptions from user code in streaming jobs
 Key: FLINK-7364
 URL: https://issues.apache.org/jira/browse/FLINK-7364
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.3.1
Reporter: Elias Levy


Currently, if an exception arises in user supplied code within an operator in a 
streaming job, Flink terminates the job, but it fails to record the reason for 
the termination.  The logs do not record that there was an exception at all, 
much less recording the type of exception and where it occurred.  This makes it 
difficult to debug jobs without implementing exception recording code on all 
user supplied operators. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Service Authorization (redux)

2017-08-03 Thread Eron Wright
Till, with (c) are you suggesting that we'd use Akka 2.3 for Scala 2.10 and
Akka 2.4+ for Scala 2.11+?   Sounds reasonable but I don't know how
feasible it is.   I will say I'm optimistic because a) Akka 2.4 is said to
be binary compatible, and b) the Flakka fork appears to be subsumed by 2.4.

Let us then take (c) as the tentative plan.

I agree the community should discuss dropping Scala 2.10 but I don't want
to drive that conversation.

Thanks


On Thu, Aug 3, 2017 at 6:24 AM, Ufuk Celebi  wrote:

> I haven't followed this discussion in detail nor am I familiar with
> the service authorization topic or Flakka, but a) sounds like a lot of
> maintenance work to me.
>
> If possible I would go with c) and maybe start a discussion about
> dropping Scala 2.10 support to check whether that is a viable option
> or not.
>
> – Ufuk
>
>
> On Thu, Aug 3, 2017 at 1:59 PM, Till Rohrmann 
> wrote:
> > Alternatively there would also be an option
> >
> > c) only support mutual auth for Akka 2.4+ if the backport is unrealistic
> to
> > do
> >
> > But this of course would break security for Scala 2.10. On the other hand
> > people are already using Flink without this feature.
> >
> > Cheers,
> > Till
> >
> > On Wed, Aug 2, 2017 at 7:21 PM, Eron Wright 
> wrote:
> >
> >> Thanks Till and Aljoscha for the feedback.
> >>
> >> Seems there are two ways to proceed here, if we accept mutual SSL as the
> >> basis.
> >>
> >> a) Backport mutual-auth support from Akka 2.4 to Flakka.
> >> b) Drop support for Scala 2.10 (FLINK-?), move to Akka 2.4 (FLINK-3662).
> >>
> >> Let's assume (a) for now.
> >>
> >>
> >>
> >> On Tue, Aug 1, 2017 at 3:05 PM, Till Rohrmann 
> >> wrote:
> >>
> >> > Dropping Java 7 alone is not enough to move to Akka 2.4+. For that we
> >> need
> >> > at least Scala 2.11.
> >> >
> >> > Cheers,
> >> > Till
> >> >
> >> > On Tue, Aug 1, 2017 at 4:22 PM, Aljoscha Krettek  >
> >> > wrote:
> >> >
> >> > > Hi Eron,
> >> > >
> >> > > I think after Dropping support for Java 7 we will move to Akka
> 2.4+, so
> >> > we
> >> > > should be good there. I think quite some users should find a (more)
> >> > secure
> >> > > Flink interesting.
> >> > >
> >> > > Best,
> >> > > Aljoscha
> >> > > > On 24. Jul 2017, at 03:11, Eron Wright 
> wrote:
> >> > > >
> >> > > > Hello, now might be a good time to revisit an important
> enhancement
> >> to
> >> > > > Flink security, so-called service authorization.   This means the
> >> > > hardening
> >> > > > of a Flink cluster against unauthorized use with some sort of
> >> > > > authentication and authorization scheme.   Today, Flink relies
> >> entirely
> >> > > on
> >> > > > network isolation to protect itself from unauthorized job
> submission
> >> > and
> >> > > > control, and to protect the secrets contained within a Flink
> cluster.
> >> > > > This is a problem in multi-user environments like YARN/Mesos/K8.
> >> > > >
> >> > > > Last fall, an effort was made to implement service authorization
> but
> >> > the
> >> > > PR
> >> > > > was ultimately rejected.   The idea was to add a simple secret
> key to
> >> > all
> >> > > > network communication between the client, JM, and TM.   Akka
> itself
> >> has
> >> > > > such a feature which formed the basis of the solution.  There are
> >> > > usability
> >> > > > challenges with this solution, including a dependency on SSL.
> >> > > >
> >> > > > Since then, the situation has evolved somewhat, and the use of SSL
> >> > mutual
> >> > > > authentication is more viable.   Mutual auth is supported in Akka
> >> > 2.4.12+
> >> > > > (or could be backported to Flakka).  My proposal is:
> >> > > >
> >> > > > 1. Upgrade Akka or backport the functionality to Flakka (see
> commit
> >> > > > 5d03902c5ec3212cd28f26c9b3ef7c3b628b9451).
> >> > > > 2. Implement SSL on any endpoint that doesn't yet support it (e.g.
> >> > > > queryable state).
> >> > > > 3. Enable mutual auth in Akka and implement it on non-Akka
> endpoints.
> >> > > > 4. Implement a simple authorization layer that accepts any
> >> > authenticated
> >> > > > connection.
> >> > > > 5. (stretch) generate and store a certificate automatically in
> YARN
> >> > mode.
> >> > > > 6. (stretch) Develop an alternate authentication method for the
> Web
> >> UI.
> >> > > >
> >> > > > Are folks interested in this capability?  Thoughts on the use of
> SSL
> >> > > mutual
> >> > > > auth versus something else?  Thanks!
> >> > > >
> >> > > > -Eron
> >> > >
> >> > >
> >> >
> >>
>


State Backend

2017-08-03 Thread Vijay Srinivasaraghavan
Hello,
I would like to know if we have any latency requirements for choosing 
appropriate state backend? 
For example, if an HCFS implementation is used as Flink state backend (instead 
of stock HDFS), are there any implications that one needs to know with respect 
to the performance?
- Frequency of read/write operations, random vs sequential reads- Load/Usage 
pattern (Frequent small updates vs bulk operation)- RocksDB->HCFS (Is this kind 
of recommended option to mitigate some of the challenges outlined above)- S3 Vs 
HDFS any performance numbers?
Appreciate any inputs on this.
RegardsVijay



Re: [DISCUSS] Service Authorization (redux)

2017-08-03 Thread Ufuk Celebi
I haven't followed this discussion in detail nor am I familiar with
the service authorization topic or Flakka, but a) sounds like a lot of
maintenance work to me.

If possible I would go with c) and maybe start a discussion about
dropping Scala 2.10 support to check whether that is a viable option
or not.

– Ufuk


On Thu, Aug 3, 2017 at 1:59 PM, Till Rohrmann  wrote:
> Alternatively there would also be an option
>
> c) only support mutual auth for Akka 2.4+ if the backport is unrealistic to
> do
>
> But this of course would break security for Scala 2.10. On the other hand
> people are already using Flink without this feature.
>
> Cheers,
> Till
>
> On Wed, Aug 2, 2017 at 7:21 PM, Eron Wright  wrote:
>
>> Thanks Till and Aljoscha for the feedback.
>>
>> Seems there are two ways to proceed here, if we accept mutual SSL as the
>> basis.
>>
>> a) Backport mutual-auth support from Akka 2.4 to Flakka.
>> b) Drop support for Scala 2.10 (FLINK-?), move to Akka 2.4 (FLINK-3662).
>>
>> Let's assume (a) for now.
>>
>>
>>
>> On Tue, Aug 1, 2017 at 3:05 PM, Till Rohrmann 
>> wrote:
>>
>> > Dropping Java 7 alone is not enough to move to Akka 2.4+. For that we
>> need
>> > at least Scala 2.11.
>> >
>> > Cheers,
>> > Till
>> >
>> > On Tue, Aug 1, 2017 at 4:22 PM, Aljoscha Krettek 
>> > wrote:
>> >
>> > > Hi Eron,
>> > >
>> > > I think after Dropping support for Java 7 we will move to Akka 2.4+, so
>> > we
>> > > should be good there. I think quite some users should find a (more)
>> > secure
>> > > Flink interesting.
>> > >
>> > > Best,
>> > > Aljoscha
>> > > > On 24. Jul 2017, at 03:11, Eron Wright  wrote:
>> > > >
>> > > > Hello, now might be a good time to revisit an important enhancement
>> to
>> > > > Flink security, so-called service authorization.   This means the
>> > > hardening
>> > > > of a Flink cluster against unauthorized use with some sort of
>> > > > authentication and authorization scheme.   Today, Flink relies
>> entirely
>> > > on
>> > > > network isolation to protect itself from unauthorized job submission
>> > and
>> > > > control, and to protect the secrets contained within a Flink cluster.
>> > > > This is a problem in multi-user environments like YARN/Mesos/K8.
>> > > >
>> > > > Last fall, an effort was made to implement service authorization but
>> > the
>> > > PR
>> > > > was ultimately rejected.   The idea was to add a simple secret key to
>> > all
>> > > > network communication between the client, JM, and TM.   Akka itself
>> has
>> > > > such a feature which formed the basis of the solution.  There are
>> > > usability
>> > > > challenges with this solution, including a dependency on SSL.
>> > > >
>> > > > Since then, the situation has evolved somewhat, and the use of SSL
>> > mutual
>> > > > authentication is more viable.   Mutual auth is supported in Akka
>> > 2.4.12+
>> > > > (or could be backported to Flakka).  My proposal is:
>> > > >
>> > > > 1. Upgrade Akka or backport the functionality to Flakka (see commit
>> > > > 5d03902c5ec3212cd28f26c9b3ef7c3b628b9451).
>> > > > 2. Implement SSL on any endpoint that doesn't yet support it (e.g.
>> > > > queryable state).
>> > > > 3. Enable mutual auth in Akka and implement it on non-Akka endpoints.
>> > > > 4. Implement a simple authorization layer that accepts any
>> > authenticated
>> > > > connection.
>> > > > 5. (stretch) generate and store a certificate automatically in YARN
>> > mode.
>> > > > 6. (stretch) Develop an alternate authentication method for the Web
>> UI.
>> > > >
>> > > > Are folks interested in this capability?  Thoughts on the use of SSL
>> > > mutual
>> > > > auth versus something else?  Thanks!
>> > > >
>> > > > -Eron
>> > >
>> > >
>> >
>>


Re: [VOTE] Release 1.3.2, release candidate #2

2017-08-03 Thread Fabian Hueske
+1

- checked hashes and signatures
- checked added and modified dependencies since Flink 1.3.1
  - io.dropwizard.metrics/metrics-core was upgraded to 3.1.5 and is ASL
  - org.xerial.snappy/snappy-java/1.0.5 was added and is ASL

- built from source

Thanks, Fabian

2017-08-03 11:24 GMT+02:00 Stefan Richter :

> +1
>
> 1. Cluster tests
>
> Tested with the stateful state machine job on the following settings:
> - Cloud env: AWS
> - Distributions: EMR 5.7.0
> - Flink deployment method: YARN (Hadoop 2.7.2)
> - HA: enabled
> - Kerberos: disabled
> - Kafka version: 0.10, 0.11
> - State Backends: Heap (Sync / Async) & RocksDB (incremental / full)
> - Filesystem: S3 and HDFS (Hadoop 2.7.2)
> - Externalized checkpoints: enabled & disabled
>
> 2. Building with Scala 2.11 works
>
> 3. Building against Hadoop version works
>
>
> > Am 30.07.2017 um 09:07 schrieb Aljoscha Krettek :
> >
> > Hi everyone,
> >
> > Please review and vote on the release candidate #2 for the version
> 1.3.2, as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be deployed to dist.apache.org [2], which is signed with the key with
> fingerprint 0xA8F4FD97121D7293 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.3.2-rc2" [5],
> > * website pull request listing the new release and adding announcement
> blog post [6].
> >
> > The vote will be open for at least 72 hours (excluding this current
> weekend). It is adopted by majority approval, with at least 3 PMC
> affirmative votes.
> >
> > Please use the provided document, as discussed before, for coordinating
> the testing efforts: [7]
> >
> > Thanks,
> > Aljoscha
> >
> > [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12315522=12340984
> > [2] http://people.apache.org/~aljoscha/flink-1.3.2-rc2/
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4] https://repository.apache.org/content/repositories/
> orgapacheflink-1133/
> > [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=
> e38825d0c8e7fe2191a4c657984d9939ed8dd0ad
> > [6] https://github.com/apache/flink-web/pull/75
> > [7] https://docs.google.com/document/d/1dN9AM9FUPizIu4hTKAXJSbbAORRdr
> ce-BqQ8AUHlOqE/edit?usp=sharing
>
>


Re: [DISCUSS] Service Authorization (redux)

2017-08-03 Thread Till Rohrmann
Alternatively there would also be an option

c) only support mutual auth for Akka 2.4+ if the backport is unrealistic to
do

But this of course would break security for Scala 2.10. On the other hand
people are already using Flink without this feature.

Cheers,
Till

On Wed, Aug 2, 2017 at 7:21 PM, Eron Wright  wrote:

> Thanks Till and Aljoscha for the feedback.
>
> Seems there are two ways to proceed here, if we accept mutual SSL as the
> basis.
>
> a) Backport mutual-auth support from Akka 2.4 to Flakka.
> b) Drop support for Scala 2.10 (FLINK-?), move to Akka 2.4 (FLINK-3662).
>
> Let's assume (a) for now.
>
>
>
> On Tue, Aug 1, 2017 at 3:05 PM, Till Rohrmann 
> wrote:
>
> > Dropping Java 7 alone is not enough to move to Akka 2.4+. For that we
> need
> > at least Scala 2.11.
> >
> > Cheers,
> > Till
> >
> > On Tue, Aug 1, 2017 at 4:22 PM, Aljoscha Krettek 
> > wrote:
> >
> > > Hi Eron,
> > >
> > > I think after Dropping support for Java 7 we will move to Akka 2.4+, so
> > we
> > > should be good there. I think quite some users should find a (more)
> > secure
> > > Flink interesting.
> > >
> > > Best,
> > > Aljoscha
> > > > On 24. Jul 2017, at 03:11, Eron Wright  wrote:
> > > >
> > > > Hello, now might be a good time to revisit an important enhancement
> to
> > > > Flink security, so-called service authorization.   This means the
> > > hardening
> > > > of a Flink cluster against unauthorized use with some sort of
> > > > authentication and authorization scheme.   Today, Flink relies
> entirely
> > > on
> > > > network isolation to protect itself from unauthorized job submission
> > and
> > > > control, and to protect the secrets contained within a Flink cluster.
> > > > This is a problem in multi-user environments like YARN/Mesos/K8.
> > > >
> > > > Last fall, an effort was made to implement service authorization but
> > the
> > > PR
> > > > was ultimately rejected.   The idea was to add a simple secret key to
> > all
> > > > network communication between the client, JM, and TM.   Akka itself
> has
> > > > such a feature which formed the basis of the solution.  There are
> > > usability
> > > > challenges with this solution, including a dependency on SSL.
> > > >
> > > > Since then, the situation has evolved somewhat, and the use of SSL
> > mutual
> > > > authentication is more viable.   Mutual auth is supported in Akka
> > 2.4.12+
> > > > (or could be backported to Flakka).  My proposal is:
> > > >
> > > > 1. Upgrade Akka or backport the functionality to Flakka (see commit
> > > > 5d03902c5ec3212cd28f26c9b3ef7c3b628b9451).
> > > > 2. Implement SSL on any endpoint that doesn't yet support it (e.g.
> > > > queryable state).
> > > > 3. Enable mutual auth in Akka and implement it on non-Akka endpoints.
> > > > 4. Implement a simple authorization layer that accepts any
> > authenticated
> > > > connection.
> > > > 5. (stretch) generate and store a certificate automatically in YARN
> > mode.
> > > > 6. (stretch) Develop an alternate authentication method for the Web
> UI.
> > > >
> > > > Are folks interested in this capability?  Thoughts on the use of SSL
> > > mutual
> > > > auth versus something else?  Thanks!
> > > >
> > > > -Eron
> > >
> > >
> >
>


[jira] [Created] (FLINK-7363) add hashes and signatures to the download page

2017-08-03 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-7363:
--

 Summary: add hashes and signatures to the download page
 Key: FLINK-7363
 URL: https://issues.apache.org/jira/browse/FLINK-7363
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Nico Kruber
Assignee: Nico Kruber


As part of the releases, we also generate MD5 hashes and cryptographic 
signatures but neither link to those nor do we explain which keys are valid 
release-signing keys. This should be added.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7362) CheckpointProperties are not correctly set when restoring savepoint with HA enabled

2017-08-03 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-7362:
---

 Summary: CheckpointProperties are not correctly set when restoring 
savepoint with HA enabled
 Key: FLINK-7362
 URL: https://issues.apache.org/jira/browse/FLINK-7362
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.4.0, 1.3.2
Reporter: Aljoscha Krettek
 Fix For: 1.4.0


When restoring a savepoint on a HA setup (with ZooKeeper) the web frontend 
incorrectly says "Type: Checkpoint" in the information box about the latest 
restore event.

The information that this uses is set here: 
https://github.com/apache/flink/blob/09caa9ffdc8168610c7d0260360c034ea87f904c/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java#L1101

It seems that the {{CheckpointProperties}} of a restored savepoint somehow get 
lost, maybe because of the recover step that the 
{{ZookeeperCompletedCheckpointStore}} is going through.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] Release 1.3.2, release candidate #2

2017-08-03 Thread Stefan Richter
+1

1. Cluster tests

Tested with the stateful state machine job on the following settings:
- Cloud env: AWS
- Distributions: EMR 5.7.0
- Flink deployment method: YARN (Hadoop 2.7.2)
- HA: enabled
- Kerberos: disabled
- Kafka version: 0.10, 0.11
- State Backends: Heap (Sync / Async) & RocksDB (incremental / full)
- Filesystem: S3 and HDFS (Hadoop 2.7.2)
- Externalized checkpoints: enabled & disabled

2. Building with Scala 2.11 works

3. Building against Hadoop version works


> Am 30.07.2017 um 09:07 schrieb Aljoscha Krettek :
> 
> Hi everyone,
> 
> Please review and vote on the release candidate #2 for the version 1.3.2, as 
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> 
> 
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be 
> deployed to dist.apache.org [2], which is signed with the key with 
> fingerprint 0xA8F4FD97121D7293 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.3.2-rc2" [5],
> * website pull request listing the new release and adding announcement blog 
> post [6]. 
> 
> The vote will be open for at least 72 hours (excluding this current weekend). 
> It is adopted by majority approval, with at least 3 PMC affirmative votes.
> 
> Please use the provided document, as discussed before, for coordinating the 
> testing efforts: [7]
> 
> Thanks,
> Aljoscha
> 
> [1] 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12340984
> [2] http://people.apache.org/~aljoscha/flink-1.3.2-rc2/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1133/
> [5] 
> https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=e38825d0c8e7fe2191a4c657984d9939ed8dd0ad
> [6] https://github.com/apache/flink-web/pull/75
> [7] 
> https://docs.google.com/document/d/1dN9AM9FUPizIu4hTKAXJSbbAORRdrce-BqQ8AUHlOqE/edit?usp=sharing



[jira] [Created] (FLINK-7361) flink-web doesn't build with ruby 2.4

2017-08-03 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-7361:
--

 Summary: flink-web doesn't build with ruby 2.4
 Key: FLINK-7361
 URL: https://issues.apache.org/jira/browse/FLINK-7361
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Reporter: Nico Kruber
Assignee: Nico Kruber


The dependencies pulled in by the old jekyll version do not build with ruby 2.4 
and fail with something like

{code}
yajl_ext.c:881:22: error: 'rb_cFixnum' undeclared (first use in this function); 
did you mean 'rb_isalnum'?
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7360) Support Scala map type

2017-08-03 Thread Timo Walther (JIRA)
Timo Walther created FLINK-7360:
---

 Summary: Support Scala map type
 Key: FLINK-7360
 URL: https://issues.apache.org/jira/browse/FLINK-7360
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Reporter: Timo Walther


Currently, Flink SQL supports only Java `java.util.Map`. Scala maps are treated 
as a blackbox with Flink `GenericTypeInfo`/SQL `ANY` data type. Therefore, you 
can forward these blackboxes and use them within scalar functions but accessing 
with the `['key']` operator is not supported.

We should convert these special collections at the beginning, in order to use 
in a SQL statement.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7359) flink-storm should update ComponentConfiguration to stormConfig

2017-08-03 Thread yf (JIRA)
yf created FLINK-7359:
-

 Summary: flink-storm should update ComponentConfiguration to 
stormConfig
 Key: FLINK-7359
 URL: https://issues.apache.org/jira/browse/FLINK-7359
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 1.3.1
Reporter: yf


We should first to get the config from the storm component config, then update 
it to storm config.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)