Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Dyana Rose
ok great,

that's done, the PR is rebased and squashed on top of master and is running
through Travis

https://github.com/apache/flink/pull/9494

Dyana

On Tue, 20 Aug 2019 at 15:32, Tzu-Li (Gordon) Tai 
wrote:

> Hi Dyana,
>
> Regarding your question on the Chinese docs:
> Since the Chinese counterparts for the Kinesis connector documentation
> isn't translated yet (see docs/dev/connectors/kinesis.zh.md), for now you
> can simply just sync whatever changes you made to the English doc to the
> Chinese one as well.
>
> Cheers,
> Gordon
>


Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Dyana Rose
Ahh, brilliant, I had myself on notifications for the streams adapter
releases, but must have missed it. That's great news.

I've got the branch prepped for moving over to Apache 2.0, but staying on
KCL 1.x, which requires the least amount of change.

Considering the large amount of change required to update to KCL/SDK 2.x I
would recommend that be done in a parallel task. Making both connectors
available then for usage, 1.x and 2.x. If that makes sense.

The branch I push will have the English Language documents updated, but not
have the Chinese Language documents updated. Is there a process for this?

Thanks,
Dyana

On Mon, 19 Aug 2019 at 19:08, Bowen Li  wrote:

> Hi all,
>
> A while back we discussed upgrading flink-connector-kinesis module to
> Apache 2.0 license so that its jar can be published as part of official
> Flink releases. Given we have a large user base using Flink with
> kinesis/dynamodb streams, it'll free users from building and maintaining
> the module themselves, and improve user and developer experience. A ticket
> was created [1] but has been idle mainly due to new releases of some aws
> libs are not available yet then.
>
> As of today I see that all flink-connector-kinesis's aws dependencies have
> been updated to Apache 2.0 license and are released. They include:
>
> - aws-java-sdk-kinesis
> - aws-java-sdk-sts
> - amazon-kinesis-client
> - amazon-kinesis-producer (Apache 2.0 from 0.13.1, released 18 days ago)
> [2]
> - dynamodb-streams-kinesis-adapter (Apache 2.0 from 1.5.0, released 7 days
> ago) [3]
>
> Therefore, I'd suggest we kick off the initiative and aim for release 1.10
> which is roughly 3 months away, leaving us plenty of time to finish.
> According to @Dyana 's comment in the ticket [1], seems some large chunks
> of changes need to be made into multiple parts than simply upgrading lib
> versions, so we can further break the JIRA down into sub-tasks to limit
> scope of each change for easier code review.
>
> @Dyana would you still be interested in carrying the responsibility and
> forwarding the effort?
>
> Thanks,
> Bowen
>
> [1] https://issues.apache.org/jira/browse/FLINK-12847
> [2] https://github.com/awslabs/amazon-kinesis-producer/releases
> [3] https://github.com/awslabs/dynamodb-streams-kinesis-adapter/releases
>
>
>

-- 

Dyana Rose
Software Engineer


W: www.salecycle.com <http://www.salecycle.com/>
[image: Airline & Travel Booking Trends - Download Report]
<https://t.xink.io/Tracking/Index/WtEBAKNtAAAwphkA0>


Re: Updating Kinesis Connector to latest Apache licensed libs

2019-06-18 Thread Dyana Rose
I've pushed an early WIP of the code changes to our fork and added a long
comment on the work that's been done, what issues I've come across, and
requests for discussion on those issues

https://issues.apache.org/jira/browse/FLINK-12847

Thanks,
Dyana

On Fri, 14 Jun 2019 at 10:34, Aljoscha Krettek  wrote:

> +1
>
> Nice! Less special-case handling is always good.
>
> > On 14. Jun 2019, at 10:30, Thomas Weise  wrote:
> >
> > Dyana, thanks for taking this up!
> >
> > The flink-connector-kinesis module is already part of the CI pipeline, it
> > is just excluded when creating the release. So what needs to be done is
> to
> > remove the -Pinclude-kinesis cruft and make it part of the default
> modules
> > instead.
> >
> > Thomas
> >
> >
> > On Fri, Jun 14, 2019 at 10:06 AM Dyana Rose 
> > wrote:
> >
> >> Brilliant. That Issue is in now under:
> >> https://issues.apache.org/jira/browse/FLINK-12847
> >>
> >> Thanks,
> >> Dyana
> >>
> >> On Fri, 14 Jun 2019 at 03:07, Tzu-Li (Gordon) Tai 
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> Thanks Dyana for bringing this up and Bowen for helping to move this
> >>> forward. Very happy to hear about this!
> >>>
> >>> Please feel free to to create a new JIRA ticket for this and assign it
> to
> >>> yourself. +1 to aim this for 1.9.0.
> >>> The subtasks that Bowen mentioned looks good to me. Issues such as [1]
> >> and
> >>> [2] should also be resolved as part of this effort.
> >>>
> >>> Also cc'ing Thomas, who recently has more coverage on the Flink Kinesis
> >>> connector.
> >>>
> >>> Cheers,
> >>> Gordon
> >>>
> >>> [1]  https://issues.apache.org/jira/browse/FLINK-3924
> >>> [2]  https://issues.apache.org/jira/browse/FLINK-7673
> >>>
> >>> On Fri, Jun 14, 2019 at 2:46 AM Bowen Li  wrote:
> >>>
> >>>> Hi Dyana,
> >>>>
> >>>> Thanks for bringing this up!
> >>>>
> >>>> You are right that ASL is the blocker for us to officially include
> >>>> flink-connecotr-kinesis as a connector module to build and publish to
> >>> Maven
> >>>> central. I've been thru the mess of building, publishing, and
> >> maintaining
> >>>> flink-connector-kinesis via Jfrog, and that's a really really painful
> >>>> experience... Glad to hear AWS finally pulls the trigger to change
> >>>> KCL/KPL's license. So big +1 on this initiative from me.
> >>>>
> >>>> I'm not aware of any previous discussion on this, so please feel free
> >> to
> >>>> create a new JIRA ticket, assign to yourself, and work on it. As a
> >>>> committer, I'll be happy to help move this effort forward, and we can
> >>> seek
> >>>> help from other experts in kinesis connector like @Tzu-Li (Gordon) Tai
> >>>>  when needed.
> >>>>
> >>>> The task should include, but not limited to, upgrading KCL/KPL to new
> >>>> versions of Apache 2.0 license, changing licenses and NOTICE files in
> >>>> flink-connector-kinesis, and adding flink-connector-kinesis to build,
> >> CI
> >>>> and artifact publishing pipeline. These can be broken into subtasks.
> >>>>
> >>>> If AWS's PR you gave can be finished soon enough, we may be able to
> >> sneak
> >>>> this into Flink 1.9 before feature freeze which is currently set as
> end
> >>> of
> >>>> June. Otherwise, we may have to wait till the next major release like
> >>> 1.10,
> >>>> as such a big change may not happen in maintenance releases like
> 1.9.1.
> >>>>
> >>>> Bowen
> >>>>
> >>>> On Thu, Jun 13, 2019 at 5:38 AM dyana.rose 
> >>>> wrote:
> >>>>
> >>>>>
> >>>>> The Kinesis Client Library v2.x and the AWS Java SDK v2.x both are
> now
> >>> on
> >>>>> the Apache 2.0 license.
> >>>>>
> >>>
> https://github.com/awslabs/amazon-kinesis-client/blob/master/LICENSE.txt
> >>>>> https://github.com/aws/aws-sdk-java-v2/blob/master/LICENSE.txt
> >>>>>
> >>>>> There is a PR for the Kinesis Producer Library to update it to the
> >>> Apache
> &

Re: Updating Kinesis Connector to latest Apache licensed libs

2019-06-14 Thread Dyana Rose
Brilliant. That Issue is in now under:
https://issues.apache.org/jira/browse/FLINK-12847

Thanks,
Dyana

On Fri, 14 Jun 2019 at 03:07, Tzu-Li (Gordon) Tai 
wrote:

> Hi,
>
> Thanks Dyana for bringing this up and Bowen for helping to move this
> forward. Very happy to hear about this!
>
> Please feel free to to create a new JIRA ticket for this and assign it to
> yourself. +1 to aim this for 1.9.0.
> The subtasks that Bowen mentioned looks good to me. Issues such as [1] and
> [2] should also be resolved as part of this effort.
>
> Also cc'ing Thomas, who recently has more coverage on the Flink Kinesis
> connector.
>
> Cheers,
> Gordon
>
> [1]  https://issues.apache.org/jira/browse/FLINK-3924
> [2]  https://issues.apache.org/jira/browse/FLINK-7673
>
> On Fri, Jun 14, 2019 at 2:46 AM Bowen Li  wrote:
>
> > Hi Dyana,
> >
> > Thanks for bringing this up!
> >
> > You are right that ASL is the blocker for us to officially include
> > flink-connecotr-kinesis as a connector module to build and publish to
> Maven
> > central. I've been thru the mess of building, publishing, and maintaining
> > flink-connector-kinesis via Jfrog, and that's a really really painful
> > experience... Glad to hear AWS finally pulls the trigger to change
> > KCL/KPL's license. So big +1 on this initiative from me.
> >
> > I'm not aware of any previous discussion on this, so please feel free to
> > create a new JIRA ticket, assign to yourself, and work on it. As a
> > committer, I'll be happy to help move this effort forward, and we can
> seek
> > help from other experts in kinesis connector like @Tzu-Li (Gordon) Tai
> >  when needed.
> >
> > The task should include, but not limited to, upgrading KCL/KPL to new
> > versions of Apache 2.0 license, changing licenses and NOTICE files in
> > flink-connector-kinesis, and adding flink-connector-kinesis to build, CI
> > and artifact publishing pipeline. These can be broken into subtasks.
> >
> > If AWS's PR you gave can be finished soon enough, we may be able to sneak
> > this into Flink 1.9 before feature freeze which is currently set as end
> of
> > June. Otherwise, we may have to wait till the next major release like
> 1.10,
> > as such a big change may not happen in maintenance releases like 1.9.1.
> >
> > Bowen
> >
> > On Thu, Jun 13, 2019 at 5:38 AM dyana.rose 
> > wrote:
> >
> >>
> >> The Kinesis Client Library v2.x and the AWS Java SDK v2.x both are now
> on
> >> the Apache 2.0 license.
> >>
> https://github.com/awslabs/amazon-kinesis-client/blob/master/LICENSE.txt
> >> https://github.com/aws/aws-sdk-java-v2/blob/master/LICENSE.txt
> >>
> >> There is a PR for the Kinesis Producer Library to update it to the
> Apache
> >> 2.0 license (
> https://github.com/awslabs/amazon-kinesis-producer/pull/256)
> >>
> >> If I understand the Amazon software license issue correctly updating to
> >> these new major versions (and the KPL when it's available under the
> Apache
> >> license) will allow the Kinesis connectors to be distributed in the core
> >> build. (making my life easier)
> >>
> >> I haven't seen a Jira ticket specifically for an upgrade in major
> >> version, but it would solve this one, though otherwise than intended!
> >> https://issues.apache.org/jira/browse/FLINK-7673
> >>
> >> Unless there are already discussed reasons not to upgrade, I'll stick a
> >> ticket in for it and cross my fingers that the KPL PR gets merged
> sometime
> >> in the relatively near future.
> >>
> >> Thanks,
> >> Dyana
> >>
> >
>


-- 

Dyana Rose
Software Engineer


W: www.salecycle.com <http://www.salecycle.com/>
[image: The 2019 Look Book - Download Now]
<https://t.xink.io/Tracking/Index/WcwBAKNtAAAwphkA0>


[jira] [Created] (FLINK-12847) Update Kinesis Connectors to latest Apache licensed libraries

2019-06-14 Thread Dyana Rose (JIRA)
Dyana Rose created FLINK-12847:
--

 Summary: Update Kinesis Connectors to latest Apache licensed 
libraries
 Key: FLINK-12847
 URL: https://issues.apache.org/jira/browse/FLINK-12847
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kinesis
Reporter: Dyana Rose
Assignee: Dyana Rose


Currently the referenced Kinesis Client Library and Kinesis Producer Library 
code in the flink-connector-kinesis is licensed under the Amazon Software 
License which is not compatible with the Apache License. This then requires a 
fair amount of work in the CI pipeline and for users who want to use the 
flink-connector-kinesis.

The Kinesis Client Library v2.x and the AWS Java SDK v2.x both are now on the 
Apache 2.0 license.
[https://github.com/awslabs/amazon-kinesis-client/blob/master/LICENSE.txt]
[https://github.com/aws/aws-sdk-java-v2/blob/master/LICENSE.txt]

There is a PR for the Kinesis Producer Library to update it to the Apache 2.0 
license ([https://github.com/awslabs/amazon-kinesis-producer/pull/256])

The task should include, but not limited to, upgrading KCL/KPL to new versions 
of Apache 2.0 license, changing licenses and NOTICE files in 
flink-connector-kinesis, and adding flink-connector-kinesis to build, CI and 
artifact publishing pipeline, updating the build profiles, updating 
documentation that references the license incompatibility

The expected outcome of this issue is that the flink-connector-kinesis will be 
included with the standard build artifacts and will no longer need to be built 
separately by users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Updating Kinesis Connector to latest Apache licensed libs

2019-06-13 Thread dyana . rose


The Kinesis Client Library v2.x and the AWS Java SDK v2.x both are now on the 
Apache 2.0 license.
https://github.com/awslabs/amazon-kinesis-client/blob/master/LICENSE.txt
https://github.com/aws/aws-sdk-java-v2/blob/master/LICENSE.txt

There is a PR for the Kinesis Producer Library to update it to the Apache 2.0 
license (https://github.com/awslabs/amazon-kinesis-producer/pull/256)

If I understand the Amazon software license issue correctly updating to these 
new major versions (and the KPL when it's available under the Apache license) 
will allow the Kinesis connectors to be distributed in the core build. (making 
my life easier)

I haven't seen a Jira ticket specifically for an upgrade in major version, but 
it would solve this one, though otherwise than intended! 
https://issues.apache.org/jira/browse/FLINK-7673

Unless there are already discussed reasons not to upgrade, I'll stick a ticket 
in for it and cross my fingers that the KPL PR gets merged sometime in the 
relatively near future.

Thanks,
Dyana


Re: HA lock nodes, Checkpoints, and JobGraphs after failure

2019-06-07 Thread dyana . rose
Just wanted to give an update on this.

Our ops team and myself independently came to the same conclusion that our 
ZooKeeper quorum was having syncing issues.

After a bit more research, they have updated the initLimit and syncLimit in the 
quorum configs to:
initLimit=10
syncLimit=5

After this change we no longer saw any of the issues we were having.

Thanks,
Dyana

On 2019/05/02 08:43:14, Till Rohrmann  wrote: 
> Thanks for the update Dyana. I'm also not an expert in running one's own
> ZooKeeper cluster. It might be related to setting the ZooKeeper cluster
> properly up. Maybe someone else from the community has experience with
> this. Therefore, I'm cross posting this thread to the user ML again to have
> a wider reach.
> 
> Cheers,
> Till
> 
> On Wed, May 1, 2019 at 10:17 AM dyana.rose  wrote:
> 
> > Like all the best problems, I can't get this to reproduce locally.
> >
> > Everything has worked as expected. I started up a test job with 5 retained
> > checkpoints, let it run and watched the nodes in zookeeper.
> >
> > Then shut down and restarted the Flink cluster.
> >
> > The ephemeral lock nodes in the retained checkpoints transitioned from one
> > lock id to another without a hitch.
> >
> > So that's good.
> >
> > As I understand it, if the Zookeeper cluster is having a sync issue,
> > ephemeral nodes may not get deleted when the session becomes inactive.
> > We're new to running our own zookeeper so it may be down to that.
> >
> 


Re: HA lock nodes, Checkpoints, and JobGraphs after failure

2019-05-01 Thread dyana . rose
Like all the best problems, I can't get this to reproduce locally.

Everything has worked as expected. I started up a test job with 5 retained 
checkpoints, let it run and watched the nodes in zookeeper.

Then shut down and restarted the Flink cluster.

The ephemeral lock nodes in the retained checkpoints transitioned from one lock 
id to another without a hitch.

So that's good.

As I understand it, if the Zookeeper cluster is having a sync issue, ephemeral 
nodes may not get deleted when the session becomes inactive. We're new to 
running our own zookeeper so it may be down to that.


Re: HA lock nodes, Checkpoints, and JobGraphs after failure

2019-04-23 Thread Dyana Rose
may take me a bit to get the logs as we're not always in a situation where
we've got enough hands free to run through the scenarios for a day.

Is that DEBUG JobManager, DEBUG ZooKeeper, or both you'd be interested in?

Thanks,
Dyana

On Tue, 23 Apr 2019 at 13:23, Till Rohrmann  wrote:

> Hi Dyana,
>
> your analysis is almost correct. The only part which is missing is that the
> lock nodes are created as ephemeral nodes. This should ensure that if a JM
> process dies that the lock nodes will get removed by ZooKeeper. It depends
> a bit on ZooKeeper's configuration how long it takes until Zk detects a
> client connection as lost and then removes the ephemeral nodes. If the job
> should terminate within this time interval, then it could happen that you
> cannot remove the checkpoint/JobGraph. However, usually the Zookeeper
> session timeout should be configured to be a couple of seconds.
>
> I would actually be interested in better understanding your problem to see
> whether this is still a bug in Flink. Could you maybe share the respective
> logs on DEBUG log level with me? Maybe it would also be possible to run the
> latest version of Flink (1.7.2) to include all possible bug fixes.
>
> FYI: The community is currently discussing to reimplement the ZooKeeper
> based high availability services [1]. One idea is to get rid of the lock
> nodes by replacing them with transactions on the leader node. This could
> prevent these kind of bugs in the future.
>
> [1] https://issues.apache.org/jira/browse/FLINK-10333
>
> Cheers,
> Till
>
> On Thu, Apr 18, 2019 at 3:12 PM dyana.rose 
> wrote:
>
> > Flink v1.7.1
> >
> > After a Flink reboot we've been seeing some unexpected issues with excess
> > retained checkpoints not being able to be removed from ZooKeeper after a
> > new checkpoint is created.
> >
> > I believe I've got my head around the role of ZK and lockNodes in
> > Checkpointing after going through the code. Could you check my logic on
> > this and add any insight, especially if I've got it wrong?
> >
> > The situation:
> > 1) Say we run JM1 and JM2 and retain 10 checkpoints and are running in HA
> > with S3 as the backing store.
> >
> > 2) JM1 and JM2 start up and each instance of ZooKeeperStateHandleStore
> has
> > its own lockNode UUID. JM1 is elected leader.
> >
> > 3) We submit a job, that JobGraph lockNode is added to ZK using JM1's
> > JobGraph lockNode.
> >
> > 4) Checkpoints start rolling in, latest 10 are retained in ZK using JM1's
> > checkpoint lockNode. We continue running, and checkpoints are
> successfully
> > being created and excess checkpoints removed.
> >
> > 5) Both JM1 and JM2 now are rebooted.
> >
> > 6) The JobGraph is recovered by the leader, the job restarts from the
> > latest checkpoint.
> >
> > Now after every new checkpoint we see in the ZooKeeper logs:
> > INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@653] - Got
> > user-level KeeperException when processing sessionid:0x1047715000d
> > type:delete cxid:0x210 zxid:0x71091 txntype:-1 reqpath:n/a Error
> >
> Path:/flink/job-name/checkpoints/2fa0d694e245f5ec1f709630c7c7bf69/0057813
> > Error:KeeperErrorCode = Directory not empty for
> >
> /flink/job-name/checkpoints/2fa0d694e245f5ec1f709630c7c7bf69/005781
> > with an increasing checkpoint id on each subsequent call.
> >
> > When JM1 and JM2 were rebooted the lockNode UUIDs would have rolled,
> > right? As the old checkpoints were created under the old UUID, the new
> JMs
> > will never be able to remove the old retained checkpoints from ZooKeeper.
> >
> > Is that correct?
> >
> > If so, would this also happen with JobGraphs in the following situation
> > (we saw this just recently where we had a JobGraph for a cancelled job
> > still in ZK):
> >
> > Steps 1 through 3 above, then:
> > 4) JM1 fails over to JM2, the job keeps running uninterrupted. JM1
> > restarts.
> >
> > 5) some time later while JM2 is still leader we hard cancel the job and
> > restart the JMs
> >
> > In this case JM2 would successfully remove the job from s3, but because
> > its lockNode is different from JM1 it cannot delete the lock file in the
> > jobgraph folder and so can’t remove the jobgraph. Then Flink restarts and
> > tries to process the JobGraph it has found, but the S3 files have been
> > deleted.
> >
> > Possible related closed issues (fixes went in v1.7.0):
> > https://issues.apache.org/jira/browse/FLINK-10184 and
> > https://issues.apache.org/jira/browse/FLINK-10255
> >
> > Thanks for any insight,
> > Dyana
> >
>


-- 

Dyana Rose
Software Engineer


W: www.salecycle.com <http://www.salecycle.com/>
[image: The 2019 Look Book - Download Now]
<https://t.xink.io/Tracking/Index/WcwBAKNtAAAwphkA0>


HA lock nodes, Checkpoints, and JobGraphs after failure

2019-04-18 Thread dyana . rose
Flink v1.7.1

After a Flink reboot we've been seeing some unexpected issues with excess 
retained checkpoints not being able to be removed from ZooKeeper after a new 
checkpoint is created.

I believe I've got my head around the role of ZK and lockNodes in Checkpointing 
after going through the code. Could you check my logic on this and add any 
insight, especially if I've got it wrong?

The situation:
1) Say we run JM1 and JM2 and retain 10 checkpoints and are running in HA with 
S3 as the backing store.

2) JM1 and JM2 start up and each instance of ZooKeeperStateHandleStore has its 
own lockNode UUID. JM1 is elected leader.

3) We submit a job, that JobGraph lockNode is added to ZK using JM1's JobGraph 
lockNode.

4) Checkpoints start rolling in, latest 10 are retained in ZK using JM1's 
checkpoint lockNode. We continue running, and checkpoints are successfully 
being created and excess checkpoints removed.

5) Both JM1 and JM2 now are rebooted.

6) The JobGraph is recovered by the leader, the job restarts from the latest 
checkpoint.

Now after every new checkpoint we see in the ZooKeeper logs:
INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@653] - Got user-level 
KeeperException when processing sessionid:0x1047715000d type:delete 
cxid:0x210 zxid:0x71091 txntype:-1 reqpath:n/a Error 
Path:/flink/job-name/checkpoints/2fa0d694e245f5ec1f709630c7c7bf69/0057813
 Error:KeeperErrorCode = Directory not empty for 
/flink/job-name/checkpoints/2fa0d694e245f5ec1f709630c7c7bf69/005781
with an increasing checkpoint id on each subsequent call.

When JM1 and JM2 were rebooted the lockNode UUIDs would have rolled, right? As 
the old checkpoints were created under the old UUID, the new JMs will never be 
able to remove the old retained checkpoints from ZooKeeper.

Is that correct?

If so, would this also happen with JobGraphs in the following situation (we saw 
this just recently where we had a JobGraph for a cancelled job still in ZK):

Steps 1 through 3 above, then:
4) JM1 fails over to JM2, the job keeps running uninterrupted. JM1 restarts.

5) some time later while JM2 is still leader we hard cancel the job and restart 
the JMs

In this case JM2 would successfully remove the job from s3, but because its 
lockNode is different from JM1 it cannot delete the lock file in the jobgraph 
folder and so can’t remove the jobgraph. Then Flink restarts and tries to 
process the JobGraph it has found, but the S3 files have been deleted.

Possible related closed issues (fixes went in v1.7.0): 
https://issues.apache.org/jira/browse/FLINK-10184 and 
https://issues.apache.org/jira/browse/FLINK-10255

Thanks for any insight,
Dyana


Re: [DISCUSS] Change underlying Frontend Architecture for Flink Web Dashboard

2018-10-31 Thread dyana . rose
Re: who's using the web ui

Though many mature solutions, with a fair amount of time/resources available 
are likely running their own front ends, for teams like mine which are smaller 
and aren't focused solely on working with Flink, having the web ui available 
removes a large barrier to getting up and running quickly.

But, with the web UI being in core Flink, it does make it a bit more of a chore 
to contribute. 

I'm unaware of how necessary it is for the UI to deploy with Flink? Is it 
making any calls that aren't through the REST API (and if so, can those 
endpoints be added to the REST API)?

In general I'd support moving it to its own package, making it easier to 
develop with your standard UI dev tools. This also allows the web UI to be 
developed and released independently to core Flink.

Dyana

On 2018/10/31 07:47:50, Fabian Wollert  wrote: 
> Hi Till, I basically agree with all your points. i would stress the
> "dustiness" of the current architecture: the package manager used (bower)
> is deprecated since a long time, the chance for the builds of the flink web
> dashboard not working anymore is increasing every day.
> 
> About the knowledge in the community: Two days is not a lot of time, but
> interest in this topic seems to be minor anyways. Is someone using the
> Flink Web Dashboard at all, or is everyone running their own custom
> solutions? Because if there is no interest in using the Web UI AND no one
> interested in developing, there would be no need to package this as part of
> the official Flink package, but rather develop an independent solution (I'm
> not voting for this right now, just putting it out), if at all. The
> official package could then just ship with the API, which other solutions
> can build upon. This solution could be from an agile point of view also the
> best (enhanced testing, independent and more effective dev workflow, etc.),
> but is bad for the usage of the Flink UI, because people need to install
> two things individually (Flink and the web dashboard).
> 
> How did the choice for Angular1 happen back then? Who was writing the
> Dashboard in the first place?
> 
> Cheers
> 
> --
> 
> 
> *Fabian WollertZalando SE*
> 
> E-Mail: fab...@zalando.de
> 
> 
> Am Di., 30. Okt. 2018 um 15:07 Uhr schrieb Till Rohrmann <
> trohrm...@apache.org>:
> 
> > Thanks for starting this discussion Fabian! I think our web UI technology
> > stack is quite dusty by now and it would be beneficial to think about its
> > technological future.
> >
> > On the one hand, our current web UI works more or less reliable and
> > changing the underlying technology has the risk of breaking things.
> > Moreover, there might be the risk that the newly chosen technology will be
> > deprecated at some point in time as well.
> >
> > On the other hand, we don't have much Angular 1 knowledge in the community
> > and extending the web UI is, thus, quite hard at the moment. Maybe by using
> > some newer web technologies we might be able to attract more people with a
> > web technology background to join the community.
> >
> > The lack of people working on the web UI is for me the biggest problem I
> > would like to address. If there is interest in the web UI, then I'm quite
> > sure that we will be able to even migrate to some other technology in the
> > future. The next important issue for me is to do the change incrementally
> > if possible. Ideally we never break the web UI in the process of migrating
> > to a new technology. I'm not an expert here so it might or might not be
> > possible. But if it is, then we should design the implementation steps in
> > such a way.
> >
> > Cheers,
> > Till
> >
> > On Mon, Oct 29, 2018 at 1:06 PM Fabian Wollert  wrote:
> >
> > > Hi everyone,
> > >
> > > in this email thread
> > > <
> > >
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Cluster-Overview-Dashboard-Improvement-Proposal-td24531.html
> > > >
> > > and the tickets FLINK-10705
> > >  and FLINK-10706
> > >  the discussion came
> > up
> > > whether to change the underlying architecture of Flink's Web Dashboard
> > from
> > > Angular1 to something else. This email thread should be solely to discuss
> > > the pro's and con's of this, and what could be the target architecture.
> > >
> > > My choice would be React. Personally I agree with Till's comments in the
> > > ticket, Angular 1 being basically outdated and is not having a large
> > > following anymore. From my experience the choice between Angular 2-7 or
> > > React is subjective, you can get things done with both. I personally only
> > > have experience with React, so I  personally would be faster to develop
> > > with this one. I currently have not planned to learn Angular as well
> > (being
> > > a more backend focused developer in general) so if the decision would be
> > to
> > > go with Angular, i would be unfortunately 

Re: CloudWatch Metrics Reporter

2018-05-16 Thread Dyana Rose
I've written a cloud watch reporter for our own use. It's not pretty to
crack out the metrics correctly for cloudwatch as the current metrics don't
all set the metric names in a good hierarchy and then they aren't all added
to the metric variables either.

If someone opens the Jira I can see about getting our code up as an example
branch of what I had to do. Unless I missed something, I think the current
metrics need a bit of a brush up.

Dyana

On 16 May 2018 at 09:23, Chesnay Schepler <ches...@apache.org> wrote:

> Hello,
>
> there was no demand for a CloudWatch reporter so far.
>
> I only quickly skimmed the API docs, but it appears that the data is
> inserted via REST.
> Would the reporter require the usage of any aws library, or could be use
> an arbitrary http client?
> If it is the latter there shouldn't be a licensing issue as i understand
> it.
>
> Please open a JIRA, let's move the discussion there.
>
>
> On 16.05.2018 10:12, Rafi Aroch wrote:
>
>> Hi,
>>
>> In my team we use CloudWatch as our monitoring & alerting system.
>> I noticed that CloudWatch does not appear in the list of supported
>> Reporters.
>> I was wondering why is that. Was there no demand from the community? Is it
>> related to licensing issue with AWS? Was it a technical concern?
>>
>> Would you accept this contribution into Flink?
>>
>> Thanks,
>> Rafi
>>
>>
>


-- 

Dyana Rose
Software Engineer


W: www.salecycle.com <http://www.salecycle.com/>
[image: Marketing Permissions Service]
<https://t.xink.io/Tracking/Index/9LwBAKNtAAAwphkA0>


KPL in current stable 1.4.2 and below, upcoming problem

2018-05-10 Thread Dyana Rose
Hello,

We've received notification from AWS that the Kinesis Producer Library
versions < 0.12.6 will stop working after the 12th of June (assuming the
date in the email is in US format...)

Flink v1.5.0 has the KPL version at 0.12.6 so it will be fine when it's
released. However using the kinesis connector in any previous version looks
like they'll have an issue.

I'm not sure how/if you want to communicate this. We build Flink ourselves,
so I plan on having a look at any changes done to the Kinesis Sink in
v1.5.0 and then bumpimg the KPL version in our fork and rebuilding.

Thanks,
Dyana

below is the email we received (note: we're in eu-west-1):


Hello,



Your action is required: please update clients running Kinesis Producer
Library 0.12.5 or older or you will experience a breaking change to your
application.



We've discovered you have one or more clients writing data to Amazon
Kinesis Data Streams running an outdated version of the Kinesis Producer
Library. On 6/12 these clients will be impacted if they are not updated to
Kinesis Producer Library version 0.12.6 or newer. On 06/12 Kinesis Data
Streams will install ATS certificates which will prevent these outdated
clients from writing to a Kinesis Data Stream. The result of this change
will break any producer using KPL 0.12.5 or older.


* How do I update clients and applications to use the latest version of the
Kinesis Producer Library?

You will need to ensure producers leveraging the Kinesis Producer Library
have upgraded to version 0.12.6 or newer. If you operate older versions
your application will break due untrusted SSL certification.

Via Maven install Kinesis Producer Library version 0.12.6 or higher [2]

After you've configured your clients to use the new version, you're done.

* What if I have questions or issues?

If you have questions or issues, please contact your AWS Technical Account
Manager or AWS support and file a support ticket [3].

[1] https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-upgrades.html

[2] http://search.maven.org/#artifactdetails|com.amazonaws|amazo
n-kinesis-produ...


[3] https://aws.amazon.com/support



-  Amazon Kinesis Data Streams Team
-


feedback request FLINK-8384 Dynamic Session Gaps

2018-01-24 Thread Dyana Rose
The PR for this has a question that I'd like some feedback on.
https://github.com/apache/flink/pull/5295

The issue is, in order to be able to pass a typed event to the session gap
extractor, the assigner needs to be generic. And if the assigner is
generic, then the triggers need to be generic.

And unfortunately, the two current triggers aren't generic, and I've need
to add exact copies of these triggers that do accept a type.

It's possible to remove all the typing (and therefore the two new
triggers), and use the current triggers, but that causes the extract method
in the SessionWindowTimeGapExtractor to accept Object, and anyone
implementing it will need to cast the event to their expected type.

I don't find making the end user cast from Object to be the most friendly
interface, especially when the type information is otherwise available, but
I'm also not happy to have created exact copies of the trigger classes
simply because they aren't currently generic.

Was a similar discussion had when the current Session Window Assigners were
implemented? They hard type the event to Object as well, which I found a
bit odd.

Thanks,
Dyana


[jira] [Created] (FLINK-8439) Document using a custom AWS Credentials Provider with flink-3s-fs-hadoop

2018-01-16 Thread Dyana Rose (JIRA)
Dyana Rose created FLINK-8439:
-

 Summary: Document using a custom AWS Credentials Provider with 
flink-3s-fs-hadoop
 Key: FLINK-8439
 URL: https://issues.apache.org/jira/browse/FLINK-8439
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Dyana Rose


This came up when using the s3 for the file system backend and running under 
ECS.

With no credentials in the container, hadoop-aws will default to EC2 instance 
level credentials when accessing S3. However when running under ECS, you will 
generally want to default to the task definition's IAM role.

In this case you need to set the hadoop property
{code:java}
fs.s3a.aws.credentials.provider{code}
to a fully qualified class name(s). see [hadoop-aws 
docs|https://github.com/apache/hadoop/blob/1ba491ff907fc5d2618add980734a3534e2be098/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md]

This works as expected when you add this setting to flink-conf.yaml but there 
is a further 'gotcha.'  Because the AWS sdk is shaded, the actual full class 
name for, in this case, the ContainerCredentialsProvider is
{code:java}
org.apache.flink.fs.s3hadoop.shaded.com.amazonaws.auth.ContainerCredentialsProvider{code}
 

meaning the full setting is:
{code:java}
fs.s3a.aws.credentials.provider: 
org.apache.flink.fs.s3hadoop.shaded.com.amazonaws.auth.ContainerCredentialsProvider{code}
If you instead set it to the unshaded class name you will see a very confusing 
error stating that the ContainerCredentialsProvider doesn't implement 
AWSCredentialsProvider (which it most certainly does.)

Adding this information (how to specify alternate Credential Providers, and the 
name space gotcha) to the [AWS deployment 
docs|https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/aws.html]
 would be useful to anyone else using S3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8384) Session Window Assigner with Dynamic Gaps

2018-01-08 Thread Dyana Rose (JIRA)
Dyana Rose created FLINK-8384:
-

 Summary: Session Window Assigner with Dynamic Gaps
 Key: FLINK-8384
 URL: https://issues.apache.org/jira/browse/FLINK-8384
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Dyana Rose
Priority: Minor


*Reason for Improvement*

Currently both Session Window assigners only allow a static inactivity gap. 
Given the following scenario, this is too restrictive:

* Given a stream of IoT events from many device types
* Assume each device type could have a different inactivity gap
* Assume each device type gap could change while sessions are in flight

By allowing dynamic inactivity gaps, the correct gap can be determined in the 
[assignWindows 
function|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/assigners/EventTimeSessionWindows.java#L59]
 by passing the element currently under consideration, the timestamp, and the 
context to a user defined function. This eliminates the need to create unwieldy 
work arounds if you only have static gaps.

Dynamic Session Window gaps should be available for both Event Time and 
Processing Time streams.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Dynamic SessionWindow gaps

2017-12-29 Thread Dyana Rose
I have a use case for non-static Session Window gaps.

For example, given a stream of IoT events, each device type could have a
different gap, and that gap could change while sessions are in flight.

I didn't want to have to run a stream processor for each potential gap
length, not to mention the headache of dealing with changing gaps, so I've
implemented a version of SessionWindows that has one major change; in the
assignWindows method it passes the element to a method to extract the
correct sessionTimeout. (current Flink method for reference:
https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/assigners/EventTimeSessionWindows.java#L59
)

Preliminary tests show this working as required and I can't be the only
person with this type of use case for session windows.

Will an issue and PR to add this functionality to the SessionWindow classes
be welcome?

Dyana


[jira] [Created] (FLINK-8267) Kinesis Producer example setting Region key

2017-12-15 Thread Dyana Rose (JIRA)
Dyana Rose created FLINK-8267:
-

 Summary: Kinesis Producer example setting Region key
 Key: FLINK-8267
 URL: https://issues.apache.org/jira/browse/FLINK-8267
 Project: Flink
  Issue Type: Bug
Reporter: Dyana Rose
Priority: Minor


https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/kinesis.html#kinesis-producer
In the example code for the kinesis producer the region key is set like:

{code:java}
producerConfig.put(AWSConfigConstants.AWS_REGION, "us-east-1");
{code}

However, the AWS Kinesis Producer Library requires that the region key be 
Region 
(https://github.com/awslabs/amazon-kinesis-producer/blob/94789ff4bb2f5dfa05aafe2d8437d3889293f264/java/amazon-kinesis-producer-sample/default_config.properties#L269)
 so the setting at this point should be:

{code:java}
producerConfig.put(AWSConfigConstants.AWS_REGION, "us-east-1");
producerConfig.put("Region", "us-east-1");
{code}

When you run the Kinesis Producer you can see the effect of not setting the 
Region key by a log line

{noformat}
org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer  - Started 
Kinesis producer instance for region ''
{noformat}


The KPL also then assumes it's running on EC2 and attempts to determine it's 
own region, which fails.

{noformat}
(EC2MetadataClient)Http request to Ec2MetadataService failed.
[error] [main.cc:266] Could not configure the region. It was not given in the 
config and we were unable to retrieve it from EC2 metadata
{noformat}


At the least I'd say the documentation should mention the difference between 
these two keys and when you are required to also set the Region key.

On the other hand, is this even the intended behaviour of this connector? Was 
it intended that the AWSConfigConstants.AWS_REGION key also set the region of 
the of the kinesis stream? The documentation for the example states 

{noformat}
The example demonstrates producing a single Kinesis stream in the AWS region 
“us-east-1”.
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-4026) Fix code, grammar, and link issues in the Streaming documentation

2016-06-06 Thread Dyana Rose (JIRA)
Dyana Rose created FLINK-4026:
-

 Summary: Fix code, grammar, and link issues in the Streaming 
documentation
 Key: FLINK-4026
 URL: https://issues.apache.org/jira/browse/FLINK-4026
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Dyana Rose
Priority: Trivial


The streaming API section of the documentation has issues with grammar that 
make it hard to follow in places. As well as an incorrect code example, and 
places of unnecessary parentheses on the Windows page, and a missing link for 
Kineses Streams on the Connectors index page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)