Re: Is there a Slack channel?

2018-06-06 Thread Chesnay Schepler

No, there is no slack channel.

We had a recent discussion about introducing one but rejected it.

On 07.06.2018 01:58, Pablo Estrada wrote:

Hello there from the Apache Beam world!

I was meaning to start learning and hearing about Flink, so I recently
subscribed to the mailing lists, and I wonder if there's also a Slack
channel that people use?
Regards!
-P.





Re: [1.4.2] mvn clean package command takes to much time

2018-06-06 Thread Chesnay Schepler

first run mvn clean install in the flink-shaded-hadoop module.

then you only have to run mvn clean package -pl flink-runtime,flink-dist

On 07.06.2018 04:48, Marvin777 wrote:

Thank you for your reply. If I modify the flink-runtime module,  then the
following command is executed for compilation.

'mvn clean package -pl flink-runtime,flink-dist *-am*'

The parameter of '-am' is necessary,  it will take a long time, otherwise
it will report an error, like 'Failed to execute goal on project
flink-dist_2.11: Could not resolve dependencies for project
org.apache.flink:flink-dist_2.11:jar:1.4.2: Could not find artifact
org.apache.flink:flink-shaded-hadoop2-uber:jar:1.4.2'

Am i missing something, waiting for your reply.

Best regard.


Chesnay Schepler  于2018年6月6日周三 下午4:29写道:


you only have to compile the module that you changed along with
flink-dist to test things locally.

On 06.06.2018 10:27, Marvin777 wrote:

Hi, all.
It takes a long time to modify some of the code and recompile it. The
process is painful.
Is there any method that I can save time.

Thanks!







Re: [1.4.2] mvn clean package command takes to much time

2018-06-06 Thread Marvin777
Thank you for your reply. If I modify the flink-runtime module,  then the
following command is executed for compilation.

'mvn clean package -pl flink-runtime,flink-dist *-am*'

The parameter of '-am' is necessary,  it will take a long time, otherwise
it will report an error, like 'Failed to execute goal on project
flink-dist_2.11: Could not resolve dependencies for project
org.apache.flink:flink-dist_2.11:jar:1.4.2: Could not find artifact
org.apache.flink:flink-shaded-hadoop2-uber:jar:1.4.2'

Am i missing something, waiting for your reply.

Best regard.


Chesnay Schepler  于2018年6月6日周三 下午4:29写道:

> you only have to compile the module that you changed along with
> flink-dist to test things locally.
>
> On 06.06.2018 10:27, Marvin777 wrote:
> > Hi, all.
> > It takes a long time to modify some of the code and recompile it. The
> > process is painful.
> > Is there any method that I can save time.
> >
> > Thanks!
> >
>
>


Is there a Slack channel?

2018-06-06 Thread Pablo Estrada
Hello there from the Apache Beam world!

I was meaning to start learning and hearing about Flink, so I recently
subscribed to the mailing lists, and I wonder if there's also a Slack
channel that people use?
Regards!
-P.
-- 
Got feedback? go/pabloem-feedback


[jira] [Created] (FLINK-9545) Support read file for N times in Flink stream

2018-06-06 Thread Bowen Li (JIRA)
Bowen Li created FLINK-9545:
---

 Summary: Support read file for N times in Flink stream
 Key: FLINK-9545
 URL: https://issues.apache.org/jira/browse/FLINK-9545
 Project: Flink
  Issue Type: Improvement
  Components: DataStream API
Affects Versions: 1.6.0
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.6.0


we need {{StreamExecutionEnvironment.readFile/readTextFile}} to read each file 
for N times, but currently it only supports reading file once.

add support for the feature.

Plan:

add a new processing mode as PROCESSING_N_TIMES, and add additional parameter 
{{numTimes}} for {{StreamExecutionEnvironment.readFile/readTextFile}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


FINAL REMINDER: Apache EU Roadshow 2018 in Berlin next week!

2018-06-06 Thread sharan

Hello Apache Supporters and Enthusiasts

This is a final reminder that our Apache EU Roadshow will be held in 
Berlin next week on 13th and 14th June 2018. We will have 28 different 
sessions running over 2 days that cover some great topics. So if you are 
interested in Microservices, Internet of Things (IoT), Cloud, Apache 
Tomcat or Apache Http Server then we have something for you.


https://foss-backstage.de/sessions/apache-roadshow

We will be co-located with FOSS Backstage, so if you are interested in 
topics such as incubator, the Apache Way, open source governance, legal, 
trademarks or simply open source communities then there will be 
something there for you too.  You can attend any of talks, presentations 
and workshops from the Apache EU Roadshow or FOSS Backstage.


You can find details of the combined Apache EU Roadshow and FOSS 
Backstage conference schedule below:


https://foss-backstage.de/schedule?day=2018-06-13

Ticket prices go up on 8th June 2018 and we have a last minute discount 
code that anyone can use before the deadline:


15% discount code: ASF15_discount
valid until June 7, 23:55 CET

You can register at the following link:

https://foss-backstage.de/tickets

Our Apache booth and lounge will be open from 11th - 14th June for 
meetups, hacking or to simply relax between sessions. And we will be 
posting regular updates on social media throughout next week so please 
follow us on Twitter @ApacheCon


Thank you for your continued support and we look forward to seeing you 
in Berlin!


Thanks
Sharan Foga, VP Apache Community Development

http://apachecon.com/

PLEASE NOTE: You are receiving this message because you are subscribed 
to a user@ or dev@ list of one or more Apache Software Foundation projects.





Re: [DISCUSS] Flink 1.4 and below STOPS writing to Kinesis after June 12th.

2018-06-06 Thread Thomas Weise
-->

On Wed, Jun 6, 2018 at 1:06 AM, Bowen Li  wrote:

> Hi,
>
> I think the following email thread might have gone lost.
>
> Dyana brought up the attention that AWS has informed users that KPL
> versions
> < 0.12.6 will *stop working* starting from the 12th of June. Flink 1.4 is
> using KPL 0.12.5 and Flink 1.5 uses 0.12.6, *so Flink 1.4 and below (1.3,
> etc) will be impacted.*
>
> I think we probably should try our best to communicate this out to our
> users in user email alias and all other possible channels, in order not to
> disrupt their production pipeline.
>
> A quick solution we can suggest to users is to package and use
> flink-connector-kinesis in Flink 1.5. Given that the public APIs don't
> change, flink-connector-kinesis in Flink 1.5 should work with Flink 1.4 and
> below, but that needs verification.
>

The connector might work, but it is going to require a little extra
juggling for the users to exclude the 1.5 transitive dependencies.

How about just backporting the changes? We internally use the newer SDK
with a number of other consumer patches on top of 1.4.2 and it works fine.

Normally such changes don't qualify for a patch version, but since it is
going to stop working anyways, there is probably no harm in this instance.

Thomas



> What do you think?
>
> Thanks, Bowen
>
>
> -- Forwarded message --
> From: Bowen Li 
> Date: Fri, May 11, 2018 at 10:28 AM
> Subject: Re: KPL in current stable 1.4.2 and below, upcoming problem
> To: dev@flink.apache.org, "Tzu-Li (Gordon) Tai" 
>
>
> Thanks, this is a great heads-up!  Flink 1.4 is using KPL 0.12.5, *so Flink
> version 1.4 or below will be effected.*
>
> Kinesis sink is in flink-kinesis-connector. Good news is that whoever is
> using flink-kinesis-connector right now is building it themself, because
> Flink doesn't publish that jar into maven due to licensing issue. So these
> users, like you Dyana,  already have build experience and may try to bump
> KPL themselves.
>
> I think it'll be great if Flink can bump KPL in Flink 1.2/1.3/1.4 and
> release minor versions for them, as an official support. It also requires
> checking backward compatibility. This can be done after releasing 1.5.
> @Gordon may take the final call of how to eventually do it.
>
> Thanks,
> Bowen
>
>
> On Thu, May 10, 2018 at 1:35 AM, Dyana Rose 
> wrote:
>
> > Hello,
> >
> > We've received notification from AWS that the Kinesis Producer Library
> > versions < 0.12.6 will stop working after the 12th of June (assuming the
> > date in the email is in US format...)
> >
> > Flink v1.5.0 has the KPL version at 0.12.6 so it will be fine when it's
> > released. However using the kinesis connector in any previous version
> looks
> > like they'll have an issue.
> >
> > I'm not sure how/if you want to communicate this. We build Flink
> ourselves,
> > so I plan on having a look at any changes done to the Kinesis Sink in
> > v1.5.0 and then bumpimg the KPL version in our fork and rebuilding.
> >
> > Thanks,
> > Dyana
> >
> > below is the email we received (note: we're in eu-west-1):
> > 
> >
> > Hello,
> >
> >
> >
> > Your action is required: please update clients running Kinesis Producer
> > Library 0.12.5 or older or you will experience a breaking change to your
> > application.
> >
> >
> >
> > We've discovered you have one or more clients writing data to Amazon
> > Kinesis Data Streams running an outdated version of the Kinesis Producer
> > Library. On 6/12 these clients will be impacted if they are not updated
> to
> > Kinesis Producer Library version 0.12.6 or newer. On 06/12 Kinesis Data
> > Streams will install ATS certificates which will prevent these outdated
> > clients from writing to a Kinesis Data Stream. The result of this change
> > will break any producer using KPL 0.12.5 or older.
> >
> >
> > * How do I update clients and applications to use the latest version of
> the
> > Kinesis Producer Library?
> >
> > You will need to ensure producers leveraging the Kinesis Producer Library
> > have upgraded to version 0.12.6 or newer. If you operate older versions
> > your application will break due untrusted SSL certification.
> >
> > Via Maven install Kinesis Producer Library version 0.12.6 or higher [2]
> >
> > After you've configured your clients to use the new version, you're done.
> >
> > * What if I have questions or issues?
> >
> > If you have questions or issues, please contact your AWS Technical
> Account
> > Manager or AWS support and file a support ticket [3].
> >
> > [1] https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-u
> > pgrades.html
> >
> > [2] http://search.maven.org/#artifactdetails|com.amazonaws|amazo
> > n-kinesis-produ...
> >  > amazon-kinesis-producer%7C0.12.6%7Cjar>
> >
> > [3] https://aws.amazon.com/support
> >
> >
> >
> > -  Amazon Kinesis Data Streams Team
> > -
> >
>


Re: [PROPOSAL] Introduce Elastic Bloom Filter For Flink

2018-06-06 Thread sihua zhou
Hi,


Sorry, but pinging for more feedbacks on this proposal...
Even the negative feedbacks is highly appreciated!


Best, Sihua






On 05/30/2018 13:19,sihua zhou wrote:
Hi,


I did a survey of the variants of Bloom Filter and the Cuckoo filter these 
days. Finally, I found 3 of them maybe adaptable for our purpose.


1. standard bloom filter (which we have implemented base on this and used it on 
production with a good experience)
2. cuckoo filter, also a very good filter which is a space-efficient data 
structures and support fast query(even faster then BF, but the insert maybe a 
little slower than BF), addtional it support delete() operation.
3. count bloom filter, a variant of BF, it supports delete()operation, but need 
to cost 4-5x memory than the standard bloom filter(so, I'm not sure whether 
it's adaptable in practice).


Anyway, these filters are just the smallest storage unit in this "Elastic Bloom 
Filter", we can define a general interface, and provide different 
implementation of "storage unit"  base on different filter if we want. Maybe I 
should change the PROPOSAL name to the "Introduce Elastic Filter For Flink", 
the ideal of approach that I outlined in the doc is very similar to the paper 
"Optimization and Applications of Dynamic Bloom 
Filters(http://ijarcs.info/index.php/Ijarcs/article/viewFile/826/814)"(compare 
to the paper, the approach I outlined could have a better query performance and 
also support the RELAXED TTL), maybe it can help to understand the desgin doc. 
Looking forward any feedback!


Best, Sihua
On 05/24/2018 10:36,sihua zhou wrote:
Hi,
Thanks for your suggestions @Elias! I have a brief look at "Cuckoo Filter" and 
"Golumb Compressed Sequence", my first sensation is that maybe "Golumc 
Compressed Sequence" is not a good choose, because it seems to require 
non-constant lookup time, but Cuckoo Filter maybe a good choose, I should 
definitely have a deeper look at it.


Beside, to me, all of this filters seems to a "variant" of the bloom 
filter(which is the smallest unit to store data in the current desgin), the 
main challenge for introducing BF into flink is the data skewed(which is common 
phenomenon on production) problem, could you maybe also have a look at the 
solution that I posted on the google doc 
https://docs.google.com/document/d/17UY5RZ1mq--hPzFx-LfBjCAw_kkoIrI9KHovXWkxNYY/edit?usp=sharing
 for this problem, It would be nice if you could give us some advice on that.


Best, Sihua


On 05/24/2018 07:21,Elias Levy wrote:
I would suggest you consider an alternative data structures: a Cuckoo
Filter or a Golumb Compressed Sequence.

The GCS data structure was introduced in Cache-, Hash- and Space-Efficient
Bloom Filters
 by
F. Putze, P. Sanders, and J. Singler.  See section 4.



We should discuss which exact implementation of bloom filters are the best
fit.
@Fabian: There are also implementations of bloom filters that use counting
and therefore support
deletes, but obviously this comes at the cost of a potentially higher
space consumption.

Am 23.05.2018 um 11:29 schrieb Fabian Hueske :
IMO, such a feature would be very interesting. However, my concerns with
Bloom Filter
is that they are insert-only data structures, i.e., it is not possible to
remove keys once
they were added. This might render the filter useless over time.
In a different thread (see discussion in FLINK-8918 [1]), you mentioned
that the Bloom
Filters would be growing.
If we keep them in memory, how can we prevent them from exceeding memory
boundaries over
time?




Need help in understanding flink ordering of records

2018-06-06 Thread Amol S - iProgrammer
Hello Ezmlm,

I have implemented code to read mongodb oplog and stream this oplog in
flink, But I need all the records in the order the are coming from oplog.
first of all clarify me is flink maintains order of insertion?  if yes then
give me some source document where I can find how flink does this.

---
*Amol Suryawanshi*
Java Developer
am...@iprogrammer.com


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com 



[jira] [Created] (FLINK-9544) Impossible to downgrade kinesis protocole from CBOR to JSON as required by kinesalite

2018-06-06 Thread Ph.Duveau (JIRA)
Ph.Duveau created FLINK-9544:


 Summary: Impossible to downgrade kinesis protocole from CBOR to 
JSON as required by kinesalite
 Key: FLINK-9544
 URL: https://issues.apache.org/jira/browse/FLINK-9544
 Project: Flink
  Issue Type: Bug
  Components: Kinesis Connector
Affects Versions: 1.4.2, 1.4.1, 1.5.0, 1.4.0
Reporter: Ph.Duveau


The amazon client do not downgrade from CBOR to JSON while setting env 
AWS_CBOR_DISABLE to true (or 1) and/or defining 
com.amazonaws.sdk.disableCbor=true via JVM options. This bug is due to maven 
shade relocation of com.amazon.* classes. As soon as you cancel (by removing 
the relocation in the kinesis connector or by re-relocating in the final jar), 
it reruns again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9543) Expose JobMaster IDs to metric system

2018-06-06 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-9543:
---

 Summary: Expose JobMaster IDs to metric system
 Key: FLINK-9543
 URL: https://issues.apache.org/jira/browse/FLINK-9543
 Project: Flink
  Issue Type: New Feature
  Components: Local Runtime, Metrics
Reporter: Chesnay Schepler
 Fix For: 1.6.0


To be able to differentiate between metrics from different taskmanagers we 
should expose the Jobmanager ID (i.e. the resourceID) to the metric system, 
like we do for TaskManagers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9542) ExEnv#registerCachedFile should accept Path

2018-06-06 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-9542:
---

 Summary: ExEnv#registerCachedFile should accept Path
 Key: FLINK-9542
 URL: https://issues.apache.org/jira/browse/FLINK-9542
 Project: Flink
  Issue Type: Improvement
  Components: DataSet API, DataStream API
Reporter: Chesnay Schepler


Currently, {{registerCachedFile}} accepts {{Strings}} as file-paths.

This is undesirable because invalid paths are detected much later than 
necessary; at the moment this happens when we attempt to upload them to the 
blob store, i.e. during job submission, when ideally it should fail right away.

As an intermediate solution we can modify the {{DistributedCacheEntries}} to 
contain {{Paths}} instead of {{Strings}}. This will not require API changes but 
still allow earlier detection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9541) Add robots.txt and sitemap.xml to Flink website

2018-06-06 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-9541:


 Summary: Add robots.txt and sitemap.xml to Flink website
 Key: FLINK-9541
 URL: https://issues.apache.org/jira/browse/FLINK-9541
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Fabian Hueske


>From the [dev mailing 
>list|https://lists.apache.org/thread.html/71ce1bfbed1cf5f0069b27a46df1cd4dccbe8abefa75ac85601b088b@%3Cdev.flink.apache.org%3E]:

{quote}
It would help to add a sitemap (and the robots.txt required to reference it) 
for flink.apache.org and ci.apache.org (for /projects/flink)

You can see what Tomcat did along these lines - 
http://tomcat.apache.org/robots.txt references 
http://tomcat.apache.org/sitemap.xml, which is a sitemap index file pointing to 
http://tomcat.apache.org/sitemap-main.xml

By doing this, you can emphasize more recent versions of docs. There are other 
benefits, but reducing poor Google search results (to me) is the biggest win.

E.g.  https://www.google.com/search?q=flink+reducingstate 
 (search on flink reducing 
state) shows the 1.3 Javadocs (hit #1), master (1.6-SNAPSHOT) Javadocs (hit 
#2), and then many pages of other results.

Whereas the Javadocs for 1.5 

 and 1.4 

 are nowhere to be found.
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [WEBSITE] Proposal to rework the Flink website

2018-06-06 Thread Fabian Hueske
Thanks for the feedback so far.

+1 for adding a sitemap.xml and robots.txt to the website.

I think we can make this a separate issue. I'll create a JIRA for that.

Any other thoughts or feedback?

Thanks, Fabian

2018-06-06 9:34 GMT+02:00 Aljoscha Krettek :

> Yes, making sure that google search results point to the most recent doc
> would be very good. :+1
>
> Also +1 to the general effort, of course.
>
> > On 5. Jun 2018, at 20:19, Ken Krugler 
> wrote:
> >
> > Along these lines, it would help to add a sitemap (and the robots.txt
> required to reference it) for flink.apache.org and ci.apache.org (for
> /projects/flink)
> >
> > You can see what Tomcat did along these lines -
> http://tomcat.apache.org/robots.txt references http://tomcat.apache.org/
> sitemap.xml, which is a sitemap index file pointing to
> http://tomcat.apache.org/sitemap-main.xml
> >
> > By doing this, you can emphasize more recent versions of docs. There are
> other benefits, but reducing poor Google search results (to me) is the
> biggest win.
> >
> > E.g.  https://www.google.com/search?q=flink+reducingstate <
> https://www.google.com/search?q=flink+reducingstate> (search on flink
> reducing state) shows the 1.3 Javadocs (hit #1), master (1.6-SNAPSHOT)
> Javadocs (hit #2), and then many pages of other results.
> >
> > Whereas the Javadocs for 1.5  projects/flink/flink-docs-release-1.5/api/java/org/
> apache/flink/api/common/state/ReducingState.html> and 1.4 <
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/api/java/org/
> apache/flink/api/common/state/ReducingState.html> are nowhere to be found.
> >
> > Thoughts?
> >
> > — Ken
> >
> >> On Jun 5, 2018, at 9:46 AM, Stephan Ewen  wrote:
> >>
> >> Big +1 to this!
> >>
> >> I would like to contribute to this effort and help strengthen the way
> Flink
> >> presents itself.
> >>
> >>
> >> On Tue, Jun 5, 2018 at 11:56 AM, Fabian Hueske 
> wrote:
> >>
> >>> Hi everybody,
> >>>
> >>> I've opened a PR [1] that reworks parts of the Flink website (
> >>> flink.apache.org).
> >>>
> >>> My goal is to improve the structure of the website and provide more
> >>> valuable information about the project and the community.
> >>>
> >>> A visitor (who doesn't know Flink yet) should be able to easily find
> >>> answers to the following questions:
> >>> * What is Apache Flink?
> >>> * Does it address my use case?
> >>> * Is it credible? / Who is using it?
> >>>
> >>> To achieve that, I have:
> >>> * Rework menu structure into three sections to address different
> audiences:
> >>> - Potential users (see above)
> >>> - Users
> >>> - Contributors
> >>> * Reworked start page: updated the figure, added a feature grid, moved
> >>> "Powered By" section up
> >>> * Replaced Features page by more detailed "What is Flink?" pages
> >>> * Reworked "Use Cases" page
> >>>
> >>> The PR should also improve the page for users who have questions about
> >>> Flink or need help.
> >>> For that, I have:
> >>> * Added a "Getting Help" page (less content than the detailed community
> >>> page)
> >>> * Removed IRC channel info
> >>>
> >>> Please give feedback, suggest improvements, and proof read the new
> texts.
> >>>
> >>> Thanks, Fabian
> >>>
> >>> [1] https://github.com/apache/flink-web/pull/109
> >>>
> >
> > 
> > http://about.me/kkrugler
> > +1 530-210-6378
> >
>
>


Re: [DISCUSS] FLIP-6 Problems

2018-06-06 Thread Renjie Liu
Our main use cases are mesos, maybe we can start with mesos support.
On Wed, Jun 6, 2018 at 5:00 PM Stephan Ewen  wrote:

> The FLIP-6 design was specifically such that it allows for separation of
> Dispatcher, ResourceManager, and JobManagers.
> So that could be another extension at some point.
>
> It should be conceptually rather simple, the dispatcher creates per job a
> new container launch context with the "JobManagerRunner" and starts that.
> In practice, it is quite a bit of work still, with all the details of Yarn
> to take care of.
>
>
>
> On Wed, Jun 6, 2018 at 9:45 AM, Renjie Liu 
> wrote:
>
> > That's really great! I'll help to contribute to the process.
> >
> > On Wed, Jun 6, 2018 at 3:17 PM Till Rohrmann 
> wrote:
> >
> > > Hi Renjie,
> > >
> > > there is already an issue for introducing further scheduling
> constraints
> > > (e.g. tags) to achieve TM isolation when using the session mode [1].
> What
> > > it does not cover is the isolation of the JMs which need to be executed
> > in
> > > their own processes. At the moment they share the same process with the
> > > Dispatcher because it was simpler to do it like that as first
> iteration.
> > > Here is the issue for isolating JobManagers [2].
> > >
> > > Concerning the resource specification, the corresponding issue can be
> > found
> > > here [3].
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-8886
> > > [2] https://issues.apache.org/jira/browse/FLINK-9537
> > > [3] https://issues.apache.org/jira/browse/FLINK-5131
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Jun 6, 2018 at 2:13 AM Renjie Liu 
> > wrote:
> > >
> > > > Hi, Stephan:
> > > >
> > > > Yes that's what I mean. In fact the most import thing is to share the
> > > > dispatcher so that we can have *a centralized gateway for flink job
> > > > management and submission. The problem with per job cluster is that
> we
> > > > can't have a centralized gateway.*
> > > >
> > > > I didn't realize that job manager also needs to run user code before
> > and
> > > > yes that means we job manager should also be isolated.
> > > >
> > > > Wouldn't it be better to separate job manager from the dispatcher so
> > that
> > > > user code does't interfere with each other? In fact it seems that in
> > most
> > > > production environments job isolation is required since nobody want
> > their
> > > > job to be affected by others.
> > > >
> > > > On Tue, Jun 5, 2018 at 11:34 PM Stephan Ewen 
> wrote:
> > > >
> > > > > Hi Renjie,
> > > > >
> > > > > When you suggest to have TaskManager isolation in session mode, do
> > you
> > > > mean
> > > > > to have a shared JobManager / Dispatcher, but job-specific
> > > TaskManagers?
> > > > > If this mainly to reduce the overhead of the per-job JobManager?
> > > > >
> > > > > One assumption so far was that if TaskManager isolation is
> required,
> > > > > JobManager isolation is also required, because some user code
> > > potentially
> > > > > also runs on the JobManager, like CheckpointHooks, Input/Output
> > > Formats,
> > > > > ...
> > > > >
> > > > > Best,
> > > > > Stephan
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jun 5, 2018 at 4:20 PM, Renjie Liu <
> liurenjie2...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi, Till:
> > > > > >
> > > > > >
> > > > > >1. Does the community has any plan to add task manager
> isolation
> > > > into
> > > > > >the session mode?
> > > > > >2. Is there any issues to track this feature? I want to help
> > > > > contribute.
> > > > > >3. Thanks for the knowledge but it can't help if task manager
> > > > > isolation
> > > > > >is not present.
> > > > > >
> > > > > >
> > > > > > On Tue, Jun 5, 2018 at 7:28 PM Till Rohrmann <
> trohrm...@apache.org
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi Renjie,
> > > > > > >
> > > > > > > 1) you're right that the Flink session mode does not give you
> > > proper
> > > > > job
> > > > > > > isolation. It is the same as with Flink 1.4 session mode. If
> this
> > > is
> > > > a
> > > > > > > strong requirement for you, then I recommend using the per job
> > > mode.
> > > > > > >
> > > > > > > 2) At the moment it is also not possible to define per job
> > resource
> > > > > > > requirements when using the session mode. This is a feature
> which
> > > the
> > > > > > > community has started implementing but it is not yet fully
> done.
> > I
> > > > > assume
> > > > > > > that the community will continue working on it. At the moment,
> > the
> > > > > > solution
> > > > > > > would be to use the per job mode to not waste unnecessary
> > > resources.
> > > > > > >
> > > > > > > 3) I think the assigned ResourceID for a TaskManager is shown
> in
> > > the
> > > > > web
> > > > > > UI
> > > > > > > and when querying the "/taskmanagers" REST endpoint. The
> resource
> > > id
> > > > is
> > > > > > > derived from the Mesos task id. Would that help to identify
> which
> > > TM
> > > > is
> > > > > > > running on which Mesos task?
> > > > > > >
> > > > > > >

[jira] [Created] (FLINK-9540) Apache Flink 1.4.2 S3 Hadooplibrary for Hadoop 2.7 is built for Hadoop 2.8 and fails

2018-06-06 Thread Razvan (JIRA)
Razvan created FLINK-9540:
-

 Summary: Apache Flink 1.4.2 S3 Hadooplibrary for Hadoop 2.7 is 
built for Hadoop 2.8 and fails
 Key: FLINK-9540
 URL: https://issues.apache.org/jira/browse/FLINK-9540
 Project: Flink
  Issue Type: Bug
Reporter: Razvan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9539) Integrate flink-shaded 4.0

2018-06-06 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-9539:
---

 Summary: Integrate flink-shaded 4.0
 Key: FLINK-9539
 URL: https://issues.apache.org/jira/browse/FLINK-9539
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.6.0


With the recent release of flink-shaded 4.0 we should bump the versions for all 
dependencies (except netty which is handled in FLINK-3952).

We can now remove the exclusions from the jackson dependencies as they are now 
properly hidden.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] FLIP-6 Problems

2018-06-06 Thread Stephan Ewen
The FLIP-6 design was specifically such that it allows for separation of
Dispatcher, ResourceManager, and JobManagers.
So that could be another extension at some point.

It should be conceptually rather simple, the dispatcher creates per job a
new container launch context with the "JobManagerRunner" and starts that.
In practice, it is quite a bit of work still, with all the details of Yarn
to take care of.



On Wed, Jun 6, 2018 at 9:45 AM, Renjie Liu  wrote:

> That's really great! I'll help to contribute to the process.
>
> On Wed, Jun 6, 2018 at 3:17 PM Till Rohrmann  wrote:
>
> > Hi Renjie,
> >
> > there is already an issue for introducing further scheduling constraints
> > (e.g. tags) to achieve TM isolation when using the session mode [1]. What
> > it does not cover is the isolation of the JMs which need to be executed
> in
> > their own processes. At the moment they share the same process with the
> > Dispatcher because it was simpler to do it like that as first iteration.
> > Here is the issue for isolating JobManagers [2].
> >
> > Concerning the resource specification, the corresponding issue can be
> found
> > here [3].
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-8886
> > [2] https://issues.apache.org/jira/browse/FLINK-9537
> > [3] https://issues.apache.org/jira/browse/FLINK-5131
> >
> > Cheers,
> > Till
> >
> > On Wed, Jun 6, 2018 at 2:13 AM Renjie Liu 
> wrote:
> >
> > > Hi, Stephan:
> > >
> > > Yes that's what I mean. In fact the most import thing is to share the
> > > dispatcher so that we can have *a centralized gateway for flink job
> > > management and submission. The problem with per job cluster is that we
> > > can't have a centralized gateway.*
> > >
> > > I didn't realize that job manager also needs to run user code before
> and
> > > yes that means we job manager should also be isolated.
> > >
> > > Wouldn't it be better to separate job manager from the dispatcher so
> that
> > > user code does't interfere with each other? In fact it seems that in
> most
> > > production environments job isolation is required since nobody want
> their
> > > job to be affected by others.
> > >
> > > On Tue, Jun 5, 2018 at 11:34 PM Stephan Ewen  wrote:
> > >
> > > > Hi Renjie,
> > > >
> > > > When you suggest to have TaskManager isolation in session mode, do
> you
> > > mean
> > > > to have a shared JobManager / Dispatcher, but job-specific
> > TaskManagers?
> > > > If this mainly to reduce the overhead of the per-job JobManager?
> > > >
> > > > One assumption so far was that if TaskManager isolation is required,
> > > > JobManager isolation is also required, because some user code
> > potentially
> > > > also runs on the JobManager, like CheckpointHooks, Input/Output
> > Formats,
> > > > ...
> > > >
> > > > Best,
> > > > Stephan
> > > >
> > > >
> > > >
> > > > On Tue, Jun 5, 2018 at 4:20 PM, Renjie Liu 
> > > > wrote:
> > > >
> > > > > Hi, Till:
> > > > >
> > > > >
> > > > >1. Does the community has any plan to add task manager isolation
> > > into
> > > > >the session mode?
> > > > >2. Is there any issues to track this feature? I want to help
> > > > contribute.
> > > > >3. Thanks for the knowledge but it can't help if task manager
> > > > isolation
> > > > >is not present.
> > > > >
> > > > >
> > > > > On Tue, Jun 5, 2018 at 7:28 PM Till Rohrmann  >
> > > > wrote:
> > > > >
> > > > > > Hi Renjie,
> > > > > >
> > > > > > 1) you're right that the Flink session mode does not give you
> > proper
> > > > job
> > > > > > isolation. It is the same as with Flink 1.4 session mode. If this
> > is
> > > a
> > > > > > strong requirement for you, then I recommend using the per job
> > mode.
> > > > > >
> > > > > > 2) At the moment it is also not possible to define per job
> resource
> > > > > > requirements when using the session mode. This is a feature which
> > the
> > > > > > community has started implementing but it is not yet fully done.
> I
> > > > assume
> > > > > > that the community will continue working on it. At the moment,
> the
> > > > > solution
> > > > > > would be to use the per job mode to not waste unnecessary
> > resources.
> > > > > >
> > > > > > 3) I think the assigned ResourceID for a TaskManager is shown in
> > the
> > > > web
> > > > > UI
> > > > > > and when querying the "/taskmanagers" REST endpoint. The resource
> > id
> > > is
> > > > > > derived from the Mesos task id. Would that help to identify which
> > TM
> > > is
> > > > > > running on which Mesos task?
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Tue, Jun 5, 2018 at 5:13 AM Renjie Liu <
> liurenjie2...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > -- Forwarded message -
> > > > > > > From: Renjie Liu 
> > > > > > > Date: Tue, Jun 5, 2018 at 10:43 AM
> > > > > > > Subject: [DISCUSS] FLIP-6 Problems
> > > > > > > To: user 
> > > > > > >
> > > > > > >
> > > > > > > Hi:
> > > > > > >
> > > > > > > We've deployed flink 1.5.0 and tested the new

Re: [1.4.2] mvn clean package command takes to much time

2018-06-06 Thread Chesnay Schepler
you only have to compile the module that you changed along with 
flink-dist to test things locally.


On 06.06.2018 10:27, Marvin777 wrote:

Hi, all.
It takes a long time to modify some of the code and recompile it. The
process is painful.
Is there any method that I can save time.

Thanks!





[1.4.2] mvn clean package command takes to much time

2018-06-06 Thread Marvin777
Hi, all.
It takes a long time to modify some of the code and recompile it. The
process is painful.
Is there any method that I can save time.

Thanks!


[DISCUSS] Flink 1.4 and below STOPS writing to Kinesis after June 12th.

2018-06-06 Thread Bowen Li
Hi,

I think the following email thread might have gone lost.

Dyana brought up the attention that AWS has informed users that KPL versions
< 0.12.6 will *stop working* starting from the 12th of June. Flink 1.4 is
using KPL 0.12.5 and Flink 1.5 uses 0.12.6, *so Flink 1.4 and below (1.3,
etc) will be impacted.*

I think we probably should try our best to communicate this out to our
users in user email alias and all other possible channels, in order not to
disrupt their production pipeline.

A quick solution we can suggest to users is to package and use
flink-connector-kinesis in Flink 1.5. Given that the public APIs don't
change, flink-connector-kinesis in Flink 1.5 should work with Flink 1.4 and
below, but that needs verification.

What do you think?

Thanks, Bowen


-- Forwarded message --
From: Bowen Li 
Date: Fri, May 11, 2018 at 10:28 AM
Subject: Re: KPL in current stable 1.4.2 and below, upcoming problem
To: dev@flink.apache.org, "Tzu-Li (Gordon) Tai" 


Thanks, this is a great heads-up!  Flink 1.4 is using KPL 0.12.5, *so Flink
version 1.4 or below will be effected.*

Kinesis sink is in flink-kinesis-connector. Good news is that whoever is
using flink-kinesis-connector right now is building it themself, because
Flink doesn't publish that jar into maven due to licensing issue. So these
users, like you Dyana,  already have build experience and may try to bump
KPL themselves.

I think it'll be great if Flink can bump KPL in Flink 1.2/1.3/1.4 and
release minor versions for them, as an official support. It also requires
checking backward compatibility. This can be done after releasing 1.5.
@Gordon may take the final call of how to eventually do it.

Thanks,
Bowen


On Thu, May 10, 2018 at 1:35 AM, Dyana Rose 
wrote:

> Hello,
>
> We've received notification from AWS that the Kinesis Producer Library
> versions < 0.12.6 will stop working after the 12th of June (assuming the
> date in the email is in US format...)
>
> Flink v1.5.0 has the KPL version at 0.12.6 so it will be fine when it's
> released. However using the kinesis connector in any previous version looks
> like they'll have an issue.
>
> I'm not sure how/if you want to communicate this. We build Flink ourselves,
> so I plan on having a look at any changes done to the Kinesis Sink in
> v1.5.0 and then bumpimg the KPL version in our fork and rebuilding.
>
> Thanks,
> Dyana
>
> below is the email we received (note: we're in eu-west-1):
> 
>
> Hello,
>
>
>
> Your action is required: please update clients running Kinesis Producer
> Library 0.12.5 or older or you will experience a breaking change to your
> application.
>
>
>
> We've discovered you have one or more clients writing data to Amazon
> Kinesis Data Streams running an outdated version of the Kinesis Producer
> Library. On 6/12 these clients will be impacted if they are not updated to
> Kinesis Producer Library version 0.12.6 or newer. On 06/12 Kinesis Data
> Streams will install ATS certificates which will prevent these outdated
> clients from writing to a Kinesis Data Stream. The result of this change
> will break any producer using KPL 0.12.5 or older.
>
>
> * How do I update clients and applications to use the latest version of the
> Kinesis Producer Library?
>
> You will need to ensure producers leveraging the Kinesis Producer Library
> have upgraded to version 0.12.6 or newer. If you operate older versions
> your application will break due untrusted SSL certification.
>
> Via Maven install Kinesis Producer Library version 0.12.6 or higher [2]
>
> After you've configured your clients to use the new version, you're done.
>
> * What if I have questions or issues?
>
> If you have questions or issues, please contact your AWS Technical Account
> Manager or AWS support and file a support ticket [3].
>
> [1] https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-u
> pgrades.html
>
> [2] http://search.maven.org/#artifactdetails|com.amazonaws|amazo
> n-kinesis-produ...
>  amazon-kinesis-producer%7C0.12.6%7Cjar>
>
> [3] https://aws.amazon.com/support
>
>
>
> -  Amazon Kinesis Data Streams Team
> -
>


Re: [DISCUSS] FLIP-6 Problems

2018-06-06 Thread Renjie Liu
That's really great! I'll help to contribute to the process.

On Wed, Jun 6, 2018 at 3:17 PM Till Rohrmann  wrote:

> Hi Renjie,
>
> there is already an issue for introducing further scheduling constraints
> (e.g. tags) to achieve TM isolation when using the session mode [1]. What
> it does not cover is the isolation of the JMs which need to be executed in
> their own processes. At the moment they share the same process with the
> Dispatcher because it was simpler to do it like that as first iteration.
> Here is the issue for isolating JobManagers [2].
>
> Concerning the resource specification, the corresponding issue can be found
> here [3].
>
> [1] https://issues.apache.org/jira/browse/FLINK-8886
> [2] https://issues.apache.org/jira/browse/FLINK-9537
> [3] https://issues.apache.org/jira/browse/FLINK-5131
>
> Cheers,
> Till
>
> On Wed, Jun 6, 2018 at 2:13 AM Renjie Liu  wrote:
>
> > Hi, Stephan:
> >
> > Yes that's what I mean. In fact the most import thing is to share the
> > dispatcher so that we can have *a centralized gateway for flink job
> > management and submission. The problem with per job cluster is that we
> > can't have a centralized gateway.*
> >
> > I didn't realize that job manager also needs to run user code before and
> > yes that means we job manager should also be isolated.
> >
> > Wouldn't it be better to separate job manager from the dispatcher so that
> > user code does't interfere with each other? In fact it seems that in most
> > production environments job isolation is required since nobody want their
> > job to be affected by others.
> >
> > On Tue, Jun 5, 2018 at 11:34 PM Stephan Ewen  wrote:
> >
> > > Hi Renjie,
> > >
> > > When you suggest to have TaskManager isolation in session mode, do you
> > mean
> > > to have a shared JobManager / Dispatcher, but job-specific
> TaskManagers?
> > > If this mainly to reduce the overhead of the per-job JobManager?
> > >
> > > One assumption so far was that if TaskManager isolation is required,
> > > JobManager isolation is also required, because some user code
> potentially
> > > also runs on the JobManager, like CheckpointHooks, Input/Output
> Formats,
> > > ...
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > >
> > > On Tue, Jun 5, 2018 at 4:20 PM, Renjie Liu 
> > > wrote:
> > >
> > > > Hi, Till:
> > > >
> > > >
> > > >1. Does the community has any plan to add task manager isolation
> > into
> > > >the session mode?
> > > >2. Is there any issues to track this feature? I want to help
> > > contribute.
> > > >3. Thanks for the knowledge but it can't help if task manager
> > > isolation
> > > >is not present.
> > > >
> > > >
> > > > On Tue, Jun 5, 2018 at 7:28 PM Till Rohrmann 
> > > wrote:
> > > >
> > > > > Hi Renjie,
> > > > >
> > > > > 1) you're right that the Flink session mode does not give you
> proper
> > > job
> > > > > isolation. It is the same as with Flink 1.4 session mode. If this
> is
> > a
> > > > > strong requirement for you, then I recommend using the per job
> mode.
> > > > >
> > > > > 2) At the moment it is also not possible to define per job resource
> > > > > requirements when using the session mode. This is a feature which
> the
> > > > > community has started implementing but it is not yet fully done. I
> > > assume
> > > > > that the community will continue working on it. At the moment, the
> > > > solution
> > > > > would be to use the per job mode to not waste unnecessary
> resources.
> > > > >
> > > > > 3) I think the assigned ResourceID for a TaskManager is shown in
> the
> > > web
> > > > UI
> > > > > and when querying the "/taskmanagers" REST endpoint. The resource
> id
> > is
> > > > > derived from the Mesos task id. Would that help to identify which
> TM
> > is
> > > > > running on which Mesos task?
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Tue, Jun 5, 2018 at 5:13 AM Renjie Liu  >
> > > > wrote:
> > > > >
> > > > > > -- Forwarded message -
> > > > > > From: Renjie Liu 
> > > > > > Date: Tue, Jun 5, 2018 at 10:43 AM
> > > > > > Subject: [DISCUSS] FLIP-6 Problems
> > > > > > To: user 
> > > > > >
> > > > > >
> > > > > > Hi:
> > > > > >
> > > > > > We've deployed flink 1.5.0 and tested the new cluster manager,
> it's
> > > > > really
> > > > > > great for flink to be elastic. However we've also found some
> > problems
> > > > > that
> > > > > > blocks us from deploying it to production environment.
> > > > > >
> > > > > > 1. Task manager isolation. Currently flink allows different jobs
> to
> > > > > execute
> > > > > > on same task managers, this is unacceptable in production
> > environment
> > > > > since
> > > > > > a faulty written job may kill task managers and affect other
> jobs.
> > > > > > 2. Per job resource configuration. Currently flink session
> cluster
> > > can
> > > > > only
> > > > > > allocate same size and configuration task managers. This may
> waste
> > a
> > > > lot
> > > > > of
> > > > > > resources if we have a lot of jobs with dif

[jira] [Created] (FLINK-9538) Make KeyedStateFunction an interface

2018-06-06 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-9538:
---

 Summary: Make KeyedStateFunction an interface
 Key: FLINK-9538
 URL: https://issues.apache.org/jira/browse/FLINK-9538
 Project: Flink
  Issue Type: Improvement
Reporter: Dawid Wysakowicz


I propose to change the KeyedStateFunction from abstract class to interface 
(FunctionalInterface in particular) to enable passing lambdas.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [WEBSITE] Proposal to rework the Flink website

2018-06-06 Thread Aljoscha Krettek
Yes, making sure that google search results point to the most recent doc would 
be very good. :+1

Also +1 to the general effort, of course.

> On 5. Jun 2018, at 20:19, Ken Krugler  wrote:
> 
> Along these lines, it would help to add a sitemap (and the robots.txt 
> required to reference it) for flink.apache.org and ci.apache.org (for 
> /projects/flink)
> 
> You can see what Tomcat did along these lines - 
> http://tomcat.apache.org/robots.txt references 
> http://tomcat.apache.org/sitemap.xml, which is a sitemap index file pointing 
> to http://tomcat.apache.org/sitemap-main.xml
> 
> By doing this, you can emphasize more recent versions of docs. There are 
> other benefits, but reducing poor Google search results (to me) is the 
> biggest win.
> 
> E.g.  https://www.google.com/search?q=flink+reducingstate 
>  (search on flink 
> reducing state) shows the 1.3 Javadocs (hit #1), master (1.6-SNAPSHOT) 
> Javadocs (hit #2), and then many pages of other results.
> 
> Whereas the Javadocs for 1.5 
> 
>  and 1.4 
> 
>  are nowhere to be found.
> 
> Thoughts?
> 
> — Ken
> 
>> On Jun 5, 2018, at 9:46 AM, Stephan Ewen  wrote:
>> 
>> Big +1 to this!
>> 
>> I would like to contribute to this effort and help strengthen the way Flink
>> presents itself.
>> 
>> 
>> On Tue, Jun 5, 2018 at 11:56 AM, Fabian Hueske  wrote:
>> 
>>> Hi everybody,
>>> 
>>> I've opened a PR [1] that reworks parts of the Flink website (
>>> flink.apache.org).
>>> 
>>> My goal is to improve the structure of the website and provide more
>>> valuable information about the project and the community.
>>> 
>>> A visitor (who doesn't know Flink yet) should be able to easily find
>>> answers to the following questions:
>>> * What is Apache Flink?
>>> * Does it address my use case?
>>> * Is it credible? / Who is using it?
>>> 
>>> To achieve that, I have:
>>> * Rework menu structure into three sections to address different audiences:
>>> - Potential users (see above)
>>> - Users
>>> - Contributors
>>> * Reworked start page: updated the figure, added a feature grid, moved
>>> "Powered By" section up
>>> * Replaced Features page by more detailed "What is Flink?" pages
>>> * Reworked "Use Cases" page
>>> 
>>> The PR should also improve the page for users who have questions about
>>> Flink or need help.
>>> For that, I have:
>>> * Added a "Getting Help" page (less content than the detailed community
>>> page)
>>> * Removed IRC channel info
>>> 
>>> Please give feedback, suggest improvements, and proof read the new texts.
>>> 
>>> Thanks, Fabian
>>> 
>>> [1] https://github.com/apache/flink-web/pull/109
>>> 
> 
> 
> http://about.me/kkrugler
> +1 530-210-6378
> 



Re: [DISCUSS] FLIP-6 Problems

2018-06-06 Thread Till Rohrmann
Hi Renjie,

there is already an issue for introducing further scheduling constraints
(e.g. tags) to achieve TM isolation when using the session mode [1]. What
it does not cover is the isolation of the JMs which need to be executed in
their own processes. At the moment they share the same process with the
Dispatcher because it was simpler to do it like that as first iteration.
Here is the issue for isolating JobManagers [2].

Concerning the resource specification, the corresponding issue can be found
here [3].

[1] https://issues.apache.org/jira/browse/FLINK-8886
[2] https://issues.apache.org/jira/browse/FLINK-9537
[3] https://issues.apache.org/jira/browse/FLINK-5131

Cheers,
Till

On Wed, Jun 6, 2018 at 2:13 AM Renjie Liu  wrote:

> Hi, Stephan:
>
> Yes that's what I mean. In fact the most import thing is to share the
> dispatcher so that we can have *a centralized gateway for flink job
> management and submission. The problem with per job cluster is that we
> can't have a centralized gateway.*
>
> I didn't realize that job manager also needs to run user code before and
> yes that means we job manager should also be isolated.
>
> Wouldn't it be better to separate job manager from the dispatcher so that
> user code does't interfere with each other? In fact it seems that in most
> production environments job isolation is required since nobody want their
> job to be affected by others.
>
> On Tue, Jun 5, 2018 at 11:34 PM Stephan Ewen  wrote:
>
> > Hi Renjie,
> >
> > When you suggest to have TaskManager isolation in session mode, do you
> mean
> > to have a shared JobManager / Dispatcher, but job-specific TaskManagers?
> > If this mainly to reduce the overhead of the per-job JobManager?
> >
> > One assumption so far was that if TaskManager isolation is required,
> > JobManager isolation is also required, because some user code potentially
> > also runs on the JobManager, like CheckpointHooks, Input/Output Formats,
> > ...
> >
> > Best,
> > Stephan
> >
> >
> >
> > On Tue, Jun 5, 2018 at 4:20 PM, Renjie Liu 
> > wrote:
> >
> > > Hi, Till:
> > >
> > >
> > >1. Does the community has any plan to add task manager isolation
> into
> > >the session mode?
> > >2. Is there any issues to track this feature? I want to help
> > contribute.
> > >3. Thanks for the knowledge but it can't help if task manager
> > isolation
> > >is not present.
> > >
> > >
> > > On Tue, Jun 5, 2018 at 7:28 PM Till Rohrmann 
> > wrote:
> > >
> > > > Hi Renjie,
> > > >
> > > > 1) you're right that the Flink session mode does not give you proper
> > job
> > > > isolation. It is the same as with Flink 1.4 session mode. If this is
> a
> > > > strong requirement for you, then I recommend using the per job mode.
> > > >
> > > > 2) At the moment it is also not possible to define per job resource
> > > > requirements when using the session mode. This is a feature which the
> > > > community has started implementing but it is not yet fully done. I
> > assume
> > > > that the community will continue working on it. At the moment, the
> > > solution
> > > > would be to use the per job mode to not waste unnecessary resources.
> > > >
> > > > 3) I think the assigned ResourceID for a TaskManager is shown in the
> > web
> > > UI
> > > > and when querying the "/taskmanagers" REST endpoint. The resource id
> is
> > > > derived from the Mesos task id. Would that help to identify which TM
> is
> > > > running on which Mesos task?
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Jun 5, 2018 at 5:13 AM Renjie Liu 
> > > wrote:
> > > >
> > > > > -- Forwarded message -
> > > > > From: Renjie Liu 
> > > > > Date: Tue, Jun 5, 2018 at 10:43 AM
> > > > > Subject: [DISCUSS] FLIP-6 Problems
> > > > > To: user 
> > > > >
> > > > >
> > > > > Hi:
> > > > >
> > > > > We've deployed flink 1.5.0 and tested the new cluster manager, it's
> > > > really
> > > > > great for flink to be elastic. However we've also found some
> problems
> > > > that
> > > > > blocks us from deploying it to production environment.
> > > > >
> > > > > 1. Task manager isolation. Currently flink allows different jobs to
> > > > execute
> > > > > on same task managers, this is unacceptable in production
> environment
> > > > since
> > > > > a faulty written job may kill task managers and affect other jobs.
> > > > > 2. Per job resource configuration. Currently flink session cluster
> > can
> > > > only
> > > > > allocate same size and configuration task managers. This may waste
> a
> > > lot
> > > > of
> > > > > resources if we have a lot of jobs with different resource
> > requirement.
> > > > > 3. Task manager's name is meanless.  This is a problem since we
> can't
> > > > > monitor status of container in mesos environment.
> > > > >
> > > > > One solution to the above problems is to use per job cluster, but a
> > > > > centralized cluster manager can help to manage flink deployment and
> > > jobs
> > > > > better.
> > > > >
> > > > > How you guys think about

[jira] [Created] (FLINK-9537) JobManager isolation in session mode

2018-06-06 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9537:


 Summary: JobManager isolation in session mode
 Key: FLINK-9537
 URL: https://issues.apache.org/jira/browse/FLINK-9537
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Coordination
Affects Versions: 1.5.0, 1.6.0
Reporter: Till Rohrmann


Currently, all {{JobManagers}} are executed in the same process which also runs 
the {{Dispatcher}} component when using the session mode. This is problematic, 
since the {{JobManager}} also executes user code. Consequently, a bug in a 
single Flink job can cause the failure of the other {{JobManagers}} running in 
the same process. In order to avoid this we should add the functionality to run 
each {{JobManager}} in its own process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)