[jira] [Comment Edited] (FLINK-2392) Instable test in flink-yarn-tests

2016-06-16 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333437#comment-15333437
 ] 

Maximilian Michels edited comment on FLINK-2392 at 6/16/16 9:26 AM:


The builds fail because the following Exception is logged:

{noformat}
2016-06-16 04:58:37,218 ERROR org.apache.flink.metrics.reporter.JMXReporter 
- Metric did not comply with JMX MBean naming rules.
javax.management.NotCompliantMBeanException: Interface is not public: 
org.apache.flink.metrics.reporter.JMXReporter$JmxGaugeMBean
at com.sun.jmx.mbeanserver.MBeanAnalyzer.(MBeanAnalyzer.java:114)
at 
com.sun.jmx.mbeanserver.MBeanAnalyzer.analyzer(MBeanAnalyzer.java:102)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.getAnalyzer(StandardMBeanIntrospector.java:67)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.getPerInterface(MBeanIntrospector.java:192)
at com.sun.jmx.mbeanserver.MBeanSupport.(MBeanSupport.java:138)
at 
com.sun.jmx.mbeanserver.StandardMBeanSupport.(StandardMBeanSupport.java:60)
at 
com.sun.jmx.mbeanserver.Introspector.makeDynamicMBean(Introspector.java:192)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.flink.metrics.reporter.JMXReporter.notifyOfAddedMetric(JMXReporter.java:113)
at 
org.apache.flink.metrics.MetricRegistry.register(MetricRegistry.java:174)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:206)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.gauge(AbstractMetricGroup.java:162)
at 
org.apache.flink.runtime.taskmanager.TaskManager$.instantiateClassLoaderMetrics(TaskManager.scala:2298)
at 
org.apache.flink.runtime.taskmanager.TaskManager$.org$apache$flink$runtime$taskmanager$TaskManager$$instantiateStatusMetrics(TaskManager.scala:2287)
at 
org.apache.flink.runtime.taskmanager.TaskManager.associateWithJobManager(TaskManager.scala:951)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleRegistrationMessage(TaskManager.scala:631)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:297)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at 
org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:124)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2016-06-16 04:58:37,221 ERROR org.apache.flink.metrics.reporter.JMXReporter 
- Metric did not comply with JMX MBean naming rules.
javax.management.NotCompliantMBeanException: Interface is not public: 
org.apache.flink.metrics.reporter.JMXReporter$JmxGaugeMBean
at com.sun.jmx.mbeanserver.MBeanAnalyzer.(MBeanAnalyzer.java:114)
at 
com.sun.jmx.mbeanserver.MBeanAnalyzer.analyzer(MBeanAnalyzer.java:102)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.getAnalyzer(StandardMBeanIntrospector.java:67)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.getPerInterface(MBeanIntrospector.java:192)
at com.sun.jmx.mbeanserver.MBeanSupport.(MBeanSupport.java:138)
at 
com.sun.jmx.mbeanserver.StandardMBeanSupport.(StandardMBeanSupport.java:60)
at 
com.sun.jmx.mbeanserver.Introspector.makeDynamicMBe

[jira] [Commented] (FLINK-2392) Instable test in flink-yarn-tests

2016-06-16 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333437#comment-15333437
 ] 

Maximilian Michels commented on FLINK-2392:
---

The logs fail because the following Exception is logged:

{noformat}
2016-06-16 04:58:37,218 ERROR org.apache.flink.metrics.reporter.JMXReporter 
- Metric did not comply with JMX MBean naming rules.
javax.management.NotCompliantMBeanException: Interface is not public: 
org.apache.flink.metrics.reporter.JMXReporter$JmxGaugeMBean
at com.sun.jmx.mbeanserver.MBeanAnalyzer.(MBeanAnalyzer.java:114)
at 
com.sun.jmx.mbeanserver.MBeanAnalyzer.analyzer(MBeanAnalyzer.java:102)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.getAnalyzer(StandardMBeanIntrospector.java:67)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.getPerInterface(MBeanIntrospector.java:192)
at com.sun.jmx.mbeanserver.MBeanSupport.(MBeanSupport.java:138)
at 
com.sun.jmx.mbeanserver.StandardMBeanSupport.(StandardMBeanSupport.java:60)
at 
com.sun.jmx.mbeanserver.Introspector.makeDynamicMBean(Introspector.java:192)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.flink.metrics.reporter.JMXReporter.notifyOfAddedMetric(JMXReporter.java:113)
at 
org.apache.flink.metrics.MetricRegistry.register(MetricRegistry.java:174)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:206)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.gauge(AbstractMetricGroup.java:162)
at 
org.apache.flink.runtime.taskmanager.TaskManager$.instantiateClassLoaderMetrics(TaskManager.scala:2298)
at 
org.apache.flink.runtime.taskmanager.TaskManager$.org$apache$flink$runtime$taskmanager$TaskManager$$instantiateStatusMetrics(TaskManager.scala:2287)
at 
org.apache.flink.runtime.taskmanager.TaskManager.associateWithJobManager(TaskManager.scala:951)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleRegistrationMessage(TaskManager.scala:631)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:297)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at 
org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:124)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2016-06-16 04:58:37,221 ERROR org.apache.flink.metrics.reporter.JMXReporter 
- Metric did not comply with JMX MBean naming rules.
javax.management.NotCompliantMBeanException: Interface is not public: 
org.apache.flink.metrics.reporter.JMXReporter$JmxGaugeMBean
at com.sun.jmx.mbeanserver.MBeanAnalyzer.(MBeanAnalyzer.java:114)
at 
com.sun.jmx.mbeanserver.MBeanAnalyzer.analyzer(MBeanAnalyzer.java:102)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.getAnalyzer(StandardMBeanIntrospector.java:67)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.getPerInterface(MBeanIntrospector.java:192)
at com.sun.jmx.mbeanserver.MBeanSupport.(MBeanSupport.java:138)
at 
com.sun.jmx.mbeanserver.StandardMBeanSupport.(StandardMBeanSupport.java:60)
at 
com.sun.jmx.mbeanserver.Introspector.makeDynamicMBean(Introspector.java:192)
at 
com.sun.jmx.i

[jira] [Updated] (FLINK-4079) YARN properties file used for per-job cluster

2016-06-15 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-4079:
--
Fix Version/s: 1.1.0

> YARN properties file used for per-job cluster
> -
>
> Key: FLINK-4079
> URL: https://issues.apache.org/jira/browse/FLINK-4079
> Project: Flink
>  Issue Type: Bug
>  Components: Command-line client
>Affects Versions: 1.0.3
>Reporter: Ufuk Celebi
>Assignee: Maximilian Michels
> Fix For: 1.1.0, 1.0.4
>
>
> YARN per job clusters (flink run -m yarn-cluster) rely on the hidden YARN 
> properties file, which defines the container configuration. This can lead to 
> unexpected behaviour, because the per-job-cluster configuration is merged  
> with the YARN properties file (or used as only configuration source).
> A user ran into this as follows:
> - Create a long-lived YARN session with HA (creates a hidden YARN properties 
> file)
> - Submits standalone batch jobs with a per job cluster (flink run -m 
> yarn-cluster). The batch jobs get submitted to the long lived HA cluster, 
> because of the properties file.
> [~mxm] Am I correct in assuming that this is only relevant for the 1.0 branch 
> and will be fixed with the client refactoring you are working on?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4079) YARN properties file used for per-job cluster

2016-06-15 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-4079:
--
Component/s: Command-line client

> YARN properties file used for per-job cluster
> -
>
> Key: FLINK-4079
> URL: https://issues.apache.org/jira/browse/FLINK-4079
> Project: Flink
>  Issue Type: Bug
>  Components: Command-line client
>Affects Versions: 1.0.3
>Reporter: Ufuk Celebi
>Assignee: Maximilian Michels
> Fix For: 1.1.0, 1.0.4
>
>
> YARN per job clusters (flink run -m yarn-cluster) rely on the hidden YARN 
> properties file, which defines the container configuration. This can lead to 
> unexpected behaviour, because the per-job-cluster configuration is merged  
> with the YARN properties file (or used as only configuration source).
> A user ran into this as follows:
> - Create a long-lived YARN session with HA (creates a hidden YARN properties 
> file)
> - Submits standalone batch jobs with a per job cluster (flink run -m 
> yarn-cluster). The batch jobs get submitted to the long lived HA cluster, 
> because of the properties file.
> [~mxm] Am I correct in assuming that this is only relevant for the 1.0 branch 
> and will be fixed with the client refactoring you are working on?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (FLINK-4079) YARN properties file used for per-job cluster

2016-06-15 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened FLINK-4079:
---
  Assignee: Maximilian Michels

I see what you mean now :) 

The issue is that the Yarn properties file is loaded regardless of whether "-m 
yarn-cluster" is specified on the command-line. This loads the dynamic 
properties from the Yarn properties file and applies all configuration of the 
running (session) cluster cluster to the to-be-created cluster. This is not 
expected behavior.

> YARN properties file used for per-job cluster
> -
>
> Key: FLINK-4079
> URL: https://issues.apache.org/jira/browse/FLINK-4079
> Project: Flink
>  Issue Type: Bug
>  Components: Command-line client
>Affects Versions: 1.0.3
>Reporter: Ufuk Celebi
>Assignee: Maximilian Michels
> Fix For: 1.1.0, 1.0.4
>
>
> YARN per job clusters (flink run -m yarn-cluster) rely on the hidden YARN 
> properties file, which defines the container configuration. This can lead to 
> unexpected behaviour, because the per-job-cluster configuration is merged  
> with the YARN properties file (or used as only configuration source).
> A user ran into this as follows:
> - Create a long-lived YARN session with HA (creates a hidden YARN properties 
> file)
> - Submits standalone batch jobs with a per job cluster (flink run -m 
> yarn-cluster). The batch jobs get submitted to the long lived HA cluster, 
> because of the properties file.
> [~mxm] Am I correct in assuming that this is only relevant for the 1.0 branch 
> and will be fixed with the client refactoring you are working on?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4079) YARN properties file used for per-job cluster

2016-06-15 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332032#comment-15332032
 ] 

Maximilian Michels commented on FLINK-4079:
---

Exactly, the yarn properties file is a means to easily submit jobs against a 
long-running Flink cluster (so called yarn session). Actually, it is the only 
proper way at the moment besides figuring out the JobManager location manually 
and essentially treating the Yarn cluster as a Standalone cluster.

I would also prefer to get rid of the properties file and only use the yarn 
application id instead. We could probably do that for one of the next releases 
but it would be a breaking change for the 1.1 release.

> YARN properties file used for per-job cluster
> -
>
> Key: FLINK-4079
> URL: https://issues.apache.org/jira/browse/FLINK-4079
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.3
>Reporter: Ufuk Celebi
> Fix For: 1.0.4
>
>
> YARN per job clusters (flink run -m yarn-cluster) rely on the hidden YARN 
> properties file, which defines the container configuration. This can lead to 
> unexpected behaviour, because the per-job-cluster configuration is merged  
> with the YARN properties file (or used as only configuration source).
> A user ran into this as follows:
> - Create a long-lived YARN session with HA (creates a hidden YARN properties 
> file)
> - Submits standalone batch jobs with a per job cluster (flink run -m 
> yarn-cluster). The batch jobs get submitted to the long lived HA cluster, 
> because of the properties file.
> [~mxm] Am I correct in assuming that this is only relevant for the 1.0 branch 
> and will be fixed with the client refactoring you are working on?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4079) YARN properties file used for per-job cluster

2016-06-15 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-4079.
---
Resolution: Not A Problem

This is a feature :) The Yarn properties file is supposed to be picked up. We 
can't break this behavior because users rely on it.

My changes will give an appropriate error message if the properties file 
configuration doesn't correspond to a running application in the Yarn cluster. 

> YARN properties file used for per-job cluster
> -
>
> Key: FLINK-4079
> URL: https://issues.apache.org/jira/browse/FLINK-4079
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.3
>Reporter: Ufuk Celebi
> Fix For: 1.0.4
>
>
> YARN per job clusters (flink run -m yarn-cluster) rely on the hidden YARN 
> properties file, which defines the container configuration. This can lead to 
> unexpected behaviour, because the per-job-cluster configuration is merged  
> with the YARN properties file (or used as only configuration source).
> A user ran into this as follows:
> - Create a long-lived YARN session with HA (creates a hidden YARN properties 
> file)
> - Submits standalone batch jobs with a per job cluster (flink run -m 
> yarn-cluster). The batch jobs get submitted to the long lived HA cluster, 
> because of the properties file.
> [~mxm] Am I correct in assuming that this is only relevant for the 1.0 branch 
> and will be fixed with the client refactoring you are working on?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3105) Submission in per job YARN cluster mode reuses properties file of long-lived session

2016-06-15 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-3105.
---
Resolution: Not A Problem

This is actually the expected behavior.

> Submission in per job YARN cluster mode reuses properties file of long-lived 
> session
> 
>
> Key: FLINK-3105
> URL: https://issues.apache.org/jira/browse/FLINK-3105
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 0.10.1
>Reporter: Ufuk Celebi
>
> Starting a YARN session with `bin/yarn-session.sh` creates a properties file, 
> which is used to parse job manager information when submitting jobs.
> This properties file is also used when submitting a job with {{bin/flink run 
> -m yarn-cluster}}. The {{yarn-cluster}} mode should actually start a new YARN 
> session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4030) ScalaShellITCase gets stuck

2016-06-15 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-4030.
---
Resolution: Fixed

Fixed with downgrade to Surefire 2.18.1 in 
c8fed99e3e85a4d27c6134cfa3e07fb3a8e1da2a

> ScalaShellITCase gets stuck
> ---
>
> Key: FLINK-4030
> URL: https://issues.apache.org/jira/browse/FLINK-4030
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The {{ScalaShellITCase}} fails regularly on Travis:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test 
> (integration-tests) on project flink-scala-shell_2.10: ExecutionException The 
> forked VM terminated without properly saying goodbye. VM crash or System.exit 
> called?
> [ERROR] Command was /bin/sh -c cd 
> /home/travis/build/apache/flink/flink-scala-shell/target && 
> /usr/lib/jvm/java-8-oracle/jre/bin/java -Xms256m -Xmx800m -Dmvn.forkNumber=1 
> -XX:-UseGCOverheadLimit -jar 
> /home/travis/build/apache/flink/flink-scala-shell/target/surefire/surefirebooter5669599672364114854.jar
>  
> /home/travis/build/apache/flink/flink-scala-shell/target/surefire/surefire854521958557782961tmp
>  
> /home/travis/build/apache/flink/flink-scala-shell/target/surefire/surefire_7186137661441589930637tmp
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-06-13 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327049#comment-15327049
 ] 

Maximilian Michels commented on FLINK-3677:
---

I assigned the issue to you. Please have a look at FLINK-2314, AFAIK it also 
implements file patterns but only for streaming.

> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-06-13 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3677:
--
Assignee: Ivan Mushketyk

> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4030) ScalaShellITCase gets stuck

2016-06-09 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-4030:
--
Summary: ScalaShellITCase gets stuck  (was: ScalaShellITCase)

> ScalaShellITCase gets stuck
> ---
>
> Key: FLINK-4030
> URL: https://issues.apache.org/jira/browse/FLINK-4030
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The {{ScalaShellITCase}} fails regularly on Travis:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test 
> (integration-tests) on project flink-scala-shell_2.10: ExecutionException The 
> forked VM terminated without properly saying goodbye. VM crash or System.exit 
> called?
> [ERROR] Command was /bin/sh -c cd 
> /home/travis/build/apache/flink/flink-scala-shell/target && 
> /usr/lib/jvm/java-8-oracle/jre/bin/java -Xms256m -Xmx800m -Dmvn.forkNumber=1 
> -XX:-UseGCOverheadLimit -jar 
> /home/travis/build/apache/flink/flink-scala-shell/target/surefire/surefirebooter5669599672364114854.jar
>  
> /home/travis/build/apache/flink/flink-scala-shell/target/surefire/surefire854521958557782961tmp
>  
> /home/travis/build/apache/flink/flink-scala-shell/target/surefire/surefire_7186137661441589930637tmp
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4031) Nightly Jenkins job doesn't deploy sources

2016-06-08 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-4031.
-
Resolution: Fixed

Fixed via fce64e193e32c9f639755f5b57222e6d7e89f150

> Nightly Jenkins job doesn't deploy sources
> --
>
> Key: FLINK-4031
> URL: https://issues.apache.org/jira/browse/FLINK-4031
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.1.0
>
>
> We need to adjust the {{deploy_to_maven.sh}} script to enable deployment of 
> the sources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4031) Nightly Jenkins job doesn't deploy sources

2016-06-08 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-4031:
-

 Summary: Nightly Jenkins job doesn't deploy sources
 Key: FLINK-4031
 URL: https://issues.apache.org/jira/browse/FLINK-4031
 Project: Flink
  Issue Type: Bug
  Components: Build System
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


We need to adjust the {{deploy_to_maven.sh}} script to enable deployment of the 
sources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4030) ScalaShellITCase

2016-06-08 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-4030:
-

 Summary: ScalaShellITCase
 Key: FLINK-4030
 URL: https://issues.apache.org/jira/browse/FLINK-4030
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


The {{ScalaShellITCase}} fails regularly on Travis:

{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test (integration-tests) 
on project flink-scala-shell_2.10: ExecutionException The forked VM terminated 
without properly saying goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd 
/home/travis/build/apache/flink/flink-scala-shell/target && 
/usr/lib/jvm/java-8-oracle/jre/bin/java -Xms256m -Xmx800m -Dmvn.forkNumber=1 
-XX:-UseGCOverheadLimit -jar 
/home/travis/build/apache/flink/flink-scala-shell/target/surefire/surefirebooter5669599672364114854.jar
 
/home/travis/build/apache/flink/flink-scala-shell/target/surefire/surefire854521958557782961tmp
 
/home/travis/build/apache/flink/flink-scala-shell/target/surefire/surefire_7186137661441589930637tmp
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4010) Scala Shell tests may fail because of a locked STDIN

2016-06-02 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-4010.
-
Resolution: Not A Problem

The Shell doesn't wait for input at STDIN in the tests.

> Scala Shell tests may fail because of a locked STDIN
> 
>
> Key: FLINK-4010
> URL: https://issues.apache.org/jira/browse/FLINK-4010
> Project: Flink
>  Issue Type: Test
>  Components: Scala Shell, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> The Surefire plugin uses STDIN to communicate with forked processes. When the 
> Surefire plugin and the Scala Shell synchronize on the STDIN this may result 
> in a deadlock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4009) Scala Shell fails to find library for inclusion in test

2016-06-02 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-4009:
--
Issue Type: Sub-task  (was: Test)
Parent: FLINK-3454

> Scala Shell fails to find library for inclusion in test
> ---
>
> Key: FLINK-4009
> URL: https://issues.apache.org/jira/browse/FLINK-4009
> Project: Flink
>  Issue Type: Sub-task
>  Components: Scala Shell, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The Scala Shell test fails to find the flink-ml library jar in the target 
> folder when executing with Intellij. This is due to its working directory 
> being expected in "flink-scala-shell/target" when it is in fact 
> "flink-scala-shell". When executed with Maven, this works fine because the 
> Shade plugin changes the basedir from the project root to the /target 
> folder*. 
> As per [~till.rohrmann] and [~greghogan] suggestions we could simply add 
> flink-ml as a test dependency and look for the jar path in the classpath.
> \* Because we have the dependencyReducedPomLocation set to /target/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4009) Scala Shell fails to find library for inclusion in test

2016-06-02 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-4009:
--
Description: 
The Scala Shell test fails to find the flink-ml library jar in the target 
folder when executing with Intellij. This is due to its working directory being 
expected in "flink-scala-shell/target" when it is in fact "flink-scala-shell". 
When executed with Maven, this works fine because the Shade plugin changes the 
basedir from the project root to the /target folder*. 

As per [~till.rohrmann] and [~greghogan] suggestions we could simply add 
flink-ml as a test dependency and look for the jar path in the classpath.

\* Because we have the dependencyReducedPomLocation set to /target/.


  was:
The Scala Shell test fails to find the flink-ml library jar in the target 
folder when executing with Intellij. This is due to its working directory being 
expected in "flink-scala-shell/target" when it is in fact "flink-scala-shell". 
When executed with Maven, this works fine because the Shade plugin changes the 
basedir from the project root to the /target folder*. 

As per [~till.rohrmann] and [~greghogan] suggestions we could simply add 
flink-ml as a test dependency and look for the jar path in the classpath.

* Because we have the dependencyReducedPomLocation set to /target/.



> Scala Shell fails to find library for inclusion in test
> ---
>
> Key: FLINK-4009
> URL: https://issues.apache.org/jira/browse/FLINK-4009
> Project: Flink
>  Issue Type: Test
>  Components: Scala Shell, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The Scala Shell test fails to find the flink-ml library jar in the target 
> folder when executing with Intellij. This is due to its working directory 
> being expected in "flink-scala-shell/target" when it is in fact 
> "flink-scala-shell". When executed with Maven, this works fine because the 
> Shade plugin changes the basedir from the project root to the /target 
> folder*. 
> As per [~till.rohrmann] and [~greghogan] suggestions we could simply add 
> flink-ml as a test dependency and look for the jar path in the classpath.
> \* Because we have the dependencyReducedPomLocation set to /target/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4009) Scala Shell fails to find library for inclusion in test

2016-06-02 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-4009:
--
Description: 
The Scala Shell test fails to find the flink-ml library jar in the target 
folder when executing with Intellij. This is due to its working directory being 
expected in "flink-scala-shell/target" when it is in fact "flink-scala-shell". 
When executed with Maven, this works fine because the Shade plugin changes the 
basedir from the project root to the /target folder*. 

As per [~till.rohrmann] and [~greghogan] suggestions we could simply add 
flink-ml as a test dependency and look for the jar path in the classpath.

* Because we have the dependencyReducedPomLocation set to /target/.


  was:The Scala Shell test fails to find the flink-ml library jar in the target 
folder. This is due to its working directory being expected in 
"flink-scala-shell/target" when it is in fact "flink-scala-shell". I'm a bit 
puzzled why that could have changed recently. The last incident I recall where 
we had to change paths was when we introduced shading of all artifacts to 
produce effective poms (via the force-shading module). I'm assuming the change 
of paths has to do with switching from Failsafe to Surefire in FLINK-3909.


> Scala Shell fails to find library for inclusion in test
> ---
>
> Key: FLINK-4009
> URL: https://issues.apache.org/jira/browse/FLINK-4009
> Project: Flink
>  Issue Type: Test
>  Components: Scala Shell, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The Scala Shell test fails to find the flink-ml library jar in the target 
> folder when executing with Intellij. This is due to its working directory 
> being expected in "flink-scala-shell/target" when it is in fact 
> "flink-scala-shell". When executed with Maven, this works fine because the 
> Shade plugin changes the basedir from the project root to the /target 
> folder*. 
> As per [~till.rohrmann] and [~greghogan] suggestions we could simply add 
> flink-ml as a test dependency and look for the jar path in the classpath.
> * Because we have the dependencyReducedPomLocation set to /target/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4009) Scala Shell fails to find library for inclusion in test

2016-06-02 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312296#comment-15312296
 ] 

Maximilian Michels commented on FLINK-4009:
---

Yes, this issue could be a sub task of FLINK-3454.

You added the flink-ml dependency in the pom which ensures the library is added 
to the test's class path. Then, the classpath is searched by the test which 
could be fixed in this issue.

> Scala Shell fails to find library for inclusion in test
> ---
>
> Key: FLINK-4009
> URL: https://issues.apache.org/jira/browse/FLINK-4009
> Project: Flink
>  Issue Type: Test
>  Components: Scala Shell, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The Scala Shell test fails to find the flink-ml library jar in the target 
> folder. This is due to its working directory being expected in 
> "flink-scala-shell/target" when it is in fact "flink-scala-shell". I'm a bit 
> puzzled why that could have changed recently. The last incident I recall 
> where we had to change paths was when we introduced shading of all artifacts 
> to produce effective poms (via the force-shading module). I'm assuming the 
> change of paths has to do with switching from Failsafe to Surefire in 
> FLINK-3909.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (FLINK-4009) Scala Shell fails to find library for inclusion in test

2016-06-02 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened FLINK-4009:
---

Reopening and converting to sub task of FLINK-3454

> Scala Shell fails to find library for inclusion in test
> ---
>
> Key: FLINK-4009
> URL: https://issues.apache.org/jira/browse/FLINK-4009
> Project: Flink
>  Issue Type: Test
>  Components: Scala Shell, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The Scala Shell test fails to find the flink-ml library jar in the target 
> folder. This is due to its working directory being expected in 
> "flink-scala-shell/target" when it is in fact "flink-scala-shell". I'm a bit 
> puzzled why that could have changed recently. The last incident I recall 
> where we had to change paths was when we introduced shading of all artifacts 
> to produce effective poms (via the force-shading module). I'm assuming the 
> change of paths has to do with switching from Failsafe to Surefire in 
> FLINK-3909.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4010) Scala Shell tests may fail because of a locked STDIN

2016-06-02 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-4010:
-

 Summary: Scala Shell tests may fail because of a locked STDIN
 Key: FLINK-4010
 URL: https://issues.apache.org/jira/browse/FLINK-4010
 Project: Flink
  Issue Type: Test
  Components: Scala Shell, Tests
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
 Fix For: 1.1.0


The Surefire plugin uses STDIN to communicate with forked processes. When the 
Surefire plugin and the Scala Shell synchronize on the STDIN this may result in 
a deadlock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4009) Scala Shell fails to find library for inclusion in test

2016-06-02 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312256#comment-15312256
 ] 

Maximilian Michels commented on FLINK-4009:
---

By "using the classpath" I mean extracting the jar location from the classpath 
provided for the tests.

> Scala Shell fails to find library for inclusion in test
> ---
>
> Key: FLINK-4009
> URL: https://issues.apache.org/jira/browse/FLINK-4009
> Project: Flink
>  Issue Type: Test
>  Components: Scala Shell, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The Scala Shell test fails to find the flink-ml library jar in the target 
> folder. This is due to its working directory being expected in 
> "flink-scala-shell/target" when it is in fact "flink-scala-shell". I'm a bit 
> puzzled why that could have changed recently. The last incident I recall 
> where we had to change paths was when we introduced shading of all artifacts 
> to produce effective poms (via the force-shading module). I'm assuming the 
> change of paths has to do with switching from Failsafe to Surefire in 
> FLINK-3909.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4009) Scala Shell fails to find library for inclusion in test

2016-06-02 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-4009.
-
Resolution: Duplicate

> Scala Shell fails to find library for inclusion in test
> ---
>
> Key: FLINK-4009
> URL: https://issues.apache.org/jira/browse/FLINK-4009
> Project: Flink
>  Issue Type: Test
>  Components: Scala Shell, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The Scala Shell test fails to find the flink-ml library jar in the target 
> folder. This is due to its working directory being expected in 
> "flink-scala-shell/target" when it is in fact "flink-scala-shell". I'm a bit 
> puzzled why that could have changed recently. The last incident I recall 
> where we had to change paths was when we introduced shading of all artifacts 
> to produce effective poms (via the force-shading module). I'm assuming the 
> change of paths has to do with switching from Failsafe to Surefire in 
> FLINK-3909.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4009) Scala Shell fails to find library for inclusion in test

2016-06-02 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312254#comment-15312254
 ] 

Maximilian Michels commented on FLINK-4009:
---

Yes, it is. I was running the test from IntelliJ. The baseDir is set to the 
project directory there. Running the test there didn't work. However, in the 
builds the Maven Shade plugin changes it to /target and then the path works 
fine.

I created a patch to look in both variants of the basedir. However, I think 
Till's solution to use the classpath is much better.

> Scala Shell fails to find library for inclusion in test
> ---
>
> Key: FLINK-4009
> URL: https://issues.apache.org/jira/browse/FLINK-4009
> Project: Flink
>  Issue Type: Test
>  Components: Scala Shell, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The Scala Shell test fails to find the flink-ml library jar in the target 
> folder. This is due to its working directory being expected in 
> "flink-scala-shell/target" when it is in fact "flink-scala-shell". I'm a bit 
> puzzled why that could have changed recently. The last incident I recall 
> where we had to change paths was when we introduced shading of all artifacts 
> to produce effective poms (via the force-shading module). I'm assuming the 
> change of paths has to do with switching from Failsafe to Surefire in 
> FLINK-3909.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4009) Scala Shell fails to find library for inclusion in test

2016-06-02 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-4009:
-

 Summary: Scala Shell fails to find library for inclusion in test
 Key: FLINK-4009
 URL: https://issues.apache.org/jira/browse/FLINK-4009
 Project: Flink
  Issue Type: Test
  Components: Scala Shell, Tests
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
 Fix For: 1.1.0


The Scala Shell test fails to find the flink-ml library jar in the target 
folder. This is due to its working directory being expected in 
"flink-scala-shell/target" when it is in fact "flink-scala-shell". I'm a bit 
puzzled why that could have changed recently. The last incident I recall where 
we had to change paths was when we introduced shading of all artifacts to 
produce effective poms (via the force-shading module). I'm assuming the change 
of paths has to do with switching from Failsafe to Surefire in FLINK-3909.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3607) Decrease default forkCount for tests

2016-05-31 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307824#comment-15307824
 ] 

Maximilian Michels commented on FLINK-3607:
---

Correct.

> Decrease default forkCount for tests
> 
>
> Key: FLINK-3607
> URL: https://issues.apache.org/jira/browse/FLINK-3607
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> I'm seeing many tests failures because of timeouts:
> https://builds.apache.org/job/flink-ci/1/testReport/
> The {{forkCount}} is set to the aggressive value {{1.5C}}. We should consider 
> to reduce it at least to {{1C}} (1 fork per exposed physical/virtual core). 
> That could improve the test stability.
> I did another test using {{1C}} on Jenkins and had only one failed test and a 
> decreased (!) run time: https://builds.apache.org/job/flink-ci/2/testReport/
> 1.5C: 1h 57m
> 1C: 1h 35m
> I'll run some more tests to verify this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3607) Decrease default forkCount for tests

2016-05-31 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3607.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

Defaulting to the saner {{1C}} with bcf5f4648670770b6c5fcfd56fd24660fae5ad44

> Decrease default forkCount for tests
> 
>
> Key: FLINK-3607
> URL: https://issues.apache.org/jira/browse/FLINK-3607
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> I'm seeing many tests failures because of timeouts:
> https://builds.apache.org/job/flink-ci/1/testReport/
> The {{forkCount}} is set to the aggressive value {{1.5C}}. We should consider 
> to reduce it at least to {{1C}} (1 fork per exposed physical/virtual core). 
> That could improve the test stability.
> I did another test using {{1C}} on Jenkins and had only one failed test and a 
> decreased (!) run time: https://builds.apache.org/job/flink-ci/2/testReport/
> 1.5C: 1h 57m
> 1C: 1h 35m
> I'll run some more tests to verify this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3937) Make flink cli list, savepoint, cancel and stop work on Flink-on-YARN clusters

2016-05-31 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned FLINK-3937:
-

Assignee: Maximilian Michels

> Make flink cli list, savepoint, cancel and stop work on Flink-on-YARN clusters
> --
>
> Key: FLINK-3937
> URL: https://issues.apache.org/jira/browse/FLINK-3937
> Project: Flink
>  Issue Type: Improvement
>Reporter: Sebastian Klemke
>Assignee: Maximilian Michels
>Priority: Trivial
> Attachments: improve_flink_cli_yarn_integration.patch
>
>
> Currently, flink cli can't figure out JobManager RPC location for 
> Flink-on-YARN clusters. Therefore, list, savepoint, cancel and stop 
> subcommands are hard to invoke if you only know the YARN application ID. As 
> an improvement, I suggest adding a -yid  option to the 
> mentioned subcommands that can be used together with -m yarn-cluster. Flink 
> cli would then retrieve JobManager RPC location from YARN ResourceManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3994) Instable KNNITSuite

2016-05-31 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307607#comment-15307607
 ] 

Maximilian Michels commented on FLINK-3994:
---

Thanks [~chiwanpark]!

> Instable KNNITSuite
> ---
>
> Key: FLINK-3994
> URL: https://issues.apache.org/jira/browse/FLINK-3994
> Project: Flink
>  Issue Type: Bug
>  Components: Machine Learning Library, Tests
>Affects Versions: 1.1.0
>Reporter: Chiwan Park
>Assignee: Chiwan Park
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> KNNITSuite fails in Travis-CI with following error:
> {code}
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:806)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:752)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:752)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   ...
>   Cause: java.io.IOException: Insufficient number of network buffers: 
> required 32, but only 4 available. The total number of network buffers is 
> currently set to 2048. You can increase this number by setting the 
> configuration key 'taskmanager.network.numberOfBuffers'.
>   at 
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.createBufferPool(NetworkBufferPool.java:196)
>   at 
> org.apache.flink.runtime.io.network.NetworkEnvironment.registerTask(NetworkEnvironment.java:327)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:497)
>   at java.lang.Thread.run(Thread.java:745)
>   ...
> {code}
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064237/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064236/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064235/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134052961/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3994) Instable KNNITSuite

2016-05-31 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3994:
--
Fix Version/s: 1.1.0

> Instable KNNITSuite
> ---
>
> Key: FLINK-3994
> URL: https://issues.apache.org/jira/browse/FLINK-3994
> Project: Flink
>  Issue Type: Bug
>  Components: Machine Learning Library, Tests
>Affects Versions: 1.1.0
>Reporter: Chiwan Park
>Assignee: Chiwan Park
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> KNNITSuite fails in Travis-CI with following error:
> {code}
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:806)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:752)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:752)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   ...
>   Cause: java.io.IOException: Insufficient number of network buffers: 
> required 32, but only 4 available. The total number of network buffers is 
> currently set to 2048. You can increase this number by setting the 
> configuration key 'taskmanager.network.numberOfBuffers'.
>   at 
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.createBufferPool(NetworkBufferPool.java:196)
>   at 
> org.apache.flink.runtime.io.network.NetworkEnvironment.registerTask(NetworkEnvironment.java:327)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:497)
>   at java.lang.Thread.run(Thread.java:745)
>   ...
> {code}
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064237/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064236/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064235/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134052961/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3887) Improve dependency management for building docs

2016-05-31 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3887.
-
   Resolution: Fixed
Fix Version/s: 0.8.0
   0.10.0
   1.0.0
   0.9
   0.7.1-incubating
   0.6.1-incubating

master: aec8b5a729498a8a4c9d4498b3f05671692c3ba6, 
8a6bdf4c18b5b5b8daff0aabe01ffd427773f815
release-1.0: 53dac6363d909ec33a9d0a02cee6136449d47061
release-0.10: 644c27504ad6fb89372e3b39123a4f896013e1ad
release-0.9: 1a1c134ac5e93faf9e04f9b125b330794f8372bd
release-0.8: 96849b5e3689a21690d45decce7b29972b248898
release-0.7: 2a228e206fe22f4cfd6a01769864063c45ad3b65
release-0.6: e48285d5099e829ff2f550663db12ac4f0c83a33

> Improve dependency management for building docs
> ---
>
> Key: FLINK-3887
> URL: https://issues.apache.org/jira/browse/FLINK-3887
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 0.6-incubating, 0.7.0-incubating, 0.8.0, 0.9.0, 0.10.0, 
> 1.0.0, 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 0.6.1-incubating, 0.7.1-incubating, 0.9, 1.1.0, 1.0.0, 
> 0.10.0, 0.8.0
>
>
> Our nightly docs builds currently fail: 
> https://ci.apache.org/builders/flink-docs-master/
> I will file an issue with JIRA to fix it. The root cause is that we rely on a 
> couple of dependencies to be installed. We could circumvent this by providing 
> a Ruby Gemfile that we can then use to load necessary dependencies. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3887) Improve dependency management for building docs

2016-05-31 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3887:
--
Affects Version/s: 0.6-incubating
   0.7.0-incubating
   0.8.0
   0.9.0
   0.10.0
   1.0.0

> Improve dependency management for building docs
> ---
>
> Key: FLINK-3887
> URL: https://issues.apache.org/jira/browse/FLINK-3887
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 0.6-incubating, 0.7.0-incubating, 0.8.0, 0.9.0, 0.10.0, 
> 1.0.0, 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> Our nightly docs builds currently fail: 
> https://ci.apache.org/builders/flink-docs-master/
> I will file an issue with JIRA to fix it. The root cause is that we rely on a 
> couple of dependencies to be installed. We could circumvent this by providing 
> a Ruby Gemfile that we can then use to load necessary dependencies. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3887) Improve dependency management for building docs

2016-05-31 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3887:
--
Fix Version/s: 1.1.0

> Improve dependency management for building docs
> ---
>
> Key: FLINK-3887
> URL: https://issues.apache.org/jira/browse/FLINK-3887
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> Our nightly docs builds currently fail: 
> https://ci.apache.org/builders/flink-docs-master/
> I will file an issue with JIRA to fix it. The root cause is that we rely on a 
> couple of dependencies to be installed. We could circumvent this by providing 
> a Ruby Gemfile that we can then use to load necessary dependencies. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3887) Improve dependency management for building docs

2016-05-31 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3887:
--
Affects Version/s: 1.1.0

> Improve dependency management for building docs
> ---
>
> Key: FLINK-3887
> URL: https://issues.apache.org/jira/browse/FLINK-3887
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> Our nightly docs builds currently fail: 
> https://ci.apache.org/builders/flink-docs-master/
> I will file an issue with JIRA to fix it. The root cause is that we rely on a 
> couple of dependencies to be installed. We could circumvent this by providing 
> a Ruby Gemfile that we can then use to load necessary dependencies. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3981) Don't log duplicate TaskManager registrations as exceptions

2016-05-30 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3981.
-
Resolution: Fixed

Fixed with 93280069256a8afd3f3f5e263fcf2fa07ec14e0b

> Don't log duplicate TaskManager registrations as exceptions
> ---
>
> Key: FLINK-3981
> URL: https://issues.apache.org/jira/browse/FLINK-3981
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> Duplicate TaskManager registrations shouldn't be logged with Exceptions in 
> the ResourceManager. Duplicate registrations can happen if the TaskManager 
> sends out registration messages too fast when the actual reply is not lost 
> but still in transit.
> The ResourceManager should simply acknowledge the duplicate registrations, 
> leaving it up to the JobManager to decide how to treat the duplicate 
> registrations (currently it will send an AlreadyRegistered to the 
> TaskManager).
> This change also affects our test stability because the Yarn tests check the 
> logs for exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3982) Multiple ResourceManagers register at JobManager in standalone HA mode

2016-05-30 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3982.
-
Resolution: Fixed

Fixed with 1212b6d3ff676877f84a51e9c849f9e484297947

> Multiple ResourceManagers register at JobManager in standalone HA mode
> --
>
> Key: FLINK-3982
> URL: https://issues.apache.org/jira/browse/FLINK-3982
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.1.0
>
>
> In HA mode, multiple ResourceManagers may register at the leading JobManager. 
> They register one after another at the JobManager. The last registering 
> ResourceManager stays registered with the JobManager. This only applies to 
> Standalone mode and doesn't affect functionality.
> To prevent duplicate registration for the standalone ResourceManager, the 
> easiest solution is to only start registration when the leading JobManager 
> runs in the same ActorSystem as the ResourceManager. Other ResourceManager 
> implementations may also run independently of the JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3962) JMXReporter doesn't properly register/deregister metrics

2016-05-30 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15306417#comment-15306417
 ] 

Maximilian Michels commented on FLINK-3962:
---

In the Yarn tests, I'm still seeing these exceptions:
{noformat}
2016-05-30 11:03:47,498 ERROR org.apache.flink.metrics.reporter.JMXReporter 
- A metric with the name 
org.apache.flink.metrics:key0=mxm,key1=taskmanager,key2=25260e5955217acb4a2be8a0846f44f1,key3=WordCount_Example,key4=CHAIN_DataSource_(at_main(WordCount.java-70)_(org.apache.flink.api.java.io.TextInputFormat))_->_FlatMap_(FlatMap_at_main(WordCount.java-80))_->_Combine(SUM(1)-_at_main(WordCount.java-83),name=numBytesIn
 was already registered.
javax.management.InstanceAlreadyExistsException: 
org.apache.flink.metrics:key0=mxm,key1=taskmanager,key2=25260e5955217acb4a2be8a0846f44f1,key3=WordCount_Example,key4=CHAIN_DataSource_(at_main(WordCount.java-70)_(org.apache.flink.api.java.io.TextInputFormat))_->_FlatMap_(FlatMap_at_main(WordCount.java-80))_->_Combine(SUM(1)-_at_main(WordCount.java-83),name=numBytesIn
at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.flink.metrics.reporter.JMXReporter.notifyOfAddedMetric(JMXReporter.java:76)
at 
org.apache.flink.metrics.MetricRegistry.register(MetricRegistry.java:177)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:191)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.counter(AbstractMetricGroup.java:144)
at 
org.apache.flink.metrics.groups.IOMetricGroup.(IOMetricGroup.java:40)
at 
org.apache.flink.metrics.groups.TaskMetricGroup.(TaskMetricGroup.java:74)
at 
org.apache.flink.metrics.groups.JobMetricGroup.addTask(JobMetricGroup.java:74)
at 
org.apache.flink.metrics.groups.TaskManagerMetricGroup.addTaskForJob(TaskManagerMetricGroup.java:86)
at 
org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:1093)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleTaskMessage(TaskManager.scala:442)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:284)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:124)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

{noformat}

> JMXReporter doesn't properly register/deregister metrics
> 
>
>

[jira] [Created] (FLINK-3982) Multiple ResourceManagers register at JobManager in standalone HA mode

2016-05-27 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3982:
-

 Summary: Multiple ResourceManagers register at JobManager in 
standalone HA mode
 Key: FLINK-3982
 URL: https://issues.apache.org/jira/browse/FLINK-3982
 Project: Flink
  Issue Type: Bug
  Components: ResourceManager
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


In HA mode, multiple ResourceManagers may register at the leading JobManager. 
They register one after another at the JobManager. The last registering 
ResourceManager stays registered with the JobManager. This only applies to 
Standalone mode and doesn't affect functionality.

To prevent duplicate registration for the standalone ResourceManager, the 
easiest solution is to only start registration when the leading JobManager runs 
in the same ActorSystem as the ResourceManager. Other ResourceManager 
implementations may also run independently of the JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3981) Don't log duplicate TaskManager registrations as exceptions

2016-05-27 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3981:
-

 Summary: Don't log duplicate TaskManager registrations as 
exceptions
 Key: FLINK-3981
 URL: https://issues.apache.org/jira/browse/FLINK-3981
 Project: Flink
  Issue Type: Bug
  Components: ResourceManager
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


Duplicate TaskManager registrations shouldn't be logged with Exceptions in the 
ResourceManager. Duplicate registrations can happen if the TaskManager sends 
out registration messages too fast when the actual reply is not lost but still 
in transit.

The ResourceManager should simply acknowledge the duplicate registrations, 
leaving it up to the JobManager to decide how to treat the duplicate 
registrations (currently it will send an AlreadyRegistered to the TaskManager).

This change also affects our test stability because the Yarn tests check the 
logs for exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3972) Subclasses of ResourceID may not to be serializable

2016-05-27 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3972.
-
Resolution: Fixed

Fixed via 4d41bd8fa787315d203bb34973f8608d84c5b6ac

> Subclasses of ResourceID may not to be serializable
> ---
>
> Key: FLINK-3972
> URL: https://issues.apache.org/jira/browse/FLINK-3972
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.1.0
>
>
> WorkerTypes are currently subclasses of ResourceID. ResourceID has to be 
> Serializable but its subclasses don't. This may lead to problems when these 
> subclasses are used as ResourceIDs, i.e. serialization may fail with 
> NotSerializableExceptions. Currently, subclasses are never send over the wire 
> but they might be in the future.
> Instead of relying on subclasses of ResourceID for the WorkerTypes, we can 
> let them implement an interface to retrieve the ResourceID of a WorkerType.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3960) Disable, fix and re-enable EventTimeWindowCheckpointingITCase

2016-05-27 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3960:
--
Summary: Disable, fix and re-enable EventTimeWindowCheckpointingITCase  
(was: EventTimeWindowCheckpointingITCase fails with a segmentation fault)

> Disable, fix and re-enable EventTimeWindowCheckpointingITCase
> -
>
> Key: FLINK-3960
> URL: https://issues.apache.org/jira/browse/FLINK-3960
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Aljoscha Krettek
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> As a follow-up issue of FLINK-3909, our tests fail with the following. I 
> believe [~aljoscha] is working on a fix.
> {noformat}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7fae0c62a264, pid=72720, tid=140385528268544
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_76-b13) (build 
> 1.7.0_76-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.76-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni78704726610339516..so+0x13c264]  
> rocksdb_iterator_helper(rocksdb::DB*, rocksdb::ReadOptions, 
> rocksdb::ColumnFamilyHandle*)+0x4
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /home/travis/build/mxm/flink/flink-tests/target/hs_err_pid72720.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> Aborted (core dumped)
> {noformat}
> I propose to disable the test case in the meantime because it is blocking our 
> test execution which we need for pull requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (FLINK-3960) EventTimeWindowCheckpointingITCase fails with a segmentation fault

2016-05-27 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened FLINK-3960:
---

Keeping this for the record to revert 98a939552e12fc699ff39111bbe877e112460ceb.

> EventTimeWindowCheckpointingITCase fails with a segmentation fault
> --
>
> Key: FLINK-3960
> URL: https://issues.apache.org/jira/browse/FLINK-3960
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Aljoscha Krettek
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> As a follow-up issue of FLINK-3909, our tests fail with the following. I 
> believe [~aljoscha] is working on a fix.
> {noformat}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7fae0c62a264, pid=72720, tid=140385528268544
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_76-b13) (build 
> 1.7.0_76-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.76-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni78704726610339516..so+0x13c264]  
> rocksdb_iterator_helper(rocksdb::DB*, rocksdb::ReadOptions, 
> rocksdb::ColumnFamilyHandle*)+0x4
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /home/travis/build/mxm/flink/flink-tests/target/hs_err_pid72720.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> Aborted (core dumped)
> {noformat}
> I propose to disable the test case in the meantime because it is blocking our 
> test execution which we need for pull requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3972) Subclasses of ResourceID may not to be serializable

2016-05-25 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3972:
-

 Summary: Subclasses of ResourceID may not to be serializable
 Key: FLINK-3972
 URL: https://issues.apache.org/jira/browse/FLINK-3972
 Project: Flink
  Issue Type: Bug
  Components: ResourceManager
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


WorkerTypes are currently subclasses of ResourceID. ResourceID has to be 
Serializable but its subclasses don't. This may lead to problems when these 
subclasses are used as ResourceIDs, i.e. serialization may fail with 
NotSerializableExceptions. Currently, subclasses are never send over the wire 
but they might be in the future.

Instead of relying on subclasses of ResourceID for the WorkerTypes, we can let 
them implement an interface to retrieve the ResourceID of a WorkerType.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3962) JMXReporter doesn't properly register/deregister metrics

2016-05-25 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300102#comment-15300102
 ] 

Maximilian Michels commented on FLINK-3962:
---

Thanks for looking into this!

> JMXReporter doesn't properly register/deregister metrics
> 
>
> Key: FLINK-3962
> URL: https://issues.apache.org/jira/browse/FLINK-3962
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Stephan Ewen
> Fix For: 1.1.0
>
>
> The following fails our Yarn tests because it checks for errors in the 
> jobmanager/taskmanager logs:
> {noformat}
> 2016-05-23 19:20:02,349 ERROR org.apache.flink.metrics.reporter.JMXReporter   
>   - A metric with the name 
> org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
>  was already registered.
> javax.management.InstanceAlreadyExistsException: 
> org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>   at 
> org.apache.flink.metrics.reporter.JMXReporter.notifyOfAddedMetric(JMXReporter.java:76)
>   at 
> org.apache.flink.metrics.MetricRegistry.register(MetricRegistry.java:177)
>   at 
> org.apache.flink.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:191)
>   at 
> org.apache.flink.metrics.groups.AbstractMetricGroup.counter(AbstractMetricGroup.java:144)
>   at 
> org.apache.flink.metrics.groups.IOMetricGroup.(IOMetricGroup.java:40)
>   at 
> org.apache.flink.metrics.groups.TaskMetricGroup.(TaskMetricGroup.java:68)
>   at 
> org.apache.flink.metrics.groups.JobMetricGroup.addTask(JobMetricGroup.java:74)
>   at 
> org.apache.flink.metrics.groups.TaskManagerMetricGroup.addTaskForJob(TaskManagerMetricGroup.java:86)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:1092)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleTaskMessage(TaskManager.scala:441)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:283)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:124)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurren

[jira] [Closed] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3963.
-
Resolution: Fixed

Fixed via 5b9872492394026f3e6ac31b9937141ebedb1481 by replacing Netty's 
ConcurrentHashMap with Java's.

> AbstractReporter uses shaded dependency
> ---
>
> Key: FLINK-3963
> URL: https://issues.apache.org/jira/browse/FLINK-3963
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened FLINK-3963:
---

The import is still not correct and let's Hadoop 1 builds fail.

> AbstractReporter uses shaded dependency
> ---
>
> Key: FLINK-3963
> URL: https://issues.apache.org/jira/browse/FLINK-3963
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2155) Add an additional checkstyle validation for illegal imports

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-2155:
--
Fix Version/s: 1.1.0

> Add an additional checkstyle validation for illegal imports
> ---
>
> Key: FLINK-2155
> URL: https://issues.apache.org/jira/browse/FLINK-2155
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Lokesh Rajaram
>Assignee: Kostas Kloudas
> Fix For: 0.10.0, 1.1.0
>
>
> Add an additional check-style validation for illegal imports.
> To begin with the following two package import are marked as illegal:
>  1. org.apache.commons.lang3.Validate
>  2. org.apache.flink.shaded.*
> Implementation based on: 
> http://checkstyle.sourceforge.net/config_imports.html#IllegalImport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2155) Add an additional checkstyle validation for illegal imports

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-2155:
--
Affects Version/s: 1.1.0

> Add an additional checkstyle validation for illegal imports
> ---
>
> Key: FLINK-2155
> URL: https://issues.apache.org/jira/browse/FLINK-2155
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Lokesh Rajaram
>Assignee: Kostas Kloudas
> Fix For: 0.10.0, 1.1.0
>
>
> Add an additional check-style validation for illegal imports.
> To begin with the following two package import are marked as illegal:
>  1. org.apache.commons.lang3.Validate
>  2. org.apache.flink.shaded.*
> Implementation based on: 
> http://checkstyle.sourceforge.net/config_imports.html#IllegalImport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3963.
-
Resolution: Fixed

Fixed with cbee4ef20431be9d934a25ba89a801b16b4f85dd

> AbstractReporter uses shaded dependency
> ---
>
> Key: FLINK-3963
> URL: https://issues.apache.org/jira/browse/FLINK-3963
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (FLINK-2155) Add an additional checkstyle validation for illegal imports

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened FLINK-2155:
---
  Assignee: Kostas Kloudas  (was: Lokesh Rajaram)

The Checkstyle rule for checking for shaded imports doesn't seem to work 
correctly (see FLINK-3963). [~kkl0u] Could you take a look?

> Add an additional checkstyle validation for illegal imports
> ---
>
> Key: FLINK-2155
> URL: https://issues.apache.org/jira/browse/FLINK-2155
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Lokesh Rajaram
>Assignee: Kostas Kloudas
> Fix For: 0.10.0
>
>
> Add an additional check-style validation for illegal imports.
> To begin with the following two package import are marked as illegal:
>  1. org.apache.commons.lang3.Validate
>  2. org.apache.flink.shaded.*
> Implementation based on: 
> http://checkstyle.sourceforge.net/config_imports.html#IllegalImport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298035#comment-15298035
 ] 

Maximilian Michels commented on FLINK-3963:
---

Thanks for reporting [~kkl0u]. Do you want to open a pull request?

> AbstractReporter uses shaded dependency
> ---
>
> Key: FLINK-3963
> URL: https://issues.apache.org/jira/browse/FLINK-3963
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3963:
-

 Summary: AbstractReporter uses shaded dependency
 Key: FLINK-3963
 URL: https://issues.apache.org/jira/browse/FLINK-3963
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Kostas Kloudas
 Fix For: 1.1.0


This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3960) EventTimeWindowCheckpointingITCase fails with a segmentation fault

2016-05-24 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298010#comment-15298010
 ] 

Maximilian Michels commented on FLINK-3960:
---

Merged temporary fix which should be reverted once we have fixed this issue: 
98a939552e12fc699ff39111bbe877e112460ceb

> EventTimeWindowCheckpointingITCase fails with a segmentation fault
> --
>
> Key: FLINK-3960
> URL: https://issues.apache.org/jira/browse/FLINK-3960
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Aljoscha Krettek
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> As a follow-up issue of FLINK-3909, our tests fail with the following. I 
> believe [~aljoscha] is working on a fix.
> {noformat}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7fae0c62a264, pid=72720, tid=140385528268544
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_76-b13) (build 
> 1.7.0_76-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.76-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni78704726610339516..so+0x13c264]  
> rocksdb_iterator_helper(rocksdb::DB*, rocksdb::ReadOptions, 
> rocksdb::ColumnFamilyHandle*)+0x4
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /home/travis/build/mxm/flink/flink-tests/target/hs_err_pid72720.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> Aborted (core dumped)
> {noformat}
> I propose to disable the test case in the meantime because it is blocking our 
> test execution which we need for pull requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3962) JMXReporter doesn't properly register/deregister metrics

2016-05-24 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298008#comment-15298008
 ] 

Maximilian Michels commented on FLINK-3962:
---

[~Zentol] I assigned you because you had been working on the metrics reporting.
[~StephanEwen] said he probably knows why this occurs.

> JMXReporter doesn't properly register/deregister metrics
> 
>
> Key: FLINK-3962
> URL: https://issues.apache.org/jira/browse/FLINK-3962
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Chesnay Schepler
> Fix For: 1.1.0
>
>
> The following fails our Yarn tests because it checks for errors in the 
> jobmanager/taskmanager logs:
> {noformat}
> 2016-05-23 19:20:02,349 ERROR org.apache.flink.metrics.reporter.JMXReporter   
>   - A metric with the name 
> org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
>  was already registered.
> javax.management.InstanceAlreadyExistsException: 
> org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>   at 
> org.apache.flink.metrics.reporter.JMXReporter.notifyOfAddedMetric(JMXReporter.java:76)
>   at 
> org.apache.flink.metrics.MetricRegistry.register(MetricRegistry.java:177)
>   at 
> org.apache.flink.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:191)
>   at 
> org.apache.flink.metrics.groups.AbstractMetricGroup.counter(AbstractMetricGroup.java:144)
>   at 
> org.apache.flink.metrics.groups.IOMetricGroup.(IOMetricGroup.java:40)
>   at 
> org.apache.flink.metrics.groups.TaskMetricGroup.(TaskMetricGroup.java:68)
>   at 
> org.apache.flink.metrics.groups.JobMetricGroup.addTask(JobMetricGroup.java:74)
>   at 
> org.apache.flink.metrics.groups.TaskManagerMetricGroup.addTaskForJob(TaskManagerMetricGroup.java:86)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:1092)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleTaskMessage(TaskManager.scala:441)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:283)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:124)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)

[jira] [Created] (FLINK-3962) JMXReporter doesn't properly register/deregister metrics

2016-05-24 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3962:
-

 Summary: JMXReporter doesn't properly register/deregister metrics
 Key: FLINK-3962
 URL: https://issues.apache.org/jira/browse/FLINK-3962
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Chesnay Schepler
 Fix For: 1.1.0


The following fails our Yarn tests because it checks for errors in the 
jobmanager/taskmanager logs:

{noformat}
2016-05-23 19:20:02,349 ERROR org.apache.flink.metrics.reporter.JMXReporter 
- A metric with the name 
org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
 was already registered.
javax.management.InstanceAlreadyExistsException: 
org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.flink.metrics.reporter.JMXReporter.notifyOfAddedMetric(JMXReporter.java:76)
at 
org.apache.flink.metrics.MetricRegistry.register(MetricRegistry.java:177)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:191)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.counter(AbstractMetricGroup.java:144)
at 
org.apache.flink.metrics.groups.IOMetricGroup.(IOMetricGroup.java:40)
at 
org.apache.flink.metrics.groups.TaskMetricGroup.(TaskMetricGroup.java:68)
at 
org.apache.flink.metrics.groups.JobMetricGroup.addTask(JobMetricGroup.java:74)
at 
org.apache.flink.metrics.groups.TaskManagerMetricGroup.addTaskForJob(TaskManagerMetricGroup.java:86)
at 
org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:1092)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleTaskMessage(TaskManager.scala:441)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:283)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:124)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3960) EventTimeWindowCheckpointingITCase fails with a segmentation fault

2016-05-23 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3960:
--
Summary: EventTimeWindowCheckpointingITCase fails with a segmentation fault 
 (was: EventTimeCheckpointingITCase fails with a segmentation fault)

> EventTimeWindowCheckpointingITCase fails with a segmentation fault
> --
>
> Key: FLINK-3960
> URL: https://issues.apache.org/jira/browse/FLINK-3960
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Aljoscha Krettek
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> As a follow-up issue of FLINK-3909, our tests fail with the following. I 
> believe [~aljoscha] is working on a fix.
> {noformat}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7fae0c62a264, pid=72720, tid=140385528268544
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_76-b13) (build 
> 1.7.0_76-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.76-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni78704726610339516..so+0x13c264]  
> rocksdb_iterator_helper(rocksdb::DB*, rocksdb::ReadOptions, 
> rocksdb::ColumnFamilyHandle*)+0x4
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /home/travis/build/mxm/flink/flink-tests/target/hs_err_pid72720.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> Aborted (core dumped)
> {noformat}
> I propose to disable the test case in the meantime because it is blocking our 
> test execution which we need for pull requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3960) EventTimeCheckpointingITCase fails with a segmentation fault

2016-05-23 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3960:
-

 Summary: EventTimeCheckpointingITCase fails with a segmentation 
fault
 Key: FLINK-3960
 URL: https://issues.apache.org/jira/browse/FLINK-3960
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Aljoscha Krettek
 Fix For: 1.1.0


As a follow-up issue of FLINK-3909, our tests fail with the following. I 
believe [~aljoscha] is working on a fix.

{noformat}
#

# A fatal error has been detected by the Java Runtime Environment:

#

#  SIGSEGV (0xb) at pc=0x7fae0c62a264, pid=72720, tid=140385528268544

#

# JRE version: Java(TM) SE Runtime Environment (7.0_76-b13) (build 1.7.0_76-b13)

# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.76-b04 mixed mode linux-amd64 
compressed oops)

# Problematic frame:

# C  [librocksdbjni78704726610339516..so+0x13c264]  
rocksdb_iterator_helper(rocksdb::DB*, rocksdb::ReadOptions, 
rocksdb::ColumnFamilyHandle*)+0x4

#

# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again

#

# An error report file with more information is saved as:

# /home/travis/build/mxm/flink/flink-tests/target/hs_err_pid72720.log

#

# If you would like to submit a bug report, please visit:

#   http://bugreport.java.com/bugreport/crash.jsp

# The crash happened outside the Java Virtual Machine in native code.

# See problematic frame for where to report the bug.

#

Aborted (core dumped)
{noformat}

I propose to disable the test case in the meantime because it is blocking our 
test execution which we need for pull requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3927) TaskManager registration may fail if Yarn versions don't match

2016-05-23 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3927.
-
Resolution: Fixed

Fixed via 017106e140f3c17ebaaa0507e1dcbbc445c8f0ac

> TaskManager registration may fail if Yarn versions don't match
> --
>
> Key: FLINK-3927
> URL: https://issues.apache.org/jira/browse/FLINK-3927
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> Flink's ResourceManager uses the Yarn container ids to identify connecting 
> task managers. Yarn's stringified container id may not be consistent across 
> different Hadoop versions, e.g. Hadoop 2.3.0 and Hadoop 2.7.1. The 
> ResourceManager gets it from the Yarn reports while the TaskManager infers it 
> from the Yarn environment variables. The ResourceManager may use Hadoop 2.3.0 
> version while the cluster runs Hadoop 2.7.1. 
> The solution is to pass the ID through a custom environment variable which is 
> set by the ResourceManager before launching the TaskManager in the container. 
> That way we will always use the Hadoop client's id generation method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3953) Surefire plugin executes unit tests twice

2016-05-23 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3953.
-
Resolution: Fixed

Fixed via 5fdf39b1fec032f5816cb188334c129ff9186415

> Surefire plugin executes unit tests twice
> -
>
> Key: FLINK-3953
> URL: https://issues.apache.org/jira/browse/FLINK-3953
> Project: Flink
>  Issue Type: Bug
>  Components: Build System, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.1.0
>
>
> After FLINK-3909 the unit tests are executed twice. There are now two 
> executions defined for the Surefire plugin: {{unit-tests}} and 
> {{integration-tests}}. In addition, there is a default execution called 
> {{default-test}}. This leads to the unit tests to be executed twice. Either 
> renaming unit-tests to default-test or skipping default-test would fix the 
> problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3953) Surefire plugin executes unit tests twice

2016-05-23 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296185#comment-15296185
 ] 

Maximilian Michels commented on FLINK-3953:
---

Follow-up of FLINK-3909.

> Surefire plugin executes unit tests twice
> -
>
> Key: FLINK-3953
> URL: https://issues.apache.org/jira/browse/FLINK-3953
> Project: Flink
>  Issue Type: Bug
>  Components: Build System, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.1.0
>
>
> After FLINK-3909 the unit tests are executed twice. There are now two 
> executions defined for the Surefire plugin: {{unit-tests}} and 
> {{integration-tests}}. In addition, there is a default execution called 
> {{default-test}}. This leads to the unit tests to be executed twice. Either 
> renaming unit-tests to default-test or skipping default-test would fix the 
> problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3953) Surefire plugin executes unit tests twice

2016-05-23 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3953:
-

 Summary: Surefire plugin executes unit tests twice
 Key: FLINK-3953
 URL: https://issues.apache.org/jira/browse/FLINK-3953
 Project: Flink
  Issue Type: Bug
  Components: Build System, Tests
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


After FLINK-3909 the unit tests are executed twice. There are now two 
executions defined for the Surefire plugin: {{unit-tests}} and 
{{integration-tests}}. In addition, there is a default execution called 
{{default-test}}. This leads to the unit tests to be executed twice. Either 
renaming unit-tests to default-test or skipping default-test would fix the 
problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3893) LeaderChangeStateCleanupTest times out

2016-05-20 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3893.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

9b8de6a6cb3b1cc9ebeb623371e7fef3a6cb763d

> LeaderChangeStateCleanupTest times out
> --
>
> Key: FLINK-3893
> URL: https://issues.apache.org/jira/browse/FLINK-3893
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> {{cluster.waitForTaskManagersToBeRegistered();}} needs to be replaced by 
> {{cluster.waitForTaskManagersToBeRegistered(timeout);}}
> {noformat}
> testStateCleanupAfterListenerNotification(org.apache.flink.runtime.leaderelection.LeaderChangeStateCleanupTest)
>   Time elapsed: 10.106 sec  <<< ERROR!
> java.util.concurrent.TimeoutException: Futures timed out after [1 
> milliseconds]
>   at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>   at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
>   at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:86)
>   at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:86)
>   at 
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>   at scala.concurrent.Await$.ready(package.scala:86)
>   at 
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.waitForTaskManagersToBeRegistered(FlinkMiniCluster.scala:455)
>   at 
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.waitForTaskManagersToBeRegistered(FlinkMiniCluster.scala:439)
>   at 
> org.apache.flink.runtime.leaderelection.LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification(LeaderChangeStateCleanupTest.java:181)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3938) Yarn tests don't run on the current master

2016-05-20 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3938.
-
Resolution: Fixed

9a4fdd5fb3a7097035e74b0bf685553bbfdf7f43

> Yarn tests don't run on the current master
> --
>
> Key: FLINK-3938
> URL: https://issues.apache.org/jira/browse/FLINK-3938
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 1.1.0
>
>
> Independently of FLINK-3909, I just discovered that the Yarn tests don't run 
> on the current master (09b428b).
> {noformat}
> [INFO] 
> 
> [INFO] Building flink-yarn-tests 1.1-SNAPSHOT
> [INFO] 
> 
> [INFO] 
> [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ 
> flink-yarn-tests_2.10 ---
> [INFO] 
> [INFO] --- maven-checkstyle-plugin:2.16:check (validate) @ 
> flink-yarn-tests_2.10 ---
> [INFO] 
> [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-maven) @ 
> flink-yarn-tests_2.10 ---
> [INFO] 
> [INFO] --- build-helper-maven-plugin:1.7:add-source (add-source) @ 
> flink-yarn-tests_2.10 ---
> [INFO] Source directory: 
> /home/travis/build/apache/flink/flink-yarn-tests/src/main/scala added.
> [INFO] 
> [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ 
> flink-yarn-tests_2.10 ---
> [INFO] 
> [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
> flink-yarn-tests_2.10 ---
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] skip non existing resourceDirectory 
> /home/travis/build/apache/flink/flink-yarn-tests/src/main/resources
> [INFO] Copying 3 resources
> [INFO] 
> [INFO] --- scala-maven-plugin:3.1.4:compile (scala-compile-first) @ 
> flink-yarn-tests_2.10 ---
> [INFO] No sources to compile
> [INFO] 
> [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
> flink-yarn-tests_2.10 ---
> [INFO] No sources to compile
> [INFO] 
> [INFO] --- build-helper-maven-plugin:1.7:add-test-source (add-test-source) @ 
> flink-yarn-tests_2.10 ---
> [INFO] Test Source directory: 
> /home/travis/build/apache/flink/flink-yarn-tests/src/test/scala added.
> [INFO] 
> [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
> flink-yarn-tests_2.10 ---
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 1 resource
> [INFO] Copying 3 resources
> [INFO] 
> [INFO] --- scala-maven-plugin:3.1.4:testCompile (scala-test-compile) @ 
> flink-yarn-tests_2.10 ---
> [INFO] /home/travis/build/apache/flink/flink-yarn-tests/src/test/scala:-1: 
> info: compiling
> [INFO] Compiling 2 source files to 
> /home/travis/build/apache/flink/flink-yarn-tests/target/test-classes at 
> 1463615798796
> [INFO] prepare-compile in 0 s
> [INFO] compile in 9 s
> [INFO] 
> [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
> flink-yarn-tests_2.10 ---
> [INFO] Nothing to compile - all classes are up to date
> [INFO] 
> [INFO] --- maven-surefire-plugin:2.18.1:test (default-test) @ 
> flink-yarn-tests_2.10 ---
> [INFO] Surefire report directory: 
> /home/travis/build/apache/flink/flink-yarn-tests/target/surefire-reports
> [WARNING] The system property log4j.configuration is configured twice! The 
> property appears in  and any of , 
>  or user property.
> ---
>  T E S T S
> ---
> Results :
> Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3892) ConnectionUtils may die with NullPointerException

2016-05-20 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3892.
-
Resolution: Fixed

5dfb8a013cbf769bee641c04ebc666a8dc2f3e14

> ConnectionUtils may die with NullPointerException
> -
>
> Key: FLINK-3892
> URL: https://issues.apache.org/jira/browse/FLINK-3892
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 1.1.0, 1.0.3
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.1.0
>
>
> If an invalid hostname is specificed or the hostname can't be resolved from 
> the current interface, {{ConnectionUtils.findAddressUsingStrategy}} may throw 
> a {{NullPointerException}}. When trying to access the {{InetAddress}} of an 
> {{InetSocketAddress}}, null is returned when the host could not been resolved.
> The solution is to abort the attempt to find the local address if the host 
> couldn't be resolved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3909) Maven Failsafe plugin may report SUCCESS on failed tests

2016-05-20 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3909.
-
Resolution: Fixed

38698c0b101cbb48f8c10adf4060983ac07e2f4b

> Maven Failsafe plugin may report SUCCESS on failed tests
> 
>
> Key: FLINK-3909
> URL: https://issues.apache.org/jira/browse/FLINK-3909
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> The following build completed successfully on Travis but there are actually 
> test failures: https://travis-ci.org/apache/flink/jobs/129943398#L5402



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3938) Yarn tests don't run on the current master

2016-05-19 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3938:
-

 Summary: Yarn tests don't run on the current master
 Key: FLINK-3938
 URL: https://issues.apache.org/jira/browse/FLINK-3938
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Critical
 Fix For: 1.1.0


Independently of FLINK-3909, I just discovered that the Yarn tests don't run on 
the current master (09b428b).

{noformat}
[INFO] 
[INFO] Building flink-yarn-tests 1.1-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ flink-yarn-tests_2.10 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.16:check (validate) @ 
flink-yarn-tests_2.10 ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-maven) @ 
flink-yarn-tests_2.10 ---
[INFO] 
[INFO] --- build-helper-maven-plugin:1.7:add-source (add-source) @ 
flink-yarn-tests_2.10 ---
[INFO] Source directory: 
/home/travis/build/apache/flink/flink-yarn-tests/src/main/scala added.
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ 
flink-yarn-tests_2.10 ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
flink-yarn-tests_2.10 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/home/travis/build/apache/flink/flink-yarn-tests/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- scala-maven-plugin:3.1.4:compile (scala-compile-first) @ 
flink-yarn-tests_2.10 ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
flink-yarn-tests_2.10 ---
[INFO] No sources to compile
[INFO] 
[INFO] --- build-helper-maven-plugin:1.7:add-test-source (add-test-source) @ 
flink-yarn-tests_2.10 ---
[INFO] Test Source directory: 
/home/travis/build/apache/flink/flink-yarn-tests/src/test/scala added.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
flink-yarn-tests_2.10 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- scala-maven-plugin:3.1.4:testCompile (scala-test-compile) @ 
flink-yarn-tests_2.10 ---
[INFO] /home/travis/build/apache/flink/flink-yarn-tests/src/test/scala:-1: 
info: compiling
[INFO] Compiling 2 source files to 
/home/travis/build/apache/flink/flink-yarn-tests/target/test-classes at 
1463615798796
[INFO] prepare-compile in 0 s
[INFO] compile in 9 s
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
flink-yarn-tests_2.10 ---
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] --- maven-surefire-plugin:2.18.1:test (default-test) @ 
flink-yarn-tests_2.10 ---
[INFO] Surefire report directory: 
/home/travis/build/apache/flink/flink-yarn-tests/target/surefire-reports
[WARNING] The system property log4j.configuration is configured twice! The 
property appears in  and any of , 
 or user property.

---
 T E S T S
---

Results :

Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3863) Yarn Cluster shutdown may fail if leader changed recently

2016-05-19 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15290870#comment-15290870
 ] 

Maximilian Michels commented on FLINK-3863:
---

Fixing this with FLINK-3667.

> Yarn Cluster shutdown may fail if leader changed recently
> -
>
> Key: FLINK-3863
> URL: https://issues.apache.org/jira/browse/FLINK-3863
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 1.1.0, 1.0.2
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.1.0
>
>
> The {{ApplicationClient}} sets {{yarnJobManager}} to {{None}} until it has 
> connected to a newly elected JobManager. A shutdown message to the 
> application master is discarded while the ApplicationClient tries to 
> reconnect. The ApplicationClient should retry to shutdown the cluster when it 
> is connected to the new leader. It may also time out (which currently is 
> always the case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3543) Introduce ResourceManager component

2016-05-19 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3543.
-
Resolution: Implemented

Closing this because we have the ResourceManager in place.

> Introduce ResourceManager component
> ---
>
> Key: FLINK-3543
> URL: https://issues.apache.org/jira/browse/FLINK-3543
> Project: Flink
>  Issue Type: New Feature
>  Components: JobManager, ResourceManager, TaskManager
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
> Attachments: ResourceManagerSketch.pdf
>
>
> So far the JobManager has been the central instance which is responsible for 
> resource management and allocation.
> While thinking about how to integrate Mesos support in Flink, people from the 
> Flink community realized that it would be nice to delegate resource 
> allocation to a dedicated process. This process may run independently of the 
> JobManager which is a requirement for proper integration of cluster 
> allocation frameworks like Mesos.
> This has led to the idea of creating a new component called the 
> {{ResourceManager}}. Its task is to allocate and maintain resources requested 
> by the {{JobManager}}. The ResourceManager has a very abstract notion of 
> resources.
> Initially, we thought we could make the ResourceManager deal with resource 
> allocation and the registration/supervision of the TaskManagers. However, 
> this approach proved to add unnecessary complexity to the runtime. 
> Registration state of TaskManagers had to be kept in sync at both the 
> JobManager and the ResourceManager.
> That's why [~StephanEwen] and me changed the ResourceManager's role to simply 
> deal with the resource acquisition. The TaskManagers still register with the 
> JobManager which informs the ResourceManager about the successful 
> registration of a TaskManager. The ResourceManager may inform the JobManager 
> of failed TaskManagers. Due to the insight which the ResourceManager has in 
> the resource health, it may detect failed TaskManagers much earlier than the 
> heartbeat-based monitoring of the JobManager.
> At this stage, the ResourceManager is an optional component. That means the 
> JobManager doesn't depend on the ResourceManager as long as it has enough 
> resources to perform the computation. All bookkeeping is performed by the 
> JobManager. When the ResourceManager connects to the JobManager, it receives 
> the current resources, i.e. task manager instances, and allocates more 
> containers if necessary. The JobManager adjusts the number of containers 
> through the {{SetWorkerPoolSize}} method.
> In standalone mode, the ResourceManager may be deactivated or simply use the 
> StandaloneResourceManager which does practically nothing because we don't 
> need to allocate resources in standalone mode.
> In YARN mode, the ResourceManager takes care of communicating with the Yarn 
> resource manager. When containers fail, it informs the JobManager and tries 
> to allocate new containers. The ResourceManager runs as an actor within the 
> same actor system as the JobManager. It could, however, also run 
> independently. The independent mode would be the default behavior for Mesos 
> where the framework master is expected to just deal with resource allocation.
> The attached figures depict the message flow between ResourceManager, 
> JobManager, and TaskManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3543) Introduce ResourceManager component

2016-05-19 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15290858#comment-15290858
 ] 

Maximilian Michels commented on FLINK-3543:
---

Hi! Scheduling is not part of the ResourceManager. The JobManager would have to 
deal with these issues.

> Introduce ResourceManager component
> ---
>
> Key: FLINK-3543
> URL: https://issues.apache.org/jira/browse/FLINK-3543
> Project: Flink
>  Issue Type: New Feature
>  Components: JobManager, ResourceManager, TaskManager
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
> Attachments: ResourceManagerSketch.pdf
>
>
> So far the JobManager has been the central instance which is responsible for 
> resource management and allocation.
> While thinking about how to integrate Mesos support in Flink, people from the 
> Flink community realized that it would be nice to delegate resource 
> allocation to a dedicated process. This process may run independently of the 
> JobManager which is a requirement for proper integration of cluster 
> allocation frameworks like Mesos.
> This has led to the idea of creating a new component called the 
> {{ResourceManager}}. Its task is to allocate and maintain resources requested 
> by the {{JobManager}}. The ResourceManager has a very abstract notion of 
> resources.
> Initially, we thought we could make the ResourceManager deal with resource 
> allocation and the registration/supervision of the TaskManagers. However, 
> this approach proved to add unnecessary complexity to the runtime. 
> Registration state of TaskManagers had to be kept in sync at both the 
> JobManager and the ResourceManager.
> That's why [~StephanEwen] and me changed the ResourceManager's role to simply 
> deal with the resource acquisition. The TaskManagers still register with the 
> JobManager which informs the ResourceManager about the successful 
> registration of a TaskManager. The ResourceManager may inform the JobManager 
> of failed TaskManagers. Due to the insight which the ResourceManager has in 
> the resource health, it may detect failed TaskManagers much earlier than the 
> heartbeat-based monitoring of the JobManager.
> At this stage, the ResourceManager is an optional component. That means the 
> JobManager doesn't depend on the ResourceManager as long as it has enough 
> resources to perform the computation. All bookkeeping is performed by the 
> JobManager. When the ResourceManager connects to the JobManager, it receives 
> the current resources, i.e. task manager instances, and allocates more 
> containers if necessary. The JobManager adjusts the number of containers 
> through the {{SetWorkerPoolSize}} method.
> In standalone mode, the ResourceManager may be deactivated or simply use the 
> StandaloneResourceManager which does practically nothing because we don't 
> need to allocate resources in standalone mode.
> In YARN mode, the ResourceManager takes care of communicating with the Yarn 
> resource manager. When containers fail, it informs the JobManager and tries 
> to allocate new containers. The ResourceManager runs as an actor within the 
> same actor system as the JobManager. It could, however, also run 
> independently. The independent mode would be the default behavior for Mesos 
> where the framework master is expected to just deal with resource allocation.
> The attached figures depict the message flow between ResourceManager, 
> JobManager, and TaskManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3927) TaskManager registration may fail if Yarn versions don't match

2016-05-18 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3927:
-

 Summary: TaskManager registration may fail if Yarn versions don't 
match
 Key: FLINK-3927
 URL: https://issues.apache.org/jira/browse/FLINK-3927
 Project: Flink
  Issue Type: Bug
  Components: ResourceManager
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
 Fix For: 1.1.0


Flink's ResourceManager uses the Yarn container ids to identify connecting task 
managers. Yarn's stringified container id may not be consistent across 
different Hadoop versions, e.g. Hadoop 2.3.0 and Hadoop 2.7.1. The 
ResourceManager gets it from the Yarn reports while the TaskManager infers it 
from the Yarn environment variables. The ResourceManager may use Hadoop 2.3.0 
version while the cluster runs Hadoop 2.7.1. 

The solution is to pass the ID through a custom environment variable which is 
set by the ResourceManager before launching the TaskManager in the container. 
That way we will always use the Hadoop client's id generation method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3661) Make Scala 2.11.x the default Scala version

2016-05-18 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3661:
--
Assignee: (was: Robert Metzger)

> Make Scala 2.11.x the default Scala version
> ---
>
> Key: FLINK-3661
> URL: https://issues.apache.org/jira/browse/FLINK-3661
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Distributed Runtime
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
> Fix For: 1.1.0
>
>
> Flink's default Scala version is 2.10.4. I'd propose to update it to Scala 
> 2.11.8 why still keeping the option to use Scala 2.10.x.
> By now, Scala 2.11 is already the preferred version many people use and Scala 
> 2.12 is around the corner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs

2016-05-18 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-2821:
--
Assignee: (was: Robert Metzger)

> Change Akka configuration to allow accessing actors from different URLs
> ---
>
> Key: FLINK-2821
> URL: https://issues.apache.org/jira/browse/FLINK-2821
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Runtime
>Reporter: Robert Metzger
> Fix For: 1.1.0
>
>
> Akka expects the actor's URL to be exactly matching.
> As pointed out here, cases where users were complaining about this: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html
>   - Proxy routing (as described here, send to the proxy URL, receiver 
> recognizes only original URL)
>   - Using hostname / IP interchangeably does not work (we solved this by 
> always putting IP addresses into URLs, never hostnames)
>   - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still 
> no solution to that (but seems not too much of a restriction)
> I am aware that this is not possible due to Akka, so it is actually not a 
> Flink bug. But I think we should track the resolution of the issue here 
> anyways because its affecting our user's satisfaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3662) Bump Akka version to 2.4.x for Scala 2.11.x

2016-05-18 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3662:
--
Assignee: (was: Robert Metzger)

> Bump Akka version to 2.4.x for Scala 2.11.x
> ---
>
> Key: FLINK-3662
> URL: https://issues.apache.org/jira/browse/FLINK-3662
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
> Fix For: 1.1.0
>
>
> In order to make use of newer Akka features (FLINK-2821), we need to update 
> Akka to version 2.4.x.
> To main backwards-compatibility, we have to adjust the 
> {{change_scala_version}} script to update the Akka version dependent on the 
> Scala version.
> Scala 2.10.x => Akka 2.3.x
> Scala 2.11.x => Akka 2.4.x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3909) Maven Failsafe plugin may report SUCCESS on failed tests

2016-05-17 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15286742#comment-15286742
 ] 

Maximilian Michels commented on FLINK-3909:
---

Failing tests are not supposed to fail the entire build in the 
{{integration-test}} phase:

{quote}
The Failsafe Plugin is used during the integration-test and verify phases of 
the build lifecycle to execute the integration tests of an application. The 
Failsafe Plugin will not fail the build during the integration-test phase, thus 
enabling the post-integration-test phase to execute.
{quote}
https://maven.apache.org/surefire/maven-failsafe-plugin/

The verify phase also runs afterwards but doesn't receive the failed tests from 
the previous phase. It appears the test failures are read from the 
{{failsafe-reports}} file in the verify phase.

I verified on my machine that Exceptions thrown in the integration-tests phase 
let maven fail the build on {{mvn verify}}.
{noformat}
[INFO] --- maven-failsafe-plugin:2.18.1:verify (default) @ flink-tests_2.10 ---
[INFO] Failsafe report directory: 
/Users/max/Dev/flink/flink-tests/target/failsafe-reports
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-failsafe-plugin:2.18.1:verify (default) on 
project flink-tests_2.10: There are test failures.
{noformat}

On Travis, all we get is 
{noformat}
[INFO] --- maven-failsafe-plugin:2.18.1:verify (default) @ flink-tests_2.10 ---
[INFO] Failsafe report directory: 
/home/travis/build/apache/flink/flink-tests/target/failsafe-reports
{noformat}

The {{executions}} are set up correctly in the parent pom. I'll try to 
reproduce the error on Travis with activated debug logging.

> Maven Failsafe plugin may report SUCCESS on failed tests
> 
>
> Key: FLINK-3909
> URL: https://issues.apache.org/jira/browse/FLINK-3909
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> The following build completed successfully on Travis but there are actually 
> test failures: https://travis-ci.org/apache/flink/jobs/129943398#L5402



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3909) Maven Failsafe plugin may report SUCCESS on failed tests

2016-05-17 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned FLINK-3909:
-

Assignee: Maximilian Michels

> Maven Failsafe plugin may report SUCCESS on failed tests
> 
>
> Key: FLINK-3909
> URL: https://issues.apache.org/jira/browse/FLINK-3909
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> The following build completed successfully on Travis but there are actually 
> test failures: https://travis-ci.org/apache/flink/jobs/129943398#L5402



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3701) Cant call execute after first execution

2016-05-14 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-3701.
---
Resolution: Fixed

Fixed in 48b469ad4f0da466b347071cea82913965645de3.

> Cant call execute after first execution
> ---
>
> Key: FLINK-3701
> URL: https://issues.apache.org/jira/browse/FLINK-3701
> Project: Flink
>  Issue Type: Bug
>  Components: Scala Shell
>Affects Versions: 1.1.0
>Reporter: Nikolaas Steenbergen
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> in the scala shell, local mode, version 1.0 this works:
> {code}
> Scala-Flink> var b = env.fromElements("a","b")
> Scala-Flink> b.print
> Scala-Flink> var c = env.fromElements("c","d")
> Scala-Flink> c.print
> {code}
> in the current master (after c.print) this leads to :
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1031)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:961)
>   at 
> org.apache.flink.api.java.ScalaShellRemoteEnvironment.execute(ScalaShellRemoteEnvironment.java:70)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:855)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:410)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1605)
>   at org.apache.flink.api.scala.DataSet.print(DataSet.scala:1615)
>   at .(:56)
>   at .()
>   at .(:7)
>   at .()
>   at $print()
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
>   at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983)
>   at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
>   at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604)
>   at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568)
>   at scala.tools.nsc.interpreter.ILoop.reallyInterpret$1(ILoop.scala:760)
>   at 
> scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:805)
>   at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:717)
>   at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581)
>   at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588)
>   at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591)
>   at 
> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:882)
>   at 
> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
>   at 
> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
>   at 
> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>   at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837)
>   at 
> org.apache.flink.api.scala.FlinkShell$.startShell(FlinkShell.scala:199)
>   at org.apache.flink.api.scala.FlinkShell$.main(FlinkShell.scala:127)
>   at org.apache.flink.api.scala.FlinkShell.main(FlinkShell.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3909) Maven Failsafe plugin may report SUCCESS on failed tests

2016-05-13 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3909:
-

 Summary: Maven Failsafe plugin may report SUCCESS on failed tests
 Key: FLINK-3909
 URL: https://issues.apache.org/jira/browse/FLINK-3909
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.1.0
Reporter: Maximilian Michels
 Fix For: 1.1.0


The following build completed successfully on Travis but there are actually 
test failures: https://travis-ci.org/apache/flink/jobs/129943398#L5402



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3856) Create types for java.sql.Date/Time/Timestamp

2016-05-13 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282843#comment-15282843
 ] 

Maximilian Michels commented on FLINK-3856:
---

There is something wrong with our Maven configuration. You can see from the 
test output that the test failed before but still Maven supported SUCCESS: 
https://travis-ci.org/apache/flink/jobs/129943398#L5402

> Create types for java.sql.Date/Time/Timestamp
> -
>
> Key: FLINK-3856
> URL: https://issues.apache.org/jira/browse/FLINK-3856
> Project: Flink
>  Issue Type: New Feature
>  Components: Core
>Reporter: Timo Walther
>Assignee: Timo Walther
> Fix For: 1.1.0
>
>
> At the moment there is only the {{Date}} type which is not sufficient for 
> most use cases about time.
> The Table API would also benefit from having different types as output result.
> I would propose to add the three {{java.sql.}} types either as {{BasicTypes}} 
> or in an additional class {{TimeTypes}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3856) Create types for java.sql.Date/Time/Timestamp

2016-05-13 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282839#comment-15282839
 ] 

Maximilian Michels commented on FLINK-3856:
---

Additional fix with 96b353d98f6b6d441ebedf69ec12cfa333a1d7c9.

> Create types for java.sql.Date/Time/Timestamp
> -
>
> Key: FLINK-3856
> URL: https://issues.apache.org/jira/browse/FLINK-3856
> Project: Flink
>  Issue Type: New Feature
>  Components: Core
>Reporter: Timo Walther
>Assignee: Timo Walther
> Fix For: 1.1.0
>
>
> At the moment there is only the {{Date}} type which is not sufficient for 
> most use cases about time.
> The Table API would also benefit from having different types as output result.
> I would propose to add the three {{java.sql.}} types either as {{BasicTypes}} 
> or in an additional class {{TimeTypes}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3856) Create types for java.sql.Date/Time/Timestamp

2016-05-13 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282696#comment-15282696
 ] 

Maximilian Michels commented on FLINK-3856:
---

This breaks {{GroupReduceITCase.testGroupByGenericType}} because it checks for 
{{   
Assert.assertTrue(ec.getRegisteredKryoTypes().contains(java.sql.Date.class));}}.

Fixing this while merging FLINK-3701.

> Create types for java.sql.Date/Time/Timestamp
> -
>
> Key: FLINK-3856
> URL: https://issues.apache.org/jira/browse/FLINK-3856
> Project: Flink
>  Issue Type: New Feature
>  Components: Core
>Reporter: Timo Walther
>Assignee: Timo Walther
> Fix For: 1.1.0
>
>
> At the moment there is only the {{Date}} type which is not sufficient for 
> most use cases about time.
> The Table API would also benefit from having different types as output result.
> I would propose to add the three {{java.sql.}} types either as {{BasicTypes}} 
> or in an additional class {{TimeTypes}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3887) Improve dependency management for building docs

2016-05-13 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282554#comment-15282554
 ] 

Maximilian Michels commented on FLINK-3887:
---

The related Infra issue

> Improve dependency management for building docs
> ---
>
> Key: FLINK-3887
> URL: https://issues.apache.org/jira/browse/FLINK-3887
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>
> Our nightly docs builds currently fail: 
> https://ci.apache.org/builders/flink-docs-master/
> I will file an issue with JIRA to fix it. The root cause is that we rely on a 
> couple of dependencies to be installed. We could circumvent this by providing 
> a Ruby Gemfile that we can then use to load necessary dependencies. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3904) GlobalConfiguration doesn't ensure config has been loaded

2016-05-13 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3904:
-

 Summary: GlobalConfiguration doesn't ensure config has been loaded
 Key: FLINK-3904
 URL: https://issues.apache.org/jira/browse/FLINK-3904
 Project: Flink
  Issue Type: Improvement
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


By default, {{GlobalConfiguration}} returns an empty Configuration. Instead, a 
call to {{get()}} should fail if the config hasn't been loaded explicitly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3776) Flink Scala shell does not allow to set configuration for local execution

2016-05-13 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-3776.
---
   Resolution: Fixed
 Assignee: Dongwon Kim
Fix Version/s: 1.1.0

Fixed with 099fdfa0c5789f509242f83e8f808d552e63ee8d

> Flink Scala shell does not allow to set configuration for local execution
> -
>
> Key: FLINK-3776
> URL: https://issues.apache.org/jira/browse/FLINK-3776
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala Shell
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Dongwon Kim
>Priority: Minor
> Fix For: 1.1.0
>
>
> Flink's Scala shell starts a {{LocalFlinkMiniCluster}} with an empty 
> configuration when the shell is started in local mode. In order to allow the 
> user to configure the mini cluster, e.g., number of slots, size of memory, it 
> would be good to forward a user specified configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3733) registeredTypesWithKryoSerializers is not assigned in ExecutionConfig#deserializeUserCode()

2016-05-12 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-3733.
---
Resolution: Duplicate
  Assignee: Maximilian Michels

> registeredTypesWithKryoSerializers is not assigned in 
> ExecutionConfig#deserializeUserCode()
> ---
>
> Key: FLINK-3733
> URL: https://issues.apache.org/jira/browse/FLINK-3733
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Maximilian Michels
>Priority: Minor
>
> {code}
> if (serializedRegisteredTypesWithKryoSerializers != null) {
>   registeredTypesWithKryoSerializers = 
> serializedRegisteredTypesWithKryoSerializers.deserializeValue(userCodeClassLoader);
> } else {
>   registeredTypesWithKryoSerializerClasses = new LinkedHashMap<>();
> }
> {code}
> When serializedRegisteredTypesWithKryoSerializers is null, 
> registeredTypesWithKryoSerializers is not assigned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3733) registeredTypesWithKryoSerializers is not assigned in ExecutionConfig#deserializeUserCode()

2016-05-11 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280193#comment-15280193
 ] 

Maximilian Michels commented on FLINK-3733:
---

I think this might be a duplicate of FLINK-3701.

> registeredTypesWithKryoSerializers is not assigned in 
> ExecutionConfig#deserializeUserCode()
> ---
>
> Key: FLINK-3733
> URL: https://issues.apache.org/jira/browse/FLINK-3733
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
> if (serializedRegisteredTypesWithKryoSerializers != null) {
>   registeredTypesWithKryoSerializers = 
> serializedRegisteredTypesWithKryoSerializers.deserializeValue(userCodeClassLoader);
> } else {
>   registeredTypesWithKryoSerializerClasses = new LinkedHashMap<>();
> }
> {code}
> When serializedRegisteredTypesWithKryoSerializers is null, 
> registeredTypesWithKryoSerializers is not assigned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3893) LeaderChangeStateCleanupTest times out

2016-05-10 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278245#comment-15278245
 ] 

Maximilian Michels commented on FLINK-3893:
---

After fixing this, I get the following:

{noformat}
java.lang.IllegalStateException: The retrieval service has not been started 
properly.
at 
org.apache.flink.runtime.leaderelection.TestingLeaderRetrievalService.notifyListener(TestingLeaderRetrievalService.java:65)
at 
org.apache.flink.runtime.leaderelection.LeaderElectionRetrievalTestingCluster.notifyRetrievalListeners(LeaderElectionRetrievalTestingCluster.java:123)
at 
org.apache.flink.runtime.leaderelection.LeaderChangeStateCleanupTest.testStateCleanupAfterNewLeaderElectionAndListenerNotification(LeaderChangeStateCleanupTest.java:97)
{noformat}

The {{LeaderRetrievalService}} may not been started yet (because the Actors are 
started asynchronously). I propose to remove this check.

> LeaderChangeStateCleanupTest times out
> --
>
> Key: FLINK-3893
> URL: https://issues.apache.org/jira/browse/FLINK-3893
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
>  Labels: test-stability
>
> {{cluster.waitForTaskManagersToBeRegistered();}} needs to be replaced by 
> {{cluster.waitForTaskManagersToBeRegistered(timeout);}}
> {noformat}
> testStateCleanupAfterListenerNotification(org.apache.flink.runtime.leaderelection.LeaderChangeStateCleanupTest)
>   Time elapsed: 10.106 sec  <<< ERROR!
> java.util.concurrent.TimeoutException: Futures timed out after [1 
> milliseconds]
>   at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>   at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
>   at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:86)
>   at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:86)
>   at 
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>   at scala.concurrent.Await$.ready(package.scala:86)
>   at 
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.waitForTaskManagersToBeRegistered(FlinkMiniCluster.scala:455)
>   at 
> org.apache.flink.runtime.minicluster.FlinkMiniCluster.waitForTaskManagersToBeRegistered(FlinkMiniCluster.scala:439)
>   at 
> org.apache.flink.runtime.leaderelection.LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification(LeaderChangeStateCleanupTest.java:181)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3893) LeaderChangeStateCleanupTest times out

2016-05-10 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3893:
-

 Summary: LeaderChangeStateCleanupTest times out
 Key: FLINK-3893
 URL: https://issues.apache.org/jira/browse/FLINK-3893
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor


{{cluster.waitForTaskManagersToBeRegistered();}} needs to be replaced by 
{{cluster.waitForTaskManagersToBeRegistered(timeout);}}

{noformat}
testStateCleanupAfterListenerNotification(org.apache.flink.runtime.leaderelection.LeaderChangeStateCleanupTest)
  Time elapsed: 10.106 sec  <<< ERROR!
java.util.concurrent.TimeoutException: Futures timed out after [1 
milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:86)
at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:86)
at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.ready(package.scala:86)
at 
org.apache.flink.runtime.minicluster.FlinkMiniCluster.waitForTaskManagersToBeRegistered(FlinkMiniCluster.scala:455)
at 
org.apache.flink.runtime.minicluster.FlinkMiniCluster.waitForTaskManagersToBeRegistered(FlinkMiniCluster.scala:439)
at 
org.apache.flink.runtime.leaderelection.LeaderChangeStateCleanupTest.testStateCleanupAfterListenerNotification(LeaderChangeStateCleanupTest.java:181)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3892) ConnectionUtils may die with NullPointerException

2016-05-10 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3892:
-

 Summary: ConnectionUtils may die with NullPointerException
 Key: FLINK-3892
 URL: https://issues.apache.org/jira/browse/FLINK-3892
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 1.1.0, 1.0.3
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


If an invalid hostname is specificed or the hostname can't be resolved from the 
current interface, {{ConnectionUtils.findAddressUsingStrategy}} may throw a 
{{NullPointerException}}. When trying to access the {{InetAddress}} of an 
{{InetSocketAddress}}, null is returned when the host could not been resolved.

The solution is to abort the attempt to find the local address if the host 
couldn't be resolved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3890) Deprecate streaming mode flag from Yarn CLI

2016-05-10 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3890:
-

 Summary: Deprecate streaming mode flag from Yarn CLI
 Key: FLINK-3890
 URL: https://issues.apache.org/jira/browse/FLINK-3890
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


The {{-yst}} and {{-yarnstreaming}} parameter to the Yarn command-line is not 
in use anymore since FLINK-3073 and should have been removed before the 1.0.0 
release. I would suggest to mark the parameter as deprecated in the code and 
not list it anymore in the help section. In one of the upcoming major releases, 
we can remove it completely (which would give users an error if they used the 
flag).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3880) Improve performance of Accumulator map

2016-05-10 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-3880.
---
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed with 9fccd6e5d626455174c830ba32eff06e60173020.

> Improve performance of Accumulator map
> --
>
> Key: FLINK-3880
> URL: https://issues.apache.org/jira/browse/FLINK-3880
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Ken Krugler
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.1.0
>
>
> I was looking at improving DataSet performance - this is for a job created 
> using the Cascading-Flink planner for Cascading 3.1.
> While doing a quick "poor man's profiler" session with one of the TaskManager 
> processes, I noticed that many (most?) of the threads that were actually 
> running were in this state:
> {code:java}
> "DataSource (/working1/terms) (8/20)" daemon prio=10 tid=0x7f55673e0800 
> nid=0x666a runnable [0x7f556abcf000]
>java.lang.Thread.State: RUNNABLE
> at java.util.Collections$SynchronizedMap.get(Collections.java:2037)
> - locked <0x0006e73fe718> (a java.util.Collections$SynchronizedMap)
> at 
> org.apache.flink.api.common.functions.util.AbstractRuntimeUDFContext.getAccumulator(AbstractRuntimeUDFContext.java:162)
> at 
> org.apache.flink.api.common.functions.util.AbstractRuntimeUDFContext.getLongCounter(AbstractRuntimeUDFContext.java:113)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.getOrInitCounter(FlinkFlowProcess.java:245)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.increment(FlinkFlowProcess.java:128)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.increment(FlinkFlowProcess.java:122)
> at 
> cascading.tap.hadoop.util.MeasuredRecordReader.next(MeasuredRecordReader.java:65)
> at cascading.scheme.hadoop.SequenceFile.source(SequenceFile.java:97)
> at 
> cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:166)
> at 
> cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:139)
> at 
> com.dataartisans.flink.cascading.runtime.source.TapSourceStage.readNextRecord(TapSourceStage.java:70)
> at 
> com.dataartisans.flink.cascading.runtime.source.TapInputFormat.reachedEnd(TapInputFormat.java:175)
> at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> at java.lang.Thread.run(Thread.java:745)}}}
> {code}
> It looks like Cascading is asking Flink to increment a counter with each 
> Tuple read, and that in turn is often blocked on getting access to the 
> Accumulator object in a map. It looks like this is a SynchronizedMap, but 
> using a ConcurrentHashMap (for example) would reduce this contention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3887) Improve dependency management for building docs

2016-05-09 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3887:
-

 Summary: Improve dependency management for building docs
 Key: FLINK-3887
 URL: https://issues.apache.org/jira/browse/FLINK-3887
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Maximilian Michels
Assignee: Maximilian Michels


Our nightly docs builds currently fail: 
https://ci.apache.org/builders/flink-docs-master/

I will file an issue with JIRA to fix it. The root cause is that we rely on a 
couple of dependencies to be installed. We could circumvent this by providing a 
Ruby Gemfile that we can then use to load necessary dependencies. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3880) Improve performance of Accumulator map

2016-05-09 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3880:
--
Summary: Improve performance of Accumulator map  (was: Use 
ConcurrentHashMap for Accumulators)

> Improve performance of Accumulator map
> --
>
> Key: FLINK-3880
> URL: https://issues.apache.org/jira/browse/FLINK-3880
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Ken Krugler
>Assignee: Maximilian Michels
>Priority: Minor
>
> I was looking at improving DataSet performance - this is for a job created 
> using the Cascading-Flink planner for Cascading 3.1.
> While doing a quick "poor man's profiler" session with one of the TaskManager 
> processes, I noticed that many (most?) of the threads that were actually 
> running were in this state:
> {code:java}
> "DataSource (/working1/terms) (8/20)" daemon prio=10 tid=0x7f55673e0800 
> nid=0x666a runnable [0x7f556abcf000]
>java.lang.Thread.State: RUNNABLE
> at java.util.Collections$SynchronizedMap.get(Collections.java:2037)
> - locked <0x0006e73fe718> (a java.util.Collections$SynchronizedMap)
> at 
> org.apache.flink.api.common.functions.util.AbstractRuntimeUDFContext.getAccumulator(AbstractRuntimeUDFContext.java:162)
> at 
> org.apache.flink.api.common.functions.util.AbstractRuntimeUDFContext.getLongCounter(AbstractRuntimeUDFContext.java:113)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.getOrInitCounter(FlinkFlowProcess.java:245)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.increment(FlinkFlowProcess.java:128)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.increment(FlinkFlowProcess.java:122)
> at 
> cascading.tap.hadoop.util.MeasuredRecordReader.next(MeasuredRecordReader.java:65)
> at cascading.scheme.hadoop.SequenceFile.source(SequenceFile.java:97)
> at 
> cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:166)
> at 
> cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:139)
> at 
> com.dataartisans.flink.cascading.runtime.source.TapSourceStage.readNextRecord(TapSourceStage.java:70)
> at 
> com.dataartisans.flink.cascading.runtime.source.TapInputFormat.reachedEnd(TapInputFormat.java:175)
> at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> at java.lang.Thread.run(Thread.java:745)}}}
> {code}
> It looks like Cascading is asking Flink to increment a counter with each 
> Tuple read, and that in turn is often blocked on getting access to the 
> Accumulator object in a map. It looks like this is a SynchronizedMap, but 
> using a ConcurrentHashMap (for example) would reduce this contention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3880) Use ConcurrentHashMap for Accumulators

2016-05-06 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274298#comment-15274298
 ] 

Maximilian Michels commented on FLINK-3880:
---

You're right, the synchronized map is a bottle neck. Actually, it is not even 
necessary that it synchronizes. In a regular Flink job, it can only be accessed 
by one task at a time. Only if the user spawned additional threads, it could be 
concurrently modified. In this case the user would have to take care of the 
synchronization (and if not get a ConcurrentModificationException). So we can 
simply make it a normal map.

> Use ConcurrentHashMap for Accumulators
> --
>
> Key: FLINK-3880
> URL: https://issues.apache.org/jira/browse/FLINK-3880
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Ken Krugler
>Priority: Minor
>
> I was looking at improving DataSet performance - this is for a job created 
> using the Cascading-Flink planner for Cascading 3.1.
> While doing a quick "poor man's profiler" session with one of the TaskManager 
> processes, I noticed that many (most?) of the threads that were actually 
> running were in this state:
> {code:java}
> "DataSource (/working1/terms) (8/20)" daemon prio=10 tid=0x7f55673e0800 
> nid=0x666a runnable [0x7f556abcf000]
>java.lang.Thread.State: RUNNABLE
> at java.util.Collections$SynchronizedMap.get(Collections.java:2037)
> - locked <0x0006e73fe718> (a java.util.Collections$SynchronizedMap)
> at 
> org.apache.flink.api.common.functions.util.AbstractRuntimeUDFContext.getAccumulator(AbstractRuntimeUDFContext.java:162)
> at 
> org.apache.flink.api.common.functions.util.AbstractRuntimeUDFContext.getLongCounter(AbstractRuntimeUDFContext.java:113)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.getOrInitCounter(FlinkFlowProcess.java:245)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.increment(FlinkFlowProcess.java:128)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.increment(FlinkFlowProcess.java:122)
> at 
> cascading.tap.hadoop.util.MeasuredRecordReader.next(MeasuredRecordReader.java:65)
> at cascading.scheme.hadoop.SequenceFile.source(SequenceFile.java:97)
> at 
> cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:166)
> at 
> cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:139)
> at 
> com.dataartisans.flink.cascading.runtime.source.TapSourceStage.readNextRecord(TapSourceStage.java:70)
> at 
> com.dataartisans.flink.cascading.runtime.source.TapInputFormat.reachedEnd(TapInputFormat.java:175)
> at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> at java.lang.Thread.run(Thread.java:745)}}}
> {code}
> It looks like Cascading is asking Flink to increment a counter with each 
> Tuple read, and that in turn is often blocked on getting access to the 
> Accumulator object in a map. It looks like this is a SynchronizedMap, but 
> using a ConcurrentHashMap (for example) would reduce this contention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3880) Use ConcurrentHashMap for Accumulators

2016-05-06 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned FLINK-3880:
-

Assignee: Maximilian Michels

> Use ConcurrentHashMap for Accumulators
> --
>
> Key: FLINK-3880
> URL: https://issues.apache.org/jira/browse/FLINK-3880
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Ken Krugler
>Assignee: Maximilian Michels
>Priority: Minor
>
> I was looking at improving DataSet performance - this is for a job created 
> using the Cascading-Flink planner for Cascading 3.1.
> While doing a quick "poor man's profiler" session with one of the TaskManager 
> processes, I noticed that many (most?) of the threads that were actually 
> running were in this state:
> {code:java}
> "DataSource (/working1/terms) (8/20)" daemon prio=10 tid=0x7f55673e0800 
> nid=0x666a runnable [0x7f556abcf000]
>java.lang.Thread.State: RUNNABLE
> at java.util.Collections$SynchronizedMap.get(Collections.java:2037)
> - locked <0x0006e73fe718> (a java.util.Collections$SynchronizedMap)
> at 
> org.apache.flink.api.common.functions.util.AbstractRuntimeUDFContext.getAccumulator(AbstractRuntimeUDFContext.java:162)
> at 
> org.apache.flink.api.common.functions.util.AbstractRuntimeUDFContext.getLongCounter(AbstractRuntimeUDFContext.java:113)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.getOrInitCounter(FlinkFlowProcess.java:245)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.increment(FlinkFlowProcess.java:128)
> at 
> com.dataartisans.flink.cascading.runtime.util.FlinkFlowProcess.increment(FlinkFlowProcess.java:122)
> at 
> cascading.tap.hadoop.util.MeasuredRecordReader.next(MeasuredRecordReader.java:65)
> at cascading.scheme.hadoop.SequenceFile.source(SequenceFile.java:97)
> at 
> cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:166)
> at 
> cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:139)
> at 
> com.dataartisans.flink.cascading.runtime.source.TapSourceStage.readNextRecord(TapSourceStage.java:70)
> at 
> com.dataartisans.flink.cascading.runtime.source.TapInputFormat.reachedEnd(TapInputFormat.java:175)
> at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> at java.lang.Thread.run(Thread.java:745)}}}
> {code}
> It looks like Cascading is asking Flink to increment a counter with each 
> Tuple read, and that in turn is often blocked on getting access to the 
> Accumulator object in a map. It looks like this is a SynchronizedMap, but 
> using a ConcurrentHashMap (for example) would reduce this contention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3876) Improve documentation of Scala Shell

2016-05-04 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3876:
--
Fix Version/s: (was: 1.0.3)

> Improve documentation of Scala Shell
> 
>
> Key: FLINK-3876
> URL: https://issues.apache.org/jira/browse/FLINK-3876
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Scala Shell
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> The Scala shell documentation could need some clarification and a bit of 
> restructuring to be more appealing and informative to the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3876) Improve documentation of Scala Shell

2016-05-04 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3876.
-
Resolution: Fixed

Fixed in 8ec47f17be0b20e5204d309f72b0bec9b234a7fb

> Improve documentation of Scala Shell
> 
>
> Key: FLINK-3876
> URL: https://issues.apache.org/jira/browse/FLINK-3876
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Scala Shell
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> The Scala shell documentation could need some clarification and a bit of 
> restructuring to be more appealing and informative to the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3876) Improve documentation of Scala Shell

2016-05-04 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-3876:
--
Affects Version/s: (was: 1.0.3)

> Improve documentation of Scala Shell
> 
>
> Key: FLINK-3876
> URL: https://issues.apache.org/jira/browse/FLINK-3876
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Scala Shell
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> The Scala shell documentation could need some clarification and a bit of 
> restructuring to be more appealing and informative to the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3876) Improve documentation of Scala Shell

2016-05-04 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3876:
-

 Summary: Improve documentation of Scala Shell
 Key: FLINK-3876
 URL: https://issues.apache.org/jira/browse/FLINK-3876
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Scala Shell
Affects Versions: 1.1.0, 1.0.3
Reporter: Maximilian Michels
Assignee: Maximilian Michels
 Fix For: 1.1.0, 1.0.3


The Scala shell documentation could need some clarification and a bit of 
restructuring to be more appealing and informative to the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs

2016-05-04 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270497#comment-15270497
 ] 

Maximilian Michels commented on FLINK-2821:
---

Awww, that's unfortunate :( I wonder, would it make sense to maintain a custom 
Akka version with a simple patch to make the Netty bind address configurable? 
IMHO that would even be better than our existing approach to bundle Akka 
versions depending on the Scala version used. I would like to look into this at 
some point.

> Change Akka configuration to allow accessing actors from different URLs
> ---
>
> Key: FLINK-2821
> URL: https://issues.apache.org/jira/browse/FLINK-2821
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Runtime
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 1.1.0
>
>
> Akka expects the actor's URL to be exactly matching.
> As pointed out here, cases where users were complaining about this: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html
>   - Proxy routing (as described here, send to the proxy URL, receiver 
> recognizes only original URL)
>   - Using hostname / IP interchangeably does not work (we solved this by 
> always putting IP addresses into URLs, never hostnames)
>   - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still 
> no solution to that (but seems not too much of a restriction)
> I am aware that this is not possible due to Akka, so it is actually not a 
> Flink bug. But I think we should track the resolution of the issue here 
> anyways because its affecting our user's satisfaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    4   5   6   7   8   9   10   11   12   13   >