[jira] [Created] (FLINK-5103) Process virtual memory and physical memory used size gauge

2016-11-18 Thread zhuhaifeng (JIRA)
zhuhaifeng created FLINK-5103:
-

 Summary: Process virtual memory and physical memory used size gauge
 Key: FLINK-5103
 URL: https://issues.apache.org/jira/browse/FLINK-5103
 Project: Flink
  Issue Type: Improvement
Reporter: zhuhaifeng
Assignee: zhuhaifeng
Priority: Minor
 Fix For: 1.2.0


Add TaskManger Process virtual memory and physical memory used size gauge 
metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5102) Connection establishment does not react to interrupt

2016-11-18 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5102:
--

 Summary: Connection establishment does not react to interrupt
 Key: FLINK-5102
 URL: https://issues.apache.org/jira/browse/FLINK-5102
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.1.3
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi
 Fix For: 1.2.0, 1.1.4


Interrupting a connection establishment does not to react to interrupts.

{code}
Task - Task '... (60/120)' did not react to cancelling signal, but is stuck in 
method:
java.lang.Object.$$YJP$$wait(Native Method)
java.lang.Object.wait(Object.java)
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:191)
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:131)
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:83)
org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:60)
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:118)
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:395)
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:414)
org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:152)
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:195)
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:67)
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:638)
java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Flink survey by data Artisans

2016-11-18 Thread Shannon Carey
There's a newline that disrupts the URL.

http://www.surveygizmo.com/s3/3166399/181bdb611f22

Not:

http://www.surveygizmo.com/s3/
3166399/181bdb611f22



Re: Flink survey by data Artisans

2016-11-18 Thread Vishnu Viswanath
Works for me also.

On Fri, Nov 18, 2016 at 12:35 PM, Stephan Ewen  wrote:

> Just checked it, the link works for me.
>
>
> On Fri, Nov 18, 2016 at 7:20 PM, amir bahmanyari 
> wrote:
>
>> [image: Inline image]
>>
>>
>> --
>> *From:* Kostas Tzoumas 
>> *To:* "dev@flink.apache.org" ;
>> u...@flink.apache.org
>> *Sent:* Friday, November 18, 2016 7:28 AM
>> *Subject:* Flink survey by data Artisans
>>
>> Hi everyone!
>>
>> The Apache Flink community has evolved quickly over the past 2+ years, and
>> there are now many production Flink deployments in organizations of all
>> sizes.  This is both exciting and humbling :-)
>>
>> data Artisans is running a brief survey to understand Apache Flink usage
>> and the needs of the community. We are hoping that this survey will help
>> identify common usage patterns, as well as pinpoint what are the most
>> needed features for Flink.
>>
>> We'll share a report with a summary of findings at the conclusion of the
>> survey with the community. All of the responses will remain confidential,
>> and only aggregate statistics will be shared.
>>
>> I expect the survey to take 5-10 minutes, and all questions are
>> optional--we appreciate any feedback that you're willing to provide.
>>
>> As a thank you, respondents will be entered in a drawing to win one of 10
>> tickets to Flink Forward 2017 (your choice of Berlin or the first-ever San
>> Francisco edition).
>>
>> The survey is available here: http://www.surveygizmo.com/s3/
>> 3166399/181bdb611f22
>>
>> Looking forward to hearing back from you!
>>
>> Best,
>> Kostas
>>
>>
>>
>


Re: Flink survey by data Artisans

2016-11-18 Thread amir bahmanyari



  From: Kostas Tzoumas 
 To: "dev@flink.apache.org" ; u...@flink.apache.org 
 Sent: Friday, November 18, 2016 7:28 AM
 Subject: Flink survey by data Artisans
   
Hi everyone!

The Apache Flink community has evolved quickly over the past 2+ years, and
there are now many production Flink deployments in organizations of all
sizes.  This is both exciting and humbling :-)

data Artisans is running a brief survey to understand Apache Flink usage
and the needs of the community. We are hoping that this survey will help
identify common usage patterns, as well as pinpoint what are the most
needed features for Flink.

We'll share a report with a summary of findings at the conclusion of the
survey with the community. All of the responses will remain confidential,
and only aggregate statistics will be shared.

I expect the survey to take 5-10 minutes, and all questions are
optional--we appreciate any feedback that you're willing to provide.

As a thank you, respondents will be entered in a drawing to win one of 10
tickets to Flink Forward 2017 (your choice of Berlin or the first-ever San
Francisco edition).

The survey is available here: http://www.surveygizmo.com/s3/
3166399/181bdb611f22

Looking forward to hearing back from you!

Best,
Kostas


   

[jira] [Created] (FLINK-5101) Test CassandraConnectorITCase instable

2016-11-18 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-5101:
-

 Summary: Test CassandraConnectorITCase instable
 Key: FLINK-5101
 URL: https://issues.apache.org/jira/browse/FLINK-5101
 Project: Flink
  Issue Type: Bug
  Components: Cassandra Connector
Reporter: Stefan Richter


I observed this test fail on Travis (very rarely):
 
 Running 
org.apache.flink.streaming.connectors.cassandra.CassandraConnectorITCase


Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 80.843 sec <<< 
FAILURE! - in 
org.apache.flink.streaming.connectors.cassandra.CassandraConnectorITCase
testCassandraBatchFormats(org.apache.flink.streaming.connectors.cassandra.CassandraConnectorITCase)
  Time elapsed: 5.82 sec  <<< FAILURE!
java.lang.AssertionError: expected:<40> but was:<20>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.flink.streaming.connectors.cassandra.CassandraConnectorITCase.testCassandraBatchFormats(CassandraConnectorITCase.java:442)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5100) Test testZooKeeperReelection is instable

2016-11-18 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-5100:
-

 Summary: Test testZooKeeperReelection is instable
 Key: FLINK-5100
 URL: https://issues.apache.org/jira/browse/FLINK-5100
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination
Reporter: Stefan Richter


I observed this test failing (very rarely) on Travis:
 
testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
  Time elapsed: 303.321 sec  <<< FAILURE!
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertFalse(Assert.java:64)
at org.junit.Assert.assertFalse(Assert.java:74)
at 
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:197)


Results :

Failed tests: 
  ZooKeeperLeaderElectionTest.testZooKeeperReelection:197 null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5099) Test testCancelPartitionRequest is instable

2016-11-18 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-5099:
-

 Summary: Test testCancelPartitionRequest is instable
 Key: FLINK-5099
 URL: https://issues.apache.org/jira/browse/FLINK-5099
 Project: Flink
  Issue Type: Bug
  Components: Network
Reporter: Stefan Richter


I observed this test fail on Travis (very rarely):

testCancelPartitionRequest(org.apache.flink.runtime.io.network.netty.CancelPartitionRequestTest)
  Time elapsed: 168.756 sec  <<< ERROR!
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2367)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuilder.append(StringBuilder.java:132)
at 
org.apache.flink.runtime.io.network.netty.CancelPartitionRequestTest.testCancelPartitionRequest(CancelPartitionRequestTest.java:94)

Results :

Tests in error: 
  CancelPartitionRequestTest.testCancelPartitionRequest:94 » OutOfMemory Java 
he...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Flink survey by data Artisans

2016-11-18 Thread Kostas Tzoumas
Hi everyone!

The Apache Flink community has evolved quickly over the past 2+ years, and
there are now many production Flink deployments in organizations of all
sizes.  This is both exciting and humbling :-)

data Artisans is running a brief survey to understand Apache Flink usage
and the needs of the community. We are hoping that this survey will help
identify common usage patterns, as well as pinpoint what are the most
needed features for Flink.

We'll share a report with a summary of findings at the conclusion of the
survey with the community. All of the responses will remain confidential,
and only aggregate statistics will be shared.

I expect the survey to take 5-10 minutes, and all questions are
optional--we appreciate any feedback that you're willing to provide.

As a thank you, respondents will be entered in a drawing to win one of 10
tickets to Flink Forward 2017 (your choice of Berlin or the first-ever San
Francisco edition).

The survey is available here: http://www.surveygizmo.com/s3/
3166399/181bdb611f22

Looking forward to hearing back from you!

Best,
Kostas


[jira] [Created] (FLINK-5098) Detect network problems to eagerly time out ask operations

2016-11-18 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-5098:


 Summary: Detect network problems to eagerly time out ask operations
 Key: FLINK-5098
 URL: https://issues.apache.org/jira/browse/FLINK-5098
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Coordination
Affects Versions: 1.2.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.2.0


Akka's ask operations are given a timeout after which they should fail with an 
{{AskTimeoutException}}. In some cases, however, it is possible to fail early 
because one has detected that the remote host is not reachable or that the 
actor does not exist on the remote {{ActorSystem}}.

Usually failing early if one cannot hope for a successful message delivery is a 
desirable behaviour since it speeds up recovery. 

I propose to send Akka's {{Identify}} message with each ask request. The 
identify message allows to detect unreachable/non-existing actors and, thus, 
enables us to fail the ask operation early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5096) Make the RollingSink rescalable.

2016-11-18 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-5096:
-

 Summary: Make the RollingSink rescalable.
 Key: FLINK-5096
 URL: https://issues.apache.org/jira/browse/FLINK-5096
 Project: Flink
  Issue Type: Improvement
  Components: filesystem-connector
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.2.0


Integrate the RollingSink with the new state abstractions so that its 
parallelism can change after restoring from a savepoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5095) Add explicit notifyOfAddedX methods to MetricReporter interface

2016-11-18 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-5095:
---

 Summary: Add explicit notifyOfAddedX methods to MetricReporter 
interface
 Key: FLINK-5095
 URL: https://issues.apache.org/jira/browse/FLINK-5095
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Affects Versions: 1.1.3
Reporter: Chesnay Schepler


I would like to start a discussion on the MetricReporter interface, 
specifically the methods that notify a reporter of added or removed metrics.

Currently, the methods are defined as follows:
{code}
void notifyOfAddedMetric(Metric metric, String metricName, MetricGroup group);
void notifyOfRemovedMetric(Metric metric, String metricName, MetricGroup group);
{code}

All metrics, regardless of their actual type, are passed to the reporter with 
these methods.

Since the different metric types have to be handled differently we thus force 
every reporter to do something like this:
{code}
if (metric instanceof Counter) {
Counter c = (Counter) metric;
// deal with counter
} else if (metric instanceof Gauge) {
// deal with gauge
} else if (metric instanceof Histogram) {
// deal with histogram
} else if (metric instanceof Meter) {
// deal with meter
} else {
// log something or throw an exception
}
{code}

This has a few issues
* the instanceof checks and castings are unnecessary overhead
* it requires the implementer to be aware of every metric type
* it encourages throwing an exception in the final else block

We could remedy all of these by reworking the interface to contain explicit 
add/remove methods for every metric type. This would however be a breaking 
change and blow up the interface to 12 methods from the current 4. We could 
also add a RichMetricReporter interface with these methods, which would require 
relatively little changes but add additional complexity.

I was wondering what other people think about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5094) Support RichReduceFunction and RichFoldFunction as incremental window aggregation functions

2016-11-18 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-5094:


 Summary: Support RichReduceFunction and RichFoldFunction as 
incremental window aggregation functions
 Key: FLINK-5094
 URL: https://issues.apache.org/jira/browse/FLINK-5094
 Project: Flink
  Issue Type: Improvement
  Components: Windowing Operators
Affects Versions: 1.1.3, 1.2.0
Reporter: Fabian Hueske


Support {{RichReduceFunction}} and {{RichFoldFunction}} as incremental window 
aggregation functions in order to initialize the functions via {{open()}].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5093) java.util.ConcurrentModificationException is thrown when stopping TimerService

2016-11-18 Thread Biao Liu (JIRA)
Biao Liu created FLINK-5093:
---

 Summary: java.util.ConcurrentModificationException is thrown when 
stopping TimerService
 Key: FLINK-5093
 URL: https://issues.apache.org/jira/browse/FLINK-5093
 Project: Flink
  Issue Type: Bug
  Components: Cluster Management
 Environment: FLIP-6 feature branch
Reporter: Biao Liu
Assignee: Biao Liu
Priority: Minor


In stop method of TimeService, removing Timeout instance while iterating the 
map will cause a java.util.ConcurrentModificationException.

The stack is:
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922)
at java.util.HashMap$KeyIterator.next(HashMap.java:956)
at 
org.apache.flink.runtime.taskexecutor.slot.TimerService.stop(TimerService.java:63)
at 
org.apache.flink.runtime.taskexecutor.slot.TaskSlotTable.stop(TaskSlotTable.java:129)
at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.shutDown(TaskExecutor.java:224)
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.shutDownInternally(TaskManagerRunner.java:135)
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.shutDown(TaskManagerRunner.java:129)
at 
org.apache.flink.runtime.minicluster.MiniCluster.shutdownInternally(MiniCluster.java:319)
at 
org.apache.flink.runtime.minicluster.MiniCluster.shutdown(MiniCluster.java:274)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)