[jira] [Closed] (FLINK-4598) Support NULLIF in Table API
[ https://issues.apache.org/jira/browse/FLINK-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-4598. -- Resolution: Won't Fix > Support NULLIF in Table API > > > Key: FLINK-4598 > URL: https://issues.apache.org/jira/browse/FLINK-4598 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: Jark Wu >Assignee: Jark Wu > > This could be a subtask of [FLINK-4549]. As Flink SQL has supported > {{NULLIF}} implicitly. We should support it in Table API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-4557) Table API Stream Aggregations
[ https://issues.apache.org/jira/browse/FLINK-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15500099#comment-15500099 ] shijinkui edited comment on FLINK-4557 at 9/18/16 2:52 AM: --- Does this plan started? What's the progress now? Can someone create nine subtasks mentioned in the FLIP-11? thanks was (Author: shijinkui): Does this plan started? What's the progress now? thanks > Table API Stream Aggregations > - > > Key: FLINK-4557 > URL: https://issues.apache.org/jira/browse/FLINK-4557 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: Timo Walther > > The Table API is a declarative API to define queries on static and streaming > tables. So far, only projection, selection, and union are supported > operations on streaming tables. > This issue and the corresponding FLIP proposes to add support for different > types of aggregations on top of streaming tables. In particular, we seek to > support: > *Group-window aggregates*, i.e., aggregates which are computed for a group of > elements. A (time or row-count) window is required to bound the infinite > input stream into a finite group. > *Row-window aggregates*, i.e., aggregates which are computed for each row, > based on a window (range) of preceding and succeeding rows. > Each type of aggregate shall be supported on keyed/grouped or > non-keyed/grouped data streams for streaming tables as well as batch tables. > Since time-windowed aggregates will be the first operation that require the > definition of time, we also need to discuss how the Table API handles time > characteristics, timestamps, and watermarks. > The FLIP can be found here: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4557) Table API Stream Aggregations
[ https://issues.apache.org/jira/browse/FLINK-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15500099#comment-15500099 ] shijinkui commented on FLINK-4557: -- Does this plan started? What's the progress now? thanks > Table API Stream Aggregations > - > > Key: FLINK-4557 > URL: https://issues.apache.org/jira/browse/FLINK-4557 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: Timo Walther > > The Table API is a declarative API to define queries on static and streaming > tables. So far, only projection, selection, and union are supported > operations on streaming tables. > This issue and the corresponding FLIP proposes to add support for different > types of aggregations on top of streaming tables. In particular, we seek to > support: > *Group-window aggregates*, i.e., aggregates which are computed for a group of > elements. A (time or row-count) window is required to bound the infinite > input stream into a finite group. > *Row-window aggregates*, i.e., aggregates which are computed for each row, > based on a window (range) of preceding and succeeding rows. > Each type of aggregate shall be supported on keyed/grouped or > non-keyed/grouped data streams for streaming tables as well as batch tables. > Since time-windowed aggregates will be the first operation that require the > definition of time, we also need to discuss how the Table API handles time > characteristics, timestamps, and watermarks. > The FLIP can be found here: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-4499) Introduce findbugs maven plugin
[ https://issues.apache.org/jira/browse/FLINK-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated FLINK-4499: -- Description: As suggested by Stephan in FLINK-4482, this issue is to add findbugs-maven-plugin into the build process so that we can detect lack of proper locking and other defects automatically. We can begin with small set of rules. was:As suggested by Stephan in FLINK-4482, this issue is to add findbugs-maven-plugin into the build process so that we can detect lack of proper locking and other defects automatically. > Introduce findbugs maven plugin > --- > > Key: FLINK-4499 > URL: https://issues.apache.org/jira/browse/FLINK-4499 > Project: Flink > Issue Type: Improvement >Reporter: Ted Yu > > As suggested by Stephan in FLINK-4482, this issue is to add > findbugs-maven-plugin into the build process so that we can detect lack of > proper locking and other defects automatically. > We can begin with small set of rules. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-4482) numUnsuccessfulCheckpointsTriggers is accessed without holding triggerLock
[ https://issues.apache.org/jira/browse/FLINK-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated FLINK-4482: -- Description: In CheckpointCoordinator#stopCheckpointScheduler() : {code} synchronized (lock) { ... numUnsuccessfulCheckpointsTriggers = 0; {code} triggerLock is not held in the above operation. See comment for triggerLock earlier in triggerCheckpoint(): {code} // we lock with a special lock to make sure that trigger requests do not overtake each other. // this is not done with the coordinator-wide lock, because the 'checkpointIdCounter' // may issue blocking operations. Using a different lock than teh coordinator-wide lock, // we avoid blocking the processing of 'acknowledge/decline' messages during that time. synchronized (triggerLock) { {code} was: In CheckpointCoordinator#stopCheckpointScheduler() : {code} synchronized (lock) { ... numUnsuccessfulCheckpointsTriggers = 0; {code} triggerLock is not held in the above operation. See comment for triggerLock earlier in triggerCheckpoint(): {code} // we lock with a special lock to make sure that trigger requests do not overtake each other. // this is not done with the coordinator-wide lock, because the 'checkpointIdCounter' // may issue blocking operations. Using a different lock than teh coordinator-wide lock, // we avoid blocking the processing of 'acknowledge/decline' messages during that time. synchronized (triggerLock) { {code} > numUnsuccessfulCheckpointsTriggers is accessed without holding triggerLock > -- > > Key: FLINK-4482 > URL: https://issues.apache.org/jira/browse/FLINK-4482 > Project: Flink > Issue Type: Bug >Reporter: Ted Yu >Priority: Minor > > In CheckpointCoordinator#stopCheckpointScheduler() : > {code} > synchronized (lock) { > ... > numUnsuccessfulCheckpointsTriggers = 0; > {code} > triggerLock is not held in the above operation. > See comment for triggerLock earlier in triggerCheckpoint(): > {code} > // we lock with a special lock to make sure that trigger requests do not > overtake each other. > // this is not done with the coordinator-wide lock, because the > 'checkpointIdCounter' > // may issue blocking operations. Using a different lock than teh > coordinator-wide lock, > // we avoid blocking the processing of 'acknowledge/decline' messages > during that time. > synchronized (triggerLock) { > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #2509: [FLINK-4280][kafka-connector] Explicit start posit...
GitHub user tzulitai opened a pull request: https://github.com/apache/flink/pull/2509 [FLINK-4280][kafka-connector] Explicit start position configuration for Kafka Consumer This PR adds the following new explicit setter methods to configure the starting position for the Kafka Consumer connector: ``` FlinkKafkaConsumer08 kafka = new FlinkKafkaConsumer08(...) // or 09 kafka.setStartFromEarliest(); // start from earliest without respecting any committed offsets kafka.setStartFromLatest(); // start from latest without respecting any committed offsets kafka.setStartFromGroupOffsets(); // respects committed offsets in ZK / Kafka as starting points ``` The default is to start from group offsets, so we won't be breaking existing user code. One thing to note is that this PR also includes some refactoring to consolidate all start offset assigning logic for partitions within the fetcher. For example, in 0.8 version, with this change the `SimpleConsumerThread` no longer deals with deciding where a partition needs to start from; all partitions should already be assigned starting offsets by the fetcher, and it simply needs to start consuming the partition.This is a pre-preparation for transparent partition discovery for the Kafka consumers in [FLINK-4022](https://issues.apache.org/jira/browse/FLINK-4022). I suggest to review this PR after #2369 to reduce effort in getting the 0.10 Kafka consumer in first. Tests for the new function will be added in follow-up commits. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tzulitai/flink FLINK-4280 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2509.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2509 commit f1d24806d902a45f66fc9b42a19a303a031b81b1 Author: Tzu-Li (Gordon) Tai Date: 2016-09-17T13:41:50Z [FLINK-4280][kafka-connector] Explicit start position configuration for Kafka Consumer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4280) New Flink-specific option to set starting position of Kafka consumer without respecting external offsets in ZK / Broker
[ https://issues.apache.org/jira/browse/FLINK-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15499135#comment-15499135 ] ASF GitHub Bot commented on FLINK-4280: --- GitHub user tzulitai opened a pull request: https://github.com/apache/flink/pull/2509 [FLINK-4280][kafka-connector] Explicit start position configuration for Kafka Consumer This PR adds the following new explicit setter methods to configure the starting position for the Kafka Consumer connector: ``` FlinkKafkaConsumer08 kafka = new FlinkKafkaConsumer08(...) // or 09 kafka.setStartFromEarliest(); // start from earliest without respecting any committed offsets kafka.setStartFromLatest(); // start from latest without respecting any committed offsets kafka.setStartFromGroupOffsets(); // respects committed offsets in ZK / Kafka as starting points ``` The default is to start from group offsets, so we won't be breaking existing user code. One thing to note is that this PR also includes some refactoring to consolidate all start offset assigning logic for partitions within the fetcher. For example, in 0.8 version, with this change the `SimpleConsumerThread` no longer deals with deciding where a partition needs to start from; all partitions should already be assigned starting offsets by the fetcher, and it simply needs to start consuming the partition.This is a pre-preparation for transparent partition discovery for the Kafka consumers in [FLINK-4022](https://issues.apache.org/jira/browse/FLINK-4022). I suggest to review this PR after #2369 to reduce effort in getting the 0.10 Kafka consumer in first. Tests for the new function will be added in follow-up commits. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tzulitai/flink FLINK-4280 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2509.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2509 commit f1d24806d902a45f66fc9b42a19a303a031b81b1 Author: Tzu-Li (Gordon) Tai Date: 2016-09-17T13:41:50Z [FLINK-4280][kafka-connector] Explicit start position configuration for Kafka Consumer > New Flink-specific option to set starting position of Kafka consumer without > respecting external offsets in ZK / Broker > --- > > Key: FLINK-4280 > URL: https://issues.apache.org/jira/browse/FLINK-4280 > Project: Flink > Issue Type: New Feature > Components: Kafka Connector >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai > Fix For: 1.2.0 > > > Currently, to start reading from the "earliest" and "latest" position in > topics for the Flink Kafka consumer, users set the Kafka config > {{auto.offset.reset}} in the provided properties configuration. > However, the way this config actually works might be a bit misleading if > users were trying to find a way to "read topics from a starting position". > The way the {{auto.offset.reset}} config works in the Flink Kafka consumer > resembles Kafka's original intent for the setting: first, existing external > offsets committed to the ZK / brokers will be checked; if none exists, then > will {{auto.offset.reset}} be respected. > I propose to add Flink-specific ways to define the starting position, without > taking into account the external offsets. The original behaviour (reference > external offsets first) can be changed to be a user option, so that the > behaviour can be retained for frequent Kafka users that may need some > collaboration with existing non-Flink Kafka consumer applications. > How users will interact with the Flink Kafka consumer after this is added, > with a newly introduced {{flink.starting-position}} config: > {code} > Properties props = new Properties(); > props.setProperty("flink.starting-position", "earliest/latest"); > props.setProperty("auto.offset.reset", "..."); // this will be ignored (log a > warning) > props.setProperty("group.id", "...") // this won't have effect on the > starting position anymore (may still be used in external offset committing) > ... > {code} > Or, reference external offsets in ZK / broker: > {code} > Properties props = new Properties(); > props.setProperty("flink.starting-position", "external-offsets"); > props.setProperty("auto.offset.reset", "earliest/latest"); // default will be > latest > props.setProperty("group.id", "..."); // will be used to lookup external > offsets in ZK / broker on startup > ... > {code} > A thing we would need to decide on is what would the default value be
[jira] [Commented] (FLINK-4629) Kafka v 0.10 Support
[ https://issues.apache.org/jira/browse/FLINK-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15498948#comment-15498948 ] Simone Robutti commented on FLINK-4629: --- There's already an issue that covers the update to Kafka 0.10 of the Kafka Source and Kafka Sink for Flink: https://issues.apache.org/jira/browse/FLINK-4035 This will be probably included in Flink 1.2 for what I've heard in the past weeks. > Kafka v 0.10 Support > > > Key: FLINK-4629 > URL: https://issues.apache.org/jira/browse/FLINK-4629 > Project: Flink > Issue Type: Wish >Reporter: Mariano Gonzalez >Priority: Minor > > I couldn't find any repo or documentation about when Flink will start > supporting Kafka v 0.10. > Is there any document that you can point me out where i can see Flink's > roadmap? > Thanks in advance -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3322) MemoryManager creates too much GC pressure with iterative jobs
[ https://issues.apache.org/jira/browse/FLINK-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15498937#comment-15498937 ] ASF GitHub Bot commented on FLINK-3322: --- Github user ggevay commented on the issue: https://github.com/apache/flink/pull/2495 Btw. before submitting a pull request, it is good practice to run `mvn clean verify` locally (or do a Travis build), and make sure that there are no test failures (or, at least no test failures related to the current change). > MemoryManager creates too much GC pressure with iterative jobs > -- > > Key: FLINK-3322 > URL: https://issues.apache.org/jira/browse/FLINK-3322 > Project: Flink > Issue Type: Bug > Components: Local Runtime >Affects Versions: 1.0.0 >Reporter: Gabor Gevay >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 1.0.0 > > Attachments: FLINK-3322.docx, FLINK-3322_reusingmemoryfordrivers.docx > > > When taskmanager.memory.preallocate is false (the default), released memory > segments are not added to a pool, but the GC is expected to take care of > them. This puts too much pressure on the GC with iterative jobs, where the > operators reallocate all memory at every superstep. > See the following discussion on the mailing list: > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Memory-manager-behavior-in-iterative-jobs-tt10066.html > Reproducing the issue: > https://github.com/ggevay/flink/tree/MemoryManager-crazy-gc > The class to start is malom.Solver. If you increase the memory given to the > JVM from 1 to 50 GB, performance gradually degrades by more than 10 times. > (It will generate some lookuptables to /tmp on first run for a few minutes.) > (I think the slowdown might also depend somewhat on > taskmanager.memory.fraction, because more unused non-managed memory results > in rarer GCs.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...
Github user ggevay commented on the issue: https://github.com/apache/flink/pull/2495 Btw. before submitting a pull request, it is good practice to run `mvn clean verify` locally (or do a Travis build), and make sure that there are no test failures (or, at least no test failures related to the current change). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...
Github user ggevay commented on the issue: https://github.com/apache/flink/pull/2495 The tests that are failing are in `flink-gelly-examples`, for example `SingleSourceShortestPathsITCase`. They have error msgs that indicate memory management errors, e.g. `Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: segment has been freed`. But I would say that before fixing this pull request, let's concentrate on https://github.com/apache/flink/pull/2496. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3322) MemoryManager creates too much GC pressure with iterative jobs
[ https://issues.apache.org/jira/browse/FLINK-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15498928#comment-15498928 ] ASF GitHub Bot commented on FLINK-3322: --- Github user ggevay commented on the issue: https://github.com/apache/flink/pull/2495 The tests that are failing are in `flink-gelly-examples`, for example `SingleSourceShortestPathsITCase`. They have error msgs that indicate memory management errors, e.g. `Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: segment has been freed`. But I would say that before fixing this pull request, let's concentrate on https://github.com/apache/flink/pull/2496. > MemoryManager creates too much GC pressure with iterative jobs > -- > > Key: FLINK-3322 > URL: https://issues.apache.org/jira/browse/FLINK-3322 > Project: Flink > Issue Type: Bug > Components: Local Runtime >Affects Versions: 1.0.0 >Reporter: Gabor Gevay >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 1.0.0 > > Attachments: FLINK-3322.docx, FLINK-3322_reusingmemoryfordrivers.docx > > > When taskmanager.memory.preallocate is false (the default), released memory > segments are not added to a pool, but the GC is expected to take care of > them. This puts too much pressure on the GC with iterative jobs, where the > operators reallocate all memory at every superstep. > See the following discussion on the mailing list: > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Memory-manager-behavior-in-iterative-jobs-tt10066.html > Reproducing the issue: > https://github.com/ggevay/flink/tree/MemoryManager-crazy-gc > The class to start is malom.Solver. If you increase the memory given to the > JVM from 1 to 50 GB, performance gradually degrades by more than 10 times. > (It will generate some lookuptables to /tmp on first run for a few minutes.) > (I think the slowdown might also depend somewhat on > taskmanager.memory.fraction, because more unused non-managed memory results > in rarer GCs.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-4589) Fix Merging of Covering Window in MergingWindowSet
[ https://issues.apache.org/jira/browse/FLINK-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek resolved FLINK-4589. - Resolution: Fixed Fixed on master and release 1.1 branch. > Fix Merging of Covering Window in MergingWindowSet > -- > > Key: FLINK-4589 > URL: https://issues.apache.org/jira/browse/FLINK-4589 > Project: Flink > Issue Type: Bug > Components: Windowing Operators >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Blocker > Fix For: 1.0.4, 1.2.0, 1.1.3 > > > Right now, when a new window gets merged that covers all of the existing > window {{MergingWindowSet}} does not correctly set the state window. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4589) Fix Merging of Covering Window in MergingWindowSet
[ https://issues.apache.org/jira/browse/FLINK-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15498512#comment-15498512 ] Aljoscha Krettek commented on FLINK-4589: - Ah good, I saw that you also put this on the 1.1 branch. Closing this issue now. > Fix Merging of Covering Window in MergingWindowSet > -- > > Key: FLINK-4589 > URL: https://issues.apache.org/jira/browse/FLINK-4589 > Project: Flink > Issue Type: Bug > Components: Windowing Operators >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Blocker > Fix For: 1.0.4, 1.2.0, 1.1.3 > > > Right now, when a new window gets merged that covers all of the existing > window {{MergingWindowSet}} does not correctly set the state window. -- This message was sent by Atlassian JIRA (v6.3.4#6332)