[jira] [Commented] (FLINK-7038) Several misused "KeyedDataStream" term in docs and Javadocs
[ https://issues.apache.org/jira/browse/FLINK-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071033#comment-16071033 ] ASF GitHub Bot commented on FLINK-7038: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/4229#discussion_r125155746 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/state/AggregatingState.java --- @@ -33,7 +33,7 @@ * The state is accessed and modified by user functions, and checkpointed consistently * by the system as part of the distributed snapshots. * - * The state is only accessible by functions applied on a KeyedDataStream. The key is + * The state is only accessible by functions applied on a KeyedStream. The key is --- End diff -- Could you also change these to be actual Javadoc links? i.e. `{@link KeyedStream}`. > Several misused "KeyedDataStream" term in docs and Javadocs > --- > > Key: FLINK-7038 > URL: https://issues.apache.org/jira/browse/FLINK-7038 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.3.1, 1.4.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: mingleizhang >Priority: Trivial > Labels: starter > > The correct term should be {{KeyedStream}}. > For example, in > https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/datastream_api.html, > it says "See keys on how to specify keys. This transformation returns a > *KeyedDataStream*." -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7038) Several misused "KeyedDataStream" term in docs and Javadocs
[ https://issues.apache.org/jira/browse/FLINK-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071034#comment-16071034 ] ASF GitHub Bot commented on FLINK-7038: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/4229#discussion_r125155753 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/state/State.java --- @@ -23,7 +23,7 @@ /** * Interface that different types of partitioned state must implement. * - * The state is only accessible by functions applied on a KeyedDataStream. The key is + * The state is only accessible by functions applied on a KeyedStream. The key is --- End diff -- Could you also change these to be actual Javadoc links? i.e. `{@link KeyedStream}`. > Several misused "KeyedDataStream" term in docs and Javadocs > --- > > Key: FLINK-7038 > URL: https://issues.apache.org/jira/browse/FLINK-7038 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.3.1, 1.4.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: mingleizhang >Priority: Trivial > Labels: starter > > The correct term should be {{KeyedStream}}. > For example, in > https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/datastream_api.html, > it says "See keys on how to specify keys. This transformation returns a > *KeyedDataStream*." -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #4229: [FLINK-7038] [docs] Correct misused term to KeyedS...
Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/4229#discussion_r125155752 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/state/FoldingState.java --- @@ -28,7 +28,7 @@ * The state is accessed and modified by user functions, and checkpointed consistently * by the system as part of the distributed snapshots. * - * The state is only accessible by functions applied on a KeyedDataStream. The key is + * The state is only accessible by functions applied on a KeyedStream. The key is --- End diff -- Could you also change these to be actual Javadoc links? i.e. `{@link KeyedStream}`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7038) Several misused "KeyedDataStream" term in docs and Javadocs
[ https://issues.apache.org/jira/browse/FLINK-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071036#comment-16071036 ] ASF GitHub Bot commented on FLINK-7038: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/4229#discussion_r125155752 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/state/FoldingState.java --- @@ -28,7 +28,7 @@ * The state is accessed and modified by user functions, and checkpointed consistently * by the system as part of the distributed snapshots. * - * The state is only accessible by functions applied on a KeyedDataStream. The key is + * The state is only accessible by functions applied on a KeyedStream. The key is --- End diff -- Could you also change these to be actual Javadoc links? i.e. `{@link KeyedStream}`. > Several misused "KeyedDataStream" term in docs and Javadocs > --- > > Key: FLINK-7038 > URL: https://issues.apache.org/jira/browse/FLINK-7038 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.3.1, 1.4.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: mingleizhang >Priority: Trivial > Labels: starter > > The correct term should be {{KeyedStream}}. > For example, in > https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/datastream_api.html, > it says "See keys on how to specify keys. This transformation returns a > *KeyedDataStream*." -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7038) Several misused "KeyedDataStream" term in docs and Javadocs
[ https://issues.apache.org/jira/browse/FLINK-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071035#comment-16071035 ] ASF GitHub Bot commented on FLINK-7038: --- Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/4229#discussion_r125155750 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/state/AppendingState.java --- @@ -29,7 +29,7 @@ * The state is accessed and modified by user functions, and checkpointed consistently * by the system as part of the distributed snapshots. * - * The state is only accessible by functions applied on a KeyedDataStream. The key is + * The state is only accessible by functions applied on a KeyedStream. The key is --- End diff -- Could you also change these to be actual Javadoc links? i.e. `{@link KeyedStream}`. > Several misused "KeyedDataStream" term in docs and Javadocs > --- > > Key: FLINK-7038 > URL: https://issues.apache.org/jira/browse/FLINK-7038 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.3.1, 1.4.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: mingleizhang >Priority: Trivial > Labels: starter > > The correct term should be {{KeyedStream}}. > For example, in > https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/datastream_api.html, > it says "See keys on how to specify keys. This transformation returns a > *KeyedDataStream*." -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #4229: [FLINK-7038] [docs] Correct misused term to KeyedS...
Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/4229#discussion_r125155753 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/state/State.java --- @@ -23,7 +23,7 @@ /** * Interface that different types of partitioned state must implement. * - * The state is only accessible by functions applied on a KeyedDataStream. The key is + * The state is only accessible by functions applied on a KeyedStream. The key is --- End diff -- Could you also change these to be actual Javadoc links? i.e. `{@link KeyedStream}`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #4229: [FLINK-7038] [docs] Correct misused term to KeyedS...
Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/4229#discussion_r125155746 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/state/AggregatingState.java --- @@ -33,7 +33,7 @@ * The state is accessed and modified by user functions, and checkpointed consistently * by the system as part of the distributed snapshots. * - * The state is only accessible by functions applied on a KeyedDataStream. The key is + * The state is only accessible by functions applied on a KeyedStream. The key is --- End diff -- Could you also change these to be actual Javadoc links? i.e. `{@link KeyedStream}`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #4229: [FLINK-7038] [docs] Correct misused term to KeyedS...
Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/4229#discussion_r125155750 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/state/AppendingState.java --- @@ -29,7 +29,7 @@ * The state is accessed and modified by user functions, and checkpointed consistently * by the system as part of the distributed snapshots. * - * The state is only accessible by functions applied on a KeyedDataStream. The key is + * The state is only accessible by functions applied on a KeyedStream. The key is --- End diff -- Could you also change these to be actual Javadoc links? i.e. `{@link KeyedStream}`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-6674) Update release 1.3 docs
[ https://issues.apache.org/jira/browse/FLINK-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071032#comment-16071032 ] ASF GitHub Bot commented on FLINK-6674: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/4211 Merging ... > Update release 1.3 docs > --- > > Key: FLINK-6674 > URL: https://issues.apache.org/jira/browse/FLINK-6674 > Project: Flink > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.3.0 >Reporter: Nico Kruber > Fix For: 1.3.0 > > > Umbrella issue to track required updates to the documentation for the 1.3 > release. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4211: [FLINK-6674] [FLINK-6680] [docs] Update migration docs fo...
Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/4211 Merging ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7060) Change annotation in TypeInformation subclasses
[ https://issues.apache.org/jira/browse/FLINK-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070908#comment-16070908 ] Greg Hogan commented on FLINK-7060: --- The {{Public}} annotation denotes stable inheritance so this looks to be as intended and not something we can change without breaking the API. > Change annotation in TypeInformation subclasses > --- > > Key: FLINK-7060 > URL: https://issues.apache.org/jira/browse/FLINK-7060 > Project: Flink > Issue Type: Improvement > Components: Type Serialization System >Reporter: Kostas Kloudas >Priority: Minor > > Currently many subclasses of the {{TypeInformation}} are annotated as > {{Public}} but all their methods are {{PublicEvolving}}. As an example, you > can check out the {{GenericTypeInfo}}. > It would be clearer if the whole class was annotated with {{PublicEvolving}} > instead of {{Public}} and have all the methods without annotations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6877) Activate checkstyle for runtime/security
[ https://issues.apache.org/jira/browse/FLINK-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070869#comment-16070869 ] ASF GitHub Bot commented on FLINK-6877: --- Github user hsaputra commented on the issue: https://github.com/apache/flink/pull/4095 +1 for merging. The longer it waits the more conflicts it will cause > Activate checkstyle for runtime/security > > > Key: FLINK-6877 > URL: https://issues.apache.org/jira/browse/FLINK-6877 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.4.0 >Reporter: Greg Hogan >Assignee: Greg Hogan >Priority: Trivial > Fix For: 1.4.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4095: [FLINK-6877] [runtime] Activate checkstyle for runtime/se...
Github user hsaputra commented on the issue: https://github.com/apache/flink/pull/4095 +1 for merging. The longer it waits the more conflicts it will cause --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL
[ https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070721#comment-16070721 ] ASF GitHub Bot commented on FLINK-6075: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3889 > Support Limit/Top(Sort) for Stream SQL > -- > > Key: FLINK-6075 > URL: https://issues.apache.org/jira/browse/FLINK-6075 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: radu > Labels: features > Attachments: sort.png > > > These will be split in 3 separated JIRA issues. However, the design is the > same only the processing function differs in terms of the output. Hence, the > design is the same for all of them. > Time target: Proc Time > **SQL targeted query examples:** > *Sort example* > Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL > '3' HOUR) ORDER BY b` > Comment: window is defined using GROUP BY > Comment: ASC or DESC keywords can be placed to mark the ordering type > *Limit example* > Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL > '1' HOUR AND current_timestamp ORDER BY b LIMIT 10` > Comment: window is defined using time ranges in the WHERE clause > Comment: window is row triggered > *Top example* > Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING > LIMIT 10) FROM stream1` > Comment: limit over the contents of the sliding window > General Comments: > -All these SQL clauses are supported only over windows (bounded collections > of data). > -Each of the 3 operators will be supported with each of the types of > expressing the windows. > **Description** > The 3 operations (limit, top and sort) are similar in behavior as they all > require a sorted collection of the data on which the logic will be applied > (i.e., select a subset of the items or the entire sorted set). These > functions would make sense in the streaming context only in the context of a > window. Without defining a window the functions could never emit as the sort > operation would never trigger. If an SQL query will be provided without > limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). > Although not targeted by this JIRA, in the case of working based on event > time order, the retraction mechanisms of windows and the lateness mechanisms > can be used to deal with out of order events and retraction/updates of > results. > **Functionality example** > We exemplify with the query below for all the 3 types of operators (sorting, > limit and top). Rowtime indicates when the HOP window will trigger – which > can be observed in the fact that outputs are generated only at those moments. > The HOP windows will trigger at every hour (fixed hour) and each event will > contribute/ be duplicated for 2 consecutive hour intervals. Proctime > indicates the processing time when a new event arrives in the system. Events > are of the type (a,b) with the ordering being applied on the b field. > `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) > ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `) > ||Rowtime|| Proctime|| Stream1|| Limit 2|| Top 2|| Sort > [ASC]|| > | |10:00:00 |(aaa, 11) | | | >| > | |10:05:00|(aab, 7) | | || > |10-11 |11:00:00 | | aab,aaa |aab,aaa | aab,aaa >| > | |11:03:00 |(aac,21) | | || > > |11-12|12:00:00 | | aab,aaa |aab,aaa | aab,aaa,aac| > | |12:10:00 |(abb,12) | | || > > | |12:15:00 |(abb,12) | | || > > |12-13 |13:00:00 | | abb,abb | abb,abb | > abb,abb,aac| > |...| > **Implementation option** > Considering that the SQL operators will be associated with window boundaries, > the functionality will be implemented within the logic of the window as > follows. > * Window assigner – selected based on the type of window used in SQL > (TUMBLING, SLIDING…) > * Evictor/ Trigger – time or count evictor based on the definition of the > window boundaries > * Apply – window function that sorts data and selects the output to trigger > (based on LIMIT/TOP parameters). All data will be sorted at once and result > outputted when the window is triggered > An alternative implementation can be to use a fold window function to sort > the elements as they arrive, one at a time followed by a flatMap to filter > the number of outputs. > !sort.png! > **General logic of Join** > ``` > inputDataStream.
[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3889 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7044) Add methods to the client API that take the stateDescriptor.
[ https://issues.apache.org/jira/browse/FLINK-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070478#comment-16070478 ] ASF GitHub Bot commented on FLINK-7044: --- Github user kl0u commented on the issue: https://github.com/apache/flink/pull/4225 @aljoscha thanks for the review, I updated the PR. > Add methods to the client API that take the stateDescriptor. > > > Key: FLINK-7044 > URL: https://issues.apache.org/jira/browse/FLINK-7044 > Project: Flink > Issue Type: Improvement > Components: Queryable State >Affects Versions: 1.3.0, 1.3.1 >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas > Fix For: 1.4.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4225: [FLINK-7044] [queryable-st] Allow to specify namespace an...
Github user kl0u commented on the issue: https://github.com/apache/flink/pull/4225 @aljoscha thanks for the review, I updated the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-7060) Change annotation in TypeInformation subclasses
Kostas Kloudas created FLINK-7060: - Summary: Change annotation in TypeInformation subclasses Key: FLINK-7060 URL: https://issues.apache.org/jira/browse/FLINK-7060 Project: Flink Issue Type: Improvement Components: Type Serialization System Reporter: Kostas Kloudas Priority: Minor Currently many subclasses of the {{TypeInformation}} are annotated as {{Public}} but all their methods are {{PublicEvolving}}. As an example, you can check out the {{GenericTypeInfo}}. It would be clearer if the whole class was annotated with {{PublicEvolving}} instead of {{Public}} and have all the methods without annotations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4209: [FLINK-7030] Build with scala-2.11 by default
Github user zentol commented on the issue: https://github.com/apache/flink/pull/4209 I'm aware of that. I will merge it over the weekend with a bunch of other PR's when the Flink Travis isn't so busy. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7030) Build with scala-2.11 by default
[ https://issues.apache.org/jira/browse/FLINK-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070451#comment-16070451 ] ASF GitHub Bot commented on FLINK-7030: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/4209 I'm aware of that. I will merge it over the weekend with a bunch of other PR's when the Flink Travis isn't so busy. > Build with scala-2.11 by default > > > Key: FLINK-7030 > URL: https://issues.apache.org/jira/browse/FLINK-7030 > Project: Flink > Issue Type: Improvement > Components: Build System >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski > > As proposed recently on the dev mailing list. > I propose to switch to Scala 2.11 as a default and to have a Scala 2.10 build > profile. Now it is the other way around. The reason for that is poor support > for build profiles in Intellij, I was unable to make it work after I added > Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala 2.10). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6043) Display time when exceptions/root cause of failure happened
[ https://issues.apache.org/jira/browse/FLINK-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070449#comment-16070449 ] ASF GitHub Bot commented on FLINK-6043: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/3583#discussion_r125088693 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java --- @@ -1070,9 +1088,9 @@ public void suspend(Throwable suspensionCause) { * exceptions that indicate a bug or an unexpected call race), and where a full restart is the * safe way to get consistency back. * -* @param t The exception that caused the failure. +* @param errorInfo ErrorInfo containing the exception that caused the failure. */ - public void failGlobal(Throwable t) { + public void failGlobal(ErrorInfo errorInfo) { --- End diff -- Why don't we do a null check for `errorInfo` similar to the one in `suspend`? I think it should be fine to assume that this parameter is not null. > Display time when exceptions/root cause of failure happened > --- > > Key: FLINK-6043 > URL: https://issues.apache.org/jira/browse/FLINK-6043 > Project: Flink > Issue Type: Improvement > Components: Webfrontend >Affects Versions: 1.3.0 >Reporter: Till Rohrmann >Assignee: Chesnay Schepler >Priority: Minor > > In order to better understand the behaviour of Flink jobs, it would be nice > to add timestamp information to exception causing the job to restart or to > fail. This information could then be displayed in the web UI making it easier > for the user to understand what happened when. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #3583: [FLINK-6043] [web] Display exception timestamp
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/3583#discussion_r125088693 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java --- @@ -1070,9 +1088,9 @@ public void suspend(Throwable suspensionCause) { * exceptions that indicate a bug or an unexpected call race), and where a full restart is the * safe way to get consistency back. * -* @param t The exception that caused the failure. +* @param errorInfo ErrorInfo containing the exception that caused the failure. */ - public void failGlobal(Throwable t) { + public void failGlobal(ErrorInfo errorInfo) { --- End diff -- Why don't we do a null check for `errorInfo` similar to the one in `suspend`? I think it should be fine to assume that this parameter is not null. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-7003) "select * from" in Flink SQL should not flatten all fields in the table by default
[ https://issues.apache.org/jira/browse/FLINK-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuyi Chen updated FLINK-7003: -- Description: Currently, CompositeRelDataType is extended from RelRecordType(StructKind.PEEK_FIELDS, ...). In Calcite, StructKind.PEEK_FIELDS would allow us to peek fields for nested types. However, when we use "select * from", calcite will flatten all nested fields that is marked as StructKind.PEEK_FIELDS in the table. For example, if the table structure *T* is as follows: {code:java} VARCHAR K0, VARCHAR C1, RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0, RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1 {code} The following query {code:java} Select * from T {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, F1.C0, F1.C1), which is the current behavior. After upgrading to Calcite 1.14, this issue should change the type of {{CompositeRelDataType}} to {{StructKind. PEEK_FIELDS_NO_EXPAND}}. was: Currently, CompositeRelDataType is extended from RelRecordType(StructKind.PEEK_FIELDS, ...). In Calcite, StructKind.PEEK_FIELDS would allow us to peek fields for nested types. However, when we use "select * from", calcite will flatten all nested fields that is marked as StructKind.PEEK_FIELDS in the table. For example, if the table structure *T* is as follows: {code:java} VARCHAR K0, VARCHAR C1, RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0, RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1 {code} The following query {code:java} Select * from T {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, F1.C0, F1.C1), which is the current behavior. After upgrading to Calcite 1.14, this issue should change the type of {{CompositeRelDataType}} to {{StructKind.PEEK_FIELDS_NO_FLATTENING}}. > "select * from" in Flink SQL should not flatten all fields in the table by > default > -- > > Key: FLINK-7003 > URL: https://issues.apache.org/jira/browse/FLINK-7003 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: Shuyi Chen > > Currently, CompositeRelDataType is extended from > RelRecordType(StructKind.PEEK_FIELDS, ...). In Calcite, > StructKind.PEEK_FIELDS would allow us to peek fields for nested types. > However, when we use "select * from", calcite will flatten all nested fields > that is marked as StructKind.PEEK_FIELDS in the table. > For example, if the table structure *T* is as follows: > {code:java} > VARCHAR K0, > VARCHAR C1, > RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0, > RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1 > {code} > The following query > {code:java} > Select * from T > {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, > F1.C0, F1.C1), which is the current behavior. > After upgrading to Calcite 1.14, this issue should change the type of > {{CompositeRelDataType}} to {{StructKind. PEEK_FIELDS_NO_EXPAND}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7059) Queryable state does not work with ListState
Kostas Kloudas created FLINK-7059: - Summary: Queryable state does not work with ListState Key: FLINK-7059 URL: https://issues.apache.org/jira/browse/FLINK-7059 Project: Flink Issue Type: Bug Components: Queryable State Reporter: Kostas Kloudas Fix For: 1.4.0 The serialization format of the list state follows the one of RocksDB (comma separated binaries without size of list) which is incompatible with that of our ListSerializer. For reference you can look up {{HeapListState.getSerializedValue()}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11
[ https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070419#comment-16070419 ] ASF GitHub Bot commented on FLINK-7058: --- GitHub user pnowojski opened a pull request: https://github.com/apache/flink/pull/4240 [FLINK-7058] Fix scala-2.10 dependencies First commit is from #4209 and should be ignored in this PR Before fixup: ``` $ mvn dependency:tree -pl flink-scala | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile ``` After fixup: ``` $ mvn dependency:tree -pl flink-scala | grep quasi $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/pnowojski/flink scala210 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4240.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4240 commit a17b0d4aee3c116761871513b2ef073bf8a98750 Author: Piotr Nowojski Date: 2017-06-23T11:41:55Z [FLINK-7030] Build with scala-2.11 by default commit 58a3b7b0a936da0148de4ddb5b9a6b2c3bccc335 Author: Piotr Nowojski Date: 2017-06-30T16:19:59Z [FLINK-7058] Fix scala-2.10 dependencies > flink-scala-shell unintended dependencies for scala 2.11 > > > Key: FLINK-7058 > URL: https://issues.apache.org/jira/browse/FLINK-7058 > Project: Flink > Issue Type: Bug >Affects Versions: 1.3.0, 1.3.1 >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski >Priority: Minor > > Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do > not work as intended. > {code:xml} > > > scala-2.10 > > > !scala-2.11 > > > > > org.scalamacros > > quasiquotes_2.10 > > ${scala.macros.version} > > > org.scala-lang > jline > 2.10.4 > > > > > > > {code} > This activation IMO have nothing to do with `-Pscala-2.11` profile switch > used in our build. "properties" are defined by `-Dproperty` switches. As far > as I understand that, those additional dependencies would be added only if > nobody defined property named `scala-2.11`, which means, they would be added > only if switch `-Dscala-2.11` was not used, so it seems like those > dependencies were basically added always. This quick test proves that I'm > correct: > {code:bash} > $ mvn dependency:tree -pl flink-scala | grep quasi > [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile > $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi > [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile > $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi > [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile > {code} > regardless of the selected profile those dependencies are always there. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #4240: [FLINK-7058] Fix scala-2.10 dependencies
GitHub user pnowojski opened a pull request: https://github.com/apache/flink/pull/4240 [FLINK-7058] Fix scala-2.10 dependencies First commit is from #4209 and should be ignored in this PR Before fixup: ``` $ mvn dependency:tree -pl flink-scala | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile ``` After fixup: ``` $ mvn dependency:tree -pl flink-scala | grep quasi $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/pnowojski/flink scala210 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4240.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4240 commit a17b0d4aee3c116761871513b2ef073bf8a98750 Author: Piotr Nowojski Date: 2017-06-23T11:41:55Z [FLINK-7030] Build with scala-2.11 by default commit 58a3b7b0a936da0148de4ddb5b9a6b2c3bccc335 Author: Piotr Nowojski Date: 2017-06-30T16:19:59Z [FLINK-7058] Fix scala-2.10 dependencies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11
[ https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-7058: -- Affects Version/s: 1.3.0 1.3.1 > flink-scala-shell unintended dependencies for scala 2.11 > > > Key: FLINK-7058 > URL: https://issues.apache.org/jira/browse/FLINK-7058 > Project: Flink > Issue Type: Bug >Affects Versions: 1.3.0, 1.3.1 >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski >Priority: Minor > > Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do > not work as intended. > {code:xml} > > > scala-2.10 > > > !scala-2.11 > > > > > org.scalamacros > > quasiquotes_2.10 > > ${scala.macros.version} > > > org.scala-lang > jline > 2.10.4 > > > > > > > {code} > This activation IMO have nothing to do with `-Pscala-2.11` profile switch > used in our build. "properties" are defined by `-Dproperty` switches. As far > as I understand that, those additional dependencies would be added only if > nobody defined property named `scala-2.11`, which means, they would be added > only if switch `-Dscala-2.11` was not used, so it seems like those > dependencies were basically added always. This quick test proves that I'm > correct: > {code:bash} > $ mvn dependency:tree -pl flink-scala | grep quasi > [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile > $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi > [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile > $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi > [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile > {code} > regardless of the selected profile those dependencies are always there. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11
[ https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-7058: - Assignee: Piotr Nowojski > flink-scala-shell unintended dependencies for scala 2.11 > > > Key: FLINK-7058 > URL: https://issues.apache.org/jira/browse/FLINK-7058 > Project: Flink > Issue Type: Bug >Affects Versions: 1.3.0, 1.3.1 >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski >Priority: Minor > > Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do > not work as intended. > {code:xml} > > > scala-2.10 > > > !scala-2.11 > > > > > org.scalamacros > > quasiquotes_2.10 > > ${scala.macros.version} > > > org.scala-lang > jline > 2.10.4 > > > > > > > {code} > This activation IMO have nothing to do with `-Pscala-2.11` profile switch > used in our build. "properties" are defined by `-Dproperty` switches. As far > as I understand that, those additional dependencies would be added only if > nobody defined property named `scala-2.11`, which means, they would be added > only if switch `-Dscala-2.11` was not used, so it seems like those > dependencies were basically added always. This quick test proves that I'm > correct: > {code:bash} > $ mvn dependency:tree -pl flink-scala | grep quasi > [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile > $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi > [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile > $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi > [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile > {code} > regardless of the selected profile those dependencies are always there. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11
Piotr Nowojski created FLINK-7058: - Summary: flink-scala-shell unintended dependencies for scala 2.11 Key: FLINK-7058 URL: https://issues.apache.org/jira/browse/FLINK-7058 Project: Flink Issue Type: Bug Reporter: Piotr Nowojski Priority: Minor Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do not work as intended. {code:xml} scala-2.10 !scala-2.11 org.scalamacros quasiquotes_2.10 ${scala.macros.version} org.scala-lang jline 2.10.4 {code} This activation IMO have nothing to do with `-Pscala-2.11` profile switch used in our build. "properties" are defined by `-Dproperty` switches. As far as I understand that, those additional dependencies would be added only if nobody defined property named `scala-2.11`, which means, they would be added only if switch `-Dscala-2.11` was not used, so it seems like those dependencies were basically added always. This quick test proves that I'm correct: {code:bash} $ mvn dependency:tree -pl flink-scala | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile {code} regardless of the selected profile those dependencies are always there. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7030) Build with scala-2.11 by default
[ https://issues.apache.org/jira/browse/FLINK-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070388#comment-16070388 ] ASF GitHub Bot commented on FLINK-7030: --- Github user pnowojski commented on the issue: https://github.com/apache/flink/pull/4209 @zentol build is passing :) > Build with scala-2.11 by default > > > Key: FLINK-7030 > URL: https://issues.apache.org/jira/browse/FLINK-7030 > Project: Flink > Issue Type: Improvement > Components: Build System >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski > > As proposed recently on the dev mailing list. > I propose to switch to Scala 2.11 as a default and to have a Scala 2.10 build > profile. Now it is the other way around. The reason for that is poor support > for build profiles in Intellij, I was unable to make it work after I added > Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala 2.10). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4209: [FLINK-7030] Build with scala-2.11 by default
Github user pnowojski commented on the issue: https://github.com/apache/flink/pull/4209 @zentol build is passing :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #4228: Flink-7035 Automatically identify AWS Region, simp...
Github user mpouttuclarke commented on a diff in the pull request: https://github.com/apache/flink/pull/4228#discussion_r125076287 --- Diff: docs/dev/connectors/kinesis.md --- @@ -72,12 +72,80 @@ Before consuming data from Kinesis streams, make sure that all streams are creat {% highlight java %} -Properties consumerConfig = new Properties(); -consumerConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1"); +StreamExecutionEnvironment env = StreamExecutionEnvironment.getEnvironment(); + +DataStream kinesis = env.addSource(new FlinkKinesisConsumer<>( +"kinesis_stream_name", new SimpleStringSchema(), ConsumerConfigConstants.InitialPosition.LATEST)); +{% endhighlight %} + + +{% highlight scala %} +val env = StreamExecutionEnvironment.getEnvironment + +val kinesis = env.addSource(new FlinkKinesisConsumer[String]( +"kinesis_stream_name", new SimpleStringSchema, ConsumerConfigConstants.InitialPosition.LATEST)) +{% endhighlight %} + + + +The above is a simple example of using the Kinesis consumer when running on an Amazon Linux node (such as in EMR or AWS Lambda). +The AWS APIs automatically provide the authentication credentials and region when available. For unit testing, the ability to +set test configuration is provided using KinesisConfigUtil. + + + +{% highlight java %} +Properties testConfig = new Properties(); +testConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1"); +testConfig.put(ConsumerConfigConstants.AWS_ACCESS_KEY_ID, "aws_access_key_id"); +testConfig.put(ConsumerConfigConstants.AWS_SECRET_ACCESS_KEY, "aws_secret_access_key"); +testConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, "LATEST"); + +KinesisConfigUtil.setDefaultTestProperties(testConfig); + +// Automatically uses testConfig without having to modify job flow +StreamExecutionEnvironment env = StreamExecutionEnvironment.getEnvironment(); +DataStream kinesis = env.addSource(new FlinkKinesisConsumer<>( +"kinesis_stream_name", new SimpleStringSchema())); +{% endhighlight %} + + +{% highlight scala %} +val testConfig = new Properties(); +testConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1"); consumerConfig.put(ConsumerConfigConstants.AWS_ACCESS_KEY_ID, "aws_access_key_id"); consumerConfig.put(ConsumerConfigConstants.AWS_SECRET_ACCESS_KEY, "aws_secret_access_key"); consumerConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, "LATEST"); +KinesisConfigUtil.setDefaultTestProperties(testConfig); + +// Automatically uses testConfig without having to modify job flow +val env = StreamExecutionEnvironment.getEnvironment +val kinesis = env.addSource(new FlinkKinesisConsumer[String]( +"kinesis_stream_name", new SimpleStringSchema)) +{% endhighlight %} + + + +Configuration for the consumer can also be supplied with `java.util.Properties` for use on non-Amazon Linux hardware, +or in the case that other stream consumer properties need to be tuned. + +Please note it is strongly recommended to use Kinesis streams within the same availability zone they originate in. --- End diff -- We completely restrict cross-region network traffic except in special circumstances within Amazon because of these reasons. These are simply lessons learned from scaling our systems globally, where situations arise not only due to performance concerns but also regulatory issues such as EU data laws restricting data egress for example. Customer should be aware of these issues and discuss them with support before going down this path. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #4228: Flink-7035 Automatically identify AWS Region, simp...
Github user mpouttuclarke commented on a diff in the pull request: https://github.com/apache/flink/pull/4228#discussion_r125074194 --- Diff: flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/util/KinesisConfigUtil.java --- @@ -171,7 +172,14 @@ public static void validateAwsConfiguration(Properties config) { } if (!config.containsKey(AWSConfigConstants.AWS_REGION)) { - throw new IllegalArgumentException("The AWS region ('" + AWSConfigConstants.AWS_REGION + "') must be set in the config."); + final Region currentRegion = Regions.getCurrentRegion(); + if (currentRegion != null) { + config.setProperty(AWSConfigConstants.AWS_REGION, currentRegion.getName()); + } else { + throw new IllegalArgumentException("The AWS region could not be identified automatically from the AWS API. " + --- End diff -- Yes I like that wording. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #4228: Flink-7035 Automatically identify AWS Region, simp...
Github user mpouttuclarke commented on a diff in the pull request: https://github.com/apache/flink/pull/4228#discussion_r125073534 --- Diff: flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/util/KinesisConfigUtil.java --- @@ -171,7 +172,14 @@ public static void validateAwsConfiguration(Properties config) { } if (!config.containsKey(AWSConfigConstants.AWS_REGION)) { --- End diff -- The new constructors make the easy path the right path. We go through a lot of trouble at Amazon to make sure that the default constructors do the right thing with the minimal amount of effort. Yet people still set things like region and auth manually when it is not only unnecessary but also a security, performance, and compliance risk. Wherever we can we should try to follow the example of the AWS SDK and provide for using it correctly. Overall, I would make the argument that using property files and statics isn't a best practice. There really should be type safe POJOs and dependency injection in place for configuration of the consumer but that is a larger issue than I can take on right now. The new constructors attempt to add some type safety while improving ease of use when operating in an Amazon environment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-6988) Add Apache Kafka 0.11 connector
[ https://issues.apache.org/jira/browse/FLINK-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070319#comment-16070319 ] ASF GitHub Bot commented on FLINK-6988: --- GitHub user pnowojski opened a pull request: https://github.com/apache/flink/pull/4239 [FLINK-6988] Initial flink-connector-kafka-0.11 with at-least-once semantic Couple of first commits are from other PRs #4206 #4209 #4213 You can merge this pull request into a Git repository by running: $ git pull https://github.com/pnowojski/flink kafka011 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4239.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4239 commit 5191d5b4b78620cfc5ecfc9088afba0d611eaacb Author: Piotr Nowojski Date: 2017-06-26T09:28:51Z [FLINK-6996] Refactor and automaticall inherit KafkaProducer integration tests commit 1c7d349ce425ec0213059e062f10c90773cc780d Author: Piotr Nowojski Date: 2017-06-26T10:20:36Z [FLINK-6996] Fix formatting in KafkaConsumerTestBase and KafkaProducerTestBase commit 5b849f98191439e69ca2357a4767f47957ee0250 Author: Piotr Nowojski Date: 2017-06-23T11:41:55Z [FLINK-7030] Build with scala-2.11 by default commit 3f62aecb57cea9d43611ecfa24e2233a63197341 Author: Piotr Nowojski Date: 2017-06-26T10:36:40Z [FLINK-6996] Fix at-least-once semantic for FlinkKafkaProducer010 Add tests coverage for Kafka 0.10 and 0.9 commit 4b78626df474a8d49a406714a7142ad44d8a8faf Author: Piotr Nowojski Date: 2017-06-28T18:30:08Z [FLINK-7032] Overwrite inherited values of compiler version from parent pom Default values were 1.6 and were causing Intellij to constantly switch language level to 1.6, which in turn was causing compilation errors. It worked fine for compiling from console using maven, because those values are separetly set in maven-compiler-plugin configuration. commit 2c2556e72dd73c5e470e5afd6dab4a11cb41772d Author: Piotr Nowojski Date: 2017-06-23T07:14:28Z [FLINK-6988] Initial flink-connector-kafka-0.11 with at-least-once semantic Code of 0.11 connector is based on 0.10 version > Add Apache Kafka 0.11 connector > --- > > Key: FLINK-6988 > URL: https://issues.apache.org/jira/browse/FLINK-6988 > Project: Flink > Issue Type: Improvement > Components: Kafka Connector >Affects Versions: 1.3.1 >Reporter: Piotr Nowojski >Assignee: Piotr Nowojski > > Kafka 0.11 (it will be released very soon) add supports for transactions. > Thanks to that, Flink might be able to implement Kafka sink supporting > "exactly-once" semantic. API changes and whole transactions support is > described in > [KIP-98|https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging]. > The goal is to mimic implementation of existing BucketingSink. New > FlinkKafkaProducer011 would > * upon creation begin transaction, store transaction identifiers into the > state and would write all incoming data to an output Kafka topic using that > transaction > * on `snapshotState` call, it would flush the data and write in state > information that current transaction is pending to be committed > * on `notifyCheckpointComplete` we would commit this pending transaction > * in case of crash between `snapshotState` and `notifyCheckpointComplete` we > either abort this pending transaction (if not every participant successfully > saved the snapshot) or restore and commit it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #4239: [FLINK-6988] Initial flink-connector-kafka-0.11 wi...
GitHub user pnowojski opened a pull request: https://github.com/apache/flink/pull/4239 [FLINK-6988] Initial flink-connector-kafka-0.11 with at-least-once semantic Couple of first commits are from other PRs #4206 #4209 #4213 You can merge this pull request into a Git repository by running: $ git pull https://github.com/pnowojski/flink kafka011 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4239.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4239 commit 5191d5b4b78620cfc5ecfc9088afba0d611eaacb Author: Piotr Nowojski Date: 2017-06-26T09:28:51Z [FLINK-6996] Refactor and automaticall inherit KafkaProducer integration tests commit 1c7d349ce425ec0213059e062f10c90773cc780d Author: Piotr Nowojski Date: 2017-06-26T10:20:36Z [FLINK-6996] Fix formatting in KafkaConsumerTestBase and KafkaProducerTestBase commit 5b849f98191439e69ca2357a4767f47957ee0250 Author: Piotr Nowojski Date: 2017-06-23T11:41:55Z [FLINK-7030] Build with scala-2.11 by default commit 3f62aecb57cea9d43611ecfa24e2233a63197341 Author: Piotr Nowojski Date: 2017-06-26T10:36:40Z [FLINK-6996] Fix at-least-once semantic for FlinkKafkaProducer010 Add tests coverage for Kafka 0.10 and 0.9 commit 4b78626df474a8d49a406714a7142ad44d8a8faf Author: Piotr Nowojski Date: 2017-06-28T18:30:08Z [FLINK-7032] Overwrite inherited values of compiler version from parent pom Default values were 1.6 and were causing Intellij to constantly switch language level to 1.6, which in turn was causing compilation errors. It worked fine for compiling from console using maven, because those values are separetly set in maven-compiler-plugin configuration. commit 2c2556e72dd73c5e470e5afd6dab4a11cb41772d Author: Piotr Nowojski Date: 2017-06-23T07:14:28Z [FLINK-6988] Initial flink-connector-kafka-0.11 with at-least-once semantic Code of 0.11 connector is based on 0.10 version --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL
[ https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070310#comment-16070310 ] ASF GitHub Bot commented on FLINK-6075: --- Github user fhueske commented on the issue: https://github.com/apache/flink/pull/3714 @rtudoran can you please close this PR? Thank you > Support Limit/Top(Sort) for Stream SQL > -- > > Key: FLINK-6075 > URL: https://issues.apache.org/jira/browse/FLINK-6075 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: radu > Labels: features > Attachments: sort.png > > > These will be split in 3 separated JIRA issues. However, the design is the > same only the processing function differs in terms of the output. Hence, the > design is the same for all of them. > Time target: Proc Time > **SQL targeted query examples:** > *Sort example* > Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL > '3' HOUR) ORDER BY b` > Comment: window is defined using GROUP BY > Comment: ASC or DESC keywords can be placed to mark the ordering type > *Limit example* > Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL > '1' HOUR AND current_timestamp ORDER BY b LIMIT 10` > Comment: window is defined using time ranges in the WHERE clause > Comment: window is row triggered > *Top example* > Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING > LIMIT 10) FROM stream1` > Comment: limit over the contents of the sliding window > General Comments: > -All these SQL clauses are supported only over windows (bounded collections > of data). > -Each of the 3 operators will be supported with each of the types of > expressing the windows. > **Description** > The 3 operations (limit, top and sort) are similar in behavior as they all > require a sorted collection of the data on which the logic will be applied > (i.e., select a subset of the items or the entire sorted set). These > functions would make sense in the streaming context only in the context of a > window. Without defining a window the functions could never emit as the sort > operation would never trigger. If an SQL query will be provided without > limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). > Although not targeted by this JIRA, in the case of working based on event > time order, the retraction mechanisms of windows and the lateness mechanisms > can be used to deal with out of order events and retraction/updates of > results. > **Functionality example** > We exemplify with the query below for all the 3 types of operators (sorting, > limit and top). Rowtime indicates when the HOP window will trigger – which > can be observed in the fact that outputs are generated only at those moments. > The HOP windows will trigger at every hour (fixed hour) and each event will > contribute/ be duplicated for 2 consecutive hour intervals. Proctime > indicates the processing time when a new event arrives in the system. Events > are of the type (a,b) with the ordering being applied on the b field. > `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) > ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `) > ||Rowtime|| Proctime|| Stream1|| Limit 2|| Top 2|| Sort > [ASC]|| > | |10:00:00 |(aaa, 11) | | | >| > | |10:05:00|(aab, 7) | | || > |10-11 |11:00:00 | | aab,aaa |aab,aaa | aab,aaa >| > | |11:03:00 |(aac,21) | | || > > |11-12|12:00:00 | | aab,aaa |aab,aaa | aab,aaa,aac| > | |12:10:00 |(abb,12) | | || > > | |12:15:00 |(abb,12) | | || > > |12-13 |13:00:00 | | abb,abb | abb,abb | > abb,abb,aac| > |...| > **Implementation option** > Considering that the SQL operators will be associated with window boundaries, > the functionality will be implemented within the logic of the window as > follows. > * Window assigner – selected based on the type of window used in SQL > (TUMBLING, SLIDING…) > * Evictor/ Trigger – time or count evictor based on the definition of the > window boundaries > * Apply – window function that sorts data and selects the output to trigger > (based on LIMIT/TOP parameters). All data will be sorted at once and result > outputted when the window is triggered > An alternative implementation can be to use a fold window function to sort > the elements as they arrive, one at a time followed by a flatMap to filter > the number of outputs. > !sort.png
[GitHub] flink issue #3714: [FLINK-6075] - Support Limit/Top(Sort) for Stream SQL
Github user fhueske commented on the issue: https://github.com/apache/flink/pull/3714 @rtudoran can you please close this PR? Thank you --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7050) RFC Compliant CSV Parser for Table Source
[ https://issues.apache.org/jira/browse/FLINK-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070301#comment-16070301 ] Fabian Hueske commented on FLINK-7050: -- Thanks for opening this JIRA [~uybhatti]. An RFC compliant TableSource would be a great addition. > RFC Compliant CSV Parser for Table Source > - > > Key: FLINK-7050 > URL: https://issues.apache.org/jira/browse/FLINK-7050 > Project: Flink > Issue Type: Improvement > Components: Table API & SQL >Affects Versions: 1.3.1 >Reporter: Usman Younas > Labels: csv, parsing > Fix For: 1.4.0 > > > Currently, Flink CSV parser is not compliant with RFC 4180. Due to this > issue, it was not able to parse standard csv files including double quotes > and delimiters with in fields etc. > In order to produce this bug, we can take a csv file with double quotes > included in field of the records and parse it using Flink CSV parser. One of > the issue is mentioned in the jira > [FLINK-4785|https://issues.apache.org/jira/browse/FLINK-4785]. > The CSV related issues will be solved by making CSV parser compliant with RFC > 4180 standards for Table Source. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7057) move BLOB ref-counting from LibraryCacheManager to BlobCache
[ https://issues.apache.org/jira/browse/FLINK-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070255#comment-16070255 ] ASF GitHub Bot commented on FLINK-7057: --- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4238 [FLINK-7057][blob] move BLOB ref-counting from LibraryCacheManager to BlobCache Currently, the `LibraryCacheManager` is doing some ref-counting for JAR files managed by it. Instead, we want the `BlobCache` to do that itself for **all** job-related BLOBs. Also, we do not want to operate on a per-BlobKey level but rather per job. Job-unrelated BLOBs should be cleaned manually as done for the Web-UI logs. A future API change will reflect the different use cases in a better way. For now, we need to also adapt the cleanup appropriately. On the `BlobServer`, the JAR files should remain locally as well as in the HA store until the job enters a final state. Then they can be deleted. With this intermediate state, job-unrelated BLOBs will remain in the file system until deleted manually. This is the same as the previous API use when working with a `BlobService` directly instead of going through the `LibraryCacheManager`. The aforementioned API extension will include TTL fields for those BLOBs in order to have a proper cleanup, too. This PR is based upon #4237 in a series to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7057 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4238.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4238 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe Author: Nico Kruber Date: 2017-06-21T12:45:31Z
[GitHub] flink pull request #4238: [FLINK-7057][blob] move BLOB ref-counting from Lib...
GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4238 [FLINK-7057][blob] move BLOB ref-counting from LibraryCacheManager to BlobCache Currently, the `LibraryCacheManager` is doing some ref-counting for JAR files managed by it. Instead, we want the `BlobCache` to do that itself for **all** job-related BLOBs. Also, we do not want to operate on a per-BlobKey level but rather per job. Job-unrelated BLOBs should be cleaned manually as done for the Web-UI logs. A future API change will reflect the different use cases in a better way. For now, we need to also adapt the cleanup appropriately. On the `BlobServer`, the JAR files should remain locally as well as in the HA store until the job enters a final state. Then they can be deleted. With this intermediate state, job-unrelated BLOBs will remain in the file system until deleted manually. This is the same as the previous API use when working with a `BlobService` directly instead of going through the `LibraryCacheManager`. The aforementioned API extension will include TTL fields for those BLOBs in order to have a proper cleanup, too. This PR is based upon #4237 in a series to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7057 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4238.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4238 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe Author: Nico Kruber Date: 2017-06-21T12:45:31Z [FLINK-7054][blob] remove LibraryCacheManager#getFile() This was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. commit 4ae04b68453d4b099f752d6c6fd3c09335ede33a Author: Nico Kruber Dat
[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125006340 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala --- @@ -92,8 +92,19 @@ class RelTimeIndicatorConverter(rexBuilder: RexBuilder) extends RelShuttle { override def visit(minus: LogicalMinus): RelNode = throw new TableException("Logical minus in a stream environment is not supported yet.") - override def visit(sort: LogicalSort): RelNode = -throw new TableException("Logical sort in a stream environment is not supported yet.") + override def visit(sort: LogicalSort): RelNode = { + +val input = sort.getInput.accept(this) + +val materializer = new RexTimeIndicatorMaterializer( --- End diff -- unused. Should be removed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125058559 --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/aggregate/TimeSortProcessFunctionTest.scala --- @@ -122,6 +130,11 @@ class TimeSortProcessFunctionTest{ Row.of(1: JInt, 12L: JLong, 1: JInt, "aaa", 11L: JLong),true), 4)) expectedOutput.add(new StreamRecord(new CRow( Row.of(1: JInt, 12L: JLong, 0: JInt, "aaa", 11L: JLong),true), 4)) + +expectedOutput.add(new StreamRecord(new CRow( + Row.of(1: JInt, 1L: JLong, 0: JInt, "aaa", 11L: JLong),true), 1006)) +expectedOutput.add(new StreamRecord(new CRow( + Row.of(1: JInt, 2L: JLong, 0: JInt, "aaa", 11L: JLong),true), 1006)) TestHarnessUtil.assertOutputEqualsSorted("Output was not correct.", --- End diff -- The test should check that the ProcessFunction emit the rows in the correct order. `assertOutputEqualsSorted` sorts the result and expected data before comparing them. We have to use `assertOutputEquals` instead. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125058944 --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/aggregate/TimeSortProcessFunctionTest.scala --- @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.table.runtime.aggregate + +import java.util.Comparator +import java.util.concurrent.ConcurrentLinkedQueue +import java.lang.{Integer => JInt, Long => JLong} + +import org.apache.flink.api.common.typeinfo.BasicTypeInfo._ +import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, TypeInformation} +import org.apache.flink.api.java.functions.KeySelector +import org.apache.flink.api.java.typeutils.RowTypeInfo +import org.apache.flink.streaming.api.operators.KeyedProcessOperator +import org.apache.flink.streaming.api.watermark.Watermark +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord +import org.apache.flink.streaming.util.{KeyedOneInputStreamOperatorTestHarness, TestHarnessUtil} +import org.apache.flink.types.Row +import org.junit.Test +import org.apache.flink.table.runtime.aggregate.ProcTimeSortProcessFunction +import org.apache.flink.table.runtime.aggregate.RowTimeSortProcessFunction +import org.apache.flink.table.runtime.aggregate.TimeSortProcessFunctionTest._ +import org.apache.flink.api.java.typeutils.runtime.RowComparator +import org.apache.flink.api.common.typeutils.TypeSerializer +import org.apache.flink.api.common.typeutils.TypeComparator +import org.apache.flink.table.runtime.types.{CRow, CRowTypeInfo} +import org.apache.flink.streaming.api.TimeCharacteristic + +class TimeSortProcessFunctionTest{ + + + @Test + def testSortProcTimeHarnessPartitioned(): Unit = { + +val rT = new RowTypeInfo(Array[TypeInformation[_]]( + INT_TYPE_INFO, + LONG_TYPE_INFO, + INT_TYPE_INFO, + STRING_TYPE_INFO, + LONG_TYPE_INFO), + Array("a","b","c","d","e")) + +val rTA = new RowTypeInfo(Array[TypeInformation[_]]( + LONG_TYPE_INFO), Array("count")) +val indexes = Array(1,2) + +val fieldComps = Array[TypeComparator[AnyRef]]( + LONG_TYPE_INFO.createComparator(true, null).asInstanceOf[TypeComparator[AnyRef]], + INT_TYPE_INFO.createComparator(false, null).asInstanceOf[TypeComparator[AnyRef]] ) +val booleanOrders = Array(true, false) + + +val rowComp = new RowComparator( + rT.getTotalFields, + indexes, + fieldComps, + new Array[TypeSerializer[AnyRef]](0), //used only for serialized comparisons + booleanOrders) + +val collectionRowComparator = new CollectionRowComparator(rowComp) + +val inputCRowType = CRowTypeInfo(rT) + +val processFunction = new KeyedProcessOperator[Integer,CRow,CRow]( + new ProcTimeSortProcessFunction( +inputCRowType, +collectionRowComparator)) + + val testHarness = new KeyedOneInputStreamOperatorTestHarness[Integer,CRow,CRow]( + processFunction, + new TupleRowSelector(0), + BasicTypeInfo.INT_TYPE_INFO) + + testHarness.open(); + + testHarness.setProcessingTime(3) + + // timestamp is ignored in processing time +testHarness.processElement(new StreamRecord(new CRow( + Row.of(1: JInt, 11L: JLong, 1: JInt, "aaa", 11L: JLong), true), 1001)) +testHarness.processElement(new StreamRecord(new CRow( +Row.of(1: JInt, 12L: JLong, 1: JInt, "aaa", 11L: JLong), true), 2002)) +testHarness.processElement(new StreamRecord(new CRow( +Row.of(1: JInt, 12L: JLong, 2: JInt, "aaa", 11L: JLong), true), 2003)) +testHarness.processElement(new StreamRecord(new CRow( +Row.of(1: JInt, 12L: JLong, 0: JInt, "aaa", 11L: JLong), true), 2004)) +testHarnes
[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL
[ https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070252#comment-16070252 ] ASF GitHub Bot commented on FLINK-6075: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125006376 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala --- @@ -92,8 +92,19 @@ class RelTimeIndicatorConverter(rexBuilder: RexBuilder) extends RelShuttle { override def visit(minus: LogicalMinus): RelNode = throw new TableException("Logical minus in a stream environment is not supported yet.") - override def visit(sort: LogicalSort): RelNode = -throw new TableException("Logical sort in a stream environment is not supported yet.") + override def visit(sort: LogicalSort): RelNode = { + +val input = sort.getInput.accept(this) + +val materializer = new RexTimeIndicatorMaterializer( + rexBuilder, + input.getRowType.getFieldList.map(_.getType)) + +//val offset = if(sort.offset != null) sort.offset.accept(materializer) else null --- End diff -- Should be removed > Support Limit/Top(Sort) for Stream SQL > -- > > Key: FLINK-6075 > URL: https://issues.apache.org/jira/browse/FLINK-6075 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: radu > Labels: features > Attachments: sort.png > > > These will be split in 3 separated JIRA issues. However, the design is the > same only the processing function differs in terms of the output. Hence, the > design is the same for all of them. > Time target: Proc Time > **SQL targeted query examples:** > *Sort example* > Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL > '3' HOUR) ORDER BY b` > Comment: window is defined using GROUP BY > Comment: ASC or DESC keywords can be placed to mark the ordering type > *Limit example* > Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL > '1' HOUR AND current_timestamp ORDER BY b LIMIT 10` > Comment: window is defined using time ranges in the WHERE clause > Comment: window is row triggered > *Top example* > Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING > LIMIT 10) FROM stream1` > Comment: limit over the contents of the sliding window > General Comments: > -All these SQL clauses are supported only over windows (bounded collections > of data). > -Each of the 3 operators will be supported with each of the types of > expressing the windows. > **Description** > The 3 operations (limit, top and sort) are similar in behavior as they all > require a sorted collection of the data on which the logic will be applied > (i.e., select a subset of the items or the entire sorted set). These > functions would make sense in the streaming context only in the context of a > window. Without defining a window the functions could never emit as the sort > operation would never trigger. If an SQL query will be provided without > limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). > Although not targeted by this JIRA, in the case of working based on event > time order, the retraction mechanisms of windows and the lateness mechanisms > can be used to deal with out of order events and retraction/updates of > results. > **Functionality example** > We exemplify with the query below for all the 3 types of operators (sorting, > limit and top). Rowtime indicates when the HOP window will trigger – which > can be observed in the fact that outputs are generated only at those moments. > The HOP windows will trigger at every hour (fixed hour) and each event will > contribute/ be duplicated for 2 consecutive hour intervals. Proctime > indicates the processing time when a new event arrives in the system. Events > are of the type (a,b) with the ordering being applied on the b field. > `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) > ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `) > ||Rowtime|| Proctime|| Stream1|| Limit 2|| Top 2|| Sort > [ASC]|| > | |10:00:00 |(aaa, 11) | | | >| > | |10:05:00|(aab, 7) | | || > |10-11 |11:00:00 | | aab,aaa |aab,aaa | aab,aaa >| > | |11:03:00 |(aac,21) | | || > > |11-12|12:00:00 | | aab,aaa |aab,aaa | aab,aaa,aac| > | |12:10:00 |(abb,12) | | || > > | |12:15:00 |(abb
[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL
[ https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070250#comment-16070250 ] ASF GitHub Bot commented on FLINK-6075: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125063072 --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/SortITCase.scala --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.api.scala.stream.sql + +import org.apache.flink.api.scala._ +import org.apache.flink.table.api.scala.stream.sql.SortITCase.StringRowSelectorSink +import org.apache.flink.table.api.scala.stream.sql.TimeTestUtil.EventTimeSourceFunction +import org.apache.flink.streaming.api.functions.source.SourceFunction +import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment +import org.apache.flink.table.api.TableEnvironment +import org.apache.flink.table.api.scala._ +import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, StreamTestData, StreamingWithStateTestBase} +import org.apache.flink.api.common.typeinfo.BasicTypeInfo +import org.apache.flink.api.java.typeutils.RowTypeInfo +import org.apache.flink.api.common.typeinfo.TypeInformation +import org.apache.flink.types.Row +import org.junit.Assert._ +import org.junit._ +import org.apache.flink.streaming.api.TimeCharacteristic +import org.apache.flink.streaming.api.functions.source.SourceFunction.SourceContext +import org.apache.flink.streaming.api.watermark.Watermark +import scala.collection.mutable +import org.apache.flink.streaming.api.functions.sink.RichSinkFunction + +class SortITCase extends StreamingWithStateTestBase { + + @Test + def testEventTimeOrderBy(): Unit = { +val data = Seq( + Left((1500L, (1L, 15, "Hello"))), + Left((1600L, (1L, 16, "Hello"))), + Left((1000L, (1L, 1, "Hello"))), + Left((2000L, (2L, 2, "Hello"))), + Right(1000L), + Left((2000L, (2L, 2, "Hello"))), + Left((2000L, (2L, 3, "Hello"))), + Left((3000L, (3L, 3, "Hello"))), + Left((2000L, (3L, 1, "Hello"))), + Right(2000L), + Left((4000L, (4L, 4, "Hello"))), + Right(3000L), + Left((5000L, (5L, 5, "Hello"))), + Right(5000L), + Left((6000L, (6L, 65, "Hello"))), + Left((6000L, (6L, 6, "Hello"))), + Left((6000L, (6L, 67, "Hello"))), + Left((6000L, (6L, -1, "Hello"))), + Left((6000L, (6L, 6, "Hello"))), + Right(7000L), + Left((9000L, (6L, 9, "Hello"))), + Left((8500L, (6L, 18, "Hello"))), + Left((9000L, (6L, 7, "Hello"))), + Right(1L), + Left((1L, (7L, 7, "Hello World"))), + Left((11000L, (7L, 77, "Hello World"))), + Left((11000L, (7L, 17, "Hello World"))), + Right(12000L), + Left((14000L, (7L, 18, "Hello World"))), + Right(14000L), + Left((15000L, (8L, 8, "Hello World"))), + Right(17000L), + Left((2L, (20L, 20, "Hello World"))), + Right(19000L)) + +val env = StreamExecutionEnvironment.getExecutionEnvironment +env.setStateBackend(getStateBackend) +env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) +val tEnv = TableEnvironment.getTableEnvironment(env) +StreamITCase.clear + +val t1 = env.addSource(new EventTimeSourceFunction[(Long, Int, String)](data)) + .toTable(tEnv, 'a, 'b, 'c, 'rowtime.rowtime) + +tEnv.registerTable("T1", t1) + +val sqlQuery = "SELECT b FROM T1 " + + "ORDER BY rowtime, b ASC "; + + +val result = tEnv.sql(sqlQuery).toDataStream[Row] --- End diff -- OK, will do that before merging > Support Limit/Top(Sort) for Stream SQL > ---
[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL
[ https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070248#comment-16070248 ] ASF GitHub Bot commented on FLINK-6075: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125058559 --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/aggregate/TimeSortProcessFunctionTest.scala --- @@ -122,6 +130,11 @@ class TimeSortProcessFunctionTest{ Row.of(1: JInt, 12L: JLong, 1: JInt, "aaa", 11L: JLong),true), 4)) expectedOutput.add(new StreamRecord(new CRow( Row.of(1: JInt, 12L: JLong, 0: JInt, "aaa", 11L: JLong),true), 4)) + +expectedOutput.add(new StreamRecord(new CRow( + Row.of(1: JInt, 1L: JLong, 0: JInt, "aaa", 11L: JLong),true), 1006)) +expectedOutput.add(new StreamRecord(new CRow( + Row.of(1: JInt, 2L: JLong, 0: JInt, "aaa", 11L: JLong),true), 1006)) TestHarnessUtil.assertOutputEqualsSorted("Output was not correct.", --- End diff -- The test should check that the ProcessFunction emit the rows in the correct order. `assertOutputEqualsSorted` sorts the result and expected data before comparing them. We have to use `assertOutputEquals` instead. > Support Limit/Top(Sort) for Stream SQL > -- > > Key: FLINK-6075 > URL: https://issues.apache.org/jira/browse/FLINK-6075 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: radu > Labels: features > Attachments: sort.png > > > These will be split in 3 separated JIRA issues. However, the design is the > same only the processing function differs in terms of the output. Hence, the > design is the same for all of them. > Time target: Proc Time > **SQL targeted query examples:** > *Sort example* > Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL > '3' HOUR) ORDER BY b` > Comment: window is defined using GROUP BY > Comment: ASC or DESC keywords can be placed to mark the ordering type > *Limit example* > Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL > '1' HOUR AND current_timestamp ORDER BY b LIMIT 10` > Comment: window is defined using time ranges in the WHERE clause > Comment: window is row triggered > *Top example* > Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING > LIMIT 10) FROM stream1` > Comment: limit over the contents of the sliding window > General Comments: > -All these SQL clauses are supported only over windows (bounded collections > of data). > -Each of the 3 operators will be supported with each of the types of > expressing the windows. > **Description** > The 3 operations (limit, top and sort) are similar in behavior as they all > require a sorted collection of the data on which the logic will be applied > (i.e., select a subset of the items or the entire sorted set). These > functions would make sense in the streaming context only in the context of a > window. Without defining a window the functions could never emit as the sort > operation would never trigger. If an SQL query will be provided without > limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). > Although not targeted by this JIRA, in the case of working based on event > time order, the retraction mechanisms of windows and the lateness mechanisms > can be used to deal with out of order events and retraction/updates of > results. > **Functionality example** > We exemplify with the query below for all the 3 types of operators (sorting, > limit and top). Rowtime indicates when the HOP window will trigger – which > can be observed in the fact that outputs are generated only at those moments. > The HOP windows will trigger at every hour (fixed hour) and each event will > contribute/ be duplicated for 2 consecutive hour intervals. Proctime > indicates the processing time when a new event arrives in the system. Events > are of the type (a,b) with the ordering being applied on the b field. > `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) > ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `) > ||Rowtime|| Proctime|| Stream1|| Limit 2|| Top 2|| Sort > [ASC]|| > | |10:00:00 |(aaa, 11) | | | >| > | |10:05:00|(aab, 7) | | || > |10-11 |11:00:00 | | aab,aaa |aab,aaa | aab,aaa >| > | |11:03:00 |(aac,21) | | || > > |11-12|12:00:00 | | aab,aaa |aab,aaa | aab,aaa,aac| > | |12:10:00 |(abb,12) | |
[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL
[ https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070253#comment-16070253 ] ASF GitHub Bot commented on FLINK-6075: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125006340 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala --- @@ -92,8 +92,19 @@ class RelTimeIndicatorConverter(rexBuilder: RexBuilder) extends RelShuttle { override def visit(minus: LogicalMinus): RelNode = throw new TableException("Logical minus in a stream environment is not supported yet.") - override def visit(sort: LogicalSort): RelNode = -throw new TableException("Logical sort in a stream environment is not supported yet.") + override def visit(sort: LogicalSort): RelNode = { + +val input = sort.getInput.accept(this) + +val materializer = new RexTimeIndicatorMaterializer( --- End diff -- unused. Should be removed > Support Limit/Top(Sort) for Stream SQL > -- > > Key: FLINK-6075 > URL: https://issues.apache.org/jira/browse/FLINK-6075 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: radu > Labels: features > Attachments: sort.png > > > These will be split in 3 separated JIRA issues. However, the design is the > same only the processing function differs in terms of the output. Hence, the > design is the same for all of them. > Time target: Proc Time > **SQL targeted query examples:** > *Sort example* > Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL > '3' HOUR) ORDER BY b` > Comment: window is defined using GROUP BY > Comment: ASC or DESC keywords can be placed to mark the ordering type > *Limit example* > Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL > '1' HOUR AND current_timestamp ORDER BY b LIMIT 10` > Comment: window is defined using time ranges in the WHERE clause > Comment: window is row triggered > *Top example* > Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING > LIMIT 10) FROM stream1` > Comment: limit over the contents of the sliding window > General Comments: > -All these SQL clauses are supported only over windows (bounded collections > of data). > -Each of the 3 operators will be supported with each of the types of > expressing the windows. > **Description** > The 3 operations (limit, top and sort) are similar in behavior as they all > require a sorted collection of the data on which the logic will be applied > (i.e., select a subset of the items or the entire sorted set). These > functions would make sense in the streaming context only in the context of a > window. Without defining a window the functions could never emit as the sort > operation would never trigger. If an SQL query will be provided without > limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). > Although not targeted by this JIRA, in the case of working based on event > time order, the retraction mechanisms of windows and the lateness mechanisms > can be used to deal with out of order events and retraction/updates of > results. > **Functionality example** > We exemplify with the query below for all the 3 types of operators (sorting, > limit and top). Rowtime indicates when the HOP window will trigger – which > can be observed in the fact that outputs are generated only at those moments. > The HOP windows will trigger at every hour (fixed hour) and each event will > contribute/ be duplicated for 2 consecutive hour intervals. Proctime > indicates the processing time when a new event arrives in the system. Events > are of the type (a,b) with the ordering being applied on the b field. > `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) > ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `) > ||Rowtime|| Proctime|| Stream1|| Limit 2|| Top 2|| Sort > [ASC]|| > | |10:00:00 |(aaa, 11) | | | >| > | |10:05:00|(aab, 7) | | || > |10-11 |11:00:00 | | aab,aaa |aab,aaa | aab,aaa >| > | |11:03:00 |(aac,21) | | || > > |11-12|12:00:00 | | aab,aaa |aab,aaa | aab,aaa,aac| > | |12:10:00 |(abb,12) | | || > > | |12:15:00 |(abb,12) | | || > > |12-13 |13:00:00 | | abb,abb | abb,abb | > abb,abb,aac| > |...| > **Implement
[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL
[ https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070249#comment-16070249 ] ASF GitHub Bot commented on FLINK-6075: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125058944 --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/aggregate/TimeSortProcessFunctionTest.scala --- @@ -0,0 +1,256 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.table.runtime.aggregate + +import java.util.Comparator +import java.util.concurrent.ConcurrentLinkedQueue +import java.lang.{Integer => JInt, Long => JLong} + +import org.apache.flink.api.common.typeinfo.BasicTypeInfo._ +import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, TypeInformation} +import org.apache.flink.api.java.functions.KeySelector +import org.apache.flink.api.java.typeutils.RowTypeInfo +import org.apache.flink.streaming.api.operators.KeyedProcessOperator +import org.apache.flink.streaming.api.watermark.Watermark +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord +import org.apache.flink.streaming.util.{KeyedOneInputStreamOperatorTestHarness, TestHarnessUtil} +import org.apache.flink.types.Row +import org.junit.Test +import org.apache.flink.table.runtime.aggregate.ProcTimeSortProcessFunction +import org.apache.flink.table.runtime.aggregate.RowTimeSortProcessFunction +import org.apache.flink.table.runtime.aggregate.TimeSortProcessFunctionTest._ +import org.apache.flink.api.java.typeutils.runtime.RowComparator +import org.apache.flink.api.common.typeutils.TypeSerializer +import org.apache.flink.api.common.typeutils.TypeComparator +import org.apache.flink.table.runtime.types.{CRow, CRowTypeInfo} +import org.apache.flink.streaming.api.TimeCharacteristic + +class TimeSortProcessFunctionTest{ + + + @Test + def testSortProcTimeHarnessPartitioned(): Unit = { + +val rT = new RowTypeInfo(Array[TypeInformation[_]]( + INT_TYPE_INFO, + LONG_TYPE_INFO, + INT_TYPE_INFO, + STRING_TYPE_INFO, + LONG_TYPE_INFO), + Array("a","b","c","d","e")) + +val rTA = new RowTypeInfo(Array[TypeInformation[_]]( + LONG_TYPE_INFO), Array("count")) +val indexes = Array(1,2) + +val fieldComps = Array[TypeComparator[AnyRef]]( + LONG_TYPE_INFO.createComparator(true, null).asInstanceOf[TypeComparator[AnyRef]], + INT_TYPE_INFO.createComparator(false, null).asInstanceOf[TypeComparator[AnyRef]] ) +val booleanOrders = Array(true, false) + + +val rowComp = new RowComparator( + rT.getTotalFields, + indexes, + fieldComps, + new Array[TypeSerializer[AnyRef]](0), //used only for serialized comparisons + booleanOrders) + +val collectionRowComparator = new CollectionRowComparator(rowComp) + +val inputCRowType = CRowTypeInfo(rT) + +val processFunction = new KeyedProcessOperator[Integer,CRow,CRow]( + new ProcTimeSortProcessFunction( +inputCRowType, +collectionRowComparator)) + + val testHarness = new KeyedOneInputStreamOperatorTestHarness[Integer,CRow,CRow]( + processFunction, + new TupleRowSelector(0), + BasicTypeInfo.INT_TYPE_INFO) + + testHarness.open(); + + testHarness.setProcessingTime(3) + + // timestamp is ignored in processing time +testHarness.processElement(new StreamRecord(new CRow( + Row.of(1: JInt, 11L: JLong, 1: JInt, "aaa", 11L: JLong), true), 1001)) +testHarness.processElement(new StreamRecord(new CRow( +Row.of(1: JInt, 12L: JLong, 1: JInt, "aaa", 11L: JLong), true), 2002)) +testHarness.processElement(new StreamRecord(new CR
[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL
[ https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070251#comment-16070251 ] ASF GitHub Bot commented on FLINK-6075: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125040180 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala --- @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.plan.nodes.datastream + +import org.apache.calcite.plan.{ RelOptCluster, RelTraitSet } --- End diff -- About half of the imports are unused. Other classes have unused imports as well. > Support Limit/Top(Sort) for Stream SQL > -- > > Key: FLINK-6075 > URL: https://issues.apache.org/jira/browse/FLINK-6075 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: radu > Labels: features > Attachments: sort.png > > > These will be split in 3 separated JIRA issues. However, the design is the > same only the processing function differs in terms of the output. Hence, the > design is the same for all of them. > Time target: Proc Time > **SQL targeted query examples:** > *Sort example* > Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL > '3' HOUR) ORDER BY b` > Comment: window is defined using GROUP BY > Comment: ASC or DESC keywords can be placed to mark the ordering type > *Limit example* > Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL > '1' HOUR AND current_timestamp ORDER BY b LIMIT 10` > Comment: window is defined using time ranges in the WHERE clause > Comment: window is row triggered > *Top example* > Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING > LIMIT 10) FROM stream1` > Comment: limit over the contents of the sliding window > General Comments: > -All these SQL clauses are supported only over windows (bounded collections > of data). > -Each of the 3 operators will be supported with each of the types of > expressing the windows. > **Description** > The 3 operations (limit, top and sort) are similar in behavior as they all > require a sorted collection of the data on which the logic will be applied > (i.e., select a subset of the items or the entire sorted set). These > functions would make sense in the streaming context only in the context of a > window. Without defining a window the functions could never emit as the sort > operation would never trigger. If an SQL query will be provided without > limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). > Although not targeted by this JIRA, in the case of working based on event > time order, the retraction mechanisms of windows and the lateness mechanisms > can be used to deal with out of order events and retraction/updates of > results. > **Functionality example** > We exemplify with the query below for all the 3 types of operators (sorting, > limit and top). Rowtime indicates when the HOP window will trigger – which > can be observed in the fact that outputs are generated only at those moments. > The HOP windows will trigger at every hour (fixed hour) and each event will > contribute/ be duplicated for 2 consecutive hour intervals. Proctime > indicates the processing time when a new event arrives in the system. Events > are of the type (a,b) with the ordering being applied on the b field. > `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) > ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `) > ||Rowtime|| Proctime|| Stream1|| Limit 2|| Top 2|| Sort > [ASC]|| > | |10:00:00 |(aaa, 11) | | | >| > | |10:05:00|(aab, 7) | | || > |10-11 |11:00:00 | |
[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125006376 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala --- @@ -92,8 +92,19 @@ class RelTimeIndicatorConverter(rexBuilder: RexBuilder) extends RelShuttle { override def visit(minus: LogicalMinus): RelNode = throw new TableException("Logical minus in a stream environment is not supported yet.") - override def visit(sort: LogicalSort): RelNode = -throw new TableException("Logical sort in a stream environment is not supported yet.") + override def visit(sort: LogicalSort): RelNode = { + +val input = sort.getInput.accept(this) + +val materializer = new RexTimeIndicatorMaterializer( + rexBuilder, + input.getRowType.getFieldList.map(_.getType)) + +//val offset = if(sort.offset != null) sort.offset.accept(materializer) else null --- End diff -- Should be removed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125063072 --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/SortITCase.scala --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.api.scala.stream.sql + +import org.apache.flink.api.scala._ +import org.apache.flink.table.api.scala.stream.sql.SortITCase.StringRowSelectorSink +import org.apache.flink.table.api.scala.stream.sql.TimeTestUtil.EventTimeSourceFunction +import org.apache.flink.streaming.api.functions.source.SourceFunction +import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment +import org.apache.flink.table.api.TableEnvironment +import org.apache.flink.table.api.scala._ +import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, StreamTestData, StreamingWithStateTestBase} +import org.apache.flink.api.common.typeinfo.BasicTypeInfo +import org.apache.flink.api.java.typeutils.RowTypeInfo +import org.apache.flink.api.common.typeinfo.TypeInformation +import org.apache.flink.types.Row +import org.junit.Assert._ +import org.junit._ +import org.apache.flink.streaming.api.TimeCharacteristic +import org.apache.flink.streaming.api.functions.source.SourceFunction.SourceContext +import org.apache.flink.streaming.api.watermark.Watermark +import scala.collection.mutable +import org.apache.flink.streaming.api.functions.sink.RichSinkFunction + +class SortITCase extends StreamingWithStateTestBase { + + @Test + def testEventTimeOrderBy(): Unit = { +val data = Seq( + Left((1500L, (1L, 15, "Hello"))), + Left((1600L, (1L, 16, "Hello"))), + Left((1000L, (1L, 1, "Hello"))), + Left((2000L, (2L, 2, "Hello"))), + Right(1000L), + Left((2000L, (2L, 2, "Hello"))), + Left((2000L, (2L, 3, "Hello"))), + Left((3000L, (3L, 3, "Hello"))), + Left((2000L, (3L, 1, "Hello"))), + Right(2000L), + Left((4000L, (4L, 4, "Hello"))), + Right(3000L), + Left((5000L, (5L, 5, "Hello"))), + Right(5000L), + Left((6000L, (6L, 65, "Hello"))), + Left((6000L, (6L, 6, "Hello"))), + Left((6000L, (6L, 67, "Hello"))), + Left((6000L, (6L, -1, "Hello"))), + Left((6000L, (6L, 6, "Hello"))), + Right(7000L), + Left((9000L, (6L, 9, "Hello"))), + Left((8500L, (6L, 18, "Hello"))), + Left((9000L, (6L, 7, "Hello"))), + Right(1L), + Left((1L, (7L, 7, "Hello World"))), + Left((11000L, (7L, 77, "Hello World"))), + Left((11000L, (7L, 17, "Hello World"))), + Right(12000L), + Left((14000L, (7L, 18, "Hello World"))), + Right(14000L), + Left((15000L, (8L, 8, "Hello World"))), + Right(17000L), + Left((2L, (20L, 20, "Hello World"))), + Right(19000L)) + +val env = StreamExecutionEnvironment.getExecutionEnvironment +env.setStateBackend(getStateBackend) +env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) +val tEnv = TableEnvironment.getTableEnvironment(env) +StreamITCase.clear + +val t1 = env.addSource(new EventTimeSourceFunction[(Long, Int, String)](data)) + .toTable(tEnv, 'a, 'b, 'c, 'rowtime.rowtime) + +tEnv.registerTable("T1", t1) + +val sqlQuery = "SELECT b FROM T1 " + + "ORDER BY rowtime, b ASC "; + + +val result = tEnv.sql(sqlQuery).toDataStream[Row] --- End diff -- OK, will do that before merging --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFR
[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/3889#discussion_r125040180 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala --- @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.plan.nodes.datastream + +import org.apache.calcite.plan.{ RelOptCluster, RelTraitSet } --- End diff -- About half of the imports are unused. Other classes have unused imports as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-6916) FLIP-19: Improved BLOB storage architecture
[ https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070243#comment-16070243 ] ASF GitHub Bot commented on FLINK-6916: --- Github user NicoK closed the pull request at: https://github.com/apache/flink/pull/4176 > FLIP-19: Improved BLOB storage architecture > --- > > Key: FLINK-6916 > URL: https://issues.apache.org/jira/browse/FLINK-6916 > Project: Flink > Issue Type: Improvement > Components: Distributed Coordination, Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > The current architecture around the BLOB server and cache components seems > rather patched up and has some issues regarding concurrency ([FLINK-6380]), > cleanup, API inconsistencies / currently unused API ([FLINK-6329], > [FLINK-6008]). These make future integration with FLIP-6 or extensions like > offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore > propose an improvement on the current architecture as described below which > tackles these issues, provides some cleanup, and enables further BLOB server > use cases. > Please refer to > https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture > for a full overview on the proposed changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6916) FLIP-19: Improved BLOB storage architecture
[ https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070242#comment-16070242 ] ASF GitHub Bot commented on FLINK-6916: --- Github user NicoK commented on the issue: https://github.com/apache/flink/pull/4176 superseeded by #4237 > FLIP-19: Improved BLOB storage architecture > --- > > Key: FLINK-6916 > URL: https://issues.apache.org/jira/browse/FLINK-6916 > Project: Flink > Issue Type: Improvement > Components: Distributed Coordination, Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > The current architecture around the BLOB server and cache components seems > rather patched up and has some issues regarding concurrency ([FLINK-6380]), > cleanup, API inconsistencies / currently unused API ([FLINK-6329], > [FLINK-6008]). These make future integration with FLIP-6 or extensions like > offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore > propose an improvement on the current architecture as described below which > tackles these issues, provides some cleanup, and enables further BLOB server > use cases. > Please refer to > https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture > for a full overview on the proposed changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4176: [FLINK-6916][blob] add API to allow job-related BLOBs to ...
Github user NicoK commented on the issue: https://github.com/apache/flink/pull/4176 superseeded by #4237 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #4176: [FLINK-6916][blob] add API to allow job-related BL...
Github user NicoK closed the pull request at: https://github.com/apache/flink/pull/4176 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #4237: [FLINK-7056][blob] add API to allow job-related BL...
GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4237 [FLINK-7056][blob] add API to allow job-related BLOBs to be stored To ease cleanup, we will make job-related BLOBs be reflected in the blob storage so that they may be removed along with the job. This adds the `jobId` to many methods similar to the previous code from the `NAME_ADDRESSABLE` mode. This PR is based upon #4236 in a series to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7056 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4237.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4237 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe Author: Nico Kruber Date: 2017-06-21T12:45:31Z [FLINK-7054][blob] remove LibraryCacheManager#getFile() This was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. commit 4ae04b68453d4b099f752d6c6fd3c09335ede33a Author: Nico Kruber Date: 2017-06-21T14:14:15Z [FLINK-7055][blob] refactor getURL() to the more generic getFile() The fact that we always returned URL objects is a relic of the BlobServer's only use for URLClassLoader. Since we'd like to extend its use, returning File objects instead is more generic. commit 8397d6aa5dc0aac07626d0af9ee3d8623dd7b60c Author: Nico Kruber Date: 2017-06-21T16:04:43Z [FLINK-7056][blob] add API to allow job-related BLOBs to be stored commit 0a4c4e9bc483e4f1f885ef1e3b8feba40c057204 Author: Nico Kruber Date: 2017-06-23T17:17:07Z [FLINK-7056][blob] refactor the new API for job-related BLOBs For a cleaner API, instead of having a nullable jobId parameter, use two methods: one for job-related BLOBs, anot
[jira] [Commented] (FLINK-7056) add API to allow job-related BLOBs to be stored
[ https://issues.apache.org/jira/browse/FLINK-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070239#comment-16070239 ] ASF GitHub Bot commented on FLINK-7056: --- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4237 [FLINK-7056][blob] add API to allow job-related BLOBs to be stored To ease cleanup, we will make job-related BLOBs be reflected in the blob storage so that they may be removed along with the job. This adds the `jobId` to many methods similar to the previous code from the `NAME_ADDRESSABLE` mode. This PR is based upon #4236 in a series to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7056 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4237.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4237 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe Author: Nico Kruber Date: 2017-06-21T12:45:31Z [FLINK-7054][blob] remove LibraryCacheManager#getFile() This was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. commit 4ae04b68453d4b099f752d6c6fd3c09335ede33a Author: Nico Kruber Date: 2017-06-21T14:14:15Z [FLINK-7055][blob] refactor getURL() to the more generic getFile() The fact that we always returned URL objects is a relic of the BlobServer's only use for URLClassLoader. Since we'd like to extend its use, returning File objects instead is more generic. commit 8397d6aa5dc0aac07626d0af9ee3d8623dd7b60c Author: Nico Kruber Date: 2017-06-21T16:04:43Z [FLINK-7056][blob] add API to allow job-related BLOBs to be stored commit 0a4c4e9bc483e4f1f885ef1e3b8fe
[jira] [Updated] (FLINK-7055) refactor BlobService#getURL() methods to return a File object
[ https://issues.apache.org/jira/browse/FLINK-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-7055: --- Description: As a relic from its use by the {{URLClassLoader}}, {{BlobService#getURL()}} methods always returned {{URL}} objects although they were always pointing to locally cached files. As a step towards a better architecture and API, these should return a File object instead. (was: As a relic from its use by the {{UrlClassLoader}}, {{BlobService#getURL()}} methods always returned {{URL}} objects although they were always pointing to locally cached files. As a step towards a better architecture and API, these should return a File object instead.) > refactor BlobService#getURL() methods to return a File object > - > > Key: FLINK-7055 > URL: https://issues.apache.org/jira/browse/FLINK-7055 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination, Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > As a relic from its use by the {{URLClassLoader}}, {{BlobService#getURL()}} > methods always returned {{URL}} objects although they were always pointing to > locally cached files. As a step towards a better architecture and API, these > should return a File object instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7055) refactor BlobService#getURL() methods to return a File object
[ https://issues.apache.org/jira/browse/FLINK-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070237#comment-16070237 ] ASF GitHub Bot commented on FLINK-7055: --- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4236 [FLINK-7055][blob] refactor BlobService#getURL() methods to return a File object As a relic from its use by the `URLClassLoader`, `BlobService#getURL()` methods always returned URL objects although they were always pointing to locally cached files. As a step towards a better architecture and API, these should be renamed and return a File object instead. This PR is based upon #4235 in a series to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7055 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4236.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4236 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe Author: Nico Kruber Date: 2017-06-21T12:45:31Z [FLINK-7054][blob] remove LibraryCacheManager#getFile() This was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. commit 4ae04b68453d4b099f752d6c6fd3c09335ede33a Author: Nico Kruber Date: 2017-06-21T14:14:15Z [FLINK-7055][blob] refactor getURL() to the more generic getFile() The fact that we always returned URL objects is a relic of the BlobServer's only use for URLClassLoader. Since we'd like to extend its use, returning File objects instead is more generic. > refactor BlobService#getURL() methods to return a File object > - > >
[GitHub] flink pull request #4236: [FLINK-7055][blob] refactor BlobService#getURL() m...
GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4236 [FLINK-7055][blob] refactor BlobService#getURL() methods to return a File object As a relic from its use by the `URLClassLoader`, `BlobService#getURL()` methods always returned URL objects although they were always pointing to locally cached files. As a step towards a better architecture and API, these should be renamed and return a File object instead. This PR is based upon #4235 in a series to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7055 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4236.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4236 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe Author: Nico Kruber Date: 2017-06-21T12:45:31Z [FLINK-7054][blob] remove LibraryCacheManager#getFile() This was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. commit 4ae04b68453d4b099f752d6c6fd3c09335ede33a Author: Nico Kruber Date: 2017-06-21T14:14:15Z [FLINK-7055][blob] refactor getURL() to the more generic getFile() The fact that we always returned URL objects is a relic of the BlobServer's only use for URLClassLoader. Since we'd like to extend its use, returning File objects instead is more generic. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7054) remove LibraryCacheManager#getFile()
[ https://issues.apache.org/jira/browse/FLINK-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070232#comment-16070232 ] ASF GitHub Bot commented on FLINK-7054: --- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4235 [FLINK-7054] [blob] remove LibraryCacheManager#getFile() `LibraryCacheManager#getFile()` was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. This PR is based upon #4234 in a series to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7054 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4235.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4235 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe Author: Nico Kruber Date: 2017-06-21T12:45:31Z [FLINK-7054][blob] remove LibraryCacheManager#getFile() This was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. > remove LibraryCacheManager#getFile() > > > Key: FLINK-7054 > URL: https://issues.apache.org/jira/browse/FLINK-7054 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination, Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > {{LibraryCacheManager#getFile()}} was only used in tests where it is > avoidable but if used anywhere else, it may have caused cleanup issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #4235: [FLINK-7054] [blob] remove LibraryCacheManager#get...
GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4235 [FLINK-7054] [blob] remove LibraryCacheManager#getFile() `LibraryCacheManager#getFile()` was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. This PR is based upon #4234 in a series to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7054 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4235.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4235 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe Author: Nico Kruber Date: 2017-06-21T12:45:31Z [FLINK-7054][blob] remove LibraryCacheManager#getFile() This was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7053) improve code quality in some tests
[ https://issues.apache.org/jira/browse/FLINK-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070228#comment-16070228 ] ASF GitHub Bot commented on FLINK-7053: --- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4234 [FLINK-7053] improve code quality in some tests * `BlobClientTest` and `BlobClientSslTest` share a lot of common code * the received buffers there are currently not verified for being equal to the expected one * `TemporaryFolder` should be used throughout blob store tests This PR is based upon #4158 in a series of PRs to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7053 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4234.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4234 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. > improve code quality in some tests > -- > > Key: FLINK-7053 > URL: https://issues.apache.org/jira/browse/FLINK-7053 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination, Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > * {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code > * the received buffers there are currently not verified for being equal to > the expected one > * {{TemporaryFolder}} should be used throughout blob store tests -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-7053) improve code quality in some tests
[ https://issues.apache.org/jira/browse/FLINK-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-7053: --- Description: * {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code * the received buffers there are currently not verified for being equal to the expected one * {{TemporaryFolder}} should be used throughout blob store tests was: * {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code * the received buffers there are currently not verified for being equal to the expected one * {{TemporarFolder}} should be used throughout blob store tests > improve code quality in some tests > -- > > Key: FLINK-7053 > URL: https://issues.apache.org/jira/browse/FLINK-7053 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination, Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > * {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code > * the received buffers there are currently not verified for being equal to > the expected one > * {{TemporaryFolder}} should be used throughout blob store tests -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #4234: [FLINK-7053] improve code quality in some tests
GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/4234 [FLINK-7053] improve code quality in some tests * `BlobClientTest` and `BlobClientSslTest` share a lot of common code * the received buffers there are currently not verified for being equal to the expected one * `TemporaryFolder` should be used throughout blob store tests This PR is based upon #4158 in a series of PRs to implement FLINK-6916. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-6916-7053 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4234.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4234 commit d54a316cfffd8243980df561fd4fcbd99934a40b Author: Nico Kruber Date: 2016-12-20T15:49:57Z [FLINK-6008][docs] minor improvements in the BlobService docs commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f Author: Nico Kruber Date: 2016-12-20T17:27:13Z [FLINK-6008] refactor BlobCache#getURL() for cleaner code commit bbcde52b3105fcf379c852b568f3893cc6052ce6 Author: Nico Kruber Date: 2016-12-21T15:23:29Z [FLINK-6008] do not fail the BlobServer if delete fails also extend the delete tests and remove one code duplication commit dda1a12e40027724efb0e50005e5b57058a220f0 Author: Nico Kruber Date: 2017-01-06T17:42:58Z [FLINK-6008][docs] update some config options to the new, non-deprecated ones commit e12c2348b237207a50649d515a0fbbd19f92e6a0 Author: Nico Kruber Date: 2017-03-09T17:14:02Z [FLINK-6008] use Preconditions.checkArgument in BlobClient commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1 Author: Nico Kruber Date: 2017-03-17T15:21:40Z [FLINK-6008] fix concurrent job directory creation also add according unit tests commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039 Author: Nico Kruber Date: 2017-06-14T16:01:47Z [FLINK-6008] do not guard a delete() call with a check for existence commit 7ba911d7ecb4861261dff8509996be0bd64d6d27 Author: Nico Kruber Date: 2017-04-18T14:37:37Z [FLINK-6008] some comments about BlobLibraryCacheManager cleanup commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde Author: Nico Kruber Date: 2017-04-19T13:39:03Z [hotfix] minor typos commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e Author: Nico Kruber Date: 2017-04-19T14:10:16Z [FLINK-6008] further cleanup tests for BlobLibraryCacheManager commit 23fb6ecd6c43c86d762503339c67953290236dca Author: Nico Kruber Date: 2017-06-30T14:03:16Z [FLINK-6008] address PR comments commit 794764ceeed6b9bbbac08662f5754b218ff86c9c Author: Nico Kruber Date: 2017-06-16T08:51:04Z [FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode commit 774bafa85f242110a2ce7907c1150f8c62d73b3f Author: Nico Kruber Date: 2017-06-21T15:05:57Z [FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE removal commit 4da3b3f6269e43bf1c66621099528824cad9373f Author: Nico Kruber Date: 2017-06-22T15:31:17Z [FLINK-7053][blob] remove code duplication in BlobClientSslTest This lets BlobClientSslTest extend BlobClientTest as most of its implementation came from there and was simply copied. commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48 Author: Nico Kruber Date: 2017-06-23T09:40:34Z [FLINK-7053][blob] verify some of the buffers returned by GET commit c9b693a46053b55b3939ff471184796f12d36a72 Author: Nico Kruber Date: 2017-06-23T10:04:10Z [FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests This replaces the use of some temporary directory where it is not guaranteed that it will be deleted after the test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-7057) move BLOB ref-counting from LibraryCacheManager to BlobCache
Nico Kruber created FLINK-7057: -- Summary: move BLOB ref-counting from LibraryCacheManager to BlobCache Key: FLINK-7057 URL: https://issues.apache.org/jira/browse/FLINK-7057 Project: Flink Issue Type: Sub-task Components: Distributed Coordination, Network Affects Versions: 1.4.0 Reporter: Nico Kruber Assignee: Nico Kruber Currently, the {{LibraryCacheManager}} is doing some ref-counting for JAR files managed by it. Instead, we want the {{BlobCache}} to do that itself for all job-related BLOBs. Also, we do not want to operate on a per-{{BlobKey}} level but rather per job. Therefore, the cleanup process should be adapted, too. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-6916) FLIP-19: Improved BLOB storage architecture
[ https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-6916: --- Component/s: Distributed Coordination > FLIP-19: Improved BLOB storage architecture > --- > > Key: FLINK-6916 > URL: https://issues.apache.org/jira/browse/FLINK-6916 > Project: Flink > Issue Type: Improvement > Components: Distributed Coordination, Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > The current architecture around the BLOB server and cache components seems > rather patched up and has some issues regarding concurrency ([FLINK-6380]), > cleanup, API inconsistencies / currently unused API ([FLINK-6329], > [FLINK-6008]). These make future integration with FLIP-6 or extensions like > offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore > propose an improvement on the current architecture as described below which > tackles these issues, provides some cleanup, and enables further BLOB server > use cases. > Please refer to > https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture > for a full overview on the proposed changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-7052) remove NAME_ADDRESSABLE mode
[ https://issues.apache.org/jira/browse/FLINK-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-7052: --- Component/s: Distributed Coordination > remove NAME_ADDRESSABLE mode > > > Key: FLINK-7052 > URL: https://issues.apache.org/jira/browse/FLINK-7052 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination, Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > Remove the BLOB store's {{NAME_ADDRESSABLE}} mode as it is currently not used > and partly broken. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6789) Remove duplicated test utility reducer in optimizer
[ https://issues.apache.org/jira/browse/FLINK-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070210#comment-16070210 ] ASF GitHub Bot commented on FLINK-6789: --- Github user zhangminglei commented on the issue: https://github.com/apache/flink/pull/4216 Thanks for review. > Remove duplicated test utility reducer in optimizer > --- > > Key: FLINK-6789 > URL: https://issues.apache.org/jira/browse/FLINK-6789 > Project: Flink > Issue Type: Improvement > Components: Optimizer, Tests >Affects Versions: 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > The {{DummyReducer}} and {{SelectOneReducer}} in > {{org.apache.flink.optimizer.testfunctions}} are identical; we could remove > one of them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4216: [FLINK-6789] [optimizer] Remove duplicated test utility r...
Github user zhangminglei commented on the issue: https://github.com/apache/flink/pull/4216 Thanks for review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-7056) add API to allow job-related BLOBs to be stored
Nico Kruber created FLINK-7056: -- Summary: add API to allow job-related BLOBs to be stored Key: FLINK-7056 URL: https://issues.apache.org/jira/browse/FLINK-7056 Project: Flink Issue Type: Sub-task Components: Distributed Coordination, Network Affects Versions: 1.4.0 Reporter: Nico Kruber Assignee: Nico Kruber To ease cleanup, we will make job-related BLOBs be reflected in the blob storage so that they may be removed along with the job. This adds the jobId to many methods similar to the previous code from the {{NAME_ADDRESSABLE}} mode. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7055) refactor BlobService#getURL() methods to return a File object
Nico Kruber created FLINK-7055: -- Summary: refactor BlobService#getURL() methods to return a File object Key: FLINK-7055 URL: https://issues.apache.org/jira/browse/FLINK-7055 Project: Flink Issue Type: Sub-task Components: Distributed Coordination, Network Affects Versions: 1.4.0 Reporter: Nico Kruber Assignee: Nico Kruber As a relic from its use by the {{UrlClassLoader}}, {{BlobService#getURL()}} methods always returned {{URL}} objects although they were always pointing to locally cached files. As a step towards a better architecture and API, these should return a File object instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7054) remove LibraryCacheManager#getFile()
Nico Kruber created FLINK-7054: -- Summary: remove LibraryCacheManager#getFile() Key: FLINK-7054 URL: https://issues.apache.org/jira/browse/FLINK-7054 Project: Flink Issue Type: Sub-task Components: Distributed Coordination, Network Affects Versions: 1.4.0 Reporter: Nico Kruber Assignee: Nico Kruber {{LibraryCacheManager#getFile()}} was only used in tests where it is avoidable but if used anywhere else, it may have caused cleanup issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7053) improve code quality in some tests
Nico Kruber created FLINK-7053: -- Summary: improve code quality in some tests Key: FLINK-7053 URL: https://issues.apache.org/jira/browse/FLINK-7053 Project: Flink Issue Type: Sub-task Components: Distributed Coordination, Network Affects Versions: 1.4.0 Reporter: Nico Kruber Assignee: Nico Kruber * {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code * the received buffers there are currently not verified for being equal to the expected one * {{TemporarFolder}} should be used throughout blob store tests -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4174: [FLINK-6916][blob] more code style and test improvements
Github user NicoK commented on the issue: https://github.com/apache/flink/pull/4174 let's split this up into two parts... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-6916) FLIP-19: Improved BLOB storage architecture
[ https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070173#comment-16070173 ] ASF GitHub Bot commented on FLINK-6916: --- Github user NicoK commented on the issue: https://github.com/apache/flink/pull/4174 let's split this up into two parts... > FLIP-19: Improved BLOB storage architecture > --- > > Key: FLINK-6916 > URL: https://issues.apache.org/jira/browse/FLINK-6916 > Project: Flink > Issue Type: Improvement > Components: Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > The current architecture around the BLOB server and cache components seems > rather patched up and has some issues regarding concurrency ([FLINK-6380]), > cleanup, API inconsistencies / currently unused API ([FLINK-6329], > [FLINK-6008]). These make future integration with FLIP-6 or extensions like > offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore > propose an improvement on the current architecture as described below which > tackles these issues, provides some cleanup, and enables further BLOB server > use cases. > Please refer to > https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture > for a full overview on the proposed changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6916) FLIP-19: Improved BLOB storage architecture
[ https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070174#comment-16070174 ] ASF GitHub Bot commented on FLINK-6916: --- Github user NicoK closed the pull request at: https://github.com/apache/flink/pull/4174 > FLIP-19: Improved BLOB storage architecture > --- > > Key: FLINK-6916 > URL: https://issues.apache.org/jira/browse/FLINK-6916 > Project: Flink > Issue Type: Improvement > Components: Network >Affects Versions: 1.4.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > The current architecture around the BLOB server and cache components seems > rather patched up and has some issues regarding concurrency ([FLINK-6380]), > cleanup, API inconsistencies / currently unused API ([FLINK-6329], > [FLINK-6008]). These make future integration with FLIP-6 or extensions like > offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore > propose an improvement on the current architecture as described below which > tackles these issues, provides some cleanup, and enables further BLOB server > use cases. > Please refer to > https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture > for a full overview on the proposed changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #4174: [FLINK-6916][blob] more code style and test improv...
Github user NicoK closed the pull request at: https://github.com/apache/flink/pull/4174 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7047) Reorganize build profiles
[ https://issues.apache.org/jira/browse/FLINK-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070164#comment-16070164 ] ASF GitHub Bot commented on FLINK-7047: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/4233 I would merge this PR in any case. It's a nice short-term option that is more stable than the previous approach; regardless of how many times we add in flink-runtime we can still test connectors/libs and vice versa. > Reorganize build profiles > - > > Key: FLINK-7047 > URL: https://issues.apache.org/jira/browse/FLINK-7047 > Project: Flink > Issue Type: Improvement > Components: Tests, Travis >Affects Versions: 1.4.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.4.0 > > > With the current build times once again hitting the timeout it is time to > revisit our approach. > The current approach of splitting all tests by name, while easy to maintain > or extend, has the big disadvantage that it's fairly binary in regards to the > timeout: either we're below the timeout and all builds pass, or we're above > and the entire merging process stalls. Furthermore, it requires all modules > to be compiled. > I propose a different approach by which we bundle several modules, only > execute the tests of these modules and skip the compilation of some modules > that are not required for these tests. > 5 groups are my current suggestion, which will result in 10 build profiles > total. > The groups are: > # *core* - core flink modules like core,runtime,streaming-java,metrics,rocksdb > # *libraries* - flink-libraries and flink-storm > # *connectors* - flink-connectors, flink-connector-wikiedits, > flink-tweet-inputformat > # *tests* - flink-tests > # *misc* - flink-yarn, fink-yarn-tests, flink-mesos, flink-examples, > flink-dist > To not increase the total number of profiles to ridiculous numbers i also > propose to only test against 2 combinations of jdk+hadoop+scala: > # oraclejdk8 + hadoop 2.8.0 + scala 2.11 > # openjdk7 + hadoop 2.4.1 + scala 2.10 > My current estimate is that this will cause profiles to take at most 40 > minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4233: [FLINK-7047] [travis] Reorganize build profiles
Github user zentol commented on the issue: https://github.com/apache/flink/pull/4233 I would merge this PR in any case. It's a nice short-term option that is more stable than the previous approach; regardless of how many times we add in flink-runtime we can still test connectors/libs and vice versa. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-7052) remove NAME_ADDRESSABLE mode
Nico Kruber created FLINK-7052: -- Summary: remove NAME_ADDRESSABLE mode Key: FLINK-7052 URL: https://issues.apache.org/jira/browse/FLINK-7052 Project: Flink Issue Type: Sub-task Components: Network Affects Versions: 1.4.0 Reporter: Nico Kruber Assignee: Nico Kruber Remove the BLOB store's {{NAME_ADDRESSABLE}} mode as it is currently not used and partly broken. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-6008) collection of BlobServer improvements
[ https://issues.apache.org/jira/browse/FLINK-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-6008: --- Description: The following things should be improved around the BlobServer/BlobCache: * update config uptions with non-deprecated ones, e.g. {{high-availability.cluster-id}} and {{high-availability.storageDir}} * -promote {{BlobStore#deleteAll(JobID)}} to the {{BlobService}}- * -extend the {{BlobService}} to work with {{NAME_ADDRESSABLE}} blobs (prepares FLINK-4399]- * -remove {{NAME_ADDRESSABLE}} blobs after job/task termination- * do not fail the {{BlobServer}} when a delete operation fails * code style, like using {{Preconditions.checkArgument}} was: The following things should be improved around the BlobServer/BlobCache: * update config uptions with non-deprecated ones, e.g. {{high-availability.cluster-id}} and {{high-availability.storageDir}} * promote {{BlobStore#deleteAll(JobID)}} to the {{BlobService}} * extend the {{BlobService}} to work with {{NAME_ADDRESSABLE}} blobs (prepares FLINK-4399] * remove {{NAME_ADDRESSABLE}} blobs after job/task termination * do not fail the {{BlobServer}} when a delete operation fails * code style, like using {{Preconditions.checkArgument}} > collection of BlobServer improvements > - > > Key: FLINK-6008 > URL: https://issues.apache.org/jira/browse/FLINK-6008 > Project: Flink > Issue Type: Improvement > Components: Network >Affects Versions: 1.3.0 >Reporter: Nico Kruber >Assignee: Nico Kruber > > The following things should be improved around the BlobServer/BlobCache: > * update config uptions with non-deprecated ones, e.g. > {{high-availability.cluster-id}} and {{high-availability.storageDir}} > * -promote {{BlobStore#deleteAll(JobID)}} to the {{BlobService}}- > * -extend the {{BlobService}} to work with {{NAME_ADDRESSABLE}} blobs > (prepares FLINK-4399]- > * -remove {{NAME_ADDRESSABLE}} blobs after job/task termination- > * do not fail the {{BlobServer}} when a delete operation fails > * code style, like using {{Preconditions.checkArgument}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4233: [FLINK-7047] [travis] Reorganize build profiles
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/4233 @zentol could you start a mailing list discussion and describe your preference for splitting the tests in this manner or splitting the repo? Despite heroic efforts by you and Robert keeping the test times under the timeout continues to be a Sisyphean task. The number of tests never truly drops so the test times will only decrease on the off chance that TravisCI bumps the number of cores (or Google increases CPU performance, which they have never done). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7047) Reorganize build profiles
[ https://issues.apache.org/jira/browse/FLINK-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070104#comment-16070104 ] ASF GitHub Bot commented on FLINK-7047: --- Github user greghogan commented on the issue: https://github.com/apache/flink/pull/4233 @zentol could you start a mailing list discussion and describe your preference for splitting the tests in this manner or splitting the repo? Despite heroic efforts by you and Robert keeping the test times under the timeout continues to be a Sisyphean task. The number of tests never truly drops so the test times will only decrease on the off chance that TravisCI bumps the number of cores (or Google increases CPU performance, which they have never done). > Reorganize build profiles > - > > Key: FLINK-7047 > URL: https://issues.apache.org/jira/browse/FLINK-7047 > Project: Flink > Issue Type: Improvement > Components: Tests, Travis >Affects Versions: 1.4.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.4.0 > > > With the current build times once again hitting the timeout it is time to > revisit our approach. > The current approach of splitting all tests by name, while easy to maintain > or extend, has the big disadvantage that it's fairly binary in regards to the > timeout: either we're below the timeout and all builds pass, or we're above > and the entire merging process stalls. Furthermore, it requires all modules > to be compiled. > I propose a different approach by which we bundle several modules, only > execute the tests of these modules and skip the compilation of some modules > that are not required for these tests. > 5 groups are my current suggestion, which will result in 10 build profiles > total. > The groups are: > # *core* - core flink modules like core,runtime,streaming-java,metrics,rocksdb > # *libraries* - flink-libraries and flink-storm > # *connectors* - flink-connectors, flink-connector-wikiedits, > flink-tweet-inputformat > # *tests* - flink-tests > # *misc* - flink-yarn, fink-yarn-tests, flink-mesos, flink-examples, > flink-dist > To not increase the total number of profiles to ridiculous numbers i also > propose to only test against 2 combinations of jdk+hadoop+scala: > # oraclejdk8 + hadoop 2.8.0 + scala 2.11 > # openjdk7 + hadoop 2.4.1 + scala 2.10 > My current estimate is that this will cause profiles to take at most 40 > minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7044) Add methods to the client API that take the stateDescriptor.
[ https://issues.apache.org/jira/browse/FLINK-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070096#comment-16070096 ] ASF GitHub Bot commented on FLINK-7044: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/4225#discussion_r125039522 --- Diff: flink-tests/src/test/java/org/apache/flink/test/query/AbstractQueryableStateITCase.java --- @@ -534,33 +537,66 @@ public Integer getKey(Tuple2 value) throws Exception { * Retry a query for state for keys between 0 and {@link #NUM_SLOTS} until * expected equals the value of the result tuple's second field. */ - private void executeValueQuery(final Deadline deadline, - final QueryableStateClient client, final JobID jobId, - final QueryableStateStream> queryableState, - final long expected) throws Exception { + private void executeValueQuery( --- End diff -- 👍 > Add methods to the client API that take the stateDescriptor. > > > Key: FLINK-7044 > URL: https://issues.apache.org/jira/browse/FLINK-7044 > Project: Flink > Issue Type: Improvement > Components: Queryable State >Affects Versions: 1.3.0, 1.3.1 >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas > Fix For: 1.4.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #4225: [FLINK-7044] [queryable-st] Allow to specify names...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/4225#discussion_r125039522 --- Diff: flink-tests/src/test/java/org/apache/flink/test/query/AbstractQueryableStateITCase.java --- @@ -534,33 +537,66 @@ public Integer getKey(Tuple2 value) throws Exception { * Retry a query for state for keys between 0 and {@link #NUM_SLOTS} until * expected equals the value of the result tuple's second field. */ - private void executeValueQuery(final Deadline deadline, - final QueryableStateClient client, final JobID jobId, - final QueryableStateStream> queryableState, - final long expected) throws Exception { + private void executeValueQuery( --- End diff -- ð --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7044) Add methods to the client API that take the stateDescriptor.
[ https://issues.apache.org/jira/browse/FLINK-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070094#comment-16070094 ] ASF GitHub Bot commented on FLINK-7044: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/4225#discussion_r125039482 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/VoidNamespaceTypeInfo.java --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.state; + +import org.apache.flink.annotation.Public; +import org.apache.flink.annotation.PublicEvolving; +import org.apache.flink.api.common.ExecutionConfig; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.api.common.typeutils.TypeSerializer; + +/** + * {@link TypeInformation} for {@link VoidNamespace}. + */ +@Public +public class VoidNamespaceTypeInfo extends TypeInformation { + + private static final long serialVersionUID = 5453679706408610586L; + + public static final VoidNamespaceTypeInfo INSTANCE = new VoidNamespaceTypeInfo(); + + @Override + @PublicEvolving --- End diff -- sounds good! > Add methods to the client API that take the stateDescriptor. > > > Key: FLINK-7044 > URL: https://issues.apache.org/jira/browse/FLINK-7044 > Project: Flink > Issue Type: Improvement > Components: Queryable State >Affects Versions: 1.3.0, 1.3.1 >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas > Fix For: 1.4.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink pull request #4225: [FLINK-7044] [queryable-st] Allow to specify names...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/4225#discussion_r125039482 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/VoidNamespaceTypeInfo.java --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.state; + +import org.apache.flink.annotation.Public; +import org.apache.flink.annotation.PublicEvolving; +import org.apache.flink.api.common.ExecutionConfig; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.api.common.typeutils.TypeSerializer; + +/** + * {@link TypeInformation} for {@link VoidNamespace}. + */ +@Public +public class VoidNamespaceTypeInfo extends TypeInformation { + + private static final long serialVersionUID = 5453679706408610586L; + + public static final VoidNamespaceTypeInfo INSTANCE = new VoidNamespaceTypeInfo(); + + @Override + @PublicEvolving --- End diff -- sounds good! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-7003) "select * from" in Flink SQL should not flatten all fields in the table by default
[ https://issues.apache.org/jira/browse/FLINK-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther updated FLINK-7003: Issue Type: Sub-task (was: Bug) Parent: FLINK-7051 > "select * from" in Flink SQL should not flatten all fields in the table by > default > -- > > Key: FLINK-7003 > URL: https://issues.apache.org/jira/browse/FLINK-7003 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: Shuyi Chen > > Currently, CompositeRelDataType is extended from > RelRecordType(StructKind.PEEK_FIELDS, ...). In Calcite, > StructKind.PEEK_FIELDS would allow us to peek fields for nested types. > However, when we use "select * from", calcite will flatten all nested fields > that is marked as StructKind.PEEK_FIELDS in the table. > For example, if the table structure *T* is as follows: > {code:java} > VARCHAR K0, > VARCHAR C1, > RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0, > RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1 > {code} > The following query > {code:java} > Select * from T > {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, > F1.C0, F1.C1), which is the current behavior. > After upgrading to Calcite 1.14, this issue should change the type of > {{CompositeRelDataType}} to {{StructKind.PEEK_FIELDS_NO_FLATTENING}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-5031) Consecutive DataStream.split() ignored
[ https://issues.apache.org/jira/browse/FLINK-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070089#comment-16070089 ] ASF GitHub Bot commented on FLINK-5031: --- Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/2847 I wrote a somewhat longer comment on the issue explaining some stuff: https://issues.apache.org/jira/browse/FLINK-5031?focusedCommentId=16070088&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16070088 > Consecutive DataStream.split() ignored > -- > > Key: FLINK-5031 > URL: https://issues.apache.org/jira/browse/FLINK-5031 > Project: Flink > Issue Type: Bug > Components: DataStream API >Affects Versions: 1.2.0, 1.1.3 >Reporter: Fabian Hueske >Assignee: Renkai Ge > > The output of the following program > {code} > static final class ThresholdSelector implements OutputSelector { > long threshold; > public ThresholdSelector(long threshold) { > this.threshold = threshold; > } > @Override > public Iterable select(Long value) { > if (value < threshold) { > return Collections.singletonList("Less"); > } else { > return Collections.singletonList("GreaterEqual"); > } > } > } > public static void main(String[] args) throws Exception { > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.setParallelism(1); > SplitStream split1 = env.generateSequence(1, 11) > .split(new ThresholdSelector(6)); > // stream11 should be [1,2,3,4,5] > DataStream stream11 = split1.select("Less"); > SplitStream split2 = stream11 > //.map(new MapFunction() { > //@Override > //public Long map(Long value) throws Exception { > //return value; > //} > //}) > .split(new ThresholdSelector(3)); > DataStream stream21 = split2.select("Less"); > // stream21 should be [1,2] > stream21.print(); > env.execute(); > } > {code} > should be {{1, 2}}, however it is {{1, 2, 3, 4, 5}}. It seems that the second > {{split}} operation is ignored. > The program is correctly evaluate if the identity {{MapFunction}} is added to > the program. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-5031) Consecutive DataStream.split() ignored
[ https://issues.apache.org/jira/browse/FLINK-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070088#comment-16070088 ] Aljoscha Krettek commented on FLINK-5031: - Actually, the current behaviour seems a bit more complicated than "union". In this example (from [~RenkaiGe]'s PR): {code} StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(1); env.setBufferTimeout(1); DataStream ds = env.generateSequence(0,11); SplitStream consecutiveSplit = ds.split(new OutputSelector() { @Override public Iterable select(Long value) { List s = new ArrayList(); if (value <= 5) { s.add("Less"); } else { s.add("GreaterEqual"); } return s; } }).select("Less") .split(new OutputSelector() { @Override public Iterable select(Long value) { List s = new ArrayList(); if (value % 2 == 0) { s.add("Even"); } else { s.add("Odd"); } return s; } }); consecutiveSplit.select("Even").addSink(smallEvenSink); consecutiveSplit.select("Odd").addSink(smallOddSink); env.execute(); {code} the output with the current master is {{0, 2, 4, 6, 8, 10}}. It works like this: {{split()}} operations "attach" tags to elements, then only the last {{select()}} is taken into account when deciding what to forward. The reason why the example with "Less" seems to output the union is that they both attach the same tag name. In the "Less"/"Even" example you can see that only the "Even" elements are selected and not the "Less" ones. It would be easy to change the behaviour to truly union, i.e. {{0, 1, 2, 3, 4, 5, 6, 8, 10}}, in fact I have a branch that does that: https://github.com/aljoscha/flink/tree/finish-pr-2847-split-select-intersection (The names/titles are misleading because I tried doing "intersection" but it's not quite possible currently). Providing "intersection" behaviour without introducing an identity operator (as in [~RenkaiGe]'s PR) is not possible because of how the output is sent along from one operator to the next in a chain and I would be very hesitant about adding a dummy operator for this case. To conclude, I would actually favour not fixing this at all and instead deprecate split/select because it is superseded by the strictly more powerful side outputs. What do you think [~fhueske] as the original reporter of the issue? > Consecutive DataStream.split() ignored > -- > > Key: FLINK-5031 > URL: https://issues.apache.org/jira/browse/FLINK-5031 > Project: Flink > Issue Type: Bug > Components: DataStream API >Affects Versions: 1.2.0, 1.1.3 >Reporter: Fabian Hueske >Assignee: Renkai Ge > > The output of the following program > {code} > static final class ThresholdSelector implements OutputSelector { > long threshold; > public ThresholdSelector(long threshold) { > this.threshold = threshold; > } > @Override > public Iterable select(Long value) { > if (value < threshold) { > return Collections.singletonList("Less"); > } else { > return Collections.singletonList("GreaterEqual"); > } > } > } > public static void main(String[] args) throws Exception { > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.setParallelism(1); > SplitStream split1 = env.generateSequence(1, 11) > .split(new ThresholdSelector(6)); > // stream11 should be [1,2,3,4,5] > DataStream stream11 = split1.select("Less"); > SplitStream split2 = stream11 > //.map(new MapFunction() { > //@Override > //public Long map(Long value) throws Exception { > //return value; > //} > //}) > .split(new ThresholdSelector(3)); > DataStream stream21 = split2.select("Less"); > // stream21 should be [1,2] > stream21.print(); > env.execute(); > } > {code} > should be {{1, 2}}, however it is {{1, 2, 3, 4, 5}}. It seems that the second > {{split}} operation is ignored. > The program is correctly evaluate if the identity {{MapFunction}} is added to > the program. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #2847: [FLINK-5031]Consecutive DataStream.split() ignored
Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/2847 I wrote a somewhat longer comment on the issue explaining some stuff: https://issues.apache.org/jira/browse/FLINK-5031?focusedCommentId=16070088&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16070088 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-7051) Bump up Calcite version to 1.14
Timo Walther created FLINK-7051: --- Summary: Bump up Calcite version to 1.14 Key: FLINK-7051 URL: https://issues.apache.org/jira/browse/FLINK-7051 Project: Flink Issue Type: New Feature Components: Table API & SQL Reporter: Timo Walther This is an umbrella issue for all tasks that need to be done once Apache Calcite 1.14 is released. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-7003) "select * from" in Flink SQL should not flatten all fields in the table by default
[ https://issues.apache.org/jira/browse/FLINK-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther updated FLINK-7003: Description: Currently, CompositeRelDataType is extended from RelRecordType(StructKind.PEEK_FIELDS, ...). In Calcite, StructKind.PEEK_FIELDS would allow us to peek fields for nested types. However, when we use "select * from", calcite will flatten all nested fields that is marked as StructKind.PEEK_FIELDS in the table. For example, if the table structure *T* is as follows: {code:java} VARCHAR K0, VARCHAR C1, RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0, RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1 {code} The following query {code:java} Select * from T {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, F1.C0, F1.C1), which is the current behavior. After upgrading to Calcite 1.14, this issue should change the type of {{CompositeRelDataType}} to {{StructKind.PEEK_FIELDS_NO_FLATTENING}}. was: Currently, CompositeRelDataType is extended from RelRecordType(StructKind.PEEK_FIELDS, ...). In Calcite, StructKind.PEEK_FIELDS would allow us to peek fields for nested types. However, when we use "select * from", calcite will flatten all nested fields that is marked as StructKind.PEEK_FIELDS in the table. For example, if the table structure *T* is as follows: {code:java} VARCHAR K0, VARCHAR C1, RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0, RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1 {code} The following query {code:java} Select * from T {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, F1.C0, F1.C1), which is the current behavior. > "select * from" in Flink SQL should not flatten all fields in the table by > default > -- > > Key: FLINK-7003 > URL: https://issues.apache.org/jira/browse/FLINK-7003 > Project: Flink > Issue Type: Bug > Components: Table API & SQL >Reporter: Shuyi Chen > > Currently, CompositeRelDataType is extended from > RelRecordType(StructKind.PEEK_FIELDS, ...). In Calcite, > StructKind.PEEK_FIELDS would allow us to peek fields for nested types. > However, when we use "select * from", calcite will flatten all nested fields > that is marked as StructKind.PEEK_FIELDS in the table. > For example, if the table structure *T* is as follows: > {code:java} > VARCHAR K0, > VARCHAR C1, > RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0, > RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1 > {code} > The following query > {code:java} > Select * from T > {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, > F1.C0, F1.C1), which is the current behavior. > After upgrading to Calcite 1.14, this issue should change the type of > {{CompositeRelDataType}} to {{StructKind.PEEK_FIELDS_NO_FLATTENING}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7048) Define javadoc skipping in travis watchdog script
[ https://issues.apache.org/jira/browse/FLINK-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070084#comment-16070084 ] ASF GitHub Bot commented on FLINK-7048: --- Github user greghogan commented on the issue: https://github.com/apache/flink/pull/4227 +1 > Define javadoc skipping in travis watchdog script > - > > Key: FLINK-7048 > URL: https://issues.apache.org/jira/browse/FLINK-7048 > Project: Flink > Issue Type: Improvement > Components: Travis >Affects Versions: 1.4.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Trivial > Fix For: 1.4.0 > > > For all builds on travis we currently skip building the javadocs to save some > time. For this purpose we added {{-Dmaven.skip.javadocs=true}} to each build > profile in {{.travis.yml}} > I would like to move the declaration of this into the travis watchdog script, > as it is obfuscating the profile view for travis builds. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4227: [FLINK-7048] [travis] Define javadoc skipping in travis w...
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/4227 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7046) Hide logging about downloaded artifacts on travis
[ https://issues.apache.org/jira/browse/FLINK-7046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070082#comment-16070082 ] ASF GitHub Bot commented on FLINK-7046: --- Github user greghogan commented on the issue: https://github.com/apache/flink/pull/4226 +1! > Hide logging about downloaded artifacts on travis > - > > Key: FLINK-7046 > URL: https://issues.apache.org/jira/browse/FLINK-7046 > Project: Flink > Issue Type: Improvement > Components: Travis >Affects Versions: 1.4.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Trivial > Fix For: 1.4.0 > > > We can reduce the verbosity of the travis logs by hiding messages about > downloaded artifacts, such as this: > {code} > [INFO] Downloading: > https://repo.maven.apache.org/maven2/org/eclipse/tycho/tycho-compiler-jdt/0.21.0/tycho-compiler-jdt-0.21.0.pom > [INFO] Downloaded: > https://repo.maven.apache.org/maven2/org/eclipse/tycho/tycho-compiler-jdt/0.21.0/tycho-compiler-jdt-0.21.0.pom > (2 KB at 62.0 KB/sec) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] flink issue #4226: [FLINK-7046] [travis] Hide download logging messages
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/4226 +1! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---