[jira] [Created] (FLINK-34477) support capture groups in REGEXP_REPLACE
David Anderson created FLINK-34477: -- Summary: support capture groups in REGEXP_REPLACE Key: FLINK-34477 URL: https://issues.apache.org/jira/browse/FLINK-34477 Project: Flink Issue Type: Improvement Components: Table SQL / API Reporter: David Anderson For example, I would expect this query {code:java} {code} {{select REGEXP_REPLACE('ERR1,ERR2', '([^,]+)', 'AA$1AA'); }} to produce {code:java} AAERR1AA,AAERR2AA{code} but instead it produces {code:java} AA$1AA,AA$1AA{code} With FLINK-9990 support was added for REGEXP_EXTRACT, which does provide access to the capture groups, but for many use cases supporting this in the way that users expect in REGEXP_REPLACE would be more natural and convenient. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32099) create flink_data volume for operations playground
David Anderson created FLINK-32099: -- Summary: create flink_data volume for operations playground Key: FLINK-32099 URL: https://issues.apache.org/jira/browse/FLINK-32099 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Affects Versions: 1.17.0 Reporter: David Anderson The docker-based operations playground instructs the user to create temp directories on the host machine for checkpoints and savepoints that are then mounted in the containers. This can be problematic on windows machines. It would be better to use a docker volume. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31388) restart from savepoint fails with "userVisibleTail should not be larger than offset. This is a bug."
David Anderson created FLINK-31388: -- Summary: restart from savepoint fails with "userVisibleTail should not be larger than offset. This is a bug." Key: FLINK-31388 URL: https://issues.apache.org/jira/browse/FLINK-31388 Project: Flink Issue Type: Bug Components: Table SQL / Client Affects Versions: 1.16.1 Reporter: David Anderson I took a savepoint, then used {code:java} SET 'execution.savepoint.path' = ... {code} to set the savepoint path, and then re-executed the query that had been running before the stop-with-savepoint. It was not an INSERT INTO job, but rather a "collect" job running a SELECT query. It then failed with {code:java} userVisibleTail should not be larger than offset. This is a bug. {code} Perhaps there is an unstated requirement that using the sql-client to restart from a savepoint only works with INSERT INTO jobs? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31361) job created by sql-client can't authenticate to kafka, can't find org.apache.kafka.common.security.plain.PlainLoginModule
David Anderson created FLINK-31361: -- Summary: job created by sql-client can't authenticate to kafka, can't find org.apache.kafka.common.security.plain.PlainLoginModule Key: FLINK-31361 URL: https://issues.apache.org/jira/browse/FLINK-31361 Project: Flink Issue Type: Bug Components: Connectors / Kafka Affects Versions: 1.16.1 Reporter: David Anderson I'm working with this SQL DDL: {noformat} CREATE TABLE pageviews_sink ( `url` STRING, `user_id` STRING, `browser` STRING, `ts` TIMESTAMP_LTZ(3) ) WITH ( 'connector' = 'kafka', 'topic' = 'pageviews', 'properties.bootstrap.servers' = 'xxx.confluent.cloud:9092', 'properties.security.protocol'='SASL_SSL', 'properties.sasl.mechanism'='PLAIN', 'properties.sasl.jaas.config'='org.apache.kafka.common.security.plain.PlainLoginModule required username="xxx" password="xxx";', 'key.format' = 'json', 'key.fields' = 'url', 'value.format' = 'json' ); {noformat} With {{flink-sql-connector-kafka-1.16.1.jar}} in the lib directory, this fails with {noformat} Caused by: javax.security.auth.login.LoginException: No LoginModule found for org.apache.kafka.common.security.plain.PlainLoginModule{noformat} As a workaround I've found that it does work if I provide both {{flink-connector-kafka-1.16.1.jar}} {{kafka-clients-3.2.3.jar}} in the lib directory. It seems like the relocation applied in the SQL connector isn't working properly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30563) Update training exercises to use Flink 1.16
David Anderson created FLINK-30563: -- Summary: Update training exercises to use Flink 1.16 Key: FLINK-30563 URL: https://issues.apache.org/jira/browse/FLINK-30563 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Affects Versions: 1.16.0 Reporter: David Anderson The training exercises in the [flink-training|https://github.com/apache/flink-training] repo need to be updated to use Flink 1.16. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30442) Update table walkthrough playground for 1.12
David Anderson created FLINK-30442: -- Summary: Update table walkthrough playground for 1.12 Key: FLINK-30442 URL: https://issues.apache.org/jira/browse/FLINK-30442 Project: Flink Issue Type: Sub-task Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30440) Update operations playground for 1.16
David Anderson created FLINK-30440: -- Summary: Update operations playground for 1.16 Key: FLINK-30440 URL: https://issues.apache.org/jira/browse/FLINK-30440 Project: Flink Issue Type: Sub-task Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30441) Update pyflink walkthrough playground for 1.16
David Anderson created FLINK-30441: -- Summary: Update pyflink walkthrough playground for 1.16 Key: FLINK-30441 URL: https://issues.apache.org/jira/browse/FLINK-30441 Project: Flink Issue Type: Sub-task Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30439) Update playgrounds for 1.16
David Anderson created FLINK-30439: -- Summary: Update playgrounds for 1.16 Key: FLINK-30439 URL: https://issues.apache.org/jira/browse/FLINK-30439 Project: Flink Issue Type: Improvement Components: Documentation / Training Reporter: David Anderson Fix For: 1.16.0 All of the playgrounds should be updated for Flink 1.16. This should include reworking the code as necessary to avoid using anything that has been deprecated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28975) withIdleness marks all streams from FLIP-27 sources as idle
David Anderson created FLINK-28975: -- Summary: withIdleness marks all streams from FLIP-27 sources as idle Key: FLINK-28975 URL: https://issues.apache.org/jira/browse/FLINK-28975 Project: Flink Issue Type: Bug Components: API / DataStream Affects Versions: 1.15.1 Reporter: David Anderson Fix For: 1.16.0 Using withIdleness with a FLIP-27 source leads to all of the streams from the source being marked idle, which in turn leads to incorrect results, e.g., from joins that rely on watermarks. Quoting from the user ML thread: In org.apache.flink.streaming.api.operators.SourceOperator, there are separate instances of WatermarksWithIdleness created for each split output and the main output. There is multiplexing of watermarks between split outputs but no multiplexing between split output and main output. For a source such as org.apache.flink.connector.kafka.source.KafkaSource, {color:#353833}there is only output from splits and no output from main. Hence the main output will (after an initial timeout) be marked as idle.{color} {color:#353833} {color} {color:#353833}The implementation of {color}WatermarksWithIdleness is such that once an output is idle, it will periodically re-mark the output as idle. Since there is no multiplexing between split outputs and main output, the idle marks coming from main output will repeatedly set the output to idle even though there are events from the splits. Result is that the entire source is repeatedly marked as idle. See this ML thread for more details: [https://lists.apache.org/thread/bbokccohs16tzkdtybqtv1vx76gqkqj4] This probably affects older versions of Flink as well, but that needs to be verified. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28754) document that Java 8 is required to build table store
David Anderson created FLINK-28754: -- Summary: document that Java 8 is required to build table store Key: FLINK-28754 URL: https://issues.apache.org/jira/browse/FLINK-28754 Project: Flink Issue Type: Improvement Components: Documentation, Table Store Reporter: David Anderson The table store can not be built with Java 11, but the "build from source" instructions don't mention this restriction. https://nightlies.apache.org/flink/flink-table-store-docs-master/docs/engines/build/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-27513) Update table walkthrough playground for 1.15
David Anderson created FLINK-27513: -- Summary: Update table walkthrough playground for 1.15 Key: FLINK-27513 URL: https://issues.apache.org/jira/browse/FLINK-27513 Project: Flink Issue Type: Sub-task Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27512) Update pyflink walkthrough playground for 1.15
David Anderson created FLINK-27512: -- Summary: Update pyflink walkthrough playground for 1.15 Key: FLINK-27512 URL: https://issues.apache.org/jira/browse/FLINK-27512 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27511) Update operations playground for 1.15
David Anderson created FLINK-27511: -- Summary: Update operations playground for 1.15 Key: FLINK-27511 URL: https://issues.apache.org/jira/browse/FLINK-27511 Project: Flink Issue Type: Sub-task Components: Documentation / Training / Exercises Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27510) update playgrounds for Flink 1.15
David Anderson created FLINK-27510: -- Summary: update playgrounds for Flink 1.15 Key: FLINK-27510 URL: https://issues.apache.org/jira/browse/FLINK-27510 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Affects Versions: 1.15.0 Reporter: David Anderson All of the playgrounds should be updated for Flink 1.15. This should include reworking the code as necessary to avoid using anything that has been deprecated. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27509) Update table walkthrough playground for 1.14
David Anderson created FLINK-27509: -- Summary: Update table walkthrough playground for 1.14 Key: FLINK-27509 URL: https://issues.apache.org/jira/browse/FLINK-27509 Project: Flink Issue Type: Sub-task Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27508) Update pyflink walkthrough playground for 1.14
David Anderson created FLINK-27508: -- Summary: Update pyflink walkthrough playground for 1.14 Key: FLINK-27508 URL: https://issues.apache.org/jira/browse/FLINK-27508 Project: Flink Issue Type: Sub-task Components: Documentation / Training / Exercises Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27507) Update operations playground for 1.14
David Anderson created FLINK-27507: -- Summary: Update operations playground for 1.14 Key: FLINK-27507 URL: https://issues.apache.org/jira/browse/FLINK-27507 Project: Flink Issue Type: Sub-task Affects Versions: 1.14.4 Reporter: David Anderson The operations playground has yet to be updated for 1.14. At this point, it may as well be configured to use the latest 1.14.x release. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27506) update playgrounds for Flink 1.14
David Anderson created FLINK-27506: -- Summary: update playgrounds for Flink 1.14 Key: FLINK-27506 URL: https://issues.apache.org/jira/browse/FLINK-27506 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Affects Versions: 1.14.4 Reporter: David Anderson All of the flink-playgrounds need to be updated for 1.14. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27456) mistake and confusion with CEP example in docs
David Anderson created FLINK-27456: -- Summary: mistake and confusion with CEP example in docs Key: FLINK-27456 URL: https://issues.apache.org/jira/browse/FLINK-27456 Project: Flink Issue Type: Bug Components: Documentation, Library / CEP Affects Versions: 1.14.4 Reporter: David Anderson [https://nightlies.apache.org/flink/flink-docs-master/docs/libs/cep/#contiguity-within-looping-patterns] In the section of the docs on contiguity within looping patterns, what it says about strict contiguity for the given example is either incorrect or very confusing (or both). It doesn't help that the example code doesn't precisely match the scenario described in the text. To study this, I implemented the example in the text and find it produces no output for strict contiguity (as I expected), which contradicts what the text says. {code:java} public class StreamingJob { public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); DataStream events = env.fromElements("a", "b1", "d1", "b2", "d2", "b3", "c"); AfterMatchSkipStrategy skipStrategy = AfterMatchSkipStrategy.skipPastLastEvent(); Pattern pattern = Pattern.begin("a", skipStrategy) .where( new SimpleCondition() { @Override public boolean filter(String element) throws Exception { return element.startsWith("a"); } }) .next("b+") .where( new SimpleCondition() { @Override public boolean filter(String element) throws Exception { return element.startsWith("b"); } }) .oneOrMore().consecutive() .next("c") .where( new SimpleCondition() { @Override public boolean filter(String element) throws Exception { return element.startsWith("c"); } }); PatternStream patternStream = CEP.pattern(events, pattern).inProcessingTime(); patternStream.select(new SelectSegment()).addSink(new PrintSinkFunction<>(true)); env.execute(); } public static class SelectSegment implements PatternSelectFunction { public String select(Map> pattern) { return String.join("", pattern.get("a")) + String.join("", pattern.get("b+")) + String.join("", pattern.get("c")); } } } {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27184) Optimize IntervalJoinOperator by using temporal state
David Anderson created FLINK-27184: -- Summary: Optimize IntervalJoinOperator by using temporal state Key: FLINK-27184 URL: https://issues.apache.org/jira/browse/FLINK-27184 Project: Flink Issue Type: Sub-task Reporter: David Anderson The performance of interval joins on RocksDB can be optimized by using temporal state. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27183) Optimize CepOperator by using temporal state
David Anderson created FLINK-27183: -- Summary: Optimize CepOperator by using temporal state Key: FLINK-27183 URL: https://issues.apache.org/jira/browse/FLINK-27183 Project: Flink Issue Type: Sub-task Components: Library / CEP Reporter: David Anderson The performance of CEP on RocksDB can be significantly improved by having it use temporal state. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27182) Optimize RowTimeSortOperator by using temporal state
David Anderson created FLINK-27182: -- Summary: Optimize RowTimeSortOperator by using temporal state Key: FLINK-27182 URL: https://issues.apache.org/jira/browse/FLINK-27182 Project: Flink Issue Type: Sub-task Components: Table SQL / Runtime Reporter: David Anderson The performance of the RowTimeSortOperator can be significantly improved by using temporal state. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27181) Optimize TemporalRowTimeJoinOperator by using temporal state
David Anderson created FLINK-27181: -- Summary: Optimize TemporalRowTimeJoinOperator by using temporal state Key: FLINK-27181 URL: https://issues.apache.org/jira/browse/FLINK-27181 Project: Flink Issue Type: Sub-task Components: Table SQL / Runtime Reporter: David Anderson The throughput of the TemporalRowTimeJoinOperator can be significantly improved by using temporal state in its implementation. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27180) Docs for temporal state
David Anderson created FLINK-27180: -- Summary: Docs for temporal state Key: FLINK-27180 URL: https://issues.apache.org/jira/browse/FLINK-27180 Project: Flink Issue Type: Sub-task Components: Documentation Reporter: David Anderson Update [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#using-keyed-state] to include temporal state. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27179) Update training example to use temporal state
David Anderson created FLINK-27179: -- Summary: Update training example to use temporal state Key: FLINK-27179 URL: https://issues.apache.org/jira/browse/FLINK-27179 Project: Flink Issue Type: Sub-task Components: Documentation / Training Reporter: David Anderson [https://nightlies.apache.org/flink/flink-docs-master/docs/learn-flink/event_driven/#example] is a good use case for temporal state (this example is doing windowing in a keyed process function). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27178) create examples that use temporal state
David Anderson created FLINK-27178: -- Summary: create examples that use temporal state Key: FLINK-27178 URL: https://issues.apache.org/jira/browse/FLINK-27178 Project: Flink Issue Type: Sub-task Components: Examples Reporter: David Anderson Add examples showing how to use temporal state. E.g., sorting and/or a temporal join. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27177) Implement Temporal State
David Anderson created FLINK-27177: -- Summary: Implement Temporal State Key: FLINK-27177 URL: https://issues.apache.org/jira/browse/FLINK-27177 Project: Flink Issue Type: Sub-task Components: Runtime / State Backends Reporter: David Anderson Following the plan in [FLIP-220|[https://cwiki.apache.org/confluence/x/Xo_FD]|https://cwiki.apache.org/confluence/x/Xo_FD],] * add methods to the RuntimeContext and KeyedStateStore interfaces for registering TemporalValueState and TemporalListState * h3. implement TemporalValueState and TemporalListState -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27176) FLIP-220: Temporal State
David Anderson created FLINK-27176: -- Summary: FLIP-220: Temporal State Key: FLINK-27176 URL: https://issues.apache.org/jira/browse/FLINK-27176 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Reporter: David Anderson Task for implementing [FLIP-220: Temporal State| https://cwiki.apache.org/confluence/x/Xo_FD] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-24478) gradle quickstart is out-of-date
David Anderson created FLINK-24478: -- Summary: gradle quickstart is out-of-date Key: FLINK-24478 URL: https://issues.apache.org/jira/browse/FLINK-24478 Project: Flink Issue Type: Improvement Affects Versions: 1.14.0 Reporter: David Anderson Assignee: Nico Kruber The gradle quickstart, as described in the docs, and produced by {{bash -c "$(curl [https://flink.apache.org/q/gradle-quickstart.sh])" – 1.14.0 _2.11}} is out of date, and it has some obvious errors. E.g., it defines scalaBinaryVersion as '_2.11', and then has {{flinkShadowJar "org.apache.flink:flink-connector-kafka-0.11_${scalaBinaryVersion}:${flinkVersion}"}} which is both ancient and includes the _ again. (I realize now that the extra _ actually comes from the bash command I copied from the docs, so the docs need to be fixed as well.) The quickstart also doesn't produce a gradlew script, and if I try {{gradle build}} I get this output: {{$ gradle build Starting a Gradle Daemon (subsequent builds will be faster) FAILURE: Build failed with an exception. * Where: Build file '/Users/david/stuff/quickstart/build.gradle' line: 41 * What went wrong: A problem occurred evaluating root project 'quickstart'. > Cannot add task 'wrapper' as a task with that name already exists}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24118) enable TaxiFareGenerator to produce a bounded stream
David Anderson created FLINK-24118: -- Summary: enable TaxiFareGenerator to produce a bounded stream Key: FLINK-24118 URL: https://issues.apache.org/jira/browse/FLINK-24118 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Reporter: David Anderson Assignee: David Anderson I would like to use the TaxiFareGenerator in tests for the training exercises. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23926) change TaxiRide data model to have a single timestamp
David Anderson created FLINK-23926: -- Summary: change TaxiRide data model to have a single timestamp Key: FLINK-23926 URL: https://issues.apache.org/jira/browse/FLINK-23926 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Reporter: David Anderson Assignee: David Anderson The current TaxiRide events have two timestamps – the startTime and endTime. Which timestamp applies to a given event depends on the value of the isStart field. This is awkward, and unnecessary. It would be better to have a single eventTime field. This will make the exercises better examples, and allow for more straightforward conversion from DataStream to Table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23840) Confusing message from MemCheckpointStreamFactory#checkSize
David Anderson created FLINK-23840: -- Summary: Confusing message from MemCheckpointStreamFactory#checkSize Key: FLINK-23840 URL: https://issues.apache.org/jira/browse/FLINK-23840 Project: Flink Issue Type: Technical Debt Components: Runtime / State Backends Affects Versions: 1.13.2 Reporter: David Anderson Fix For: 1.14.0 After the refactoring of the state backends and checkpoint storage done in 1.13, some folks who were using either the filesystem state backend or the rocksdb state backend find themselves accidentally using JobManagerCheckpointStorage (because it is the default), and then are very confused by this error message: {{throw new IOException(}} {{ "Size of the state is larger than the maximum permitted memory-backed state. Size="}} {{ + size}} {{ + " , maxSize="}} {{ + maxSize}} {{ + " . Consider using a different state backend, like the File System State backend.");}} This should instead say something like {quote}Consider using FileSystemCheckpointStorage instead. {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23653) improve training exercises and tests so they are better examples
David Anderson created FLINK-23653: -- Summary: improve training exercises and tests so they are better examples Key: FLINK-23653 URL: https://issues.apache.org/jira/browse/FLINK-23653 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Reporter: David Anderson The tests for the training exercises are implemented in a way that permits the same tests to be used for both the exercises and the solutions, and for both the Java and Scala implementations. The way that this was done is a bit awkward. It would be better to * eliminate the ExerciseBase class and its mechanisms for setting the source(s) and sink and parallelism * have tests that run with parallelism > 1 * speed up the tests by using MiniClusterWithClientResource It's also the case that the watermarking is done by calling emitWatermark in the sources. This is confusing; the watermarking should be visibly implemented in the exercises and solutions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23128) Translate update to operations playground docs to Chinese
David Anderson created FLINK-23128: -- Summary: Translate update to operations playground docs to Chinese Key: FLINK-23128 URL: https://issues.apache.org/jira/browse/FLINK-23128 Project: Flink Issue Type: Sub-task Components: Documentation / Training Affects Versions: 1.13.1 Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23100) Update pyflink walkthrough playground for 1.13
David Anderson created FLINK-23100: -- Summary: Update pyflink walkthrough playground for 1.13 Key: FLINK-23100 URL: https://issues.apache.org/jira/browse/FLINK-23100 Project: Flink Issue Type: Sub-task Components: Documentation / Training / Exercises Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23099) Update table walkthrough playground for 1.13
David Anderson created FLINK-23099: -- Summary: Update table walkthrough playground for 1.13 Key: FLINK-23099 URL: https://issues.apache.org/jira/browse/FLINK-23099 Project: Flink Issue Type: Sub-task Components: Documentation / Training / Exercises Reporter: David Anderson Assignee: David Anderson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23098) Update operations playground for 1.13
David Anderson created FLINK-23098: -- Summary: Update operations playground for 1.13 Key: FLINK-23098 URL: https://issues.apache.org/jira/browse/FLINK-23098 Project: Flink Issue Type: Sub-task Components: Documentation / Training / Exercises Affects Versions: 1.13.0 Reporter: David Anderson Assignee: David Anderson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23059) Update playgrounds for Flink 1.13
David Anderson created FLINK-23059: -- Summary: Update playgrounds for Flink 1.13 Key: FLINK-23059 URL: https://issues.apache.org/jira/browse/FLINK-23059 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Affects Versions: 1.13.0 Reporter: David Anderson Assignee: David Anderson The various playgrounds in apache/flink-playgrounds all need an update for the 1.13 release. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22948) Scala example for toDataStream does not compile
David Anderson created FLINK-22948: -- Summary: Scala example for toDataStream does not compile Key: FLINK-22948 URL: https://issues.apache.org/jira/browse/FLINK-22948 Project: Flink Issue Type: Bug Components: Documentation, Table SQL / API Affects Versions: 1.13.1 Reporter: David Anderson The scala example at [https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/data_stream_api/#examples-for-todatastream] does not compile – {{User.class}} should be {{classOf[User]}}. It would also be better to show the table DDL as {{tableEnv.executeSql(}} {{ """}} {{ CREATE TABLE GeneratedTable (}} {{ name STRING,}} {{ score INT,}} {{ event_time TIMESTAMP_LTZ(3),}} {{ WATERMARK FOR event_time AS event_time - INTERVAL '10' SECOND}} {{ )}} {{ WITH ('connector'='datagen')}} {{ """}} {{)}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22894) Window Top-N should allow n=1
David Anderson created FLINK-22894: -- Summary: Window Top-N should allow n=1 Key: FLINK-22894 URL: https://issues.apache.org/jira/browse/FLINK-22894 Project: Flink Issue Type: Bug Components: Table SQL / Runtime Affects Versions: 1.13.1 Reporter: David Anderson I tried to reimplement the Hourly Tips exercise from the DataStream training using Flink SQL. The objective of this exercise is to find the one taxi driver who earned the most in tips during each hour, and report that driver's driverId and the sum of their tips. This can be expressed as a window top-n query, where n=1, as in {{FROM (}} {{ SELECT *, ROW_NUMBER() OVER }}{{(PARTITION BY window_start, window_end ORDER BY sumOfTips DESC) as rownum}} {{ FROM ( }} {{ SELECT driverId, window_start, window_end, sum(tip) as sumOfTips}} {{ FROM TABLE( }} {{ TUMBLE(TABLE fares, DESCRIPTOR(startTime), INTERVAL '1' HOUR))}} {{ GROUP BY driverId, window_start, window_end}} {{ )}} {{) WHERE rownum = 1;}} This fails because the {{WindowRankOperatorBuilder}} insists on {{rankEnd > 1. }}So, in other words, while it is possible to report the top 2 drivers, or the driver in 2nd place, it's not possible to report only the top driver. This appears to be an off-by-one error in the range checking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22868) Update training exercises for 1.13
David Anderson created FLINK-22868: -- Summary: Update training exercises for 1.13 Key: FLINK-22868 URL: https://issues.apache.org/jira/browse/FLINK-22868 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Affects Versions: 1.13.1, 1.13.0 Reporter: David Anderson Assignee: David Anderson The exercises in the flink-training repo need to be updated for 1.13. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22737) Add support for CURRENT_WATERMARK to SQL
David Anderson created FLINK-22737: -- Summary: Add support for CURRENT_WATERMARK to SQL Key: FLINK-22737 URL: https://issues.apache.org/jira/browse/FLINK-22737 Project: Flink Issue Type: Sub-task Components: Table SQL / API Reporter: David Anderson With a built-in function returning the current watermark, one could operate on late events without resorting to using the DataStream API. Called with zero parameters, this function returns the current watermark for the current row – if there is an event time attribute. Otherwise, it returns NULL. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22543) layout of exception history tab isn't very usable with Flink SQL
David Anderson created FLINK-22543: -- Summary: layout of exception history tab isn't very usable with Flink SQL Key: FLINK-22543 URL: https://issues.apache.org/jira/browse/FLINK-22543 Project: Flink Issue Type: Improvement Components: Runtime / Web Frontend Affects Versions: 1.13.0 Reporter: David Anderson Attachments: image-2021-05-01-12-38-38-178.png With Flink SQL, the name field can be very long, in which case the Time and Exception columns of the Exception History view become very narrow and hard to read. Also, the Cancel Job button is covered over with other text. !image-2021-05-01-12-38-38-178.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22489) subtask backpressure indicator shows value for entire job
David Anderson created FLINK-22489: -- Summary: subtask backpressure indicator shows value for entire job Key: FLINK-22489 URL: https://issues.apache.org/jira/browse/FLINK-22489 Project: Flink Issue Type: Bug Components: Runtime / Web Frontend Affects Versions: 1.13.0 Reporter: David Anderson Attachments: backPressureTab.png In the backpressure tab of the web UI, the OK/LOW/HIGH indication is displaying the job-level backpressure for every subtask, rather than the individual subtask values. !backPressureTab.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-21639) docs still state that AsyncWaitOperator is not chainable
David Anderson created FLINK-21639: -- Summary: docs still state that AsyncWaitOperator is not chainable Key: FLINK-21639 URL: https://issues.apache.org/jira/browse/FLINK-21639 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.12.2, 1.11.3 Reporter: David Anderson Fix For: 1.13.0 The documentation for asyncio wasn't updated after FLINK-16219 resolved the issue first reported in FLINK-13063. The last paragraph of dev/stream/operators/asyncio.html can be dropped now. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20603) Update pyflink walkthrough playground for 1.12 release
David Anderson created FLINK-20603: -- Summary: Update pyflink walkthrough playground for 1.12 release Key: FLINK-20603 URL: https://issues.apache.org/jira/browse/FLINK-20603 Project: Flink Issue Type: Sub-task Components: Documentation / Training Affects Versions: 1.12.0 Reporter: David Anderson Fix For: 1.12.0 The pyflink walkthrough needs to be updated for 1.12. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20604) Update table walkthrough playground for 1.12
David Anderson created FLINK-20604: -- Summary: Update table walkthrough playground for 1.12 Key: FLINK-20604 URL: https://issues.apache.org/jira/browse/FLINK-20604 Project: Flink Issue Type: Sub-task Components: Documentation / Training Affects Versions: 1.12.0 Reporter: David Anderson Fix For: 1.12.0 The table walkthrough playground needs to be updated for the 1.12 release. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20599) Update operations playground for 1.12
David Anderson created FLINK-20599: -- Summary: Update operations playground for 1.12 Key: FLINK-20599 URL: https://issues.apache.org/jira/browse/FLINK-20599 Project: Flink Issue Type: Sub-task Components: Documentation / Training Affects Versions: 1.12.0 Reporter: David Anderson Fix For: 1.12.0 The operations playground needs an update for 1.12. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20598) Update playgrounds for Flink 1.12
David Anderson created FLINK-20598: -- Summary: Update playgrounds for Flink 1.12 Key: FLINK-20598 URL: https://issues.apache.org/jira/browse/FLINK-20598 Project: Flink Issue Type: Improvement Components: Documentation / Training Affects Versions: 1.12.0 Reporter: David Anderson Fix For: 1.12.0 The various playgrounds all need to be updated for the new 1.12.0 release. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20597) Update flink training exercises to 1.12
David Anderson created FLINK-20597: -- Summary: Update flink training exercises to 1.12 Key: FLINK-20597 URL: https://issues.apache.org/jira/browse/FLINK-20597 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Affects Versions: 1.12.0 Reporter: David Anderson Assignee: David Anderson Fix For: 1.12.0 The flink-training repo needs to be updated for 1.12. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20296) Explanation of keyBy was broken by find/replace of deprecated forms of keyBy
David Anderson created FLINK-20296: -- Summary: Explanation of keyBy was broken by find/replace of deprecated forms of keyBy Key: FLINK-20296 URL: https://issues.apache.org/jira/browse/FLINK-20296 Project: Flink Issue Type: Improvement Components: Documentation / Training Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson Fix For: 1.12.0, 1.11.3 The code example showing what not to do (using the now deprecated form of keyBy that uses reflection on field names) was replaced with an example using a lambda -- but without changing the explanatory text, which now makes no sense. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19109) Split Reader eats chained periodic watermarks
David Anderson created FLINK-19109: -- Summary: Split Reader eats chained periodic watermarks Key: FLINK-19109 URL: https://issues.apache.org/jira/browse/FLINK-19109 Project: Flink Issue Type: Bug Affects Versions: 1.11.1, 1.10.2, 1.11.0, 1.10.1, 1.10.0 Reporter: David Anderson Attempting to generate watermarks chained to the Split Reader / ContinuousFileReaderOperator, as in {{SingleOutputStreamOperator results = env .readTextFile(...) .map(...) .assignTimestampsAndWatermarks(bounded) .keyBy(...) .process(...); }} leads to the Watermarks failing to be produced. Breaking the chain, via {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using punctuated watermarks also avoids the issue. Looking at this in the debugger reveals that timer service is being prematurely quiesced. In many respects this is FLINK-7666 brought back to life. The problem is not present in 1.9.3. There's a minimal reproducible example in https://github.com/alpinegizmo/flink-question-001/tree/bug. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19104) how to run Fraud Detection walkthrough in Eclipse
David Anderson created FLINK-19104: -- Summary: how to run Fraud Detection walkthrough in Eclipse Key: FLINK-19104 URL: https://issues.apache.org/jira/browse/FLINK-19104 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.11.1 Reporter: David Anderson Getting the DataStream API walkthrough running in Eclipse is challenging. This walkthrough is in the docs at https://ci.apache.org/projects/flink/flink-docs-release-1.11/try-flink/datastream_api.html A user couldn't figure it out -- stackoverflow.com/questions/63659566/apache-flink-not-able-to-resolve-imports/63667067 -- and neither can I. Perhaps our maven archetype needs to be adjusted to make things easier for Eclipse users. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18797) docs and examples use deprecated forms of keyBy
David Anderson created FLINK-18797: -- Summary: docs and examples use deprecated forms of keyBy Key: FLINK-18797 URL: https://issues.apache.org/jira/browse/FLINK-18797 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.11.1, 1.11.0 Reporter: David Anderson The DataStream example at https://ci.apache.org/projects/flink/flink-docs-stable/dev/datastream_api.html#example-program uses {{keyBy(0)}} which has been deprecated. There are many other cases of this throughout the docs: dev/connectors/cassandra.md dev/parallel.md dev/stream/operators/index.md dev/stream/operators/process_function.md dev/stream/state/queryable_state.md dev/stream/state/state.md dev/types_serialization.md learn-flink/etl.md ops/scala_shell.md and also in a number of examples: AsyncIOExample.java SideOutputExample.java TwitterExample.java GroupedProcessingTimeWindowExample.java SessionWindowing.java TopSpeedWindowing.java WindowWordCount.java WordCount.java TwitterExample.scala GroupedProcessingTimeWindowExample.scala SessionWindowing.scala WindowWordCount.scala WordCount.scala There are also some uses of keyBy("string"), which has also been deprecated: dev/connectors/cassandra.md dev/stream/operators/index.md dev/types_serialization.md learn-flink/etl.md SocketWindowWordCount.java SocketWindowWordCount.scala TopSpeedWindowing.scala -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18630) Improve solution to Long Rides training exercise
David Anderson created FLINK-18630: -- Summary: Improve solution to Long Rides training exercise Key: FLINK-18630 URL: https://issues.apache.org/jira/browse/FLINK-18630 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Reporter: David Anderson Assignee: David Anderson The current solution to the Long Rides exercise will incorrectly generate an alert in the case where the END event arrives more than two hours before the corresponding START event. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18499) Update Flink Exercises to 1.11
David Anderson created FLINK-18499: -- Summary: Update Flink Exercises to 1.11 Key: FLINK-18499 URL: https://issues.apache.org/jira/browse/FLINK-18499 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson The training exercises need to be updated for Flink 1.11. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18498) Update Flink Playgrounds to 1.11
David Anderson created FLINK-18498: -- Summary: Update Flink Playgrounds to 1.11 Key: FLINK-18498 URL: https://issues.apache.org/jira/browse/FLINK-18498 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson The Flink Operations Playground needs to be updated to Flink 1.11. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18482) Replace flink-training datasets with data generators
David Anderson created FLINK-18482: -- Summary: Replace flink-training datasets with data generators Key: FLINK-18482 URL: https://issues.apache.org/jira/browse/FLINK-18482 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Reporter: David Anderson Assignee: David Anderson It will improve the experience for those doing the flink-training exercises if they don't have to download and configure the taxi ride and taxi fare datasets, and it will allow us to delete some rather ugly code. This will also remove this dependency on these external datasets. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18297) SQL client: setting execution.type to invalid value shuts down the session
David Anderson created FLINK-18297: -- Summary: SQL client: setting execution.type to invalid value shuts down the session Key: FLINK-18297 URL: https://issues.apache.org/jira/browse/FLINK-18297 Project: Flink Issue Type: Bug Components: Table SQL / Client Affects Versions: 1.10.0 Reporter: David Anderson Flink SQL> SET execution.type=foo; Exception in thread "main" org.apache.flink.table.client.SqlClientException: Invalid configuration entry. at org.apache.flink.table.client.config.entries.ConfigEntry.(ConfigEntry.java:41) at org.apache.flink.table.client.config.entries.ExecutionEntry.(ExecutionEntry.java:112) at org.apache.flink.table.client.config.entries.ExecutionEntry.enrich(ExecutionEntry.java:375) at org.apache.flink.table.client.config.Environment.enrich(Environment.java:295) at org.apache.flink.table.client.gateway.local.LocalExecutor.setSessionProperty(LocalExecutor.java:284) at org.apache.flink.table.client.cli.CliClient.callSet(CliClient.java:370) at org.apache.flink.table.client.cli.CliClient.callCommand(CliClient.java:262) at java.util.Optional.ifPresent(Optional.java:159) at org.apache.flink.table.client.cli.CliClient.open(CliClient.java:200) at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:125) at org.apache.flink.table.client.SqlClient.start(SqlClient.java:104) at org.apache.flink.table.client.SqlClient.main(SqlClient.java:178) Caused by: org.apache.flink.table.api.ValidationException: Unknown value for property 'type'. Supported values are [streaming, batch] but was: foo at org.apache.flink.table.descriptors.DescriptorProperties.lambda$validateEnum$34(DescriptorProperties.java:1254) at org.apache.flink.table.descriptors.DescriptorProperties.validateOptional(DescriptorProperties.java:1520) at org.apache.flink.table.descriptors.DescriptorProperties.validateEnum(DescriptorProperties.java:1247) at org.apache.flink.table.descriptors.DescriptorProperties.validateEnumValues(DescriptorProperties.java:1266) at org.apache.flink.table.client.config.entries.ExecutionEntry.validate(ExecutionEntry.java:123) at org.apache.flink.table.client.config.entries.ConfigEntry.(ConfigEntry.java:39) ... 11 more Shutting down the session... done. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18282) retranslate the documentation home page
David Anderson created FLINK-18282: -- Summary: retranslate the documentation home page Key: FLINK-18282 URL: https://issues.apache.org/jira/browse/FLINK-18282 Project: Flink Issue Type: Improvement Components: chinese-translation, Documentation Reporter: David Anderson Fix For: 1.11.0 FLINK-17981 was a complete rewrite of the documentation home page. The chinese translation should be updated along the same lines. docs/index.zh.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18178) flink-training exercises don't build with Eclipse
David Anderson created FLINK-18178: -- Summary: flink-training exercises don't build with Eclipse Key: FLINK-18178 URL: https://issues.apache.org/jira/browse/FLINK-18178 Project: Flink Issue Type: Bug Components: Documentation / Training / Exercises Reporter: David Anderson Assignee: David Anderson The joda-time dependency can't be found, and main classes being referenced in the per-exercise build.gradle files do not exist. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18014) JSONDeserializationSchema: removed in Flink 1.8, but still in the docs
David Anderson created FLINK-18014: -- Summary: JSONDeserializationSchema: removed in Flink 1.8, but still in the docs Key: FLINK-18014 URL: https://issues.apache.org/jira/browse/FLINK-18014 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.9.3, 1.8.4, 1.11.0, 1.10.2 Reporter: David Anderson It seems that this section of the docs -- [https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/kafka.html#the-deserializationschema] – was not updated when https://jira.apache.org/jira/browse/FLINK-11015 was implemented. Not sure if the problem is limited to just this one class that should no longer be mentioned, or if this section needs a rewrite. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17507) Training figure program_dataflow.svg should use preferred parts of the API
David Anderson created FLINK-17507: -- Summary: Training figure program_dataflow.svg should use preferred parts of the API Key: FLINK-17507 URL: https://issues.apache.org/jira/browse/FLINK-17507 Project: Flink Issue Type: Improvement Components: Documentation / Training Reporter: David Anderson It would be better if fig/program_dataflow.svg used a {{ProcessWindowFunction}}, rather than a {{WindowFunction}}. It also uses a {{BucketingSink}}, which sets a bad example. Note that this is not a trivial edit, since it doesn't work to simply replace {{new BucketingSink}} with {{new StreamingFileSink}}. Something like this would be better: {{final StreamingFileSink sink = StreamingFileSink}} {{ .forBulkFormat(...)}} {{ .build();}} {{}} {{stats.addSink(sink);}} {{}} Note: This figure is only used once, in the Training Overview page. {{}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17504) Update translation of Getting Started / Overview
David Anderson created FLINK-17504: -- Summary: Update translation of Getting Started / Overview Key: FLINK-17504 URL: https://issues.apache.org/jira/browse/FLINK-17504 Project: Flink Issue Type: Improvement Components: chinese-translation Reporter: David Anderson getting-started/index.zh.md is out-of-date -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17491) Translate Training page on project website
David Anderson created FLINK-17491: -- Summary: Translate Training page on project website Key: FLINK-17491 URL: https://issues.apache.org/jira/browse/FLINK-17491 Project: Flink Issue Type: Improvement Components: chinese-translation, Project Website Reporter: David Anderson Translate the training page for the project website to Chinese. The file is training.zh.md. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17490) Add Training page to project website
David Anderson created FLINK-17490: -- Summary: Add Training page to project website Key: FLINK-17490 URL: https://issues.apache.org/jira/browse/FLINK-17490 Project: Flink Issue Type: Improvement Components: Project Website Reporter: David Anderson Assignee: David Anderson Now that the documentation has a training section, it would be good to help folks find it by promoting it from the project website. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17432) Rename Tutorials to Training
David Anderson created FLINK-17432: -- Summary: Rename Tutorials to Training Key: FLINK-17432 URL: https://issues.apache.org/jira/browse/FLINK-17432 Project: Flink Issue Type: Improvement Components: Documentation / Training Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson Fix For: 1.11.0 Change Tutorials to Training in the sidebar navigation and headings, and change the URL path as well. The motivation for this change is SEO – folks looking for this kind of content are more likely to be searching for training. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17352) All doc links w/ site.baseurl & link tag are broken
David Anderson created FLINK-17352: -- Summary: All doc links w/ site.baseurl & link tag are broken Key: FLINK-17352 URL: https://issues.apache.org/jira/browse/FLINK-17352 Project: Flink Issue Type: Improvement Components: Documentation Reporter: David Anderson Assignee: David Anderson Using {{ site.baseurl }}{% link foo.md %} creates a link containing something like https://ci.apache.org/projects/flink/flink-docs-master//ci.apache.org/projects/flink/flink-docs-master/concepts/stateful-stream-processing.html The link tag includes site.baseurl, so no need to include it again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17316) Have HourlyTips solutions use TumblingEventTimeWindows.of
David Anderson created FLINK-17316: -- Summary: Have HourlyTips solutions use TumblingEventTimeWindows.of Key: FLINK-17316 URL: https://issues.apache.org/jira/browse/FLINK-17316 Project: Flink Issue Type: Improvement Components: Documentation / Training / Exercises Reporter: David Anderson Assignee: David Anderson In an educational context I think it's better to use .window(TumblingEventTimeWindows.of(Time.hours(1))) rather than .timeWindow(Time.hours(1)) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17292) Translate Fault Tolerance tutorial to Chinese
David Anderson created FLINK-17292: -- Summary: Translate Fault Tolerance tutorial to Chinese Key: FLINK-17292 URL: https://issues.apache.org/jira/browse/FLINK-17292 Project: Flink Issue Type: Improvement Components: chinese-translation, Documentation / Training Reporter: David Anderson docs/tutorials/fault-tolerance.zh.md does not yet exist. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17291) Translate tutorial on event-driven applications to chinese
David Anderson created FLINK-17291: -- Summary: Translate tutorial on event-driven applications to chinese Key: FLINK-17291 URL: https://issues.apache.org/jira/browse/FLINK-17291 Project: Flink Issue Type: Improvement Components: chinese-translation, Documentation / Training Reporter: David Anderson Translate docs/tutorials/event-driven.md to Chinese. the .zh.md file does not exist yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17290) Translate Streaming Analytics tutorial to chinese
David Anderson created FLINK-17290: -- Summary: Translate Streaming Analytics tutorial to chinese Key: FLINK-17290 URL: https://issues.apache.org/jira/browse/FLINK-17290 Project: Flink Issue Type: Improvement Components: chinese-translation, Documentation / Training Reporter: David Anderson docs/tutorials/streaming-analytics.zh.md does not exist yet. The content covers event time, watermarks, and windowing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17289) Translate tutorials/etl.md to chinese
David Anderson created FLINK-17289: -- Summary: Translate tutorials/etl.md to chinese Key: FLINK-17289 URL: https://issues.apache.org/jira/browse/FLINK-17289 Project: Flink Issue Type: Improvement Components: chinese-translation, Documentation / Training Reporter: David Anderson This is one of the new tutorials, and it needs translation. docs/tutorials/etl.zh.md does not exist yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17283) Improve and explain the solution to Long Rides training exercise
David Anderson created FLINK-17283: -- Summary: Improve and explain the solution to Long Rides training exercise Key: FLINK-17283 URL: https://issues.apache.org/jira/browse/FLINK-17283 Project: Flink Issue Type: Improvement Components: Training Exercises Reporter: David Anderson Assignee: David Anderson The Long Rides Alerts exercise for flink-training is missing a DISCUSSION.md. And the solution can be made a bit easier to explain. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17271) Translate new DataStream API tutorial
David Anderson created FLINK-17271: -- Summary: Translate new DataStream API tutorial Key: FLINK-17271 URL: https://issues.apache.org/jira/browse/FLINK-17271 Project: Flink Issue Type: Improvement Components: chinese-translation Reporter: David Anderson tutorials/datastream_api.md needs to be translated. The zh file doesn't exist yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17269) Translate new Tutorials Overview
David Anderson created FLINK-17269: -- Summary: Translate new Tutorials Overview Key: FLINK-17269 URL: https://issues.apache.org/jira/browse/FLINK-17269 Project: Flink Issue Type: Improvement Components: chinese-translation Reporter: David Anderson The training materials being added to the documentation need to be translated to Chinese. This ticket is for translating tutorials/index.zh.md concepts/index.zh.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17247) Rework Concepts / Timely Stream Processing to depend on Tutorials
David Anderson created FLINK-17247: -- Summary: Rework Concepts / Timely Stream Processing to depend on Tutorials Key: FLINK-17247 URL: https://issues.apache.org/jira/browse/FLINK-17247 Project: Flink Issue Type: Sub-task Components: Documentation Reporter: David Anderson Assignee: David Anderson Topics that should remain: * Watermarks in Parallel Streams Topics to add: * Idle sources * Per-partition watermarking? Some of the introductory content on Event time vs processing time can be merged into the tutorial on this topic, and the rest of this section can be dropped. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17246) Rework Concepts / Stateful Stream Processing to depend on Tutorials
David Anderson created FLINK-17246: -- Summary: Rework Concepts / Stateful Stream Processing to depend on Tutorials Key: FLINK-17246 URL: https://issues.apache.org/jira/browse/FLINK-17246 Project: Flink Issue Type: Sub-task Components: Documentation Reporter: David Anderson Assignee: David Anderson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17245) Merge concepts/stream-processing into tutorials overview
David Anderson created FLINK-17245: -- Summary: Merge concepts/stream-processing into tutorials overview Key: FLINK-17245 URL: https://issues.apache.org/jira/browse/FLINK-17245 Project: Flink Issue Type: Sub-task Components: Documentation Reporter: David Anderson Assignee: David Anderson The content of the stream-processing concepts page overlaps with the tutorials overview page. Where the stream-processing concepts page has better content, those bits should be merged in. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17244) Update Getting Started / Overview to mention Tutorials
David Anderson created FLINK-17244: -- Summary: Update Getting Started / Overview to mention Tutorials Key: FLINK-17244 URL: https://issues.apache.org/jira/browse/FLINK-17244 Project: Flink Issue Type: Sub-task Components: Documentation Reporter: David Anderson Assignee: David Anderson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17243) Update Getting Started / Overview to mention Tutorials
David Anderson created FLINK-17243: -- Summary: Update Getting Started / Overview to mention Tutorials Key: FLINK-17243 URL: https://issues.apache.org/jira/browse/FLINK-17243 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.11.0 Reporter: David Anderson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17242) Update docs Home page to mention Tutorials
David Anderson created FLINK-17242: -- Summary: Update docs Home page to mention Tutorials Key: FLINK-17242 URL: https://issues.apache.org/jira/browse/FLINK-17242 Project: Flink Issue Type: Sub-task Components: Documentation Affects Versions: 1.11.0 Reporter: David Anderson The First Steps section of the docs home page should mention and link to the Tutorials Overview. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17241) Add page on Fault Tolerance to Tutorials
David Anderson created FLINK-17241: -- Summary: Add page on Fault Tolerance to Tutorials Key: FLINK-17241 URL: https://issues.apache.org/jira/browse/FLINK-17241 Project: Flink Issue Type: Sub-task Components: Documentation Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson Fix For: 1.11.0 Topics: * State backends * State snapshots -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17240) Add page on Event-driven Applications to Tutorials section
David Anderson created FLINK-17240: -- Summary: Add page on Event-driven Applications to Tutorials section Key: FLINK-17240 URL: https://issues.apache.org/jira/browse/FLINK-17240 Project: Flink Issue Type: Sub-task Components: Documentation Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson Fix For: 1.11.0 Topics: * Process Functions * Side Outputs -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17239) Add streaming analytics page to Tutorials section
David Anderson created FLINK-17239: -- Summary: Add streaming analytics page to Tutorials section Key: FLINK-17239 URL: https://issues.apache.org/jira/browse/FLINK-17239 Project: Flink Issue Type: Sub-task Components: Documentation Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson Fix For: 1.11.0 Topics: * event time * watermarks * windows -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17238) Add ETL page to Tutorials section
David Anderson created FLINK-17238: -- Summary: Add ETL page to Tutorials section Key: FLINK-17238 URL: https://issues.apache.org/jira/browse/FLINK-17238 Project: Flink Issue Type: Sub-task Components: Documentation Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson Fix For: 1.11.0 This page should cover enough of the DataStream API to permit building simple data pipelines and ETL jobs, including keyed state and connected streams. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17237) Add Intro to DataStream API page to Tutorials section
David Anderson created FLINK-17237: -- Summary: Add Intro to DataStream API page to Tutorials section Key: FLINK-17237 URL: https://issues.apache.org/jira/browse/FLINK-17237 Project: Flink Issue Type: Sub-task Components: Documentation Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson Fix For: 1.11.0 This page should contain a basic introduction, a complete example, and a pointer to the RideCleansing exercise. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17236) Add new Tutorials section to Documentation
David Anderson created FLINK-17236: -- Summary: Add new Tutorials section to Documentation Key: FLINK-17236 URL: https://issues.apache.org/jira/browse/FLINK-17236 Project: Flink Issue Type: Sub-task Components: Documentation Affects Versions: 1.11.0 Reporter: David Anderson Assignee: David Anderson This section will contain pages of content contributed from Ververica's Flink training website. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-16925) remove Custom Search Engine files from statefun docs
David Anderson created FLINK-16925: -- Summary: remove Custom Search Engine files from statefun docs Key: FLINK-16925 URL: https://issues.apache.org/jira/browse/FLINK-16925 Project: Flink Issue Type: Improvement Components: Documentation, Stateful Functions Reporter: David Anderson Fix For: statefun-2.0.1 cse.xml and annotations.xml were copied over from the Flink docs, but aren't used. Better to remove them. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14270) new web ui should display more than 4 metrics
David Anderson created FLINK-14270: -- Summary: new web ui should display more than 4 metrics Key: FLINK-14270 URL: https://issues.apache.org/jira/browse/FLINK-14270 Project: Flink Issue Type: New Feature Components: Runtime / Web Frontend Affects Versions: 1.9.0 Reporter: David Anderson Attachments: input-metrics-all-zero.png The old web UI can display at least 9 metrics at once, and this can be valuable. The new interface is limited to 4 metrics, which is not enough. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14160) Extend Operations Playground with --backpressure option
David Anderson created FLINK-14160: -- Summary: Extend Operations Playground with --backpressure option Key: FLINK-14160 URL: https://issues.apache.org/jira/browse/FLINK-14160 Project: Flink Issue Type: New Feature Components: Documentation Reporter: David Anderson Add a --backpressure option to the ClickEventCount job used in the operations playground. This will insert an optional operator into the job that causes severe, periodic backpressure that can be observed in the metrics and web UI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-13816) Long job names result in a very ugly table listing the completed jobs in the web UI
David Anderson created FLINK-13816: -- Summary: Long job names result in a very ugly table listing the completed jobs in the web UI Key: FLINK-13816 URL: https://issues.apache.org/jira/browse/FLINK-13816 Project: Flink Issue Type: Bug Components: Runtime / Web Frontend Affects Versions: 1.9.0 Reporter: David Anderson Attachments: Screen Shot 2019-08-21 at 1.20.45 PM.png Although this is a UI flaw, it's bad enough I've classified it as a bug. The horizontal space used for the list of jobs in the new, angular-based web frontend needs to be distributed more fairly (see the attached image). Some min-width for each of the columns would be one solution. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-11997) ConcurrentModificationException: ZooKeeper unexpectedly modified
David Anderson created FLINK-11997: -- Summary: ConcurrentModificationException: ZooKeeper unexpectedly modified Key: FLINK-11997 URL: https://issues.apache.org/jira/browse/FLINK-11997 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.8.0 Environment: Flink 1.8.0-rc4, running in a k8s job cluster with checkpointing and savepointing in minio. Zookeeper enabled, also saving to minio. jobmanager.rpc.address: localhost jobmanager.rpc.port: 6123 jobmanager.heap.size: 1024m taskmanager.heap.size: 1024m taskmanager.numberOfTaskSlots: 4 parallelism.default: 4 high-availability: zookeeper high-availability.jobmanager.port: 6123 high-availability.storageDir: s3://highavailability/storage high-availability.zookeeper.quorum: zoo1:2181 state.backend: filesystem state.checkpoints.dir: s3://state/checkpoints state.savepoints.dir: s3://state/savepoints rest.port: 8081 zookeeper.sasl.disable: true s3.access-key: minio s3.secret-key: minio123 s3.path-style-access: true s3.endpoint: http://minio-service:9000 Reporter: David Anderson Trying to rescale a job running in a k8s job cluster via flink modify -p 2 -m localhost:30081 Rescaling works fine if HA is off. Taking a savepoint and restarting from one also works fine, even with HA turned on. But rescaling by modifying the job via always fails as shown below: Caused by: org.apache.flink.util.FlinkException: Failed to rescale the job . ... 21 more Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.jobmaster.exceptions.JobModificationException: Could not restore from temporary rescaling savepoint. This might indicate that the savepoint s3://state/savepoints/savepoint-00-2fa7fd5dabb2 got corrupted. Deleting this savepoint as a precaution. at org.apache.flink.runtime.jobmaster.JobMaster.lambda$rescaleOperators$4(JobMaster.java:470) at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822) at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797) ... 18 more Caused by: org.apache.flink.runtime.jobmaster.exceptions.JobModificationException: Could not restore from temporary rescaling savepoint. This might indicate that the savepoint s3://state/savepoints/savepoint-00-2fa7fd5dabb2 got corrupted. Deleting this savepoint as a precaution. at org.apache.flink.runtime.jobmaster.JobMaster.lambda$restoreExecutionGraphFromRescalingSavepoint$18(JobMaster.java:1433) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.ConcurrentModificationException: ZooKeeper unexpectedly modified at org.apache.flink.runtime.zookeeper.ZooKeeperStateHandleStore.addAndLock(ZooKeeperStateHandleStore.java:159) at org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore.addCheckpoint(ZooKeeperCompletedCheckpointStore.java:216) at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1106) at org.apache.flink.runtime.jobmaster.JobMaster.tryRestoreExecutionGraphFromSavepoint(JobMaster.java:1251) at org.apache.flink.runtime.jobmaster.JobMaster.lambda$restoreExecutionGraphFromRescalingSavepoint$18(JobMaster.java:1413) ... 10 more Caused by: org.apache.flink.shaded.zookeeper.org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists at org.apache.flink.shaded.zookeeper.org.apache.zookeeper.KeeperException.create(KeeperException.java:119) at org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1006) at org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:910) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorTransactionImpl.doOperat
[jira] [Created] (FLINK-9980) wiki-edits quickstart example fails when run outside of IDE
David Anderson created FLINK-9980: - Summary: wiki-edits quickstart example fails when run outside of IDE Key: FLINK-9980 URL: https://issues.apache.org/jira/browse/FLINK-9980 Project: Flink Issue Type: Bug Components: Examples Affects Versions: 1.5.1 Reporter: David Anderson Fix For: 1.6.0 Following the instructions in the docs, I find that this example runs in intellij, but when run from the command line as instructed {{mvn exec:java -Dexec.mainClass=wikiedits.WikipediaAnalysis}} it fails with {{java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction}} I discovered this when trying to reproduce a problem reported on stack overflow: [https://stackoverflow.com/questions/51550479/caused-by-java-io-ioexception-unable-to-serialize-default-value-of-type-tuple2] That user is getting a different runtime error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9359) Update quickstart docs to only mention Java 8
David Anderson created FLINK-9359: - Summary: Update quickstart docs to only mention Java 8 Key: FLINK-9359 URL: https://issues.apache.org/jira/browse/FLINK-9359 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.4.2, 1.5.0, 1.6.0 Reporter: David Anderson Assignee: David Anderson Fix For: 1.5.0, 1.6.0, 1.4.2 Java 7 support was dropped from Flink 1.4, and Java 9 and 10 aren't yet supported, but the quickstart docs still say "the only requirement is to have a working *Java 7.x* (or higher) installation". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9288) clarify a few points in the event time / watermark docs
David Anderson created FLINK-9288: - Summary: clarify a few points in the event time / watermark docs Key: FLINK-9288 URL: https://issues.apache.org/jira/browse/FLINK-9288 Project: Flink Issue Type: Improvement Components: Documentation Reporter: David Anderson Assignee: David Anderson Fix For: 1.5.0, 1.6.0 There are a few things that folks often seem to miss when reading the event time and watermark docs. Adding a couple of sentences and a couple of links should help. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-8914) CEP's greedy() modifier doesn't work
David Anderson created FLINK-8914: - Summary: CEP's greedy() modifier doesn't work Key: FLINK-8914 URL: https://issues.apache.org/jira/browse/FLINK-8914 Project: Flink Issue Type: Bug Components: CEP Affects Versions: 1.4.1, 1.4.0 Reporter: David Anderson When applied to the first or last component of a CEP Pattern, greedy() doesn't work correctly. Here's an example: {code:java} package com.dataartisans.flinktraining.exercises.datastream_java.cep; import org.apache.flink.cep.CEP; import org.apache.flink.cep.PatternSelectFunction; import org.apache.flink.cep.PatternStream; import org.apache.flink.cep.pattern.Pattern; import org.apache.flink.cep.pattern.conditions.SimpleCondition; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import java.util.List; import java.util.Map; public class RunLength { public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(1); DataStream input = env.fromElements(1, 1, 1, 1, 1, 0, 1, 1, 1, 0); Pattern onesThenZero = Pattern.begin("ones") .where(new SimpleCondition() { @Override public boolean filter(Integer value) throws Exception { return value == 1; } }) .oneOrMore() .greedy() .consecutive() .next("zero") .where(new SimpleCondition() { @Override public boolean filter(Integer value) throws Exception { return value == 0; } }); PatternStream patternStream = CEP.pattern(input, onesThenZero); // Expected: 5 3 // Actual: 5 4 3 2 1 3 2 1 patternStream.select(new LengthOfRun()).print(); env.execute(); } public static class LengthOfRun implements PatternSelectFunction { public Integer select(Map> pattern) { return pattern.get("ones").size(); } } } {code} The only workaround for now seems to be to rewrite the pattern so that greedy() isn't needed – i.e. by bracketing the greedy section with a prefix and suffix that both have to be matched. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-7136) Docs search can be customized to be more useful
David Anderson created FLINK-7136: - Summary: Docs search can be customized to be more useful Key: FLINK-7136 URL: https://issues.apache.org/jira/browse/FLINK-7136 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.4.0 Reporter: David Anderson Assignee: David Anderson The google custom search engine we're using for search can be customized to make it more useful. I propose to * turn of ads (since this site belongs to a non-profit org) * add additional sources of information * mailing lists * JIRA * FLIPs * stack overflow * flink forward talks * use refinements (tabs) to make it easy to navigate between these sources -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-6797) building docs fails with bundler 1.15
David Anderson created FLINK-6797: - Summary: building docs fails with bundler 1.15 Key: FLINK-6797 URL: https://issues.apache.org/jira/browse/FLINK-6797 Project: Flink Issue Type: Bug Components: Documentation Reporter: David Anderson Assignee: David Anderson Priority: Critical Fix For: 1.3.1, 1.4.0 The script for building the docs installs the latest version of the bundler ruby gem (if it can't find the bundle command, which is always the case on the build-bots, for example). Since the release of bundler 1.15 this fails because it is now pickier about dependency checking, and we somehow ended up with an invalid dependency rule in Gemfile.lock. -- This message was sent by Atlassian JIRA (v6.3.15#6346)