[jira] [Created] (FLINK-18535) Elasticsearch connector hangs for threads deadlocked
DeFOX created FLINK-18535: - Summary: Elasticsearch connector hangs for threads deadlocked Key: FLINK-18535 URL: https://issues.apache.org/jira/browse/FLINK-18535 Project: Flink Issue Type: Bug Components: Connectors / ElasticSearch Affects Versions: 1.10.0 Reporter: DeFOX When using the connector created by org.apache.flink.streaming.connectors.elasticsearch7.ElasticsearchSink.Builder to sink data to ES(es7.7.1 and es6.8.5), flink job will finally hang for threads deadlocked after a while. And connector created by org.apache.flink.streaming.connectors.elasticsearch6.ElasticsearchSink.Builder will not result in this deadlocked situation. This deadlocked issue may come from this pr:[https://github.com/elastic/elasticsearch/pull/41451] So upgrade version of the dependencies of ES may fix this bug. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Releasing Flink 1.11.1 soon?
Besides, it would be great if we can figure out the performance regression Thomas reported before. Do you know what's the status now? @zhijiang @Thomas Best, Jark On Thu, 9 Jul 2020 at 11:10, Jark Wu wrote: > Hi everyone, > > As discussed in the voting thread of 1.11.0-RC4 [1], we found a blocker > issue about the CDC feature [1]. > Considering this is a new kind of connector, we don't want to block the > ready-to-publish RC4 and prefer to have an immediately 1.11.1 release. > Therefore, I would like to start the discussion about releasing 1.11.1 > soon, to deliver a complete CDC feature. > We can also release some notable bug fixes found in these days. But I > suggest not to wait too long to collect/fix bugs, > otherwise it will delay the feature delivery again, we can launch another > patch release after that if needed. > > Most notable bug fixes so far are: > > - FLINK-18461 Changelog source can't be insert into upsert sink > - FLINK-18426 Incompatible deprecated key type for registration cluster > options > > Furthermore, I think it would be better if we can fix these issues in > 1.11.1: > > - FLINK-18434 Can not select fields with JdbcCatalog (PR opened) > - FLINK-18520 New Table Function type inference fails (PR opened) > - FLINK-18477 ChangelogSocketExample does not work > > I'd like to suggest creating the first RC next Monday. What do you think? > > If there are any concerns or missing blocker issues need to be fixed in > 1.11.1, please let me know. Thanks. > > Best, > Jark > > [1]: > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-11-0-release-candidate-4-tp42829p42858.html > > [2]: https://issues.apache.org/jira/browse/FLINK-18461 >
[DISCUSS] Releasing Flink 1.11.1 soon?
Hi everyone, As discussed in the voting thread of 1.11.0-RC4 [1], we found a blocker issue about the CDC feature [1]. Considering this is a new kind of connector, we don't want to block the ready-to-publish RC4 and prefer to have an immediately 1.11.1 release. Therefore, I would like to start the discussion about releasing 1.11.1 soon, to deliver a complete CDC feature. We can also release some notable bug fixes found in these days. But I suggest not to wait too long to collect/fix bugs, otherwise it will delay the feature delivery again, we can launch another patch release after that if needed. Most notable bug fixes so far are: - FLINK-18461 Changelog source can't be insert into upsert sink - FLINK-18426 Incompatible deprecated key type for registration cluster options Furthermore, I think it would be better if we can fix these issues in 1.11.1: - FLINK-18434 Can not select fields with JdbcCatalog (PR opened) - FLINK-18520 New Table Function type inference fails (PR opened) - FLINK-18477 ChangelogSocketExample does not work I'd like to suggest creating the first RC next Monday. What do you think? If there are any concerns or missing blocker issues need to be fixed in 1.11.1, please let me know. Thanks. Best, Jark [1]: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-11-0-release-candidate-4-tp42829p42858.html [2]: https://issues.apache.org/jira/browse/FLINK-18461
Re: [DISCUSS] FLIP-130: Support for Python DataStream API (Stateless Part)
Hi Shuiqiang, Thanks a lot for driving this discussion. Big +1 for supporting Python DataStream. In many ML scenarios, operating Object will be more natural than operating Table. Best, Xingbo Wei Zhong 于2020年7月9日周四 上午10:35写道: > Hi Shuiqiang, > > Thanks for driving this. Big +1 for supporting DataStream API in PyFlink! > > Best, > Wei > > > > 在 2020年7月9日,10:29,Hequn Cheng 写道: > > > > +1 for adding the Python DataStream API and starting with the stateless > > part. > > There are already some users that expressed their wish to have the Python > > DataStream APIs. Once we have the APIs in PyFlink, we can cover more use > > cases for our users. > > > > Best, Hequn > > > > On Wed, Jul 8, 2020 at 11:45 AM Shuiqiang Chen > wrote: > > > >> Sorry, the 3rd link is broken, please refer to this one: Support Python > >> DataStream API > >> < > >> > https://docs.google.com/document/d/1H3hz8wuk22-8cDBhQmQKNw3m1q5gDAMkwTDEwnj3FBI/edit > >>> > >> > >> Shuiqiang Chen 于2020年7月8日周三 上午11:13写道: > >> > >>> Hi everyone, > >>> > >>> As we all know, Flink provides three layered APIs: the > ProcessFunctions, > >>> the DataStream API and the SQL & Table API. Each API offers a different > >>> trade-off between conciseness and expressiveness and targets different > >> use > >>> cases[1]. > >>> > >>> Currently, the SQL & Table API has already been supported in PyFlink. > The > >>> API provides relational operations as well as user-defined functions to > >>> provide convenience for users who are familiar with python and > relational > >>> programming. > >>> > >>> Meanwhile, the DataStream API and ProcessFunctions provide more generic > >>> APIs to implement stream processing applications. The ProcessFunctions > >>> expose time and state which are the fundamental building blocks for any > >>> kind of streaming application. > >>> To cover more use cases, we are planning to cover all these APIs in > >>> PyFlink. > >>> > >>> In this discussion(FLIP-130), we propose to support the Python > DataStream > >>> API for the stateless part. For more detail, please refer to the FLIP > >> wiki > >>> page here[2]. If interested in the stateful part, you can also take a > >>> look the design doc here[3] for which we are going to discuss in a > >> separate > >>> FLIP. > >>> > >>> Any comments will be highly appreciated! > >>> > >>> [1] https://flink.apache.org/flink-applications.html#layered-apis > >>> [2] > >>> > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866298 > >>> [3] > >>> > >> > https://docs.google.com/document/d/1H3hz8wuk228cDBhQmQKNw3m1q5gDAMkwTDEwnj3FBI/edit?usp=sharing > >>> > >>> Best, > >>> Shuiqiang > >>> > >>> > >>> > >>> > >> > >
Re: [DISCUSS] FLIP-130: Support for Python DataStream API (Stateless Part)
Hi Shuiqiang, Thanks for driving this. Big +1 for supporting DataStream API in PyFlink! Best, Wei > 在 2020年7月9日,10:29,Hequn Cheng 写道: > > +1 for adding the Python DataStream API and starting with the stateless > part. > There are already some users that expressed their wish to have the Python > DataStream APIs. Once we have the APIs in PyFlink, we can cover more use > cases for our users. > > Best, Hequn > > On Wed, Jul 8, 2020 at 11:45 AM Shuiqiang Chen wrote: > >> Sorry, the 3rd link is broken, please refer to this one: Support Python >> DataStream API >> < >> https://docs.google.com/document/d/1H3hz8wuk22-8cDBhQmQKNw3m1q5gDAMkwTDEwnj3FBI/edit >>> >> >> Shuiqiang Chen 于2020年7月8日周三 上午11:13写道: >> >>> Hi everyone, >>> >>> As we all know, Flink provides three layered APIs: the ProcessFunctions, >>> the DataStream API and the SQL & Table API. Each API offers a different >>> trade-off between conciseness and expressiveness and targets different >> use >>> cases[1]. >>> >>> Currently, the SQL & Table API has already been supported in PyFlink. The >>> API provides relational operations as well as user-defined functions to >>> provide convenience for users who are familiar with python and relational >>> programming. >>> >>> Meanwhile, the DataStream API and ProcessFunctions provide more generic >>> APIs to implement stream processing applications. The ProcessFunctions >>> expose time and state which are the fundamental building blocks for any >>> kind of streaming application. >>> To cover more use cases, we are planning to cover all these APIs in >>> PyFlink. >>> >>> In this discussion(FLIP-130), we propose to support the Python DataStream >>> API for the stateless part. For more detail, please refer to the FLIP >> wiki >>> page here[2]. If interested in the stateful part, you can also take a >>> look the design doc here[3] for which we are going to discuss in a >> separate >>> FLIP. >>> >>> Any comments will be highly appreciated! >>> >>> [1] https://flink.apache.org/flink-applications.html#layered-apis >>> [2] >>> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866298 >>> [3] >>> >> https://docs.google.com/document/d/1H3hz8wuk228cDBhQmQKNw3m1q5gDAMkwTDEwnj3FBI/edit?usp=sharing >>> >>> Best, >>> Shuiqiang >>> >>> >>> >>> >>
Re: [DISCUSS] FLIP-130: Support for Python DataStream API (Stateless Part)
+1 for adding the Python DataStream API and starting with the stateless part. There are already some users that expressed their wish to have the Python DataStream APIs. Once we have the APIs in PyFlink, we can cover more use cases for our users. Best, Hequn On Wed, Jul 8, 2020 at 11:45 AM Shuiqiang Chen wrote: > Sorry, the 3rd link is broken, please refer to this one: Support Python > DataStream API > < > https://docs.google.com/document/d/1H3hz8wuk22-8cDBhQmQKNw3m1q5gDAMkwTDEwnj3FBI/edit > > > > Shuiqiang Chen 于2020年7月8日周三 上午11:13写道: > > > Hi everyone, > > > > As we all know, Flink provides three layered APIs: the ProcessFunctions, > > the DataStream API and the SQL & Table API. Each API offers a different > > trade-off between conciseness and expressiveness and targets different > use > > cases[1]. > > > > Currently, the SQL & Table API has already been supported in PyFlink. The > > API provides relational operations as well as user-defined functions to > > provide convenience for users who are familiar with python and relational > > programming. > > > > Meanwhile, the DataStream API and ProcessFunctions provide more generic > > APIs to implement stream processing applications. The ProcessFunctions > > expose time and state which are the fundamental building blocks for any > > kind of streaming application. > > To cover more use cases, we are planning to cover all these APIs in > > PyFlink. > > > > In this discussion(FLIP-130), we propose to support the Python DataStream > > API for the stateless part. For more detail, please refer to the FLIP > wiki > > page here[2]. If interested in the stateful part, you can also take a > > look the design doc here[3] for which we are going to discuss in a > separate > > FLIP. > > > > Any comments will be highly appreciated! > > > > [1] https://flink.apache.org/flink-applications.html#layered-apis > > [2] > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866298 > > [3] > > > https://docs.google.com/document/d/1H3hz8wuk228cDBhQmQKNw3m1q5gDAMkwTDEwnj3FBI/edit?usp=sharing > > > > Best, > > Shuiqiang > > > > > > > > >
[jira] [Created] (FLINK-18534) KafkaTableITCase.testKafkaDebeziumChangelogSource failed with "Topic 'changelog_topic' already exists"
Dian Fu created FLINK-18534: --- Summary: KafkaTableITCase.testKafkaDebeziumChangelogSource failed with "Topic 'changelog_topic' already exists" Key: FLINK-18534 URL: https://issues.apache.org/jira/browse/FLINK-18534 Project: Flink Issue Type: Test Components: Connectors / Kafka, Table SQL / API, Tests Affects Versions: 1.12.0 Reporter: Dian Fu https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=4350&view=logs&j=4be4ed2b-549a-533d-aa33-09e28e360cc8&t=0db94045-2aa0-53fa-f444-0130d6933518 {code} 2020-07-08T21:14:04.1626423Z [ERROR] Failures: 2020-07-08T21:14:04.1629804Z [ERROR] KafkaTableITCase.testKafkaDebeziumChangelogSource:66->KafkaTestBase.createTestTopic:197 Create test topic : changelog_topic failed, org.apache.kafka.common.errors.TopicExistsException: Topic 'changelog_topic' already exists. 2020-07-08T21:14:04.1630642Z [ERROR] Errors: 2020-07-08T21:14:04.1630986Z [ERROR] KafkaTableITCase.testKafkaDebeziumChangelogSource:83 Failed to write debezium... {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] New PMC member: Piotr Nowojski
Congratulations, Piotr! Best, Xingcan On Wed, Jul 8, 2020, 21:53 Yang Wang wrote: > Congratulations Piotr! > > > Best, > Yang > > Dan Zou 于2020年7月8日周三 下午10:36写道: > > > Congratulations! > > > > Best, > > Dan Zou > > > > > 2020年7月8日 下午5:25,godfrey he 写道: > > > > > > Congratulations > > > > >
[jira] [Created] (FLINK-18533) AccumulatorLiveITCase.testStreaming hangs
Dian Fu created FLINK-18533: --- Summary: AccumulatorLiveITCase.testStreaming hangs Key: FLINK-18533 URL: https://issues.apache.org/jira/browse/FLINK-18533 Project: Flink Issue Type: Test Components: Tests Affects Versions: 1.12.0 Reporter: Dian Fu [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=4350&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=a99e99c7-21cd-5a1f-7274-585e62b72f56] {code} 2020-07-08T21:46:15.4438026Z Printing stack trace of Java process 40159 2020-07-08T21:46:15.4442864Z == 2020-07-08T21:46:15.4475676Z Picked up JAVA_TOOL_OPTIONS: -XX:+HeapDumpOnOutOfMemoryError 2020-07-08T21:46:15.7672746Z 2020-07-08 21:46:15 2020-07-08T21:46:15.7673349Z Full thread dump OpenJDK 64-Bit Server VM (25.242-b08 mixed mode): 2020-07-08T21:46:15.7673590Z 2020-07-08T21:46:15.7673893Z "Attach Listener" #86 daemon prio=9 os_prio=0 tid=0x7fef8c025800 nid=0x1b231 runnable [0x] 2020-07-08T21:46:15.7674242Zjava.lang.Thread.State: RUNNABLE 2020-07-08T21:46:15.7674419Z 2020-07-08T21:46:15.7675150Z "flink-taskexecutor-io-thread-2" #85 daemon prio=5 os_prio=0 tid=0x7fef9c02 nid=0xb03a waiting on condition [0x7fefac1f3000] 2020-07-08T21:46:15.7675964Zjava.lang.Thread.State: WAITING (parking) 2020-07-08T21:46:15.7676249Zat sun.misc.Unsafe.park(Native Method) 2020-07-08T21:46:15.7680997Z- parking to wait for <0x87180a20> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) 2020-07-08T21:46:15.7681506Zat java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) 2020-07-08T21:46:15.7682009Zat java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) 2020-07-08T21:46:15.7682666Zat java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) 2020-07-08T21:46:15.7683100Zat java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) 2020-07-08T21:46:15.7683554Zat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) 2020-07-08T21:46:15.7684013Zat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 2020-07-08T21:46:15.7684371Zat java.lang.Thread.run(Thread.java:748) 2020-07-08T21:46:15.7684559Z 2020-07-08T21:46:15.7685213Z "Flink-DispatcherRestEndpoint-thread-4" #84 daemon prio=5 os_prio=0 tid=0x7fef90431800 nid=0x9e49 waiting on condition [0x7fef7df4a000] 2020-07-08T21:46:15.7685665Zjava.lang.Thread.State: WAITING (parking) 2020-07-08T21:46:15.7686052Zat sun.misc.Unsafe.park(Native Method) 2020-07-08T21:46:15.7686707Z- parking to wait for <0x87180cc0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) 2020-07-08T21:46:15.7687184Zat java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) 2020-07-08T21:46:15.7687721Zat java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) 2020-07-08T21:46:15.7688342Zat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088) 2020-07-08T21:46:15.7688935Zat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) 2020-07-08T21:46:15.7689579Zat java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) 2020-07-08T21:46:15.7690451Zat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) 2020-07-08T21:46:15.7690928Zat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 2020-07-08T21:46:15.7691317Zat java.lang.Thread.run(Thread.java:748) 2020-07-08T21:46:15.7691502Z 2020-07-08T21:46:15.7692183Z "Flink-DispatcherRestEndpoint-thread-3" #83 daemon prio=5 os_prio=0 tid=0x7fefa01e2800 nid=0x9dc9 waiting on condition [0x7fef7f1f4000] 2020-07-08T21:46:15.7692636Zjava.lang.Thread.State: WAITING (parking) 2020-07-08T21:46:15.7692920Zat sun.misc.Unsafe.park(Native Method) 2020-07-08T21:46:15.7693647Z- parking to wait for <0x87180cc0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) 2020-07-08T21:46:15.7694105Zat java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) 2020-07-08T21:46:15.7694595Zat java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) 2020-07-08T21:46:15.7695178Zat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088) 2020-07-08T21:46:15.7695746Zat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) 2020-07-08T
Re: [ANNOUNCE] New PMC member: Piotr Nowojski
Congratulations Piotr! Best, Yang Dan Zou 于2020年7月8日周三 下午10:36写道: > Congratulations! > > Best, > Dan Zou > > > 2020年7月8日 下午5:25,godfrey he 写道: > > > > Congratulations > >
[jira] [Created] (FLINK-18532) Remove Beta tag from MATCH_RECOGNIZE docs
Seth Wiesman created FLINK-18532: Summary: Remove Beta tag from MATCH_RECOGNIZE docs Key: FLINK-18532 URL: https://issues.apache.org/jira/browse/FLINK-18532 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.12.0 Reporter: Seth Wiesman Assignee: Seth Wiesman -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18531) Minicluster option to reuse same port if 0 is applied
David Chen created FLINK-18531: -- Summary: Minicluster option to reuse same port if 0 is applied Key: FLINK-18531 URL: https://issues.apache.org/jira/browse/FLINK-18531 Project: Flink Issue Type: Test Reporter: David Chen If the minicluster port is set to 0, can we have it able to reuse the same port after it's been closed and started again? The reason for this use case is simply that we have the port set to 0 but we'd like to restart the minicluster in order to get rid of completed/expired jobs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] New PMC member: Piotr Nowojski
Congratulations! Best, Dan Zou > 2020年7月8日 下午5:25,godfrey he 写道: > > Congratulations
[jira] [Created] (FLINK-18530) ParquetAvroWriters can not write data to hdfs
humengyu created FLINK-18530: Summary: ParquetAvroWriters can not write data to hdfs Key: FLINK-18530 URL: https://issues.apache.org/jira/browse/FLINK-18530 Project: Flink Issue Type: Bug Components: Connectors / FileSystem Affects Versions: 1.11.0 Reporter: humengyu I read data from kafka and write to hdfs by StreamingFileSink: # in version 1.11.0, ParquetAvroWriters does not work, but it works well in version 1.10.1; # AvroWriters works well in 1.11.0. {code:java} public class TestParquetAvroSink { @Test public void testParquet() throws Exception { EnvironmentSettings settings = EnvironmentSettings.newInstance().useBlinkPlanner() .inStreamingMode().build(); StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); StreamTableEnvironment tEnv = StreamTableEnvironment.create(env, settings); env.enableCheckpointing(2L);TableSchema tableSchema = TableSchema.builder().fields( new String[]{"id", "name", "sex"}, new DataType[]{DataTypes.STRING(), DataTypes.STRING(), DataTypes.STRING()}) .build();// build a kafka source DataStream rowDataStream = ;Schema schema = SchemaBuilder .record("xxx") .namespace("") .fields() .optionalString("id") .optionalString("name") .optionalString("sex") .endRecord();OutputFileConfig config = OutputFileConfig .builder() .withPartPrefix("prefix") .withPartSuffix(".ext") .build();StreamingFileSink sink = StreamingFileSink .forBulkFormat( new Path("hdfs://host:port/xxx/xxx/xxx"), ParquetAvroWriters.forGenericRecord(schema)) .withOutputFileConfig(config) .withBucketAssigner(new DateTimeBucketAssigner<>("'pdate='-MM-dd")) .build();SingleOutputStreamOperator recordDateStream = rowDataStream .map(new RecordMapFunction());recordDateStream.print(); recordDateStream.addSink(sink);env.execute("test"); } @Test public void testAvro() throws Exception { EnvironmentSettings settings = EnvironmentSettings.newInstance().useBlinkPlanner() .inStreamingMode().build(); StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); StreamTableEnvironment tEnv = StreamTableEnvironment.create(env, settings); env.enableCheckpointing(2L);TableSchema tableSchema = TableSchema.builder().fields( new String[]{"id", "name", "sex"}, new DataType[]{DataTypes.STRING(), DataTypes.STRING(), DataTypes.STRING()}) .build();// build a kafka source DataStream rowDataStream = ;Schema schema = SchemaBuilder .record("xxx") .namespace("") .fields() .optionalString("id") .optionalString("name") .optionalString("sex") .endRecord();OutputFileConfig config = OutputFileConfig .builder() .withPartPrefix("prefix") .withPartSuffix(".ext") .build();StreamingFileSink sink = StreamingFileSink .forBulkFormat( new Path("hdfs://host:port/xxx/xxx/xxx"), AvroWriters.forGenericRecord(schema)) .withOutputFileConfig(config) .withBucketAssigner(new DateTimeBucketAssigner<>("'pdate='-MM-dd")) .build();SingleOutputStreamOperator recordDateStream = rowDataStream .map(new RecordMapFunction());recordDateStream.print(); recordDateStream.addSink(sink);env.execute("test"); } public static class RecordMapFunction implements MapFunction {private transient Schema schema;@Override public GenericRecord map(Row row) throws Exception { if (schema == null) { schema = SchemaBuilder .record("xxx") .namespace("xxx") .fields() .optionalString("id") .optionalString("name") .optionalString("sex") .endRecord(); } Record record = new Record(schema); record.put("id", row.getField(0)); record.put("name", row.getField(1)); record.put("sex", row.getField(2)); return record; } } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18529) Query Hive table and filter by timestamp partition doesn't work
Rui Li created FLINK-18529: -- Summary: Query Hive table and filter by timestamp partition doesn't work Key: FLINK-18529 URL: https://issues.apache.org/jira/browse/FLINK-18529 Project: Flink Issue Type: Bug Components: Connectors / Hive Reporter: Rui Li -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18528) Update UNNEST to new type system
Timo Walther created FLINK-18528: Summary: Update UNNEST to new type system Key: FLINK-18528 URL: https://issues.apache.org/jira/browse/FLINK-18528 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Timo Walther Assignee: Timo Walther Updates the built-in UNNEST function to the new type system. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18527) Task Managers fail to start on HDP 2.6.5 due to commons-cli conflict
Truong Duc Kien created FLINK-18527: --- Summary: Task Managers fail to start on HDP 2.6.5 due to commons-cli conflict Key: FLINK-18527 URL: https://issues.apache.org/jira/browse/FLINK-18527 Project: Flink Issue Type: Bug Components: Runtime / Task Affects Versions: 1.11.0 Reporter: Truong Duc Kien When launching a new job in HDP 2.6.5, we are encountering these exceptions {code:java} 2020-07-08 16:10:36 E [default-dispatcher-4] [ o.a.f.y.YarnResourceManager] Could not start TaskManager in container container_xxx java.lang.NoSuchMethodError: org.apache.commons.cli.Option.builder(Ljava/lang/String;)Lorg/apache/commons/cli/Option$Builder; at org.apache.flink.runtime.entrypoint.parser.CommandLineOptions.(CommandLineOptions.java:28) ~[flink-dist_2.12-1.11.0.jar:1.11.0] 2020-07-08 16:12:46 E [default-dispatcher-4] [ o.a.f.y.YarnResourceManager] Could not start TaskManager in container container_xxx {} [] java.lang.NoClassDefFoundError: Could not initialize class org.apache.flink.runtime.entrypoint.parser.CommandLineOptions {code} We figure this is because HDP 2.6.5 is putting commons-cli version 1.2 on the class path, while Flink is expecting version 1.3. Maybe commons-cli should also be shaded to avoid such issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] New PMC member: Piotr Nowojski
Congratulations Piotr! Best, Godfrey Yun Tang 于2020年7月8日周三 下午5:22写道: > Congratulations, Piotr! > > Best > Yun Tang > > From: Danny Chan > Sent: Wednesday, July 8, 2020 15:46 > To: dev@flink.apache.org > Subject: Re: [ANNOUNCE] New PMC member: Piotr Nowojski > > Congratulations and nice work Piotr ~ > > Best, > Danny Chan > 在 2020年7月7日 +0800 PM10:00,dev@flink.apache.org,写道: > > > > Congratulations! >
Re: [ANNOUNCE] New PMC member: Piotr Nowojski
Congratulations, Piotr! Best Yun Tang From: Danny Chan Sent: Wednesday, July 8, 2020 15:46 To: dev@flink.apache.org Subject: Re: [ANNOUNCE] New PMC member: Piotr Nowojski Congratulations and nice work Piotr ~ Best, Danny Chan 在 2020年7月7日 +0800 PM10:00,dev@flink.apache.org,写道: > > Congratulations!
[jira] [Created] (FLINK-18526) Add the configuration of Python UDF using Managed Memory in the doc of Pyflink
Huang Xingbo created FLINK-18526: Summary: Add the configuration of Python UDF using Managed Memory in the doc of Pyflink Key: FLINK-18526 URL: https://issues.apache.org/jira/browse/FLINK-18526 Project: Flink Issue Type: Improvement Components: API / Python, Documentation Reporter: Huang Xingbo Fix For: 1.12.0, 1.11.1 Add the configuration of Python UDF using Managed Memory in the doc of Pyflink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18525) Running yarn-session.sh occurs error
zhangyunyun created FLINK-18525: --- Summary: Running yarn-session.sh occurs error Key: FLINK-18525 URL: https://issues.apache.org/jira/browse/FLINK-18525 Project: Flink Issue Type: Bug Components: Deployment / YARN Affects Versions: 1.11.0 Environment: hadoop-2.8.5 Reporter: zhangyunyun org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:382) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:514) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$4(FlinkYarnSessionCli.java:751) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:751) Caused by: java.io.FileNotFoundException: File does not exist: /tmp/application_1594196612035_0008-flink-conf.yaml3951184480005887817.tmp at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1444) at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1437) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1437) at org.apache.flink.yarn.YarnApplicationFileUploader.registerSingleLocalResource(YarnApplicationFileUploader.java:163) at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:839) at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:524) at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:375) ... 7 more -- This message was sent by Atlassian Jira (v8.3.4#803005)
Why ApplicationMode for k8s doesn't support using remote jar?
Hi, When I use ApplicationMode on k8s to submit a new flink application, it's not capable for me to using remote jar like what yarn does and I have to pack the jar into the image before. So I want know why the application mode for k8s doesn't support to reading remote jar. -- Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
Re: [ANNOUNCE] Apache Flink 1.11.0 released
Congratulations! Thanks Zhijiang and Piotr for the great work, and thanks everyone for their contribution! Best, Godfrey Benchao Li 于2020年7月8日周三 下午12:39写道: > Congratulations! Thanks Zhijiang & Piotr for the great work as release > managers. > > Rui Li 于2020年7月8日周三 上午11:38写道: > >> Congratulations! Thanks Zhijiang & Piotr for the hard work. >> >> On Tue, Jul 7, 2020 at 10:06 PM Zhijiang >> wrote: >> >>> The Apache Flink community is very happy to announce the release of >>> Apache Flink 1.11.0, which is the latest major release. >>> >>> Apache Flink® is an open-source stream processing framework for distributed, >>> high-performing, always-available, and accurate data streaming >>> applications. >>> >>> The release is available for download at: >>> https://flink.apache.org/downloads.html >>> >>> Please check out the release blog post for an overview of the >>> improvements for this new major release: >>> https://flink.apache.org/news/2020/07/06/release-1.11.0.html >>> >>> The full release notes are available in Jira: >>> >>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346364 >>> >>> We would like to thank all contributors of the Apache Flink community >>> who made this release possible! >>> >>> Cheers, >>> Piotr & Zhijiang >>> >> >> >> -- >> Best regards! >> Rui Li >> > > > -- > > Best, > Benchao Li >
[jira] [Created] (FLINK-18524) Scala varargs cause exception for new inference
Timo Walther created FLINK-18524: Summary: Scala varargs cause exception for new inference Key: FLINK-18524 URL: https://issues.apache.org/jira/browse/FLINK-18524 Project: Flink Issue Type: Sub-task Components: Table SQL / API Reporter: Timo Walther Assignee: Timo Walther Scala varargs are supported but cause an exception currently. Because there are two signatures (a valid and invalid one) in the class file. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] New PMC member: Piotr Nowojski
Congratulations and nice work Piotr ~ Best, Danny Chan 在 2020年7月7日 +0800 PM10:00,dev@flink.apache.org,写道: > > Congratulations!
Re: [ANNOUNCE] Apache Flink 1.11.0 released
Thanks Zhijiang and Piotr for the great work as release manager, and thanks everyone who makes the release possible! Best, Congxian Benchao Li 于2020年7月8日周三 下午12:39写道: > Congratulations! Thanks Zhijiang & Piotr for the great work as release > managers. > > Rui Li 于2020年7月8日周三 上午11:38写道: > >> Congratulations! Thanks Zhijiang & Piotr for the hard work. >> >> On Tue, Jul 7, 2020 at 10:06 PM Zhijiang >> wrote: >> >>> The Apache Flink community is very happy to announce the release of >>> Apache Flink 1.11.0, which is the latest major release. >>> >>> Apache Flink® is an open-source stream processing framework for distributed, >>> high-performing, always-available, and accurate data streaming >>> applications. >>> >>> The release is available for download at: >>> https://flink.apache.org/downloads.html >>> >>> Please check out the release blog post for an overview of the >>> improvements for this new major release: >>> https://flink.apache.org/news/2020/07/06/release-1.11.0.html >>> >>> The full release notes are available in Jira: >>> >>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346364 >>> >>> We would like to thank all contributors of the Apache Flink community >>> who made this release possible! >>> >>> Cheers, >>> Piotr & Zhijiang >>> >> >> >> -- >> Best regards! >> Rui Li >> > > > -- > > Best, > Benchao Li >
[jira] [Created] (FLINK-18523) The watermark could be updated without data for some time
chen yong created FLINK-18523: - Summary: The watermark could be updated without data for some time Key: FLINK-18523 URL: https://issues.apache.org/jira/browse/FLINK-18523 Project: Flink Issue Type: Improvement Affects Versions: 1.11.0 Reporter: chen yong In the case of window calculations and eventTime scenarios, watermar cannot update because the source does not have data for some reason, and the last Windows cannot trigger the calculations. One parameter, table.exec.source. Idle -timeout, can only solve the problem of ignoring parallelism of watermark alignment that does not occur.But when there is no watermark in each parallel degree, you still cannot update the watermark. Is it possible to add a lock-timeout parameter (which should be larger than maxOutOfOrderness with default of "-1 ms") and if the watermark is not updated beyond this time (i.e., there is no data), then the current time is taken and sent downstream as the watermark. thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005)