Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2
Hi Piotr and Zhijiang, I merged three commits in 1 hour ago: one blocker: 0dafcf8792220dbe5b77544261f726a566b054f5 two critical issues: 1830c1c47b8a985ec328a7332e92d21433c0a4df 80fa0f5c5b8600f4b386487f267bde80b882bd07 I have synced with Zhijiang, they were merged after RC2 branch cutting. So RC2 does not include them. Hope not bother your RC testing. If there are some problems, I will revert them at any time. Best, Jingsong On Wed, Jun 17, 2020 at 9:29 PM Piotr Nowojski wrote: > Hi all, > > I would like to give an update about the RC2 status. We are now waiting for > a green azure build on one final bug fix before creating RC2. This bug fix > should be merged late afternoon/early evening Berlin time, so RC2 will be > hopefully created tomorrow morning. Until then I would ask to not > merge/backport commits to release-1.11 branch, including bug fixes. If you > have something that's truly essential and should be treated as a release > blocker, please reach out to me or Zhijiang. > > Best, > Piotr Nowojski > -- Best, Jingsong Lee
[jira] [Created] (FLINK-18355) Simplify tests of SlotPoolImpl
Zhu Zhu created FLINK-18355: --- Summary: Simplify tests of SlotPoolImpl Key: FLINK-18355 URL: https://issues.apache.org/jira/browse/FLINK-18355 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination, Tests Reporter: Zhu Zhu Fix For: 1.12.0 Tests of SlotPoolImpl, including SlotPoolImplTest, SlotPoolInteractionsTest and SlotPoolSlotSharingTest, are somehow unnecessarily complicated in the code involvement. E.g. SchedulerImp built on top of SlotPoolImpl is used to allocate slots from SlotPoolImpl, which can be simplified by directly invoke slot allocation on SlotPoolImpl. Besides that, there are quite some duplications between tests classes of SlotPoolImpl, this further includes SlotPoolPendingRequestFailureTest, SlotPoolRequestCompletionTest and SlotPoolBatchSlotRequestTest. It can ease future development and maintenance a lot if we clean up these tests by 1. introduce a comment test base for fields and methods reuse 2. remove the usages of SchedulerImp for slotpool testing 3. other possible simplifications -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Congratulations , Yu! Best, Guowei Yang Wang 于2020年6月18日周四 上午10:36写道: > Congratulations , Yu! > > Best, > Yang > > Piotr Nowojski 于2020年6月17日周三 下午9:21写道: > > > Congratulations :) > > > > > On 17 Jun 2020, at 14:53, Yun Tang wrote: > > > > > > Congratulations , Yu! well deserved. > > > > > > Best > > > Yun Tang > > > > > > From: Yu Li > > > Sent: Wednesday, June 17, 2020 20:03 > > > To: dev > > > Subject: Re: [ANNOUNCE] Yu Li is now part of the Flink PMC > > > > > > Thanks everyone! Really happy to work in such a great and encouraging > > > community! > > > > > > Best Regards, > > > Yu > > > > > > > > > On Wed, 17 Jun 2020 at 19:59, Congxian Qiu > > wrote: > > > > > >> Congratulations Yu ! > > >> Best, > > >> Congxian > > >> > > >> > > >> Thomas Weise 于2020年6月17日周三 下午6:23写道: > > >> > > >>> Congratulations! > > >>> > > >>> > > >>> On Wed, Jun 17, 2020, 2:59 AM Fabian Hueske > wrote: > > >>> > > Congrats Yu! > > > > Cheers, Fabian > > > > Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann < > > trohrm...@apache.org>: > > > > > Congratulations Yu! > > > > > > Cheers, > > > Till > > > > > > On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li < > jingsongl...@gmail.com> > > > wrote: > > > > > >> Congratulations Yu, well deserved! > > >> > > >> Best, > > >> Jingsong > > >> > > >> On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei > > wrote: > > >> > > >>> Congrats, Yu! > > >>> > > >>> GXGX & well deserved!! > > >>> > > >>> Best Regards, > > >>> > > >>> Yuan > > >>> > > >>> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun < > > sunjincheng...@gmail.com> > > >>> wrote: > > >>> > > Hi all, > > > > On behalf of the Flink PMC, I'm happy to announce that Yu Li is > > >> now > > part of the Apache Flink Project Management Committee (PMC). > > > > Yu Li has been very active on Flink's Statebackend component, > > >>> working > > > on > > various improvements, for example the RocksDB memory management > > >> for > > > 1.10. > > and keeps checking and voting for our releases, and also has > > > successfully > > produced two releases(1.10.0&1.10.1) as RM. > > > > Congratulations & Welcome Yu Li! > > > > Best, > > Jincheng (on behalf of the Flink PMC) > > > > >>> > > >> > > >> -- > > >> Best, Jingsong Lee > > >> > > > > > > > >>> > > >> > > > > >
Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Congratulations , Yu! Best, Yang Piotr Nowojski 于2020年6月17日周三 下午9:21写道: > Congratulations :) > > > On 17 Jun 2020, at 14:53, Yun Tang wrote: > > > > Congratulations , Yu! well deserved. > > > > Best > > Yun Tang > > > > From: Yu Li > > Sent: Wednesday, June 17, 2020 20:03 > > To: dev > > Subject: Re: [ANNOUNCE] Yu Li is now part of the Flink PMC > > > > Thanks everyone! Really happy to work in such a great and encouraging > > community! > > > > Best Regards, > > Yu > > > > > > On Wed, 17 Jun 2020 at 19:59, Congxian Qiu > wrote: > > > >> Congratulations Yu ! > >> Best, > >> Congxian > >> > >> > >> Thomas Weise 于2020年6月17日周三 下午6:23写道: > >> > >>> Congratulations! > >>> > >>> > >>> On Wed, Jun 17, 2020, 2:59 AM Fabian Hueske wrote: > >>> > Congrats Yu! > > Cheers, Fabian > > Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann < > trohrm...@apache.org>: > > > Congratulations Yu! > > > > Cheers, > > Till > > > > On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li > > wrote: > > > >> Congratulations Yu, well deserved! > >> > >> Best, > >> Jingsong > >> > >> On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei > wrote: > >> > >>> Congrats, Yu! > >>> > >>> GXGX & well deserved!! > >>> > >>> Best Regards, > >>> > >>> Yuan > >>> > >>> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun < > sunjincheng...@gmail.com> > >>> wrote: > >>> > Hi all, > > On behalf of the Flink PMC, I'm happy to announce that Yu Li is > >> now > part of the Apache Flink Project Management Committee (PMC). > > Yu Li has been very active on Flink's Statebackend component, > >>> working > > on > various improvements, for example the RocksDB memory management > >> for > > 1.10. > and keeps checking and voting for our releases, and also has > > successfully > produced two releases(1.10.0&1.10.1) as RM. > > Congratulations & Welcome Yu Li! > > Best, > Jincheng (on behalf of the Flink PMC) > > >>> > >> > >> -- > >> Best, Jingsong Lee > >> > > > > >>> > >> > >
Re: Flink Table program cannot be compiled when enable checkpoint of StreamExecutionEnvironment
Glad to hear that! On Thu, 18 Jun 2020 at 09:12, 杜斌 wrote: > Sure, it helps a lot. After update to Flink 1.10.1, problem solved! > > Thanks, > Bin > > Fabian Hueske 于2020年6月17日周三 下午4:32写道: > > > The exception is thrown when the StreamGraph is translated to a JobGraph. > > The translation logic has a switch. If checkpointing is enabled, the Java > > code generated by the optimizer is directly compiled in the client (in > > contrast to compiling it on the TaskManagers when the operators are > > started). > > The compiler needs access to the UDF classes but the classloader that's > > being used doesn't know about the UDF classes. > > > > The classloader that's used for the compilation is generated by > > PackagedProgram. > > You can configure the PackagedProgram with the right user code JAR URLs > > which contain the UDF classes. > > Alternatively, you can try to inject another classloader into > > PackagedProgram using Reflection (but that's a rather hacky approach). > > > > Hope this helps. > > > > Cheers, Fabian > > > > Am Mi., 17. Juni 2020 um 06:48 Uhr schrieb Jark Wu : > > > > > Why compile JobGraph yourself? This is really an internal API and may > > cause > > > problems. > > > Could you try to use `flink run` command [1] to submit your user jar > > > instead? > > > > > > Btw, what's your Flink version? If you are using Flink 1.10.0, could > you > > > try to use 1.10.1? > > > > > > Best, > > > Jark > > > > > > On Wed, 17 Jun 2020 at 12:41, 杜斌 wrote: > > > > > > > Thanks for the reply, > > > > Here is the simple java program that re-produce the problem: > > > > 1. code for the application: > > > > > > > > import org.apache.flink.api.java.tuple.Tuple2; > > > > import org.apache.flink.streaming.api.datastream.DataStream; > > > > import > > > > > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; > > > > import org.apache.flink.table.api.EnvironmentSettings; > > > > import org.apache.flink.table.api.Table; > > > > import org.apache.flink.table.api.java.StreamTableEnvironment; > > > > import org.apache.flink.types.Row; > > > > > > > > > > > > import java.util.Arrays; > > > > > > > > public class Test { > > > > public static void main(String[] args) throws Exception { > > > > > > > > StreamExecutionEnvironment env = > > > > StreamExecutionEnvironment.getExecutionEnvironment(); > > > > /** > > > > * If enable checkpoint, blink planner will failed > > > > */ > > > > env.enableCheckpointing(1000); > > > > > > > > EnvironmentSettings envSettings = > > > EnvironmentSettings.newInstance() > > > > //.useBlinkPlanner() // compile fail > > > > .useOldPlanner() // compile success > > > > .inStreamingMode() > > > > .build(); > > > > StreamTableEnvironment tEnv = > > > > StreamTableEnvironment.create(env, envSettings); > > > > DataStream orderA = env.fromCollection(Arrays.asList( > > > > new Order(1L, "beer", 3), > > > > new Order(1L, "diaper", 4), > > > > new Order(3L, "beer", 2) > > > > )); > > > > > > > > //Table table = tEnv.fromDataStream(orderA); > > > > tEnv.createTemporaryView("orderA", orderA); > > > > Table res = tEnv.sqlQuery("SELECT * FROM orderA"); > > > > DataStream> ds = > > > > tEnv.toRetractStream(res, Row.class); > > > > ds.print(); > > > > env.execute(); > > > > > > > > } > > > > > > > > public static class Order { > > > > public long user; > > > > public String product; > > > > public int amount; > > > > > > > > public Order(long user, String product, int amount) { > > > > this.user = user; > > > > this.product = product; > > > > this.amount = amount; > > > > } > > > > > > > > public long getUser() { > > > > return user; > > > > } > > > > > > > > public void setUser(long user) { > > > > this.user = user; > > > > } > > > > > > > > public String getProduct() { > > > > return product; > > > > } > > > > > > > > public void setProduct(String product) { > > > > this.product = product; > > > > } > > > > > > > > public int getAmount() { > > > > return amount; > > > > } > > > > > > > > public void setAmount(int amount) { > > > > this.amount = amount; > > > > } > > > > } > > > > } > > > > > > > > 2. mvn clean package to a jar file > > > > 3. then we use the following code to produce a jobgraph: > > > > > > > > PackagedProgram program = > > > > PackagedProgram.newBuilder() > > > > .setJarFile(userJarPath) > > > > .setUserClassPaths(classpaths) > > > > .setEntryPointClassName(userMainClass) > > > > .setConfiguration(configuration)
[jira] [Created] (FLINK-18354) when use ParquetAvroWriters.forGenericRecord(Schema schema) error java.lang.ClassCastException: org.apache.flink.api.java.tuple.Tuple2 cannot be cast to org.apache.avro
Yangyingbo created FLINK-18354: -- Summary: when use ParquetAvroWriters.forGenericRecord(Schema schema) error java.lang.ClassCastException: org.apache.flink.api.java.tuple.Tuple2 cannot be cast to org.apache.avro.generic.IndexedRecord Key: FLINK-18354 URL: https://issues.apache.org/jira/browse/FLINK-18354 Project: Flink Issue Type: Bug Components: API / DataStream, Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.10.0 Reporter: Yangyingbo when i use ParquetAvroWriters.forGenericRecord(Schema schema) write data to parquet ,it has occur some error: mycode: ``` //transfor 2 dataStream // TupleTypeInfo tupleTypeInfo = new TupleTypeInfo(GenericData.Record.class, BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO); TupleTypeInfo tupleTypeInfo = new TupleTypeInfo(BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO); DataStream testDataStream = flinkTableEnv.toAppendStream(test, tupleTypeInfo); testDataStream.print().setParallelism(1); ArrayList fields = new ArrayList(); fields.add(new org.apache.avro.Schema.Field("id", org.apache.avro.Schema.create(org.apache.avro.Schema.Type.STRING), "id", JsonProperties.NULL_VALUE)); fields.add(new org.apache.avro.Schema.Field("time", org.apache.avro.Schema.create(org.apache.avro.Schema.Type.STRING), "time", JsonProperties.NULL_VALUE)); org.apache.avro.Schema parquetSinkSchema = org.apache.avro.Schema.createRecord("pi", "flinkParquetSink", "flink.parquet", true, fields); String fileSinkPath = "./xxx.text/rs6/"; StreamingFileSink parquetSink = StreamingFileSink. forBulkFormat(new Path(fileSinkPath), ParquetAvroWriters.forGenericRecord(parquetSinkSchema)) .withRollingPolicy(OnCheckpointRollingPolicy.build()) .build(); testDataStream.addSink(parquetSink).setParallelism(1); flinkTableEnv.execute("ReadFromKafkaConnectorWriteToLocalFileJava"); ``` and this error: ``` 09:29:50,283 INFO org.apache.flink.runtime.taskmanager.Task - Sink: Unnamed (1/1) (79505cb6ab2df38886663fd99461315a) switched from RUNNING to FAILED.09:29:50,283 INFO org.apache.flink.runtime.taskmanager.Task - Sink: Unnamed (1/1) (79505cb6ab2df38886663fd99461315a) switched from RUNNING to FAILED.java.lang.ClassCastException: org.apache.flink.api.java.tuple.Tuple2 cannot be cast to org.apache.avro.generic.IndexedRecord at org.apache.avro.generic.GenericData.getField(GenericData.java:697) at org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:188) at org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:165) at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128) at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299) at org.apache.flink.formats.parquet.ParquetBulkWriter.addElement(ParquetBulkWriter.java:52) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:50) at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.write(Bucket.java:214) at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onElement(Buckets.java:274) at org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.invoke(StreamingFileSink.java:445) at org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:56) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:173) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:151) at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:128) at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69) at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:311) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:487) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:470) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) at java.lang.Thread.run(Thread.java:748)09:29:50,284 INFO org.apache.flink.runtime.taskmanager.Task - Freeing task resources for Sink: Unnamed (1/1) (79505cb6ab2df38886663fd99461315a).09:29:50,285 INFO org.apache.flink.runtime.taskmanager.Task - Ensuring all FileSystem streams are closed for task Sink: Unnamed (1/1) (79505cb6ab2df38886663fd99461315a) [FAILED]09:29:50,289 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor -
Re: Flink Table program cannot be compiled when enable checkpoint of StreamExecutionEnvironment
Sure, it helps a lot. After update to Flink 1.10.1, problem solved! Thanks, Bin Fabian Hueske 于2020年6月17日周三 下午4:32写道: > The exception is thrown when the StreamGraph is translated to a JobGraph. > The translation logic has a switch. If checkpointing is enabled, the Java > code generated by the optimizer is directly compiled in the client (in > contrast to compiling it on the TaskManagers when the operators are > started). > The compiler needs access to the UDF classes but the classloader that's > being used doesn't know about the UDF classes. > > The classloader that's used for the compilation is generated by > PackagedProgram. > You can configure the PackagedProgram with the right user code JAR URLs > which contain the UDF classes. > Alternatively, you can try to inject another classloader into > PackagedProgram using Reflection (but that's a rather hacky approach). > > Hope this helps. > > Cheers, Fabian > > Am Mi., 17. Juni 2020 um 06:48 Uhr schrieb Jark Wu : > > > Why compile JobGraph yourself? This is really an internal API and may > cause > > problems. > > Could you try to use `flink run` command [1] to submit your user jar > > instead? > > > > Btw, what's your Flink version? If you are using Flink 1.10.0, could you > > try to use 1.10.1? > > > > Best, > > Jark > > > > On Wed, 17 Jun 2020 at 12:41, 杜斌 wrote: > > > > > Thanks for the reply, > > > Here is the simple java program that re-produce the problem: > > > 1. code for the application: > > > > > > import org.apache.flink.api.java.tuple.Tuple2; > > > import org.apache.flink.streaming.api.datastream.DataStream; > > > import > > > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; > > > import org.apache.flink.table.api.EnvironmentSettings; > > > import org.apache.flink.table.api.Table; > > > import org.apache.flink.table.api.java.StreamTableEnvironment; > > > import org.apache.flink.types.Row; > > > > > > > > > import java.util.Arrays; > > > > > > public class Test { > > > public static void main(String[] args) throws Exception { > > > > > > StreamExecutionEnvironment env = > > > StreamExecutionEnvironment.getExecutionEnvironment(); > > > /** > > > * If enable checkpoint, blink planner will failed > > > */ > > > env.enableCheckpointing(1000); > > > > > > EnvironmentSettings envSettings = > > EnvironmentSettings.newInstance() > > > //.useBlinkPlanner() // compile fail > > > .useOldPlanner() // compile success > > > .inStreamingMode() > > > .build(); > > > StreamTableEnvironment tEnv = > > > StreamTableEnvironment.create(env, envSettings); > > > DataStream orderA = env.fromCollection(Arrays.asList( > > > new Order(1L, "beer", 3), > > > new Order(1L, "diaper", 4), > > > new Order(3L, "beer", 2) > > > )); > > > > > > //Table table = tEnv.fromDataStream(orderA); > > > tEnv.createTemporaryView("orderA", orderA); > > > Table res = tEnv.sqlQuery("SELECT * FROM orderA"); > > > DataStream> ds = > > > tEnv.toRetractStream(res, Row.class); > > > ds.print(); > > > env.execute(); > > > > > > } > > > > > > public static class Order { > > > public long user; > > > public String product; > > > public int amount; > > > > > > public Order(long user, String product, int amount) { > > > this.user = user; > > > this.product = product; > > > this.amount = amount; > > > } > > > > > > public long getUser() { > > > return user; > > > } > > > > > > public void setUser(long user) { > > > this.user = user; > > > } > > > > > > public String getProduct() { > > > return product; > > > } > > > > > > public void setProduct(String product) { > > > this.product = product; > > > } > > > > > > public int getAmount() { > > > return amount; > > > } > > > > > > public void setAmount(int amount) { > > > this.amount = amount; > > > } > > > } > > > } > > > > > > 2. mvn clean package to a jar file > > > 3. then we use the following code to produce a jobgraph: > > > > > > PackagedProgram program = > > > PackagedProgram.newBuilder() > > > .setJarFile(userJarPath) > > > .setUserClassPaths(classpaths) > > > .setEntryPointClassName(userMainClass) > > > .setConfiguration(configuration) > > > > > > .setSavepointRestoreSettings((descriptor.isRecoverFromSavepoint() && > > > descriptor.getSavepointPath() != null && > > > !descriptor.getSavepointPath().equals("")) ? > > > > > > SavepointRestoreSettings.forPath(descriptor.getSavepointPath(), > > > descriptor.isAllowNonRestoredState()) : > > > SavepointRestoreSettings.none())
[jira] [Created] (FLINK-18353) [1.11.0] maybe document jobmanager behavior change regarding -XX:MaxDirectMemorySize
Steven Zhen Wu created FLINK-18353: -- Summary: [1.11.0] maybe document jobmanager behavior change regarding -XX:MaxDirectMemorySize Key: FLINK-18353 URL: https://issues.apache.org/jira/browse/FLINK-18353 Project: Flink Issue Type: Improvement Components: Runtime / Configuration Affects Versions: 1.11.0 Reporter: Steven Zhen Wu I know FLIP-116 (Unified Memory Configuration for Job Managers) is introduced in 1.11. That does cause a small behavior change regarding `-XX:MaxDirectMemorySize`. Previously, jobmanager JVM args don't set `-XX:MaxDirectMemorySize`, which means JVM can use up to -`Xmx` size. Now `-XX:MaxDirectMemorySize` is always set to {{[jobmanager.memory.off-heap.size|https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#jobmanager-memory-off-heap-size] config (default 128 mb). }} {{}} {{It is possible to get "java.lang.OufOfMemoryError: Direct Buffer Memory" without tuning }}{{[jobmanager.memory.off-heap.size|https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#jobmanager-memory-off-heap-size]}} especially for high-parallelism jobs. Previously, no tuning needed. Maybe we should point out the behavior change in the migration guide? [https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html#migrate-job-manager-memory-configuration] -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Flink Table program cannot be compiled when enable checkpoint of StreamExecutionEnvironment
I use Flink 1.10.0. After I updated the version to 1.10.1, the code is fully functional. Big Thanks!! We compile JobGraph mainly for building up a streaming service, and get some meta data of the job, save the meta data, do some monitoring on the jobs and so on. Thanks, Bin Jark Wu 于2020年6月17日周三 下午12:48写道: > Why compile JobGraph yourself? This is really an internal API and may cause > problems. > Could you try to use `flink run` command [1] to submit your user jar > instead? > > Btw, what's your Flink version? If you are using Flink 1.10.0, could you > try to use 1.10.1? > > Best, > Jark > > On Wed, 17 Jun 2020 at 12:41, 杜斌 wrote: > > > Thanks for the reply, > > Here is the simple java program that re-produce the problem: > > 1. code for the application: > > > > import org.apache.flink.api.java.tuple.Tuple2; > > import org.apache.flink.streaming.api.datastream.DataStream; > > import > > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; > > import org.apache.flink.table.api.EnvironmentSettings; > > import org.apache.flink.table.api.Table; > > import org.apache.flink.table.api.java.StreamTableEnvironment; > > import org.apache.flink.types.Row; > > > > > > import java.util.Arrays; > > > > public class Test { > > public static void main(String[] args) throws Exception { > > > > StreamExecutionEnvironment env = > > StreamExecutionEnvironment.getExecutionEnvironment(); > > /** > > * If enable checkpoint, blink planner will failed > > */ > > env.enableCheckpointing(1000); > > > > EnvironmentSettings envSettings = > EnvironmentSettings.newInstance() > > //.useBlinkPlanner() // compile fail > > .useOldPlanner() // compile success > > .inStreamingMode() > > .build(); > > StreamTableEnvironment tEnv = > > StreamTableEnvironment.create(env, envSettings); > > DataStream orderA = env.fromCollection(Arrays.asList( > > new Order(1L, "beer", 3), > > new Order(1L, "diaper", 4), > > new Order(3L, "beer", 2) > > )); > > > > //Table table = tEnv.fromDataStream(orderA); > > tEnv.createTemporaryView("orderA", orderA); > > Table res = tEnv.sqlQuery("SELECT * FROM orderA"); > > DataStream> ds = > > tEnv.toRetractStream(res, Row.class); > > ds.print(); > > env.execute(); > > > > } > > > > public static class Order { > > public long user; > > public String product; > > public int amount; > > > > public Order(long user, String product, int amount) { > > this.user = user; > > this.product = product; > > this.amount = amount; > > } > > > > public long getUser() { > > return user; > > } > > > > public void setUser(long user) { > > this.user = user; > > } > > > > public String getProduct() { > > return product; > > } > > > > public void setProduct(String product) { > > this.product = product; > > } > > > > public int getAmount() { > > return amount; > > } > > > > public void setAmount(int amount) { > > this.amount = amount; > > } > > } > > } > > > > 2. mvn clean package to a jar file > > 3. then we use the following code to produce a jobgraph: > > > > PackagedProgram program = > > PackagedProgram.newBuilder() > > .setJarFile(userJarPath) > > .setUserClassPaths(classpaths) > > .setEntryPointClassName(userMainClass) > > .setConfiguration(configuration) > > > > .setSavepointRestoreSettings((descriptor.isRecoverFromSavepoint() && > > descriptor.getSavepointPath() != null && > > !descriptor.getSavepointPath().equals("")) ? > > > > SavepointRestoreSettings.forPath(descriptor.getSavepointPath(), > > descriptor.isAllowNonRestoredState()) : > > SavepointRestoreSettings.none()) > > .setArguments(userArgs) > > .build(); > > > > > > JobGraph jobGraph = PackagedProgramUtils.createJobGraph(program, > > configuration, 4, true); > > > > 4. If we use blink planner & enable checkpoint, the compile will failed. > > For the others, the compile success. > > > > Thanks, > > Bin > > > > Jark Wu 于2020年6月17日周三 上午10:42写道: > > > > > Hi, > > > > > > Which Flink version are you using? Are you using SQL CLI? Could you > share > > > your table/sql program? > > > We did fix some classloading problems around SQL CLI, e.g. FLINK-18302 > > > > > > Best, > > > Jark > > > > > > On Wed, 17 Jun 2020 at 10:31, 杜斌 wrote: > > > > > > > add the full stack trace here: > > > > > > > > > > > > Caused by: > > > > > > > > > > > > > > org.apache.flink.shaded.guava18.com.google.common.util.concurrent.UncheckedExecutionException: > > > >
[jira] [Created] (FLINK-18352) org.apache.flink.core.execution.DefaultExecutorServiceLoader not thread safe
Marcos Klein created FLINK-18352: Summary: org.apache.flink.core.execution.DefaultExecutorServiceLoader not thread safe Key: FLINK-18352 URL: https://issues.apache.org/jira/browse/FLINK-18352 Project: Flink Issue Type: Bug Affects Versions: 1.10.0 Reporter: Marcos Klein The singleton nature of the *org.apache.flink.core.execution.DefaultExecutorServiceLoader* class is not thread-safe due to the fact that *java.util.ServiceLoader* class is not thread-safe. [https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/ServiceLoader.html#Concurrency] This can result in *ServiceLoader* class entering into an inconsistent state for processes which attempt to self-heal. This then requires bouncing the process/container in the hopes the race condition does not re-occur. [https://stackoverflow.com/questions/60391499/apache-flink-cannot-find-compatible-factory-for-specified-execution-target-lo] Additionally the following stack traces have been seen when using a *org.apache.flink.streaming.api.environment.RemoteStreamEnvironment* instances. {code:java} java.lang.ArrayIndexOutOfBoundsException: 2 at sun.misc.CompoundEnumeration.nextElement(CompoundEnumeration.java:61) at java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:357) at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393) at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474) at org.apache.flink.core.execution.DefaultExecutorServiceLoader.getExecutorFactory(DefaultExecutorServiceLoader.java:60) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1724) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1706) {code} {code:java} java.util.NoSuchElementException: null at sun.misc.CompoundEnumeration.nextElement(CompoundEnumeration.java:59) at java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:357) at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393) at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474) at org.apache.flink.core.execution.DefaultExecutorServiceLoader.getExecutorFactory(DefaultExecutorServiceLoader.java:60) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1724) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1706) {code} The workaround for using the ***StreamExecutionEnvironment* implementations is to write a custom implementation of *DefaultExecutorServiceLoader* which is thread-safe and pass that to the *StreamExecutionEnvironment* constructors. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18351) ModuleManager creates a lot of duplicate/similar log messages
Robert Metzger created FLINK-18351: -- Summary: ModuleManager creates a lot of duplicate/similar log messages Key: FLINK-18351 URL: https://issues.apache.org/jira/browse/FLINK-18351 Project: Flink Issue Type: Improvement Components: Table SQL / API Affects Versions: 1.12.0 Reporter: Robert Metzger This is a follow up to FLINK-17977: {code} 2020-06-03 15:02:11,982 INFO org.apache.flink.table.module.ModuleManager [] - Got FunctionDefinition 'as' from 'core' module. 2020-06-03 15:02:11,988 INFO org.apache.flink.table.module.ModuleManager [] - Got FunctionDefinition 'sum' from 'core' module. 2020-06-03 15:02:12,139 INFO org.apache.flink.table.module.ModuleManager [] - Got FunctionDefinition 'as' from 'core' module. 2020-06-03 15:02:12,159 INFO org.apache.flink.table.module.ModuleManager [] - Got FunctionDefinition 'equals' from 'core' module. {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18350) [1.11.0] jobmanager complains `taskmanager.memory.process.size` missing
Steven Zhen Wu created FLINK-18350: -- Summary: [1.11.0] jobmanager complains `taskmanager.memory.process.size` missing Key: FLINK-18350 URL: https://issues.apache.org/jira/browse/FLINK-18350 Project: Flink Issue Type: Bug Components: Runtime / Configuration Affects Versions: 1.11.0 Reporter: Steven Zhen Wu Saw this failure in jobmanager startup. I know the exception said that `taskmanager.memory.process.size` missing. We set it at taskmanager side in `flink-conf.yaml`. But I am wondering why is this required by jobmanager for session cluster mode. When taskmanager registering with jobmanager, it reports the resources (like CPU, memory etc.). {code:java} 2020-06-17 18:06:25,079 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint[main] - Could not start cluster entrypoint TitusSessionClusterEntrypoint. org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint TitusSessionClusterEntrypoint. at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:516) at com.netflix.spaas.runtime.TitusSessionClusterEntrypoint.main(TitusSessionClusterEntrypoint.java:103) Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent. at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:255) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:216) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168) ... 2 more Caused by: org.apache.flink.configuration.IllegalConfigurationException: Cannot read memory size from config option 'taskmanager.memory.process.size'. at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.getMemorySizeFromConfig(ProcessMemoryUtils.java:234) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.deriveProcessSpecWithTotalProcessMemory(ProcessMemoryUtils.java:100) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.memoryProcessSpecFromConfig(ProcessMemoryUtils.java:79) at org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:109) at org.apache.flink.runtime.clusterframework.TaskExecutorProcessSpecBuilder.build(TaskExecutorProcessSpecBuilder.java:58) at org.apache.flink.runtime.resourcemanager.WorkerResourceSpecFactory.workerResourceSpecFromConfigAndCpu(WorkerResourceSpecFactory.java:37) at com.netflix.spaas.runtime.resourcemanager.TitusWorkerResourceSpecFactory.createDefaultWorkerResourceSpec(TitusWorkerResourceSpecFactory.java:17) at org.apache.flink.runtime.resourcemanager.ResourceManagerRuntimeServicesConfiguration.fromConfiguration(ResourceManagerRuntimeServicesConfiguration.java:67) at com.netflix.spaas.runtime.resourcemanager.TitusResourceManagerFactory.createResourceManager(TitusResourceManagerFactory.java:53) at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:167) ... 9 more Caused by: java.lang.IllegalArgumentException: Could not parse value '7500}' for key 'taskmanager.memory.process.size'. at org.apache.flink.configuration.Configuration.getOptional(Configuration.java:753) at org.apache.flink.configuration.Configuration.get(Configuration.java:738) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.getMemorySizeFromConfig(ProcessMemoryUtils.java:232) ... 18 more Caused by: java.lang.IllegalArgumentException: Memory size unit '}' does not match any of the recognized units: (b | bytes) / (k | kb | kibibytes) / (m | mb | mebibytes) / (g | gb | gibibytes) / (t | tb | tebibytes) at org.apache.flink.configuration.MemorySize.parseUnit(MemorySize.java:331) at org.apache.flink.configuration.MemorySize.parseBytes(MemorySize.java:306) at org.apache.flink.configuration.MemorySize.parse(MemorySize.java:247) at
Re: Any good comparison on FSM of Flink vs Akka?
Hi Igal, I read that, To solve the following use case, What will be the best option between Flink or Flink Stateful Functions or Akka FSM? http://thatgamesguy.co.uk/articles/modular-finite-state-machine-in-unity/ or these images: https://imgur.com/a/MV62BlO Was looking for an in-depth comparison of Akka FSM vs Flink Stateful Functions.. Thank You On Wed, Jun 17, 2020 at 7:58 PM Igal Shilman wrote: > Hi Rohit, > > Stateful functions fit well into domains that require many (billions) state > machines that are able to communicate with each other by message passing. > In stateful functions world, a state machine can be represented by a > stateful function - a uniquely addressable entity, that can keep state and > be invoked with > messages. > > Here is a summary of some of the capabilities of stateful functions: > > - built on a scalable battle tested, stateful stream processor > - scales to many millions state machines per node (bounded by disk size) > idle state machines do not occupy any RAM. > - exactly once processing of the messages and state modifications across > all of the state machines > - Globally consists of point in time snapshots to durable storage, like S3 > or HDFS. > - interop with FaaS via remote functions - being able to seamlessly scale > the compute part when needed. > - no need for service discovery and complex failure-recovery logic around > message sending, de-duping, retrying etc' > > I would encourage you to visit [1] for more information, and take a look at > some of the recording of the previous Flink forward conferences to > understand more > about what kind of applications you can build with that. > > [1] https://flink.apache.org/stateful-functions.html > > Good luck, > Igal. > > > On Wed, Jun 17, 2020 at 3:18 PM Rohit R wrote: > > > Hi Till Rohrmann, > > > > Consider any Finite State Machine which involves many states and need > > timers (wait for this much time and then timeout) or for example, > consider > > the Movie Booking ticketing system > > > http://thatgamesguy.co.uk/articles/modular-finite-state-machine-in-unity/ > > or these images: > > https://imgur.com/a/MV62BlO > > > > The current use case is based on FSM, but in future can consider HFSM as > > well: > > https://web.stanford.edu/class/cs123/lectures/CS123_lec08_HFSM_BT.pdf > > > > Thank You > > > > On Wed, Jun 17, 2020 at 5:20 PM Till Rohrmann > > wrote: > > > > > Hi Rohit, > > > > > > image attachments are filtered out and not visible to others. Hence, I > > > would suggest that you upload the image and then share the link. > > > > > > Maybe you can share a bit more details about the use case and your > > current > > > analysis of the problem. > > > > > > Cheers, > > > Till > > > > > > On Wed, Jun 17, 2020 at 12:15 PM Rohit R > > wrote: > > > > > > > Hello, > > > > > > > > To solve the following use case, What will be the best option between > > > > Flink or Flink Stateful Functions or Akka FSM? > > > > > > > > Use Case: > > > > [image: image.png] > > > > > > > > Can I get the analysis, pros, and cons of each? For example, why > > choosing > > > > Flink Stateful function will better option. > > > > > > > > Thank You > > > > > > > > > >
[jira] [Created] (FLINK-18349) Add Flink 1.11 release notes to documentation
Piotr Nowojski created FLINK-18349: -- Summary: Add Flink 1.11 release notes to documentation Key: FLINK-18349 URL: https://issues.apache.org/jira/browse/FLINK-18349 Project: Flink Issue Type: Task Components: Documentation Affects Versions: 1.10.0 Reporter: Piotr Nowojski Assignee: Gary Yao Fix For: 1.10.0 Gather, edit, and add Flink 1.10 release notes to documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Re-renaming "Flink Master" back to JobManager
Thanks a lot for looking into this! +1 to your proposal On Wed, Jun 17, 2020 at 10:55 AM David Anderson wrote: > Aljoscha, > > I think this is a step in the right direction. > > In some cases it may be difficult to talk concretely about the > differences between different deployment models (e.g., comparing a k8s > per-job cluster to a YARN-based session cluster, which is something I > typically present during training) without giving names to the internal > components. I'm not convinced we can completely avoid mentioning the > JobMaster (and Dispatcher and ResourceManagers) in some (rare) contexts -- > but I don't see this as an argument against the proposed change. > > David > > On Mon, Jun 15, 2020 at 2:32 PM Konstantin Knauf > wrote: > > > Hi Aljoscha, > > > > sounds good to me. Let’s also make sure we don’t refer to the JobMaster > as > > Jobmanager anywhere then (code, config). > > > > I am not sure we can avoid mentioning the Flink ResourceManagers in user > > facing docs completely. For JobMaster and Dispatcher this seems doable. > > > > Best, > > > > Konstantin > > > > On Mon 15. Jun 2020 at 12:56, Aljoscha Krettek > > wrote: > > > > > Hi All, > > > > > > This came to my mind because of the master/slave discussion in [1] and > > > the larger discussions about inequality/civil rights happening right > now > > > in the world. I think for this reason alone we should use a name that > > > does not include "master". > > > > > > We could rename it back to JobManager, which was the name mostly used > > > before 2019. Since the beginning of Flink, TaskManager was the term > used > > > for the worker component/node and JobManager was the term used for the > > > orchestrating component/node. > > > > > > Currently our glossary [2] defines these terms (paraphrased by me): > > > > > > - "Flink Master": it's the orchestrating component that consists of > > > resource manager, dispatcher, and JobManager > > > > > > - JobManager: it's the thing that manages a single job and runs as > > > part of a "Flink Master" > > > > > > - TaskManager: it's the worker process > > > > > > Prior to the introduction of the glossary the definition of JobManager > > > would have been: > > > > > > - It's the orchestrating component that manages execution of jobs and > > > schedules work on TaskManagers. > > > > > > Quite some parts in the code and documentation/configuration options > > > still use that older meaning of JobManager. Newer parts of the > > > documentation use "Flink Master" instead. > > > > > > I'm proposing to go back to calling the orchestrating component > > > JobManager, which would mean that we have to touch up the documentation > > > to remove mentions of "Flink Master". I'm also proposing not to mention > > > the internal components such as resource manager and dispatcher in the > > > glossary because there are transparent to users. > > > > > > I'm proposing to go back to JobManager instead of an alternative name > > > also because switching to yet another name would mean many more changes > > > to code/documentation/peoples minds. > > > > > > What do you all think? > > > > > > Best, > > > Aljoscha > > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-18209 > > > [2] > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/concepts/glossary.html > > > > > -- > > > > Konstantin Knauf > > > > https://twitter.com/snntrable > > > > https://github.com/knaufk > > >
Re: Any good comparison on FSM of Flink vs Akka?
Hi Rohit, Stateful functions fit well into domains that require many (billions) state machines that are able to communicate with each other by message passing. In stateful functions world, a state machine can be represented by a stateful function - a uniquely addressable entity, that can keep state and be invoked with messages. Here is a summary of some of the capabilities of stateful functions: - built on a scalable battle tested, stateful stream processor - scales to many millions state machines per node (bounded by disk size) idle state machines do not occupy any RAM. - exactly once processing of the messages and state modifications across all of the state machines - Globally consists of point in time snapshots to durable storage, like S3 or HDFS. - interop with FaaS via remote functions - being able to seamlessly scale the compute part when needed. - no need for service discovery and complex failure-recovery logic around message sending, de-duping, retrying etc' I would encourage you to visit [1] for more information, and take a look at some of the recording of the previous Flink forward conferences to understand more about what kind of applications you can build with that. [1] https://flink.apache.org/stateful-functions.html Good luck, Igal. On Wed, Jun 17, 2020 at 3:18 PM Rohit R wrote: > Hi Till Rohrmann, > > Consider any Finite State Machine which involves many states and need > timers (wait for this much time and then timeout) or for example, consider > the Movie Booking ticketing system > http://thatgamesguy.co.uk/articles/modular-finite-state-machine-in-unity/ > or these images: > https://imgur.com/a/MV62BlO > > The current use case is based on FSM, but in future can consider HFSM as > well: > https://web.stanford.edu/class/cs123/lectures/CS123_lec08_HFSM_BT.pdf > > Thank You > > On Wed, Jun 17, 2020 at 5:20 PM Till Rohrmann > wrote: > > > Hi Rohit, > > > > image attachments are filtered out and not visible to others. Hence, I > > would suggest that you upload the image and then share the link. > > > > Maybe you can share a bit more details about the use case and your > current > > analysis of the problem. > > > > Cheers, > > Till > > > > On Wed, Jun 17, 2020 at 12:15 PM Rohit R > wrote: > > > > > Hello, > > > > > > To solve the following use case, What will be the best option between > > > Flink or Flink Stateful Functions or Akka FSM? > > > > > > Use Case: > > > [image: image.png] > > > > > > Can I get the analysis, pros, and cons of each? For example, why > choosing > > > Flink Stateful function will better option. > > > > > > Thank You > > > > > >
[ANNOUNCE] Apache Flink 1.11.0, release candidate #2
Hi all, I would like to give an update about the RC2 status. We are now waiting for a green azure build on one final bug fix before creating RC2. This bug fix should be merged late afternoon/early evening Berlin time, so RC2 will be hopefully created tomorrow morning. Until then I would ask to not merge/backport commits to release-1.11 branch, including bug fixes. If you have something that's truly essential and should be treated as a release blocker, please reach out to me or Zhijiang. Best, Piotr Nowojski
Re: [DISCUSS] SQL Syntax for Table API StatementSet
Hi Fabian, Jack, Timo Thanks for the suggestions. Regarding the SQL syntax, BEGIN is more popular than START. I'm fine with the syntax Timo suggested. Regarding whether this should be implemented in Flink's SQL core. I think there are three things to consider: First one, do we need to unify the default behavior of API and sql file? The execution of `TableEnvironment#executeSql` method and `StatementSet#execute` method is asynchronous for both batch and streaming, which means these methods just submit the job and then return a `TableResult`. While for batch processing (e.g. hive, traditional databases), the default behavior is sync mode. So this behavior is different from the APIs. I think it's better we can unify the default behavior. Second one, how to determine the execution behavior of each statement in a file which contains both batch sql and streaming sql. Currently, we have a flag to tell the planner that the TableEnvironment is batch env or stream env which can determine the default behavior. We want to remove the flag and unify the TableEnvironment in the future. Then TableEnvironment can execute both batch sql and streaming sql. Timo and I have a discussion about this on slack: for DML & DQL, if a statement has keywords like `EMIT STREAM`, it's streaming sql and will be executed in async mode. otherwise it's a batch sql and will be executed in sync mode. Three one, how to flexibly support execution mode switching for batch sql. For streaming sql, all DMLs & DQLs should be in async mode because the job may be never finished. While for batch sql, I think both modes are needed. I know some platforms execute batch sql in async mode, and then continuously monitor the job status. Do we need introduce `set execute-mode=xx` command or new sql syntax like `START SYNC EXECUTION` ? For sql-client or other projects, we can easily decide what behavior an app can support. Just as Jark said, many downstream projects have the same requirement for multiple statement support, but they may have different execution behaviors. It's great if flink can support flexible execution modes. Or Flink core just defines the syntax, provides parser and supports a default execution mode. The downstream projects can use the APIs and parsed results to decide how to execute a sql. Best, Godfrey Timo Walther 于2020年6月17日周三 下午6:32写道: > Hi Fabian, > > thanks for the proposal. I agree that we should have consensus on the > SQL syntax as well and thus finalize the concepts introduced in FLIP-84. > > I would favor Jark's proposal. I would like to propose the following > syntax: > > BEGIN STATEMENT SET; >INSERT INTO ...; >INSERT INTO ...; > END; > > 1) BEGIN and END are commonly used for blocks in SQL. > > 2) We should not start mixing START/BEGIN for different kind of blocks. > Because that can also be confusing for users. There is no additional > helpful semantic in using START over BEGIN. > > 3) Instead, we should rather parameterize the block statament with > `STATEMENT SET` and keep the END of the block simple (also similar to > CASE ... WHEN ... END). > > 4) If we look at Jark's example in SQL Server, the BEGIN is also > parameterized by `BEGIN { TRAN | TRANSACTION }`. > > 5) Also in Java curly braces are used for both classes, methods, and > loops for different purposes parameterized by the preceding code. > > Regards, > Timo > > > On 17.06.20 11:36, Fabian Hueske wrote: > > Thanks for joining this discussion Jark! > > > > This feature is a bit different from BEGIN TRANSACTION / COMMIT and > BEGIN / > > END. > > > > The only commonality is that all three group multiple statements. > > * BEGIN TRANSACTION / COMMIT creates a transactional context that > > guarantees atomicity, consistency, and isolation. Statements and queries > > are sequentially executed. > > * BEGIN / END defines a block of statements just like curly braces ({ and > > }) do in Java. The statements (which can also include variable > definitions > > and printing) are sequentially executed. > > * A statement set defines a group of statements that are optimized > together > > and jointly executed at the same time, i.e., there is no sequence or > order. > > > > A statement set (consisting of multiple INSERT INTO statements) behaves > > just like a single INSERT INTO statement. > > Everywhere where an INSERT INTO statement can be executed, it should be > > possible to execute a statement set consisting of multiple INSERT INTO > > statements. > > That's also why I think that statement sets are orthogonal to > > multi-statement execution. > > > > As I said before, I'm happy to discuss syntax proposals for statement > sets. > > However, I think a BEGIN / END syntax for statement sets would confuse > > users who know this syntax from MySQL, SQL Server, or another DBMS. > > > > Thanks, > > Fabian > > > > > > Am Di., 16. Juni 2020 um 05:07 Uhr schrieb Jark Wu : > > > >> Hi Fabian, > >> > >> Thanks for starting this discussion. I think this is a very
Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Congratulations :) > On 17 Jun 2020, at 14:53, Yun Tang wrote: > > Congratulations , Yu! well deserved. > > Best > Yun Tang > > From: Yu Li > Sent: Wednesday, June 17, 2020 20:03 > To: dev > Subject: Re: [ANNOUNCE] Yu Li is now part of the Flink PMC > > Thanks everyone! Really happy to work in such a great and encouraging > community! > > Best Regards, > Yu > > > On Wed, 17 Jun 2020 at 19:59, Congxian Qiu wrote: > >> Congratulations Yu ! >> Best, >> Congxian >> >> >> Thomas Weise 于2020年6月17日周三 下午6:23写道: >> >>> Congratulations! >>> >>> >>> On Wed, Jun 17, 2020, 2:59 AM Fabian Hueske wrote: >>> Congrats Yu! Cheers, Fabian Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann < trohrm...@apache.org>: > Congratulations Yu! > > Cheers, > Till > > On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li > wrote: > >> Congratulations Yu, well deserved! >> >> Best, >> Jingsong >> >> On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei wrote: >> >>> Congrats, Yu! >>> >>> GXGX & well deserved!! >>> >>> Best Regards, >>> >>> Yuan >>> >>> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun < sunjincheng...@gmail.com> >>> wrote: >>> Hi all, On behalf of the Flink PMC, I'm happy to announce that Yu Li is >> now part of the Apache Flink Project Management Committee (PMC). Yu Li has been very active on Flink's Statebackend component, >>> working > on various improvements, for example the RocksDB memory management >> for > 1.10. and keeps checking and voting for our releases, and also has > successfully produced two releases(1.10.0&1.10.1) as RM. Congratulations & Welcome Yu Li! Best, Jincheng (on behalf of the Flink PMC) >>> >> >> -- >> Best, Jingsong Lee >> > >>> >>
Re: Any good comparison on FSM of Flink vs Akka?
Hi Till Rohrmann, Consider any Finite State Machine which involves many states and need timers (wait for this much time and then timeout) or for example, consider the Movie Booking ticketing system http://thatgamesguy.co.uk/articles/modular-finite-state-machine-in-unity/ or these images: https://imgur.com/a/MV62BlO The current use case is based on FSM, but in future can consider HFSM as well: https://web.stanford.edu/class/cs123/lectures/CS123_lec08_HFSM_BT.pdf Thank You On Wed, Jun 17, 2020 at 5:20 PM Till Rohrmann wrote: > Hi Rohit, > > image attachments are filtered out and not visible to others. Hence, I > would suggest that you upload the image and then share the link. > > Maybe you can share a bit more details about the use case and your current > analysis of the problem. > > Cheers, > Till > > On Wed, Jun 17, 2020 at 12:15 PM Rohit R wrote: > > > Hello, > > > > To solve the following use case, What will be the best option between > > Flink or Flink Stateful Functions or Akka FSM? > > > > Use Case: > > [image: image.png] > > > > Can I get the analysis, pros, and cons of each? For example, why choosing > > Flink Stateful function will better option. > > > > Thank You > > >
Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Congratulations , Yu! well deserved. Best Yun Tang From: Yu Li Sent: Wednesday, June 17, 2020 20:03 To: dev Subject: Re: [ANNOUNCE] Yu Li is now part of the Flink PMC Thanks everyone! Really happy to work in such a great and encouraging community! Best Regards, Yu On Wed, 17 Jun 2020 at 19:59, Congxian Qiu wrote: > Congratulations Yu ! > Best, > Congxian > > > Thomas Weise 于2020年6月17日周三 下午6:23写道: > > > Congratulations! > > > > > > On Wed, Jun 17, 2020, 2:59 AM Fabian Hueske wrote: > > > > > Congrats Yu! > > > > > > Cheers, Fabian > > > > > > Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann < > > > trohrm...@apache.org>: > > > > > > > Congratulations Yu! > > > > > > > > Cheers, > > > > Till > > > > > > > > On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li > > > > wrote: > > > > > > > > > Congratulations Yu, well deserved! > > > > > > > > > > Best, > > > > > Jingsong > > > > > > > > > > On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei > > > wrote: > > > > > > > > > >> Congrats, Yu! > > > > >> > > > > >> GXGX & well deserved!! > > > > >> > > > > >> Best Regards, > > > > >> > > > > >> Yuan > > > > >> > > > > >> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun < > > > sunjincheng...@gmail.com> > > > > >> wrote: > > > > >> > > > > >>> Hi all, > > > > >>> > > > > >>> On behalf of the Flink PMC, I'm happy to announce that Yu Li is > now > > > > >>> part of the Apache Flink Project Management Committee (PMC). > > > > >>> > > > > >>> Yu Li has been very active on Flink's Statebackend component, > > working > > > > on > > > > >>> various improvements, for example the RocksDB memory management > for > > > > 1.10. > > > > >>> and keeps checking and voting for our releases, and also has > > > > successfully > > > > >>> produced two releases(1.10.0&1.10.1) as RM. > > > > >>> > > > > >>> Congratulations & Welcome Yu Li! > > > > >>> > > > > >>> Best, > > > > >>> Jincheng (on behalf of the Flink PMC) > > > > >>> > > > > >> > > > > > > > > > > -- > > > > > Best, Jingsong Lee > > > > > > > > > > > > > > >
[jira] [Created] (FLINK-18348) RemoteInputChannel should checkError before checking partitionRequestClient
Jiayi Liao created FLINK-18348: -- Summary: RemoteInputChannel should checkError before checking partitionRequestClient Key: FLINK-18348 URL: https://issues.apache.org/jira/browse/FLINK-18348 Project: Flink Issue Type: Improvement Components: Runtime / Network Affects Versions: 1.10.1 Reporter: Jiayi Liao The error will be set and \{{partitionRequestClient}} will be a null value if a remote channel fails to request the partition at the beginning. And the task will fail [here](https://github.com/apache/flink/blob/2150533ac0b2a6cc00238041853bbb6ebf22cee9/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/RemoteInputChannel.java#L172) when the task thread trying to fetch data from channel. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18347) Error java.lang.NoSuchFieldError: NO_INTS
lining created FLINK-18347: -- Summary: Error java.lang.NoSuchFieldError: NO_INTS Key: FLINK-18347 URL: https://issues.apache.org/jira/browse/FLINK-18347 Project: Flink Issue Type: Bug Components: Connectors / Kinesis Affects Versions: 1.10.1 Reporter: lining java.lang.NoSuchFieldError: NO_INTSjava.lang.NoSuchFieldError: NO_INTS at com.fasterxml.jackson.dataformat.cbor.CBORParser.(CBORParser.java:285) ~[usercode.jar:?] at com.fasterxml.jackson.dataformat.cbor.CBORParserBootstrapper.constructParser(CBORParserBootstrapper.java:91) ~[usercode.jar:?] at com.fasterxml.jackson.dataformat.cbor.CBORFactory._createParser(CBORFactory.java:399) ~[usercode.jar:?] at com.fasterxml.jackson.dataformat.cbor.CBORFactory.createParser(CBORFactory.java:324) ~[usercode.jar:?] at com.fasterxml.jackson.dataformat.cbor.CBORFactory.createParser(CBORFactory.java:26) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:109) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:43) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1627) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1336) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:2809) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2776) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2765) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.executeListShards(AmazonKinesisClient.java:1557) ~[usercode.jar:?] at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.listShards(AmazonKinesisClient.java:1528) ~[usercode.jar:?] at org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.listShards(KinesisProxy.java:439) ~[usercode.jar:?] at org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardsOfStream(KinesisProxy.java:389) ~[usercode.jar:?] at org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardList(KinesisProxy.java:279) ~[usercode.jar:?] at org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher.discoverNewShardsToSubscribe(KinesisDataFetcher.java:686) ~[usercode.jar:?] at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.run(FlinkKinesisConsumer.java:287) ~[usercode.jar:?] at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100) ~[flink-dist_2.11-1.10-vvr-1.0.2-SNAPSHOT.jar:1.10-vvr-1.0.2-SNAPSHOT] at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63) ~[flink-dist_2.11-1.10-vvr-1.0.2-SNAPSHOT.jar:1.10-vvr-1.0.2-SNAPSHOT] at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:200) ~[flink-dist_2.11-1.10-vvr-1.0.2-SNAPSHOT.jar:1.10-vvr-1.0.2-SNAPSHOT] -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Thanks everyone! Really happy to work in such a great and encouraging community! Best Regards, Yu On Wed, 17 Jun 2020 at 19:59, Congxian Qiu wrote: > Congratulations Yu ! > Best, > Congxian > > > Thomas Weise 于2020年6月17日周三 下午6:23写道: > > > Congratulations! > > > > > > On Wed, Jun 17, 2020, 2:59 AM Fabian Hueske wrote: > > > > > Congrats Yu! > > > > > > Cheers, Fabian > > > > > > Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann < > > > trohrm...@apache.org>: > > > > > > > Congratulations Yu! > > > > > > > > Cheers, > > > > Till > > > > > > > > On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li > > > > wrote: > > > > > > > > > Congratulations Yu, well deserved! > > > > > > > > > > Best, > > > > > Jingsong > > > > > > > > > > On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei > > > wrote: > > > > > > > > > >> Congrats, Yu! > > > > >> > > > > >> GXGX & well deserved!! > > > > >> > > > > >> Best Regards, > > > > >> > > > > >> Yuan > > > > >> > > > > >> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun < > > > sunjincheng...@gmail.com> > > > > >> wrote: > > > > >> > > > > >>> Hi all, > > > > >>> > > > > >>> On behalf of the Flink PMC, I'm happy to announce that Yu Li is > now > > > > >>> part of the Apache Flink Project Management Committee (PMC). > > > > >>> > > > > >>> Yu Li has been very active on Flink's Statebackend component, > > working > > > > on > > > > >>> various improvements, for example the RocksDB memory management > for > > > > 1.10. > > > > >>> and keeps checking and voting for our releases, and also has > > > > successfully > > > > >>> produced two releases(1.10.0&1.10.1) as RM. > > > > >>> > > > > >>> Congratulations & Welcome Yu Li! > > > > >>> > > > > >>> Best, > > > > >>> Jincheng (on behalf of the Flink PMC) > > > > >>> > > > > >> > > > > > > > > > > -- > > > > > Best, Jingsong Lee > > > > > > > > > > > > > > >
Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Congratulations Yu ! Best, Congxian Thomas Weise 于2020年6月17日周三 下午6:23写道: > Congratulations! > > > On Wed, Jun 17, 2020, 2:59 AM Fabian Hueske wrote: > > > Congrats Yu! > > > > Cheers, Fabian > > > > Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann < > > trohrm...@apache.org>: > > > > > Congratulations Yu! > > > > > > Cheers, > > > Till > > > > > > On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li > > > wrote: > > > > > > > Congratulations Yu, well deserved! > > > > > > > > Best, > > > > Jingsong > > > > > > > > On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei > > wrote: > > > > > > > >> Congrats, Yu! > > > >> > > > >> GXGX & well deserved!! > > > >> > > > >> Best Regards, > > > >> > > > >> Yuan > > > >> > > > >> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun < > > sunjincheng...@gmail.com> > > > >> wrote: > > > >> > > > >>> Hi all, > > > >>> > > > >>> On behalf of the Flink PMC, I'm happy to announce that Yu Li is now > > > >>> part of the Apache Flink Project Management Committee (PMC). > > > >>> > > > >>> Yu Li has been very active on Flink's Statebackend component, > working > > > on > > > >>> various improvements, for example the RocksDB memory management for > > > 1.10. > > > >>> and keeps checking and voting for our releases, and also has > > > successfully > > > >>> produced two releases(1.10.0&1.10.1) as RM. > > > >>> > > > >>> Congratulations & Welcome Yu Li! > > > >>> > > > >>> Best, > > > >>> Jincheng (on behalf of the Flink PMC) > > > >>> > > > >> > > > > > > > > -- > > > > Best, Jingsong Lee > > > > > > > > > >
Re: Any good comparison on FSM of Flink vs Akka?
Hi Rohit, image attachments are filtered out and not visible to others. Hence, I would suggest that you upload the image and then share the link. Maybe you can share a bit more details about the use case and your current analysis of the problem. Cheers, Till On Wed, Jun 17, 2020 at 12:15 PM Rohit R wrote: > Hello, > > To solve the following use case, What will be the best option between > Flink or Flink Stateful Functions or Akka FSM? > > Use Case: > [image: image.png] > > Can I get the analysis, pros, and cons of each? For example, why choosing > Flink Stateful function will better option. > > Thank You >
Which is Best option between Flink or Flink Stateful Functions or Akka FSM for State Use-cases?
Hello all, To solve the following use case, What will be the Best option between Flink or Flink Stateful Functions or Akka FSM? Use Case: ![SimpleBookingStateExample][https://i.stack.imgur.com/BMDFP.png] Can I get the analysis, pros, and cons of each? For example, why choosing Flink Stateful function will better option. Thank You
Re: [DISCUSS] SQL Syntax for Table API StatementSet
Hi Fabian, thanks for the proposal. I agree that we should have consensus on the SQL syntax as well and thus finalize the concepts introduced in FLIP-84. I would favor Jark's proposal. I would like to propose the following syntax: BEGIN STATEMENT SET; INSERT INTO ...; INSERT INTO ...; END; 1) BEGIN and END are commonly used for blocks in SQL. 2) We should not start mixing START/BEGIN for different kind of blocks. Because that can also be confusing for users. There is no additional helpful semantic in using START over BEGIN. 3) Instead, we should rather parameterize the block statament with `STATEMENT SET` and keep the END of the block simple (also similar to CASE ... WHEN ... END). 4) If we look at Jark's example in SQL Server, the BEGIN is also parameterized by `BEGIN { TRAN | TRANSACTION }`. 5) Also in Java curly braces are used for both classes, methods, and loops for different purposes parameterized by the preceding code. Regards, Timo On 17.06.20 11:36, Fabian Hueske wrote: Thanks for joining this discussion Jark! This feature is a bit different from BEGIN TRANSACTION / COMMIT and BEGIN / END. The only commonality is that all three group multiple statements. * BEGIN TRANSACTION / COMMIT creates a transactional context that guarantees atomicity, consistency, and isolation. Statements and queries are sequentially executed. * BEGIN / END defines a block of statements just like curly braces ({ and }) do in Java. The statements (which can also include variable definitions and printing) are sequentially executed. * A statement set defines a group of statements that are optimized together and jointly executed at the same time, i.e., there is no sequence or order. A statement set (consisting of multiple INSERT INTO statements) behaves just like a single INSERT INTO statement. Everywhere where an INSERT INTO statement can be executed, it should be possible to execute a statement set consisting of multiple INSERT INTO statements. That's also why I think that statement sets are orthogonal to multi-statement execution. As I said before, I'm happy to discuss syntax proposals for statement sets. However, I think a BEGIN / END syntax for statement sets would confuse users who know this syntax from MySQL, SQL Server, or another DBMS. Thanks, Fabian Am Di., 16. Juni 2020 um 05:07 Uhr schrieb Jark Wu : Hi Fabian, Thanks for starting this discussion. I think this is a very important syntax to support file mode and multi-statement for SQL Client. I'm +1 to introduce a syntax to group SQL statements to execute together. As a reference, traditional database systems also have similar syntax, such as "START/BEGIN TRANSACTION ... COMMIT" to group statements as a transaction [1], and also "BEGIN ... END" [2] [3] to group a set of SQL statements that execute together. Maybe we can also use "BEGIN ... END" syntax which is much simpler? Regarding where to implement, I also prefer to have it in Flink SQL core, here are some reasons from my side: 1) I think many downstream projects (e.g Zeppelin) will have the same requirement. It would be better to have it in core instead of reinventing the wheel by users. 2) Having it in SQL CLI means it is a standard syntax to support statement set in Flink. So I think it makes sense to have it in core too, otherwise, it looks like a broken feature. In 1.10, CREATE VIEW is only supported in SQL CLI, not supported in TableEnvironment, which confuses many users. 3) Currently, we are moving statement parsing to use sql-parser (FLINK-17728). Calcite has a good support for parsing multi-statements. It will be tricky to parse multi-statements only in SQL Client. Best, Jark [1]: https://docs.microsoft.com/en-us/sql/t-sql/language-elements/begin-transaction-transact-sql?view=sql-server-ver15 [2]: https://www.sqlservertutorial.net/sql-server-stored-procedures/sql-server-begin-end/ [3]: https://dev.mysql.com/doc/refman/8.0/en/begin-end.html On Mon, 15 Jun 2020 at 20:50, Fabian Hueske wrote: Hi everyone, FLIP-84 [1] added the concept of a "statement set" to group multiple INSERT INTO statements (SQL or Table API) together. The statements in a statement set are jointly optimized and executed as a single Flink job. I would like to start a discussion about a SQL syntax to group multiple INSERT INTO statements in a statement set. The use case would be to expose the statement set feature to a solely text based client for Flink SQL such as Flink's SQL CLI [1]. During the discussion of FLIP-84, we had briefly talked about such a syntax [3]. START STATEMENT SET; INSERT INTO ... SELECT ...; INSERT INTO ... SELECT ...; ... END STATEMENT SET; We didn't follow up on this proposal, to keep the focus on the FLIP-84 Table API changes and to not dive into a discussion about multiline SQL query support [4]. While this feature is clearly based on multiple SQL queries, I think it is a bit different from what we usually understand as multiline SQL
Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Congratulations! On Wed, Jun 17, 2020, 2:59 AM Fabian Hueske wrote: > Congrats Yu! > > Cheers, Fabian > > Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann < > trohrm...@apache.org>: > > > Congratulations Yu! > > > > Cheers, > > Till > > > > On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li > > wrote: > > > > > Congratulations Yu, well deserved! > > > > > > Best, > > > Jingsong > > > > > > On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei > wrote: > > > > > >> Congrats, Yu! > > >> > > >> GXGX & well deserved!! > > >> > > >> Best Regards, > > >> > > >> Yuan > > >> > > >> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun < > sunjincheng...@gmail.com> > > >> wrote: > > >> > > >>> Hi all, > > >>> > > >>> On behalf of the Flink PMC, I'm happy to announce that Yu Li is now > > >>> part of the Apache Flink Project Management Committee (PMC). > > >>> > > >>> Yu Li has been very active on Flink's Statebackend component, working > > on > > >>> various improvements, for example the RocksDB memory management for > > 1.10. > > >>> and keeps checking and voting for our releases, and also has > > successfully > > >>> produced two releases(1.10.0&1.10.1) as RM. > > >>> > > >>> Congratulations & Welcome Yu Li! > > >>> > > >>> Best, > > >>> Jincheng (on behalf of the Flink PMC) > > >>> > > >> > > > > > > -- > > > Best, Jingsong Lee > > > > > >
Any good comparison on FSM of Flink vs Akka?
Hello, To solve the following use case, What will be the best option between Flink or Flink Stateful Functions or Akka FSM? Use Case: [image: image.png] Can I get the analysis, pros, and cons of each? For example, why choosing Flink Stateful function will better option. Thank You
Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Congrats Yu! Cheers, Fabian Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann < trohrm...@apache.org>: > Congratulations Yu! > > Cheers, > Till > > On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li > wrote: > > > Congratulations Yu, well deserved! > > > > Best, > > Jingsong > > > > On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei wrote: > > > >> Congrats, Yu! > >> > >> GXGX & well deserved!! > >> > >> Best Regards, > >> > >> Yuan > >> > >> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun > >> wrote: > >> > >>> Hi all, > >>> > >>> On behalf of the Flink PMC, I'm happy to announce that Yu Li is now > >>> part of the Apache Flink Project Management Committee (PMC). > >>> > >>> Yu Li has been very active on Flink's Statebackend component, working > on > >>> various improvements, for example the RocksDB memory management for > 1.10. > >>> and keeps checking and voting for our releases, and also has > successfully > >>> produced two releases(1.10.0&1.10.1) as RM. > >>> > >>> Congratulations & Welcome Yu Li! > >>> > >>> Best, > >>> Jincheng (on behalf of the Flink PMC) > >>> > >> > > > > -- > > Best, Jingsong Lee > > >
[jira] [Created] (FLINK-18346) Support partition pruning for lookup table source
Jingsong Lee created FLINK-18346: Summary: Support partition pruning for lookup table source Key: FLINK-18346 URL: https://issues.apache.org/jira/browse/FLINK-18346 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Jingsong Lee Fix For: 1.12.0 Especially for Filesystem lookup table source, it stores all records in memory, if there is partition pruning, for partitioned table, can reduce memory effectively for lookup table source. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18345) Support filter push down for lookup table source
Jingsong Lee created FLINK-18345: Summary: Support filter push down for lookup table source Key: FLINK-18345 URL: https://issues.apache.org/jira/browse/FLINK-18345 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Jingsong Lee Fix For: 1.12.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] SQL Syntax for Table API StatementSet
Thanks for joining this discussion Jark! This feature is a bit different from BEGIN TRANSACTION / COMMIT and BEGIN / END. The only commonality is that all three group multiple statements. * BEGIN TRANSACTION / COMMIT creates a transactional context that guarantees atomicity, consistency, and isolation. Statements and queries are sequentially executed. * BEGIN / END defines a block of statements just like curly braces ({ and }) do in Java. The statements (which can also include variable definitions and printing) are sequentially executed. * A statement set defines a group of statements that are optimized together and jointly executed at the same time, i.e., there is no sequence or order. A statement set (consisting of multiple INSERT INTO statements) behaves just like a single INSERT INTO statement. Everywhere where an INSERT INTO statement can be executed, it should be possible to execute a statement set consisting of multiple INSERT INTO statements. That's also why I think that statement sets are orthogonal to multi-statement execution. As I said before, I'm happy to discuss syntax proposals for statement sets. However, I think a BEGIN / END syntax for statement sets would confuse users who know this syntax from MySQL, SQL Server, or another DBMS. Thanks, Fabian Am Di., 16. Juni 2020 um 05:07 Uhr schrieb Jark Wu : > Hi Fabian, > > Thanks for starting this discussion. I think this is a very important > syntax to support file mode and multi-statement for SQL Client. > I'm +1 to introduce a syntax to group SQL statements to execute together. > > As a reference, traditional database systems also have similar syntax, such > as "START/BEGIN TRANSACTION ... COMMIT" to group statements as a > transaction [1], > and also "BEGIN ... END" [2] [3] to group a set of SQL statements that > execute together. > > Maybe we can also use "BEGIN ... END" syntax which is much simpler? > > Regarding where to implement, I also prefer to have it in Flink SQL core, > here are some reasons from my side: > 1) I think many downstream projects (e.g Zeppelin) will have the same > requirement. It would be better to have it in core instead of reinventing > the wheel by users. > 2) Having it in SQL CLI means it is a standard syntax to support statement > set in Flink. So I think it makes sense to have it in core too, otherwise, > it looks like a broken feature. > In 1.10, CREATE VIEW is only supported in SQL CLI, not supported in > TableEnvironment, which confuses many users. > 3) Currently, we are moving statement parsing to use sql-parser > (FLINK-17728). Calcite has a good support for parsing multi-statements. > It will be tricky to parse multi-statements only in SQL Client. > > Best, > Jark > > [1]: > > https://docs.microsoft.com/en-us/sql/t-sql/language-elements/begin-transaction-transact-sql?view=sql-server-ver15 > [2]: > > https://www.sqlservertutorial.net/sql-server-stored-procedures/sql-server-begin-end/ > [3]: https://dev.mysql.com/doc/refman/8.0/en/begin-end.html > > On Mon, 15 Jun 2020 at 20:50, Fabian Hueske wrote: > > > Hi everyone, > > > > FLIP-84 [1] added the concept of a "statement set" to group multiple > INSERT > > INTO statements (SQL or Table API) together. The statements in a > statement > > set are jointly optimized and executed as a single Flink job. > > > > I would like to start a discussion about a SQL syntax to group multiple > > INSERT INTO statements in a statement set. The use case would be to > expose > > the statement set feature to a solely text based client for Flink SQL > such > > as Flink's SQL CLI [1]. > > > > During the discussion of FLIP-84, we had briefly talked about such a > syntax > > [3]. > > > > START STATEMENT SET; > > INSERT INTO ... SELECT ...; > > INSERT INTO ... SELECT ...; > > ... > > END STATEMENT SET; > > > > We didn't follow up on this proposal, to keep the focus on the FLIP-84 > > Table API changes and to not dive into a discussion about multiline SQL > > query support [4]. > > > > While this feature is clearly based on multiple SQL queries, I think it > is > > a bit different from what we usually understand as multiline SQL support. > > That's because a statement set ends up to be a single Flink job. Hence, > > there is no need on the Flink side to coordinate the execution of > multiple > > jobs (incl. the discussion about blocking or async execution of queries). > > Flink would treat the queries in a STATEMENT SET as a single query. > > > > I would like to start a discussion about supporting the [START|END] > > STATEMENT SET syntax (or a different syntax with equivalent semantics) in > > Flink. > > I don't have a strong preference whether this should be implemented in > > Flink's SQL core or be a purely client side implementation in the CLI > > client. It would be good though to have parser support in Flink for this. > > > > What do others think? > > > > [1] > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 > > [2] > > > > >
[jira] [Created] (FLINK-18344) Avoid wrong check with shardId in DynamoDB Streams
Javier Garcia created FLINK-18344: - Summary: Avoid wrong check with shardId in DynamoDB Streams Key: FLINK-18344 URL: https://issues.apache.org/jira/browse/FLINK-18344 Project: Flink Issue Type: Wish Components: Connectors / Kinesis Affects Versions: 1.10.1 Reporter: Javier Garcia Change class DynamoDBStreamsShardHandle, method isValidShardId to not perform check based on the assumptions of the shardId format Using localstack as local env for testing, the shardId follow a different format, valid following the AWS doc ( [https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_streams_Shard.html)] Can you change the validation to just check the length -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-18343) Enable DEBUG logging for java e2e tests
Chesnay Schepler created FLINK-18343: Summary: Enable DEBUG logging for java e2e tests Key: FLINK-18343 URL: https://issues.apache.org/jira/browse/FLINK-18343 Project: Flink Issue Type: Improvement Components: Test Infrastructure Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.11.0 Java e2e tests run with the default logging configuration, which only logs on INFO. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Re-renaming "Flink Master" back to JobManager
Aljoscha, I think this is a step in the right direction. In some cases it may be difficult to talk concretely about the differences between different deployment models (e.g., comparing a k8s per-job cluster to a YARN-based session cluster, which is something I typically present during training) without giving names to the internal components. I'm not convinced we can completely avoid mentioning the JobMaster (and Dispatcher and ResourceManagers) in some (rare) contexts -- but I don't see this as an argument against the proposed change. David On Mon, Jun 15, 2020 at 2:32 PM Konstantin Knauf wrote: > Hi Aljoscha, > > sounds good to me. Let’s also make sure we don’t refer to the JobMaster as > Jobmanager anywhere then (code, config). > > I am not sure we can avoid mentioning the Flink ResourceManagers in user > facing docs completely. For JobMaster and Dispatcher this seems doable. > > Best, > > Konstantin > > On Mon 15. Jun 2020 at 12:56, Aljoscha Krettek > wrote: > > > Hi All, > > > > This came to my mind because of the master/slave discussion in [1] and > > the larger discussions about inequality/civil rights happening right now > > in the world. I think for this reason alone we should use a name that > > does not include "master". > > > > We could rename it back to JobManager, which was the name mostly used > > before 2019. Since the beginning of Flink, TaskManager was the term used > > for the worker component/node and JobManager was the term used for the > > orchestrating component/node. > > > > Currently our glossary [2] defines these terms (paraphrased by me): > > > > - "Flink Master": it's the orchestrating component that consists of > > resource manager, dispatcher, and JobManager > > > > - JobManager: it's the thing that manages a single job and runs as > > part of a "Flink Master" > > > > - TaskManager: it's the worker process > > > > Prior to the introduction of the glossary the definition of JobManager > > would have been: > > > > - It's the orchestrating component that manages execution of jobs and > > schedules work on TaskManagers. > > > > Quite some parts in the code and documentation/configuration options > > still use that older meaning of JobManager. Newer parts of the > > documentation use "Flink Master" instead. > > > > I'm proposing to go back to calling the orchestrating component > > JobManager, which would mean that we have to touch up the documentation > > to remove mentions of "Flink Master". I'm also proposing not to mention > > the internal components such as resource manager and dispatcher in the > > glossary because there are transparent to users. > > > > I'm proposing to go back to JobManager instead of an alternative name > > also because switching to yet another name would mean many more changes > > to code/documentation/peoples minds. > > > > What do you all think? > > > > Best, > > Aljoscha > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-18209 > > [2] > > > > > https://ci.apache.org/projects/flink/flink-docs-master/concepts/glossary.html > > > -- > > Konstantin Knauf > > https://twitter.com/snntrable > > https://github.com/knaufk >
Re: Flink Table program cannot be compiled when enable checkpoint of StreamExecutionEnvironment
The exception is thrown when the StreamGraph is translated to a JobGraph. The translation logic has a switch. If checkpointing is enabled, the Java code generated by the optimizer is directly compiled in the client (in contrast to compiling it on the TaskManagers when the operators are started). The compiler needs access to the UDF classes but the classloader that's being used doesn't know about the UDF classes. The classloader that's used for the compilation is generated by PackagedProgram. You can configure the PackagedProgram with the right user code JAR URLs which contain the UDF classes. Alternatively, you can try to inject another classloader into PackagedProgram using Reflection (but that's a rather hacky approach). Hope this helps. Cheers, Fabian Am Mi., 17. Juni 2020 um 06:48 Uhr schrieb Jark Wu : > Why compile JobGraph yourself? This is really an internal API and may cause > problems. > Could you try to use `flink run` command [1] to submit your user jar > instead? > > Btw, what's your Flink version? If you are using Flink 1.10.0, could you > try to use 1.10.1? > > Best, > Jark > > On Wed, 17 Jun 2020 at 12:41, 杜斌 wrote: > > > Thanks for the reply, > > Here is the simple java program that re-produce the problem: > > 1. code for the application: > > > > import org.apache.flink.api.java.tuple.Tuple2; > > import org.apache.flink.streaming.api.datastream.DataStream; > > import > > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; > > import org.apache.flink.table.api.EnvironmentSettings; > > import org.apache.flink.table.api.Table; > > import org.apache.flink.table.api.java.StreamTableEnvironment; > > import org.apache.flink.types.Row; > > > > > > import java.util.Arrays; > > > > public class Test { > > public static void main(String[] args) throws Exception { > > > > StreamExecutionEnvironment env = > > StreamExecutionEnvironment.getExecutionEnvironment(); > > /** > > * If enable checkpoint, blink planner will failed > > */ > > env.enableCheckpointing(1000); > > > > EnvironmentSettings envSettings = > EnvironmentSettings.newInstance() > > //.useBlinkPlanner() // compile fail > > .useOldPlanner() // compile success > > .inStreamingMode() > > .build(); > > StreamTableEnvironment tEnv = > > StreamTableEnvironment.create(env, envSettings); > > DataStream orderA = env.fromCollection(Arrays.asList( > > new Order(1L, "beer", 3), > > new Order(1L, "diaper", 4), > > new Order(3L, "beer", 2) > > )); > > > > //Table table = tEnv.fromDataStream(orderA); > > tEnv.createTemporaryView("orderA", orderA); > > Table res = tEnv.sqlQuery("SELECT * FROM orderA"); > > DataStream> ds = > > tEnv.toRetractStream(res, Row.class); > > ds.print(); > > env.execute(); > > > > } > > > > public static class Order { > > public long user; > > public String product; > > public int amount; > > > > public Order(long user, String product, int amount) { > > this.user = user; > > this.product = product; > > this.amount = amount; > > } > > > > public long getUser() { > > return user; > > } > > > > public void setUser(long user) { > > this.user = user; > > } > > > > public String getProduct() { > > return product; > > } > > > > public void setProduct(String product) { > > this.product = product; > > } > > > > public int getAmount() { > > return amount; > > } > > > > public void setAmount(int amount) { > > this.amount = amount; > > } > > } > > } > > > > 2. mvn clean package to a jar file > > 3. then we use the following code to produce a jobgraph: > > > > PackagedProgram program = > > PackagedProgram.newBuilder() > > .setJarFile(userJarPath) > > .setUserClassPaths(classpaths) > > .setEntryPointClassName(userMainClass) > > .setConfiguration(configuration) > > > > .setSavepointRestoreSettings((descriptor.isRecoverFromSavepoint() && > > descriptor.getSavepointPath() != null && > > !descriptor.getSavepointPath().equals("")) ? > > > > SavepointRestoreSettings.forPath(descriptor.getSavepointPath(), > > descriptor.isAllowNonRestoredState()) : > > SavepointRestoreSettings.none()) > > .setArguments(userArgs) > > .build(); > > > > > > JobGraph jobGraph = PackagedProgramUtils.createJobGraph(program, > > configuration, 4, true); > > > > 4. If we use blink planner & enable checkpoint, the compile will failed. > > For the others, the compile success. > > > > Thanks, > > Bin > > > > Jark Wu 于2020年6月17日周三 上午10:42写道: > > > > > Hi, > > > > > >
Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
Congratulations Yu! Cheers, Till On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li wrote: > Congratulations Yu, well deserved! > > Best, > Jingsong > > On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei wrote: > >> Congrats, Yu! >> >> GXGX & well deserved!! >> >> Best Regards, >> >> Yuan >> >> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun >> wrote: >> >>> Hi all, >>> >>> On behalf of the Flink PMC, I'm happy to announce that Yu Li is now >>> part of the Apache Flink Project Management Committee (PMC). >>> >>> Yu Li has been very active on Flink's Statebackend component, working on >>> various improvements, for example the RocksDB memory management for 1.10. >>> and keeps checking and voting for our releases, and also has successfully >>> produced two releases(1.10.0&1.10.1) as RM. >>> >>> Congratulations & Welcome Yu Li! >>> >>> Best, >>> Jincheng (on behalf of the Flink PMC) >>> >> > > -- > Best, Jingsong Lee >
[jira] [Created] (FLINK-18342) SQLParser can not parse ROW() if it contains UDF
Leonard Xu created FLINK-18342: -- Summary: SQLParser can not parse ROW() if it contains UDF Key: FLINK-18342 URL: https://issues.apache.org/jira/browse/FLINK-18342 Project: Flink Issue Type: Bug Components: Table SQL / Planner Reporter: Leonard Xu {code:java} // code to reproduce CREATE TABLE MyHBaseSource ( rowkey STRING, family1 ROW, family2 ROW ) WITH ( 'connector' = 'hbase-1.4', 'table-name' = 'source', 'zookeeper.quorum' = 'localhost:2181', 'zookeeper.znode.parent' = '/hbase' ); CREATE FUNCTION RegReplace AS 'org.apache.flink.table.toolbox.StringRegexReplaceFunction'; INSERT INTO MyHBaseSink SELECT rowkey, ROW(RegReplace(family1.f1c1, 'v', 'value')) FROM MyHBaseSource; //exception [ERROR] Could not execute SQL statement. Reason: org.apache.flink.sql.parser.impl.ParseException: Encountered "(" at line 4, column 19. Was expecting one of: ")" ... "," ... {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)