Re: [DISCUSS] Change some default config values of blocking shuffle
Hi Yingjie, Thanks for your explanation. I have no more questions. +1 On Tue, Dec 14, 2021 at 3:31 PM Yingjie Cao wrote: > > Hi Jingsong, > > Thanks for your feedback. > > >>> My question is, what is the maximum parallelism a job can have with the > >>> default configuration? (Does this break out of the box) > > Yes, you are right, these two options are related to network memory and > framework off-heap memory. Generally, these changes will not break out of the > box experience, but for some extreme cases, for example, there are too many > ResultPartitions per task, users may need to increase network memory to avoid > "insufficient network buffer" error. For framework off-head, I believe that > user do not need to change the default value. > > In fact, I have a basic goal when changing these config values: when running > TPCDS of medium parallelism with the default value, all queries must pass > without any error and at the same time, the performance can be improved. I > think if we achieve this goal, most common use cases can be covered. > > Currently, for the default configuration, the exclusive buffers required at > input gate side is still parallelism relevant (though since 1.14, we can > decouple the network buffer consumption from parallelism by setting a config > value, it has slight performance influence on streaming jobs), which means > that no large parallelism can be supported by the default configuration. > Roughly, I would say the default value can support jobs of several hundreds > of parallelism. > > >>> I do feel that this correspondence is a bit difficult to control at the > >>> moment, and it would be best if a rough table could be provided. > > I think this is a good suggestion, we can provide those suggestions in the > document. > > Best, > Yingjie > > Jingsong Li 于2021年12月14日周二 14:39写道: >> >> Hi Yingjie, >> >> +1 for this FLIP. I'm pretty sure this will greatly improve the ease >> of batch jobs. >> >> Looks like "taskmanager.memory.framework.off-heap.batch-shuffle.size" >> and "taskmanager.network.sort-shuffle.min-buffers" are related to >> network memory and framework.off-heap.size. >> >> My question is, what is the maximum parallelism a job can have with >> the default configuration? (Does this break out of the box) >> >> How much network memory and framework.off-heap.size are required for >> how much parallelism in the default configuration? >> >> I do feel that this correspondence is a bit difficult to control at >> the moment, and it would be best if a rough table could be provided. >> >> Best, >> Jingsong >> >> On Tue, Dec 14, 2021 at 2:16 PM Yingjie Cao wrote: >> > >> > Hi Jiangang, >> > >> > Thanks for your suggestion. >> > >> > >>> The config can affect the memory usage. Will the related memory >> > >>> configs be changed? >> > >> > I think we will not change the default network memory settings. My best >> > expectation is that the default value can work for most cases (though may >> > not the best) and for other cases, user may need to tune the memory >> > settings. >> > >> > >>> Can you share the tpcds results for different configs? Although we >> > >>> change the default values, it is helpful to change them for different >> > >>> users. In this case, the experience can help a lot. >> > >> > I did not keep all previous TPCDS results, but from the results, I can >> > tell that on HDD, always using the sort-shuffle is a good choice. For >> > small jobs, using sort-shuffle may not bring much performance gain, this >> > may because that all shuffle data can be cached in memory (page cache), >> > this is the case if the cluster have enough resources. However, if the >> > whole cluster is under heavy burden or you are running large scale jobs, >> > the performance of those small jobs can also be influenced. For >> > large-scale jobs, the configurations suggested to be tuned are >> > taskmanager.network.sort-shuffle.min-buffers and >> > taskmanager.memory.framework.off-heap.batch-shuffle.size, you can increase >> > these values for large-scale batch jobs. >> > >> > BTW, I am still running TPCDS tests these days and I can share these >> > results soon. >> > >> > Best, >> > Yingjie >> > >> > 刘建刚 于2021年12月10日周五 18:30写道: >> >> >> >> Glad to see the suggestion. In our test, we found that small jobs with >> >> the changing configs can not improve the performance much just as your >> >> test. I have some suggestions: >> >> >> >> The config can affect the memory usage. Will the related memory configs >> >> be changed? >> >> Can you share the tpcds results for different configs? Although we change >> >> the default values, it is helpful to change them for different users. In >> >> this case, the experience can help a lot. >> >> >> >> Best, >> >> Liu Jiangang >> >> >> >> Yun Gao 于2021年12月10日周五 17:20写道: >> >>> >> >>> Hi Yingjie, >> >>> >> >>> Very thanks for drafting the FLIP and initiating the discussion! >> >>> >> >>> May I have a double
Re: [DISCUSS] FLIP-200: Support Multiple Rule and Dynamic Rule Changing (Flink CEP)
Hi Nicholas, I understand that a Rule contains more than the Pattern. Still, I favor DynamicPattern[Holder] over Rule, because the term "Rule" does not exist in Flink's CEP implementation so far and "dynamic" seems to be the important bit here. Cheers, Konstantin On Tue, Dec 14, 2021 at 4:46 AM Nicholas Jiang wrote: > Hi DianFu, > > Thanks for your feedback of the FLIP. > > About the mentioned question for the `getLatestRules`, IMO, this > doesn't need to rename into `getRuleChanges` because this method is used > for getting the total amount of the latest rules which has been updated > once. > > About the CEP.rule method, the CEP.dynamicPattern renaming is > confusing for users. The dynamic pattern only creates the PatternStream not > the DataStream. From the concept, a dynamic pattern is also a pattern, not > contains the PatternProcessFunction. If renaming the CEP.rule into > CEP.dynamicPattern, the return value of the method couldn't include the > PatternProcessFunction, only returns the PatternStream. I think the > difference between the Rule and the Pattern is that Rule contains the > PatternProcessFunction, but the Pattern or DynamicPattern doesn't contain > the function. > > Best > Nicholas Jiang > -- Konstantin Knauf https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] Change some default config values of blocking shuffle
Hi Jingsong, Thanks for your feedback. >>> My question is, what is the maximum parallelism a job can have with the default configuration? (Does this break out of the box) Yes, you are right, these two options are related to network memory and framework off-heap memory. Generally, these changes will not break out of the box experience, but for some extreme cases, for example, there are too many ResultPartitions per task, users may need to increase network memory to avoid "insufficient network buffer" error. For framework off-head, I believe that user do not need to change the default value. In fact, I have a basic goal when changing these config values: when running TPCDS of medium parallelism with the default value, all queries must pass without any error and at the same time, the performance can be improved. I think if we achieve this goal, most common use cases can be covered. Currently, for the default configuration, the exclusive buffers required at input gate side is still parallelism relevant (though since 1.14, we can decouple the network buffer consumption from parallelism by setting a config value, it has slight performance influence on streaming jobs), which means that no large parallelism can be supported by the default configuration. Roughly, I would say the default value can support jobs of several hundreds of parallelism. >>> I do feel that this correspondence is a bit difficult to control at the moment, and it would be best if a rough table could be provided. I think this is a good suggestion, we can provide those suggestions in the document. Best, Yingjie Jingsong Li 于2021年12月14日周二 14:39写道: > Hi Yingjie, > > +1 for this FLIP. I'm pretty sure this will greatly improve the ease > of batch jobs. > > Looks like "taskmanager.memory.framework.off-heap.batch-shuffle.size" > and "taskmanager.network.sort-shuffle.min-buffers" are related to > network memory and framework.off-heap.size. > > My question is, what is the maximum parallelism a job can have with > the default configuration? (Does this break out of the box) > > How much network memory and framework.off-heap.size are required for > how much parallelism in the default configuration? > > I do feel that this correspondence is a bit difficult to control at > the moment, and it would be best if a rough table could be provided. > > Best, > Jingsong > > On Tue, Dec 14, 2021 at 2:16 PM Yingjie Cao > wrote: > > > > Hi Jiangang, > > > > Thanks for your suggestion. > > > > >>> The config can affect the memory usage. Will the related memory > configs be changed? > > > > I think we will not change the default network memory settings. My best > expectation is that the default value can work for most cases (though may > not the best) and for other cases, user may need to tune the memory > settings. > > > > >>> Can you share the tpcds results for different configs? Although we > change the default values, it is helpful to change them for different > users. In this case, the experience can help a lot. > > > > I did not keep all previous TPCDS results, but from the results, I can > tell that on HDD, always using the sort-shuffle is a good choice. For small > jobs, using sort-shuffle may not bring much performance gain, this may > because that all shuffle data can be cached in memory (page cache), this is > the case if the cluster have enough resources. However, if the whole > cluster is under heavy burden or you are running large scale jobs, the > performance of those small jobs can also be influenced. For large-scale > jobs, the configurations suggested to be tuned are > taskmanager.network.sort-shuffle.min-buffers and > taskmanager.memory.framework.off-heap.batch-shuffle.size, you can increase > these values for large-scale batch jobs. > > > > BTW, I am still running TPCDS tests these days and I can share these > results soon. > > > > Best, > > Yingjie > > > > 刘建刚 于2021年12月10日周五 18:30写道: > >> > >> Glad to see the suggestion. In our test, we found that small jobs with > the changing configs can not improve the performance much just as your > test. I have some suggestions: > >> > >> The config can affect the memory usage. Will the related memory configs > be changed? > >> Can you share the tpcds results for different configs? Although we > change the default values, it is helpful to change them for different > users. In this case, the experience can help a lot. > >> > >> Best, > >> Liu Jiangang > >> > >> Yun Gao 于2021年12月10日周五 17:20写道: > >>> > >>> Hi Yingjie, > >>> > >>> Very thanks for drafting the FLIP and initiating the discussion! > >>> > >>> May I have a double confirmation for > taskmanager.network.sort-shuffle.min-parallelism that > >>> since other frameworks like Spark have used sort-based shuffle for all > the cases, does our > >>> current circumstance still have difference with them? > >>> > >>> Best, > >>> Yun > >>> > >>> > >>> > >>> > >>> -- > >>> From:Yingjie Cao >
[jira] [Created] (FLINK-25294) Incorrect cloudpickle import
arya created FLINK-25294: Summary: Incorrect cloudpickle import Key: FLINK-25294 URL: https://issues.apache.org/jira/browse/FLINK-25294 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.14.0 Reporter: arya In flink-python/pyflink/fn_execution/coder_impl_fast.pyx line 30 {code:python} from cloudpickle import cloudpickle {code} should simply be {code:python} import cloudpickle{code} or else I get AttributeError: module 'cloudpickle.cloudpickle' has no attribute 'dumps' when using keyed states I assume this is left over from when cloudpickle was incorrectly packaged, then fixed in [FLINK-14556|https://issues.apache.org/jira/browse/FLINK-14556], so this might have been a problem since 1.10.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [DISCUSS] Change some default config values of blocking shuffle
Hi Yingjie, +1 for this FLIP. I'm pretty sure this will greatly improve the ease of batch jobs. Looks like "taskmanager.memory.framework.off-heap.batch-shuffle.size" and "taskmanager.network.sort-shuffle.min-buffers" are related to network memory and framework.off-heap.size. My question is, what is the maximum parallelism a job can have with the default configuration? (Does this break out of the box) How much network memory and framework.off-heap.size are required for how much parallelism in the default configuration? I do feel that this correspondence is a bit difficult to control at the moment, and it would be best if a rough table could be provided. Best, Jingsong On Tue, Dec 14, 2021 at 2:16 PM Yingjie Cao wrote: > > Hi Jiangang, > > Thanks for your suggestion. > > >>> The config can affect the memory usage. Will the related memory configs > >>> be changed? > > I think we will not change the default network memory settings. My best > expectation is that the default value can work for most cases (though may not > the best) and for other cases, user may need to tune the memory settings. > > >>> Can you share the tpcds results for different configs? Although we change > >>> the default values, it is helpful to change them for different users. In > >>> this case, the experience can help a lot. > > I did not keep all previous TPCDS results, but from the results, I can tell > that on HDD, always using the sort-shuffle is a good choice. For small jobs, > using sort-shuffle may not bring much performance gain, this may because that > all shuffle data can be cached in memory (page cache), this is the case if > the cluster have enough resources. However, if the whole cluster is under > heavy burden or you are running large scale jobs, the performance of those > small jobs can also be influenced. For large-scale jobs, the configurations > suggested to be tuned are taskmanager.network.sort-shuffle.min-buffers and > taskmanager.memory.framework.off-heap.batch-shuffle.size, you can increase > these values for large-scale batch jobs. > > BTW, I am still running TPCDS tests these days and I can share these results > soon. > > Best, > Yingjie > > 刘建刚 于2021年12月10日周五 18:30写道: >> >> Glad to see the suggestion. In our test, we found that small jobs with the >> changing configs can not improve the performance much just as your test. I >> have some suggestions: >> >> The config can affect the memory usage. Will the related memory configs be >> changed? >> Can you share the tpcds results for different configs? Although we change >> the default values, it is helpful to change them for different users. In >> this case, the experience can help a lot. >> >> Best, >> Liu Jiangang >> >> Yun Gao 于2021年12月10日周五 17:20写道: >>> >>> Hi Yingjie, >>> >>> Very thanks for drafting the FLIP and initiating the discussion! >>> >>> May I have a double confirmation for >>> taskmanager.network.sort-shuffle.min-parallelism that >>> since other frameworks like Spark have used sort-based shuffle for all the >>> cases, does our >>> current circumstance still have difference with them? >>> >>> Best, >>> Yun >>> >>> >>> >>> >>> -- >>> From:Yingjie Cao >>> Send Time:2021 Dec. 10 (Fri.) 16:17 >>> To:dev ; user ; user-zh >>> >>> Subject:Re: [DISCUSS] Change some default config values of blocking shuffle >>> >>> Hi dev & users: >>> >>> I have created a FLIP [1] for it, feedbacks are highly appreciated. >>> >>> Best, >>> Yingjie >>> >>> [1] >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-199%3A+Change+some+default+config+values+of+blocking+shuffle+for+better+usability >>> Yingjie Cao 于2021年12月3日周五 17:02写道: >>> >>> Hi dev & users, >>> >>> We propose to change some default values of blocking shuffle to improve the >>> user out-of-box experience (not influence streaming). The default values we >>> want to change are as follows: >>> >>> 1. Data compression >>> (taskmanager.network.blocking-shuffle.compression.enabled): Currently, the >>> default value is 'false'. Usually, data compression can reduce both disk >>> and network IO which is good for performance. At the same time, it can save >>> storage space. We propose to change the default value to true. >>> >>> 2. Default shuffle implementation >>> (taskmanager.network.sort-shuffle.min-parallelism): Currently, the default >>> value is 'Integer.MAX', which means by default, Flink jobs will always use >>> hash-shuffle. In fact, for high parallelism, sort-shuffle is better for >>> both stability and performance. So we propose to reduce the default value >>> to a proper smaller one, for example, 128. (We tested 128, 256, 512 and >>> 1024 with a tpc-ds and 128 is the best one.) >>> >>> 3. Read buffer of sort-shuffle >>> (taskmanager.memory.framework.off-heap.batch-shuffle.size): Currently, the >>> default value is '32M'. Previously, when choosing the default value, both >>> ‘32M' and '64M'
[jira] [Created] (FLINK-25293) Option to let fail if KafkaSource keeps failing to commit offset
rerorero created FLINK-25293: Summary: Option to let fail if KafkaSource keeps failing to commit offset Key: FLINK-25293 URL: https://issues.apache.org/jira/browse/FLINK-25293 Project: Flink Issue Type: Improvement Components: Connectors / Kafka Affects Versions: 1.14.0 Environment: Flink 1.14.0 Reporter: rerorero Is it possible to let KafkaSource fail if it keeps failing to commit offset? I faced an issue where KafkaSource keeps failing and never recover, while it's logging like these logs: {code:java} 2021-12-08 22:18:34,155 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator [] - [Consumer clientId=dbz-mercari-contact-tool-jp-cg-1, groupId=dbz-mercari-contact-tool-jp-cg] Group coordinator b4-pkc-xmj7g.asia-northeast1.gcp.confluent.cloud:9092 (id: 2147483643 rack: null) is unavailable or invalid due to cause: null.isDisconnected: true. Rediscovery will be attempted. 2021-12-08 22:18:34,157 WARN org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed to commit consumer offsets for checkpoint 13 {code} This is happening not just once, but a couple of times a week. I found [other people reporting the same thing](https://lists.apache.org/thread/8l4f2yb4qwysdn1cj1wjk99tfb79kgs2). This could possibly be a problem with the Kafka client. It can be resolved by restarting the Flink Job. However, Flink Kafka connector doesn't provide an automatic way to save this situation. KafkaSource [keeps retrying forever when a retriable error occurs](https://github.com/apache/flink/blob/afb29d92c4e76ec6a453459c3d8a08304efec549/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/KafkaSourceReader.java#L144-L148], even if it is not retriable actually. Since it sends metrics of the number of times a commit fails, it could be automated by monitoring it and restarting the job, but that would mean we need to have a new process to manage. Does it make sense to have KafkaSource have the option like, let the source task fail if it keeps failing to commit an offset more than X times? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25292) Azure failed due to unable to fetch some archives
Yun Gao created FLINK-25292: --- Summary: Azure failed due to unable to fetch some archives Key: FLINK-25292 URL: https://issues.apache.org/jira/browse/FLINK-25292 Project: Flink Issue Type: Bug Components: Build System / Azure Pipelines Affects Versions: 1.13.3 Reporter: Yun Gao {code:java} /bin/bash --noprofile --norc /__w/_temp/ba0f8961-8595-4ace-b13f-d60e17df8803.sh Reading package lists... Building dependency tree... Reading state information... The following additional packages will be installed: libio-pty-perl libipc-run-perl Suggested packages: libtime-duration-perl libtimedate-perl The following NEW packages will be installed: libio-pty-perl libipc-run-perl moreutils 0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded. Need to get 177 kB of archives. After this operation, 573 kB of additional disk space will be used. Err:1 http://archive.ubuntu.com/ubuntu xenial/main amd64 libio-pty-perl amd64 1:1.08-1.1build1 Could not connect to archive.ubuntu.com:80 (91.189.88.152), connection timed out [IP: 91.189.88.152 80] Err:2 http://archive.ubuntu.com/ubuntu xenial/main amd64 libipc-run-perl all 0.94-1 Unable to connect to archive.ubuntu.com:http: [IP: 91.189.88.152 80] Err:3 http://archive.ubuntu.com/ubuntu xenial/universe amd64 moreutils amd64 0.57-1 Unable to connect to archive.ubuntu.com:http: [IP: 91.189.88.152 80] E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/libi/libio-pty-perl/libio-pty-perl_1.08-1.1build1_amd64.deb Could not connect to archive.ubuntu.com:80 (91.189.88.152), connection timed out [IP: 91.189.88.152 80] E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/libi/libipc-run-perl/libipc-run-perl_0.94-1_all.deb Unable to connect to archive.ubuntu.com:http: [IP: 91.189.88.152 80] E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/universe/m/moreutils/moreutils_0.57-1_amd64.deb Unable to connect to archive.ubuntu.com:http: [IP: 91.189.88.152 80] E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? Running command './tools/ci/test_controller.sh kafka/gelly' with a timeout of 234 minutes. ./tools/azure-pipelines/uploading_watchdog.sh: line 76: ts: command not found The STDIO streams did not close within 10 seconds of the exit event from process '/bin/bash'. This may indicate a child process inherited the STDIO streams and has not yet exited. ##[error]Bash exited with code '141'. {code} https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=28064=logs=72d4811f-9f0d-5fd0-014a-0bc26b72b642=e424005a-b16e-540f-196d-da062cc19bdf=13 -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [DISCUSS] Change some default config values of blocking shuffle
Hi Jiangang, Thanks for your suggestion. >>> The config can affect the memory usage. Will the related memory configs be changed? I think we will not change the default network memory settings. My best expectation is that the default value can work for most cases (though may not the best) and for other cases, user may need to tune the memory settings. >>> Can you share the tpcds results for different configs? Although we change the default values, it is helpful to change them for different users. In this case, the experience can help a lot. I did not keep all previous TPCDS results, but from the results, I can tell that on HDD, always using the sort-shuffle is a good choice. For small jobs, using sort-shuffle may not bring much performance gain, this may because that all shuffle data can be cached in memory (page cache), this is the case if the cluster have enough resources. However, if the whole cluster is under heavy burden or you are running large scale jobs, the performance of those small jobs can also be influenced. For large-scale jobs, the configurations suggested to be tuned are taskmanager.network.sort-shuffle.min-buffers and taskmanager.memory.framework.off-heap.batch-shuffle.size, you can increase these values for large-scale batch jobs. BTW, I am still running TPCDS tests these days and I can share these results soon. Best, Yingjie 刘建刚 于2021年12月10日周五 18:30写道: > Glad to see the suggestion. In our test, we found that small jobs with the > changing configs can not improve the performance much just as your test. I > have some suggestions: > >- The config can affect the memory usage. Will the related memory >configs be changed? >- Can you share the tpcds results for different configs? Although we >change the default values, it is helpful to change them for different >users. In this case, the experience can help a lot. > > Best, > Liu Jiangang > > Yun Gao 于2021年12月10日周五 17:20写道: > >> Hi Yingjie, >> >> Very thanks for drafting the FLIP and initiating the discussion! >> >> May I have a double confirmation for >> taskmanager.network.sort-shuffle.min-parallelism that >> since other frameworks like Spark have used sort-based shuffle for all >> the cases, does our >> current circumstance still have difference with them? >> >> Best, >> Yun >> >> >> >> >> -- >> From:Yingjie Cao >> Send Time:2021 Dec. 10 (Fri.) 16:17 >> To:dev ; user ; user-zh < >> user...@flink.apache.org> >> Subject:Re: [DISCUSS] Change some default config values of blocking >> shuffle >> >> Hi dev & users: >> >> I have created a FLIP [1] for it, feedbacks are highly appreciated. >> >> Best, >> Yingjie >> >> [1] >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-199%3A+Change+some+default+config+values+of+blocking+shuffle+for+better+usability >> Yingjie Cao 于2021年12月3日周五 17:02写道: >> >> Hi dev & users, >> >> We propose to change some default values of blocking shuffle to improve >> the user out-of-box experience (not influence streaming). The default >> values we want to change are as follows: >> >> 1. Data compression >> (taskmanager.network.blocking-shuffle.compression.enabled): Currently, the >> default value is 'false'. Usually, data compression can reduce both disk >> and network IO which is good for performance. At the same time, it can save >> storage space. We propose to change the default value to true. >> >> 2. Default shuffle implementation >> (taskmanager.network.sort-shuffle.min-parallelism): Currently, the default >> value is 'Integer.MAX', which means by default, Flink jobs will always use >> hash-shuffle. In fact, for high parallelism, sort-shuffle is better for >> both stability and performance. So we propose to reduce the default value >> to a proper smaller one, for example, 128. (We tested 128, 256, 512 and >> 1024 with a tpc-ds and 128 is the best one.) >> >> 3. Read buffer of sort-shuffle >> (taskmanager.memory.framework.off-heap.batch-shuffle.size): Currently, the >> default value is '32M'. Previously, when choosing the default value, both >> ‘32M' and '64M' are OK for tests and we chose the smaller one in a cautious >> way. However, recently, it is reported in the mailing list that the default >> value is not enough which caused a buffer request timeout issue. We already >> created a ticket to improve the behavior. At the same time, we propose to >> increase this default value to '64M' which can also help. >> >> 4. Sort buffer size of sort-shuffle >> (taskmanager.network.sort-shuffle.min-buffers): Currently, the default >> value is '64' which means '64' network buffers (32k per buffer by default). >> This default value is quite modest and the performance can be influenced. >> We propose to increase this value to a larger one, for example, 512 (the >> default TM and network buffer configuration can serve more than 10 result >> partitions concurrently). >> >> We already tested these default values together with
Re: [DISCUSS] Change some default config values of blocking shuffle
Hi Yun, Thanks for your feedback. I think setting taskmanager.network.sort-shuffle.min-parallelism to 1 and using sort-shuffle for all cases by default is a good suggestion. I am not choosing this value mainly because two reasons: 1. The first one is that it increases the usage of network memory which may cause "insufficient network buffer" exception and user may have to increase the total network buffers. 2. There are several (not many) TPCDS queries suffers some performance regression on SSD. For the first issue, I will test more settings on tpcds and see the influence. For the second issue, I will try to find the cause and solve it in 1.15. I am open for your suggestion, but I still need some more tests and analysis to guarantee that it works well. Best, Yingjie Yun Gao 于2021年12月10日周五 17:19写道: > Hi Yingjie, > > Very thanks for drafting the FLIP and initiating the discussion! > > May I have a double confirmation for > taskmanager.network.sort-shuffle.min-parallelism > that > since other frameworks like Spark have used sort-based shuffle for all the > cases, does our > current circumstance still have difference with them? > > Best, > Yun > > > > -- > From:Yingjie Cao > Send Time:2021 Dec. 10 (Fri.) 16:17 > To:dev ; user ; user-zh < > user...@flink.apache.org> > Subject:Re: [DISCUSS] Change some default config values of blocking shuffle > > Hi dev & users: > > I have created a FLIP [1] for it, feedbacks are highly appreciated. > > Best, > Yingjie > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-199%3A+Change+some+default+config+values+of+blocking+shuffle+for+better+usability > > Yingjie Cao 于2021年12月3日周五 17:02写道: > Hi dev & users, > > We propose to change some default values of blocking shuffle to improve > the user out-of-box experience (not influence streaming). The default > values we want to change are as follows: > > 1. Data compression > (taskmanager.network.blocking-shuffle.compression.enabled): Currently, the > default value is 'false'. Usually, data compression can reduce both disk > and network IO which is good for performance. At the same time, it can save > storage space. We propose to change the default value to true. > > 2. Default shuffle implementation > (taskmanager.network.sort-shuffle.min-parallelism): Currently, the default > value is 'Integer.MAX', which means by default, Flink jobs will always use > hash-shuffle. In fact, for high parallelism, sort-shuffle is better for > both stability and performance. So we propose to reduce the default value > to a proper smaller one, for example, 128. (We tested 128, 256, 512 and > 1024 with a tpc-ds and 128 is the best one.) > > 3. Read buffer of sort-shuffle > (taskmanager.memory.framework.off-heap.batch-shuffle.size): Currently, the > default value is '32M'. Previously, when choosing the default value, both > ‘32M' and '64M' are OK for tests and we chose the smaller one in a cautious > way. However, recently, it is reported in the mailing list that the default > value is not enough which caused a buffer request timeout issue. We already > created a ticket to improve the behavior. At the same time, we propose to > increase this default value to '64M' which can also help. > > 4. Sort buffer size of sort-shuffle > (taskmanager.network.sort-shuffle.min-buffers): Currently, the default > value is '64' which means '64' network buffers (32k per buffer by default). > This default value is quite modest and the performance can be influenced. > We propose to increase this value to a larger one, for example, 512 (the > default TM and network buffer configuration can serve more than 10 > result partitions concurrently). > > We already tested these default values together with tpc-ds benchmark in a > cluster and both the performance and stability improved a lot. These > changes can help to improve the out-of-box experience of blocking shuffle. > What do you think about these changes? Is there any concern? If there are > no objections, I will make these changes soon. > > Best, > Yingjie > > >
[jira] [Created] (FLINK-25291) Add failure cases in DataStream source and sink suite of connector testing framework
Qingsheng Ren created FLINK-25291: - Summary: Add failure cases in DataStream source and sink suite of connector testing framework Key: FLINK-25291 URL: https://issues.apache.org/jira/browse/FLINK-25291 Project: Flink Issue Type: Sub-task Components: Test Infrastructure Reporter: Qingsheng Ren Fix For: 1.15.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25290) Add table source and sink test suite in connector testing framework
Qingsheng Ren created FLINK-25290: - Summary: Add table source and sink test suite in connector testing framework Key: FLINK-25290 URL: https://issues.apache.org/jira/browse/FLINK-25290 Project: Flink Issue Type: Sub-task Reporter: Qingsheng Ren Fix For: 1.15.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25288) Add savepoint and metric cases in DataStream source suite of connector testing framework
Qingsheng Ren created FLINK-25288: - Summary: Add savepoint and metric cases in DataStream source suite of connector testing framework Key: FLINK-25288 URL: https://issues.apache.org/jira/browse/FLINK-25288 Project: Flink Issue Type: Sub-task Components: Test Infrastructure Reporter: Qingsheng Ren Fix For: 1.15.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25289) Add DataStream sink test suite in connector testing framework
Qingsheng Ren created FLINK-25289: - Summary: Add DataStream sink test suite in connector testing framework Key: FLINK-25289 URL: https://issues.apache.org/jira/browse/FLINK-25289 Project: Flink Issue Type: Sub-task Components: Test Infrastructure Reporter: Qingsheng Ren -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25287) Refactor connector testing framework interfaces
Qingsheng Ren created FLINK-25287: - Summary: Refactor connector testing framework interfaces Key: FLINK-25287 URL: https://issues.apache.org/jira/browse/FLINK-25287 Project: Flink Issue Type: Sub-task Components: Test Infrastructure Reporter: Qingsheng Ren Fix For: 1.15.0 A refactor in connector testing framework interfaces is required to support more test scenarios and cases such as sinks and Table / SQL API. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25286) Improve connector testing framework to support more scenarios
Qingsheng Ren created FLINK-25286: - Summary: Improve connector testing framework to support more scenarios Key: FLINK-25286 URL: https://issues.apache.org/jira/browse/FLINK-25286 Project: Flink Issue Type: Improvement Components: Test Infrastructure Reporter: Qingsheng Ren Fix For: 1.15.0 Currently connector testing framework only support tests for DataStream sources, and available scenarios are quite limited by current interface design. This ticket proposes to made improvements to connector testing framework for supporting more test scenarios, and add test suites for sink and Table/SQL API. -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [Proposal] It is hoped that Apache APISIX and Apache Flink will carry out diversified community cooperation
Hi, yeliang, Apache ML does not support pictures and attachments, so we can not see the architecture diagram. I think it will work between Flink and APISIX from your description. Thanks, Ming Wen, Apache APISIX PMC Chair Twitter: _WenMing yeliang wang 于2021年12月14日周二 12:05写道: > Hi, community, > > My name is Yeliang Wang, and I am Apache APISIX Committer. > > In my knowledge field, Apache Flink as a distributed real-time processing > engine for stream data and batch data with high throughput and low latency, > has a very rich application scenario in enterprises. > > For example, in combination with openresty or nginx, there are two > application scenarios: > >1. client -> gateway -> kafka -> flink,This is the mainstream >application scenario (data analysis scenario) in the industry. >2. client -> gateway -> flink(REST API),Flink has some rest APIs, >which can be used to complete current limiting, authentication and other >scenarios. > > [image: 飞书20211214-105403.png] > Compared with OpenResty or Nginx, Apache APISIX provides rich traffic > management features like Load Balancing, Dynamic Upstream, Canary Release, > Circuit Breaking, Authentication, Observability, and more... so Apache > APISIX already supports Kafka on plugin( > https://apisix.apache.org/docs/apisix/plugins/kafka-logger/ ) > > so, we to develop more community cooperation with Apache Flink. > >1. Hold a wonderful technical meetup together >2. Collaborative output technology blog to share with more people >3. Carry out publicity activities on the official website and media >channels together >4. Jointly develop Flink plugin support to APISIX > > > I believe in doing so, it can not only meet the diversified needs of > users, but also enrich the surrounding ecology of Apache Flink and Apache > APISIX. > > Wait for more discussion. > > Thanks, > Github: wang-yeliang, Twitter: @WYeliang >
[Proposal] It is hoped that Apache APISIX and Apache Flink will carry out diversified community cooperation
Hi, community, My name is Yeliang Wang, and I am Apache APISIX Committer. In my knowledge field, Apache Flink as a distributed real-time processing engine for stream data and batch data with high throughput and low latency, has a very rich application scenario in enterprises. For example, in combination with openresty or nginx, there are two application scenarios: 1. client -> gateway -> kafka -> flink,This is the mainstream application scenario (data analysis scenario) in the industry. 2. client -> gateway -> flink(REST API),Flink has some rest APIs, which can be used to complete current limiting, authentication and other scenarios. [image: 飞书20211214-105403.png] Compared with OpenResty or Nginx, Apache APISIX provides rich traffic management features like Load Balancing, Dynamic Upstream, Canary Release, Circuit Breaking, Authentication, Observability, and more... so Apache APISIX already supports Kafka on plugin( https://apisix.apache.org/docs/apisix/plugins/kafka-logger/ ) so, we to develop more community cooperation with Apache Flink. 1. Hold a wonderful technical meetup together 2. Collaborative output technology blog to share with more people 3. Carry out publicity activities on the official website and media channels together 4. Jointly develop Flink plugin support to APISIX I believe in doing so, it can not only meet the diversified needs of users, but also enrich the surrounding ecology of Apache Flink and Apache APISIX. Wait for more discussion. Thanks, Github: wang-yeliang, Twitter: @WYeliang
Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1
Hi Chesnay, Thanks a lot for driving these emergency patch releases! I just noticed that current flink-1.11.4 offers python files on mac os [1]. Is it okay to release Flink-1.11.5 and flink-1.12.6 without those python binaries on mac os? [1] https://pypi.org/project/apache-flink/1.11.4/#files Best Yun Tang From: Zhu Zhu Sent: Tuesday, December 14, 2021 11:00 To: dev Subject: Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1 +1 (binding) - verified the differences of source releases to the corresponding latest releases, there are only dependency updates and release version update commits - verified versions of log4j dependencies in the all binary releases are 2.15.0 - ran example jobs against all the binary releases, logs look good - release notes and blogpost look good Thanks, Zhu Xintong Song 于2021年12月14日周二 10:23写道: > +1 (binding) > > - verified checksum and signature > - verified that release candidates only contain the log4j dependency > changes compared to previous releases. > - release notes and blogpost LGTM > > Thanks a lot for driving these emergency patch releases, Chesnay! > > Thank you~ > > Xintong Song > > > > On Tue, Dec 14, 2021 at 7:45 AM Chesnay Schepler > wrote: > > > I forgot to mention something important: > > > > The 1.11/1.12 releases do *NOT* contain flink-python releases for *mac* > > due to compile problems. > > > > On 13/12/2021 20:28, Chesnay Schepler wrote: > > > Hi everyone, > > > > > > This vote is for the emergency patch releases for 1.11, 1.12, 1.13 and > > > 1.14 to address CVE-2021-44228. > > > It covers all 4 releases as they contain the same changes (upgrading > > > Log4j to 2.15.0) and were prepared simultaneously by the same person. > > > (Hence, if something is broken, it likely applies to all releases) > > > > > > Please review and vote on the release candidate #1 for the versions > > > 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows: > > > [ ] +1, Approve the releases > > > [ ] -1, Do not approve the releases (please provide specific comments) > > > > > > The complete staging area is available for your review, which includes: > > > * JIRA release notes [1], > > > * the official Apache source releases and binary convenience releases > > > to be deployed to dist.apache.org [2], which are signed with the key > > > with fingerprint C2EED7B111D464BA [3], > > > * all artifacts to be deployed to the Maven Central Repository [4], > > > * *the jars for 1.13/1.14 are still being built* > > > * source code tags [5], > > > * website pull request listing the new releases and adding > > > announcement blog post [6]. > > > > > > The vote will be open for at least 24 hours. The minimum vote time has > > > been shortened as the changes are minimal and the matter is urgent. > > > It is adopted by majority approval, with at least 3 PMC affirmative > > > votes. > > > > > > Thanks, > > > Chesnay > > > > > > [1] > > > 1.11: > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327 > > > 1.12: > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350328 > > > 1.13: > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350686 > > > 1.14: > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350512 > > > [2] > > > 1.11: https://dist.apache.org/repos/dist/dev/flink/flink-1.11.5-rc1/ > > > 1.12: https://dist.apache.org/repos/dist/dev/flink/flink-1.12.6-rc1/ > > > 1.13: https://dist.apache.org/repos/dist/dev/flink/flink-1.13.4-rc1/ > > > 1.14: https://dist.apache.org/repos/dist/dev/flink/flink-1.14.1-rc1/ > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > > [4] > > > 1.11/1.12: > > > https://repository.apache.org/content/repositories/orgapacheflink-1455 > > > 1.13: > > > https://repository.apache.org/content/repositories/orgapacheflink-1457 > > > 1.14: > > > https://repository.apache.org/content/repositories/orgapacheflink-1456 > > > [5] > > > 1.11: https://github.com/apache/flink/releases/tag/release-1.11.5-rc1 > > > 1.12: https://github.com/apache/flink/releases/tag/release-1.12.6-rc1 > > > 1.13: https://github.com/apache/flink/releases/tag/release-1.13.4-rc1 > > > 1.14: https://github.com/apache/flink/releases/tag/release-1.14.1-rc1 > > > [6] https://github.com/apache/flink-web/pull/489 > > > > > >
Re: [DISCUSS] FLIP-200: Support Multiple Rule and Dynamic Rule Changing (Flink CEP)
Hi DianFu, Thanks for your feedback of the FLIP. About the mentioned question for the `getLatestRules`, IMO, this doesn't need to rename into `getRuleChanges` because this method is used for getting the total amount of the latest rules which has been updated once. About the CEP.rule method, the CEP.dynamicPattern renaming is confusing for users. The dynamic pattern only creates the PatternStream not the DataStream. From the concept, a dynamic pattern is also a pattern, not contains the PatternProcessFunction. If renaming the CEP.rule into CEP.dynamicPattern, the return value of the method couldn't include the PatternProcessFunction, only returns the PatternStream. I think the difference between the Rule and the Pattern is that Rule contains the PatternProcessFunction, but the Pattern or DynamicPattern doesn't contain the function. Best Nicholas Jiang
Re: [DISCUSS] Immediate dedicated Flink releases for log4j vulnerability
It would be good if docker images are released too . Prasanna. On Mon, 13 Dec 2021, 16:16 Jing Zhang, wrote: > +1 for the quick release. > > Till Rohrmann 于2021年12月13日周一 17:54写道: > > > +1 > > > > Cheers, > > Till > > > > On Mon, Dec 13, 2021 at 10:42 AM Jing Ge wrote: > > > > > +1 > > > > > > As I suggested to publish the blog post last week asap, users have been > > > keen to have such urgent releases. Many thanks for it. > > > > > > > > > > > > On Mon, Dec 13, 2021 at 8:29 AM Konstantin Knauf > > > wrote: > > > > > > > +1 > > > > > > > > I didn't think this was necessary when I published the blog post on > > > Friday, > > > > but this has made higher waves than I expected over the weekend. > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 8:23 AM Yuan Mei > > wrote: > > > > > > > > > +1 for quick release. > > > > > > > > > > On Mon, Dec 13, 2021 at 2:55 PM Martijn Visser < > > mart...@ververica.com> > > > > > wrote: > > > > > > > > > > > +1 to address the issue like this > > > > > > > > > > > > On Mon, 13 Dec 2021 at 07:46, Jingsong Li < > jingsongl...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > +1 for fixing it in these versions and doing quick releases. > > Looks > > > > good > > > > > > to > > > > > > > me. > > > > > > > > > > > > > > Best, > > > > > > > Jingsong > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 2:18 PM Becket Qin < > becket@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > > +1. The solution sounds good to me. There have been a lot of > > > > > inquiries > > > > > > > > about how to react to this. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 12:40 PM Prasanna kumar < > > > > > > > > prasannakumarram...@gmail.com> wrote: > > > > > > > > > > > > > > > > > 1+ for making Updates for 1.12.5 . > > > > > > > > > We are looking for fix in 1.12 version. > > > > > > > > > Please notify once the fix is done. > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 9:45 AM Leonard Xu < > > xbjt...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > +1 for the quick release and the special vote period 24h. > > > > > > > > > > > > > > > > > > > > > 2021年12月13日 上午11:49,Dian Fu > 写道: > > > > > > > > > > > > > > > > > > > > > > +1 for the proposal and creating a quick release. > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > Dian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 11:15 AM Kyle Bendickson < > > > > > > k...@tabular.io> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > >> +1 to doing a release for this widely publicized > > > > > vulnerability. > > > > > > > > > > >> > > > > > > > > > > >> In my experience, users will often update to the > latest > > > > minor > > > > > > > patch > > > > > > > > > > version > > > > > > > > > > >> without much fuss. Plus, users have also likely heard > > > about > > > > > this > > > > > > > and > > > > > > > > > > will > > > > > > > > > > >> appreciate a simple fix (updating their version where > > > > > possible). > > > > > > > > > > >> > > > > > > > > > > >> The work-around will need to still be noted for users > > who > > > > > can’t > > > > > > > > > upgrade > > > > > > > > > > for > > > > > > > > > > >> whatever reason (EMR hasn’t caught up, etc). > > > > > > > > > > >> > > > > > > > > > > >> I also agree with your assessment to apply a patch on > > each > > > > of > > > > > > > those > > > > > > > > > > >> previous versions with only the log4j commit, so that > > they > > > > > don’t > > > > > > > need > > > > > > > > > > to be > > > > > > > > > > >> as rigorously tested. > > > > > > > > > > >> > > > > > > > > > > >> Best, > > > > > > > > > > >> Kyle (GitHub @kbendick) > > > > > > > > > > >> > > > > > > > > > > >> On Sun, Dec 12, 2021 at 2:23 PM Stephan Ewen < > > > > > se...@apache.org> > > > > > > > > > wrote: > > > > > > > > > > >> > > > > > > > > > > >>> Hi all! > > > > > > > > > > >>> > > > > > > > > > > >>> Without doubt, you heard about the log4j > vulnerability > > > [1]. > > > > > > > > > > >>> > > > > > > > > > > >>> There is an advisory blog post on how to mitigate > this > > in > > > > > > Apache > > > > > > > > > Flink > > > > > > > > > > >> [2], > > > > > > > > > > >>> which involves setting a config option and restarting > > the > > > > > > > processes. > > > > > > > > > > That > > > > > > > > > > >>> is fortunately a relatively simple fix. > > > > > > > > > > >>> > > > > > > > > > > >>> Despite this workaround, I think we should do an > > > immediate > > > > > > > release > > > > > > > > > with > > > > > > > > > > >> the > > > > > > > > > > >>> updated dependency. Meaning not waiting for the next > > bug > > > > fix > > > > > > > releases > > > > > > > > > > >>> coming in a few weeks, but releasing asap. > > > > > > > > > > >>> The mood I perceive in the industry is
Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1
+1 (binding) - verified the differences of source releases to the corresponding latest releases, there are only dependency updates and release version update commits - verified versions of log4j dependencies in the all binary releases are 2.15.0 - ran example jobs against all the binary releases, logs look good - release notes and blogpost look good Thanks, Zhu Xintong Song 于2021年12月14日周二 10:23写道: > +1 (binding) > > - verified checksum and signature > - verified that release candidates only contain the log4j dependency > changes compared to previous releases. > - release notes and blogpost LGTM > > Thanks a lot for driving these emergency patch releases, Chesnay! > > Thank you~ > > Xintong Song > > > > On Tue, Dec 14, 2021 at 7:45 AM Chesnay Schepler > wrote: > > > I forgot to mention something important: > > > > The 1.11/1.12 releases do *NOT* contain flink-python releases for *mac* > > due to compile problems. > > > > On 13/12/2021 20:28, Chesnay Schepler wrote: > > > Hi everyone, > > > > > > This vote is for the emergency patch releases for 1.11, 1.12, 1.13 and > > > 1.14 to address CVE-2021-44228. > > > It covers all 4 releases as they contain the same changes (upgrading > > > Log4j to 2.15.0) and were prepared simultaneously by the same person. > > > (Hence, if something is broken, it likely applies to all releases) > > > > > > Please review and vote on the release candidate #1 for the versions > > > 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows: > > > [ ] +1, Approve the releases > > > [ ] -1, Do not approve the releases (please provide specific comments) > > > > > > The complete staging area is available for your review, which includes: > > > * JIRA release notes [1], > > > * the official Apache source releases and binary convenience releases > > > to be deployed to dist.apache.org [2], which are signed with the key > > > with fingerprint C2EED7B111D464BA [3], > > > * all artifacts to be deployed to the Maven Central Repository [4], > > > * *the jars for 1.13/1.14 are still being built* > > > * source code tags [5], > > > * website pull request listing the new releases and adding > > > announcement blog post [6]. > > > > > > The vote will be open for at least 24 hours. The minimum vote time has > > > been shortened as the changes are minimal and the matter is urgent. > > > It is adopted by majority approval, with at least 3 PMC affirmative > > > votes. > > > > > > Thanks, > > > Chesnay > > > > > > [1] > > > 1.11: > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327 > > > 1.12: > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350328 > > > 1.13: > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350686 > > > 1.14: > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350512 > > > [2] > > > 1.11: https://dist.apache.org/repos/dist/dev/flink/flink-1.11.5-rc1/ > > > 1.12: https://dist.apache.org/repos/dist/dev/flink/flink-1.12.6-rc1/ > > > 1.13: https://dist.apache.org/repos/dist/dev/flink/flink-1.13.4-rc1/ > > > 1.14: https://dist.apache.org/repos/dist/dev/flink/flink-1.14.1-rc1/ > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > > [4] > > > 1.11/1.12: > > > https://repository.apache.org/content/repositories/orgapacheflink-1455 > > > 1.13: > > > https://repository.apache.org/content/repositories/orgapacheflink-1457 > > > 1.14: > > > https://repository.apache.org/content/repositories/orgapacheflink-1456 > > > [5] > > > 1.11: https://github.com/apache/flink/releases/tag/release-1.11.5-rc1 > > > 1.12: https://github.com/apache/flink/releases/tag/release-1.12.6-rc1 > > > 1.13: https://github.com/apache/flink/releases/tag/release-1.13.4-rc1 > > > 1.14: https://github.com/apache/flink/releases/tag/release-1.14.1-rc1 > > > [6] https://github.com/apache/flink-web/pull/489 > > > > > >
Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1
+1 (binding) - verified checksum and signature - verified that release candidates only contain the log4j dependency changes compared to previous releases. - release notes and blogpost LGTM Thanks a lot for driving these emergency patch releases, Chesnay! Thank you~ Xintong Song On Tue, Dec 14, 2021 at 7:45 AM Chesnay Schepler wrote: > I forgot to mention something important: > > The 1.11/1.12 releases do *NOT* contain flink-python releases for *mac* > due to compile problems. > > On 13/12/2021 20:28, Chesnay Schepler wrote: > > Hi everyone, > > > > This vote is for the emergency patch releases for 1.11, 1.12, 1.13 and > > 1.14 to address CVE-2021-44228. > > It covers all 4 releases as they contain the same changes (upgrading > > Log4j to 2.15.0) and were prepared simultaneously by the same person. > > (Hence, if something is broken, it likely applies to all releases) > > > > Please review and vote on the release candidate #1 for the versions > > 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows: > > [ ] +1, Approve the releases > > [ ] -1, Do not approve the releases (please provide specific comments) > > > > The complete staging area is available for your review, which includes: > > * JIRA release notes [1], > > * the official Apache source releases and binary convenience releases > > to be deployed to dist.apache.org [2], which are signed with the key > > with fingerprint C2EED7B111D464BA [3], > > * all artifacts to be deployed to the Maven Central Repository [4], > > * *the jars for 1.13/1.14 are still being built* > > * source code tags [5], > > * website pull request listing the new releases and adding > > announcement blog post [6]. > > > > The vote will be open for at least 24 hours. The minimum vote time has > > been shortened as the changes are minimal and the matter is urgent. > > It is adopted by majority approval, with at least 3 PMC affirmative > > votes. > > > > Thanks, > > Chesnay > > > > [1] > > 1.11: > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327 > > 1.12: > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350328 > > 1.13: > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350686 > > 1.14: > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350512 > > [2] > > 1.11: https://dist.apache.org/repos/dist/dev/flink/flink-1.11.5-rc1/ > > 1.12: https://dist.apache.org/repos/dist/dev/flink/flink-1.12.6-rc1/ > > 1.13: https://dist.apache.org/repos/dist/dev/flink/flink-1.13.4-rc1/ > > 1.14: https://dist.apache.org/repos/dist/dev/flink/flink-1.14.1-rc1/ > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > [4] > > 1.11/1.12: > > https://repository.apache.org/content/repositories/orgapacheflink-1455 > > 1.13: > > https://repository.apache.org/content/repositories/orgapacheflink-1457 > > 1.14: > > https://repository.apache.org/content/repositories/orgapacheflink-1456 > > [5] > > 1.11: https://github.com/apache/flink/releases/tag/release-1.11.5-rc1 > > 1.12: https://github.com/apache/flink/releases/tag/release-1.12.6-rc1 > > 1.13: https://github.com/apache/flink/releases/tag/release-1.13.4-rc1 > > 1.14: https://github.com/apache/flink/releases/tag/release-1.14.1-rc1 > > [6] https://github.com/apache/flink-web/pull/489 > > >
[jira] [Created] (FLINK-25285) CoGroupedStreams has inner Maps without easy ways to set uid
Daniel Bosnic Hill created FLINK-25285: -- Summary: CoGroupedStreams has inner Maps without easy ways to set uid Key: FLINK-25285 URL: https://issues.apache.org/jira/browse/FLINK-25285 Project: Flink Issue Type: Bug Components: API / Core Affects Versions: 1.14.0 Reporter: Daniel Bosnic Hill I tried to use CoGroupedStreams w/ disableAutoGeneratedUIDs. CoGroupedStreams creates two map operators without the ability to set uids on them. These appear as "Map" in my operator graph. I noticed that the CoGroupedStreams.apply function has two map calls without setting uids. If I try to run with disableAutoGeneratedUIDs, I get the following error "java.lang.IllegalStateException: Auto generated UIDs have been disabled but no UID or hash has been assigned to operator Map". >From Flink user group email thread "CoGroupedStreams and >disableAutoGeneratedUIDs". https://github.com/apache/flink/blob/release-1.14/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/CoGroupedStreams.java#L379 -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1
I forgot to mention something important: The 1.11/1.12 releases do *NOT* contain flink-python releases for *mac* due to compile problems. On 13/12/2021 20:28, Chesnay Schepler wrote: Hi everyone, This vote is for the emergency patch releases for 1.11, 1.12, 1.13 and 1.14 to address CVE-2021-44228. It covers all 4 releases as they contain the same changes (upgrading Log4j to 2.15.0) and were prepared simultaneously by the same person. (Hence, if something is broken, it likely applies to all releases) Please review and vote on the release candidate #1 for the versions 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows: [ ] +1, Approve the releases [ ] -1, Do not approve the releases (please provide specific comments) The complete staging area is available for your review, which includes: * JIRA release notes [1], * the official Apache source releases and binary convenience releases to be deployed to dist.apache.org [2], which are signed with the key with fingerprint C2EED7B111D464BA [3], * all artifacts to be deployed to the Maven Central Repository [4], * *the jars for 1.13/1.14 are still being built* * source code tags [5], * website pull request listing the new releases and adding announcement blog post [6]. The vote will be open for at least 24 hours. The minimum vote time has been shortened as the changes are minimal and the matter is urgent. It is adopted by majority approval, with at least 3 PMC affirmative votes. Thanks, Chesnay [1] 1.11: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327 1.12: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350328 1.13: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350686 1.14: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350512 [2] 1.11: https://dist.apache.org/repos/dist/dev/flink/flink-1.11.5-rc1/ 1.12: https://dist.apache.org/repos/dist/dev/flink/flink-1.12.6-rc1/ 1.13: https://dist.apache.org/repos/dist/dev/flink/flink-1.13.4-rc1/ 1.14: https://dist.apache.org/repos/dist/dev/flink/flink-1.14.1-rc1/ [3] https://dist.apache.org/repos/dist/release/flink/KEYS [4] 1.11/1.12: https://repository.apache.org/content/repositories/orgapacheflink-1455 1.13: https://repository.apache.org/content/repositories/orgapacheflink-1457 1.14: https://repository.apache.org/content/repositories/orgapacheflink-1456 [5] 1.11: https://github.com/apache/flink/releases/tag/release-1.11.5-rc1 1.12: https://github.com/apache/flink/releases/tag/release-1.12.6-rc1 1.13: https://github.com/apache/flink/releases/tag/release-1.13.4-rc1 1.14: https://github.com/apache/flink/releases/tag/release-1.14.1-rc1 [6] https://github.com/apache/flink-web/pull/489
Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1
Update: All jars are now available. On 13/12/2021 20:28, Chesnay Schepler wrote: Hi everyone, This vote is for the emergency patch releases for 1.11, 1.12, 1.13 and 1.14 to address CVE-2021-44228. It covers all 4 releases as they contain the same changes (upgrading Log4j to 2.15.0) and were prepared simultaneously by the same person. (Hence, if something is broken, it likely applies to all releases) Please review and vote on the release candidate #1 for the versions 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows: [ ] +1, Approve the releases [ ] -1, Do not approve the releases (please provide specific comments) The complete staging area is available for your review, which includes: * JIRA release notes [1], * the official Apache source releases and binary convenience releases to be deployed to dist.apache.org [2], which are signed with the key with fingerprint C2EED7B111D464BA [3], * all artifacts to be deployed to the Maven Central Repository [4], * *the jars for 1.13/1.14 are still being built* * source code tags [5], * website pull request listing the new releases and adding announcement blog post [6]. The vote will be open for at least 24 hours. The minimum vote time has been shortened as the changes are minimal and the matter is urgent. It is adopted by majority approval, with at least 3 PMC affirmative votes. Thanks, Chesnay [1] 1.11: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327 1.12: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350328 1.13: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350686 1.14: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350512 [2] 1.11: https://dist.apache.org/repos/dist/dev/flink/flink-1.11.5-rc1/ 1.12: https://dist.apache.org/repos/dist/dev/flink/flink-1.12.6-rc1/ 1.13: https://dist.apache.org/repos/dist/dev/flink/flink-1.13.4-rc1/ 1.14: https://dist.apache.org/repos/dist/dev/flink/flink-1.14.1-rc1/ [3] https://dist.apache.org/repos/dist/release/flink/KEYS [4] 1.11/1.12: https://repository.apache.org/content/repositories/orgapacheflink-1455 1.13: https://repository.apache.org/content/repositories/orgapacheflink-1457 1.14: https://repository.apache.org/content/repositories/orgapacheflink-1456 [5] 1.11: https://github.com/apache/flink/releases/tag/release-1.11.5-rc1 1.12: https://github.com/apache/flink/releases/tag/release-1.12.6-rc1 1.13: https://github.com/apache/flink/releases/tag/release-1.13.4-rc1 1.14: https://github.com/apache/flink/releases/tag/release-1.14.1-rc1 [6] https://github.com/apache/flink-web/pull/489
[jira] [Created] (FLINK-25284) Support nulls in DataGen
Sergey Nuyanzin created FLINK-25284: --- Summary: Support nulls in DataGen Key: FLINK-25284 URL: https://issues.apache.org/jira/browse/FLINK-25284 Project: Flink Issue Type: Bug Components: Table SQL / API Reporter: Sergey Nuyanzin Currently it is impossible to specify that some values should be null sometimes. It would be nice to have some property something like {{null-rate}} telling how often there should be {{null}} value generated -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1
+1 (binding) - Verified that commit history is identical to previous release (except dependency upgrade and release version commit) - Verified that the source releases reference updated dependency and binary releases contain updated dependency - Blog post looks good - ran bundled examples against 1.14.1 binary release, worked as expected. On Mon, Dec 13, 2021 at 9:22 PM Seth Wiesman wrote: > +1 (non-binding) > > - Checked Log4J version and updated license preambles on all releases > - Verified signatures on sources > - Reviewed blog post > > Seth > > On Mon, Dec 13, 2021 at 1:42 PM Jing Ge wrote: > > > +1 LGTM. Many thanks for your effort! > > > > On Mon, Dec 13, 2021 at 8:28 PM Chesnay Schepler > > wrote: > > > > > Hi everyone, > > > > > > This vote is for the emergency patch releases for 1.11, 1.12, 1.13 and > > > 1.14 to address CVE-2021-44228. > > > It covers all 4 releases as they contain the same changes (upgrading > > > Log4j to 2.15.0) and were prepared simultaneously by the same person. > > > (Hence, if something is broken, it likely applies to all releases) > > > > > > Please review and vote on the release candidate #1 for the versions > > > 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows: > > > [ ] +1, Approve the releases > > > [ ] -1, Do not approve the releases (please provide specific comments) > > > > > > The complete staging area is available for your review, which includes: > > > * JIRA release notes [1], > > > * the official Apache source releases and binary convenience releases > to > > > be deployed to dist.apache.org [2], which are signed with the key with > > > fingerprint C2EED7B111D464BA [3], > > > * all artifacts to be deployed to the Maven Central Repository [4], > > > * *the jars for 1.13/1.14 are still being built* > > > * source code tags [5], > > > * website pull request listing the new releases and adding announcement > > > blog post [6]. > > > > > > The vote will be open for at least 24 hours. The minimum vote time has > > > been shortened as the changes are minimal and the matter is urgent. > > > It is adopted by majority approval, with at least 3 PMC affirmative > > votes. > > > > > > Thanks, > > > Chesnay > > > > > > [1] > > > 1.11: > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327 > > > 1.12: > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350328 > > > 1.13: > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350686 > > > 1.14: > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350512 > > > [2] > > > 1.11: https://dist.apache.org/repos/dist/dev/flink/flink-1.11.5-rc1/ > > > 1.12: https://dist.apache.org/repos/dist/dev/flink/flink-1.12.6-rc1/ > > > 1.13: https://dist.apache.org/repos/dist/dev/flink/flink-1.13.4-rc1/ > > > 1.14: https://dist.apache.org/repos/dist/dev/flink/flink-1.14.1-rc1/ > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > > [4] > > > 1.11/1.12: > > > https://repository.apache.org/content/repositories/orgapacheflink-1455 > > > 1.13: > > > https://repository.apache.org/content/repositories/orgapacheflink-1457 > > > 1.14: > > > https://repository.apache.org/content/repositories/orgapacheflink-1456 > > > [5] > > > 1.11: https://github.com/apache/flink/releases/tag/release-1.11.5-rc1 > > > 1.12: https://github.com/apache/flink/releases/tag/release-1.12.6-rc1 > > > 1.13: https://github.com/apache/flink/releases/tag/release-1.13.4-rc1 > > > 1.14: https://github.com/apache/flink/releases/tag/release-1.14.1-rc1 > > > [6] https://github.com/apache/flink-web/pull/489 > > > > > >
Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1
+1 (non-binding) - Checked Log4J version and updated license preambles on all releases - Verified signatures on sources - Reviewed blog post Seth On Mon, Dec 13, 2021 at 1:42 PM Jing Ge wrote: > +1 LGTM. Many thanks for your effort! > > On Mon, Dec 13, 2021 at 8:28 PM Chesnay Schepler > wrote: > > > Hi everyone, > > > > This vote is for the emergency patch releases for 1.11, 1.12, 1.13 and > > 1.14 to address CVE-2021-44228. > > It covers all 4 releases as they contain the same changes (upgrading > > Log4j to 2.15.0) and were prepared simultaneously by the same person. > > (Hence, if something is broken, it likely applies to all releases) > > > > Please review and vote on the release candidate #1 for the versions > > 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows: > > [ ] +1, Approve the releases > > [ ] -1, Do not approve the releases (please provide specific comments) > > > > The complete staging area is available for your review, which includes: > > * JIRA release notes [1], > > * the official Apache source releases and binary convenience releases to > > be deployed to dist.apache.org [2], which are signed with the key with > > fingerprint C2EED7B111D464BA [3], > > * all artifacts to be deployed to the Maven Central Repository [4], > > * *the jars for 1.13/1.14 are still being built* > > * source code tags [5], > > * website pull request listing the new releases and adding announcement > > blog post [6]. > > > > The vote will be open for at least 24 hours. The minimum vote time has > > been shortened as the changes are minimal and the matter is urgent. > > It is adopted by majority approval, with at least 3 PMC affirmative > votes. > > > > Thanks, > > Chesnay > > > > [1] > > 1.11: > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327 > > 1.12: > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350328 > > 1.13: > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350686 > > 1.14: > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350512 > > [2] > > 1.11: https://dist.apache.org/repos/dist/dev/flink/flink-1.11.5-rc1/ > > 1.12: https://dist.apache.org/repos/dist/dev/flink/flink-1.12.6-rc1/ > > 1.13: https://dist.apache.org/repos/dist/dev/flink/flink-1.13.4-rc1/ > > 1.14: https://dist.apache.org/repos/dist/dev/flink/flink-1.14.1-rc1/ > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > [4] > > 1.11/1.12: > > https://repository.apache.org/content/repositories/orgapacheflink-1455 > > 1.13: > > https://repository.apache.org/content/repositories/orgapacheflink-1457 > > 1.14: > > https://repository.apache.org/content/repositories/orgapacheflink-1456 > > [5] > > 1.11: https://github.com/apache/flink/releases/tag/release-1.11.5-rc1 > > 1.12: https://github.com/apache/flink/releases/tag/release-1.12.6-rc1 > > 1.13: https://github.com/apache/flink/releases/tag/release-1.13.4-rc1 > > 1.14: https://github.com/apache/flink/releases/tag/release-1.14.1-rc1 > > [6] https://github.com/apache/flink-web/pull/489 > > >
[jira] [Created] (FLINK-25283) End-to-end application modules create oversized jars
Chesnay Schepler created FLINK-25283: Summary: End-to-end application modules create oversized jars Key: FLINK-25283 URL: https://issues.apache.org/jira/browse/FLINK-25283 Project: Flink Issue Type: Technical Debt Components: Tests Affects Versions: 1.13.0 Reporter: Chesnay Schepler Fix For: 1.15.0 Various modules that create jars for e2e tests (e.g., flink-streaming-kinesis-test) create oversized jars (100mb+) because they bundle their entire dependency tree, including many parts of Flink and even test resources. -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1
+1 LGTM. Many thanks for your effort! On Mon, Dec 13, 2021 at 8:28 PM Chesnay Schepler wrote: > Hi everyone, > > This vote is for the emergency patch releases for 1.11, 1.12, 1.13 and > 1.14 to address CVE-2021-44228. > It covers all 4 releases as they contain the same changes (upgrading > Log4j to 2.15.0) and were prepared simultaneously by the same person. > (Hence, if something is broken, it likely applies to all releases) > > Please review and vote on the release candidate #1 for the versions > 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows: > [ ] +1, Approve the releases > [ ] -1, Do not approve the releases (please provide specific comments) > > The complete staging area is available for your review, which includes: > * JIRA release notes [1], > * the official Apache source releases and binary convenience releases to > be deployed to dist.apache.org [2], which are signed with the key with > fingerprint C2EED7B111D464BA [3], > * all artifacts to be deployed to the Maven Central Repository [4], > * *the jars for 1.13/1.14 are still being built* > * source code tags [5], > * website pull request listing the new releases and adding announcement > blog post [6]. > > The vote will be open for at least 24 hours. The minimum vote time has > been shortened as the changes are minimal and the matter is urgent. > It is adopted by majority approval, with at least 3 PMC affirmative votes. > > Thanks, > Chesnay > > [1] > 1.11: > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327 > 1.12: > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350328 > 1.13: > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350686 > 1.14: > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350512 > [2] > 1.11: https://dist.apache.org/repos/dist/dev/flink/flink-1.11.5-rc1/ > 1.12: https://dist.apache.org/repos/dist/dev/flink/flink-1.12.6-rc1/ > 1.13: https://dist.apache.org/repos/dist/dev/flink/flink-1.13.4-rc1/ > 1.14: https://dist.apache.org/repos/dist/dev/flink/flink-1.14.1-rc1/ > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > [4] > 1.11/1.12: > https://repository.apache.org/content/repositories/orgapacheflink-1455 > 1.13: > https://repository.apache.org/content/repositories/orgapacheflink-1457 > 1.14: > https://repository.apache.org/content/repositories/orgapacheflink-1456 > [5] > 1.11: https://github.com/apache/flink/releases/tag/release-1.11.5-rc1 > 1.12: https://github.com/apache/flink/releases/tag/release-1.12.6-rc1 > 1.13: https://github.com/apache/flink/releases/tag/release-1.13.4-rc1 > 1.14: https://github.com/apache/flink/releases/tag/release-1.14.1-rc1 > [6] https://github.com/apache/flink-web/pull/489 >
[VOTE] Release 1.11.5/1.12.6/1.13.4/1.14.1, release candidate #1
Hi everyone, This vote is for the emergency patch releases for 1.11, 1.12, 1.13 and 1.14 to address CVE-2021-44228. It covers all 4 releases as they contain the same changes (upgrading Log4j to 2.15.0) and were prepared simultaneously by the same person. (Hence, if something is broken, it likely applies to all releases) Please review and vote on the release candidate #1 for the versions 1.11.5, 1.12.6, 1.13.4 and 1.14.1, as follows: [ ] +1, Approve the releases [ ] -1, Do not approve the releases (please provide specific comments) The complete staging area is available for your review, which includes: * JIRA release notes [1], * the official Apache source releases and binary convenience releases to be deployed to dist.apache.org [2], which are signed with the key with fingerprint C2EED7B111D464BA [3], * all artifacts to be deployed to the Maven Central Repository [4], * *the jars for 1.13/1.14 are still being built* * source code tags [5], * website pull request listing the new releases and adding announcement blog post [6]. The vote will be open for at least 24 hours. The minimum vote time has been shortened as the changes are minimal and the matter is urgent. It is adopted by majority approval, with at least 3 PMC affirmative votes. Thanks, Chesnay [1] 1.11: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350327 1.12: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350328 1.13: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350686 1.14: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350512 [2] 1.11: https://dist.apache.org/repos/dist/dev/flink/flink-1.11.5-rc1/ 1.12: https://dist.apache.org/repos/dist/dev/flink/flink-1.12.6-rc1/ 1.13: https://dist.apache.org/repos/dist/dev/flink/flink-1.13.4-rc1/ 1.14: https://dist.apache.org/repos/dist/dev/flink/flink-1.14.1-rc1/ [3] https://dist.apache.org/repos/dist/release/flink/KEYS [4] 1.11/1.12: https://repository.apache.org/content/repositories/orgapacheflink-1455 1.13: https://repository.apache.org/content/repositories/orgapacheflink-1457 1.14: https://repository.apache.org/content/repositories/orgapacheflink-1456 [5] 1.11: https://github.com/apache/flink/releases/tag/release-1.11.5-rc1 1.12: https://github.com/apache/flink/releases/tag/release-1.12.6-rc1 1.13: https://github.com/apache/flink/releases/tag/release-1.13.4-rc1 1.14: https://github.com/apache/flink/releases/tag/release-1.14.1-rc1 [6] https://github.com/apache/flink-web/pull/489
Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs
@Timo Walther > But the question is why a user wants to run COMPILE multiple times. If > it is during development, then running EXECUTE (or just the statement > itself) without calling COMPILE should be sufficient. The file can also > manually be deleted if necessary. Sorry for the delayed response, yep its sounds like not necessary, and if one, for any reason needs to force compile only specific plans, the config option can be used. Thank you, Marios On Thu, Dec 2, 2021 at 5:42 PM Timo Walther wrote: > Response to Marios's feedback: > > > there should be some good logging in place when the upgrade is taking > place > > Yes, I agree. I added this part to the FLIP. > > > config option instead that doesn't provide the flexibility to > overwrite certain plans > > One can set the config option also around sections of the > multi-statement SQL script. > > SET 'table.plan.force-recompile'='true'; > > COMPILE ... > > SET 'table.plan.force-recompile'='false'; > > But the question is why a user wants to run COMPILE multiple times. If > it is during development, then running EXECUTE (or just the statement > itself) without calling COMPILE should be sufficient. The file can also > manually be deleted if necessary. > > What do you think? > > Regards, > Timo > > > > On 02.12.21 16:09, Timo Walther wrote: > > Hi Till, > > > > Yes, you might have to. But not a new plan from the SQL query but a > > migration from the old plan to the new plan. This will not happen often. > > But we need a way to evolve the format of the JSON plan itself. > > > > Maybe this confuses a bit, so let me clarify it again: Mostly ExecNode > > versions and operator state layouts will evolve. Not the plan files, > > those will be pretty stable. But also not infinitely. > > > > Regards, > > Timo > > > > > > On 02.12.21 16:01, Till Rohrmann wrote: > >> Then for migrating from Flink 1.10 to 1.12, I might have to create a new > >> plan using Flink 1.11 in order to migrate from Flink 1.11 to 1.12, > right? > >> > >> Cheers, > >> Till > >> > >> On Thu, Dec 2, 2021 at 3:39 PM Timo Walther wrote: > >> > >>> Response to Till's feedback: > >>> > >>> > compiled plan won't be changed after being written initially > >>> > >>> This is not entirely correct. We give guarantees for keeping the query > >>> up and running. We reserve us the right to force plan migrations. In > >>> this case, the plan might not be created from the SQL statement but > from > >>> the old plan. I have added an example in section 10.1.1. In general, > >>> both persisted entities "plan" and "savepoint" can evolve independently > >>> from each other. > >>> > >>> Thanks, > >>> Timo > >>> > >>> On 02.12.21 15:10, Timo Walther wrote: > Response to Godfrey's feedback: > > > "EXPLAIN PLAN EXECUTE STATEMENT SET BEGIN ... END" is missing. > > Thanks for the hint. I added a dedicated section 7.1.3. > > > > it's hard to maintain the supported versions for > "supportedPlanChanges" and "supportedSavepointChanges" > > Actually, I think we are mostly on the same page. > > The annotation does not need to be updated for every Flink version. As > the name suggests it is about "Changes" (in other words: > incompatibilities) that require some kind of migration. Either plan > migration (= PlanChanges) or savepoint migration (=SavepointChanges, > using operator migration or savepoint migration). > > Let's assume we introduced two ExecNodes A and B in Flink 1.15. > > The annotations are: > > @ExecNodeMetadata(name=A, supportedPlanChanges=1.15, > supportedSavepointChanges=1.15) > > @ExecNodeMetadata(name=B, supportedPlanChanges=1.15, > supportedSavepointChanges=1.15) > > We change an operator state of B in Flink 1.16. > > We perform the change in the operator of B in a way to support both > state layouts. Thus, no need for a new ExecNode version. > > The annotations in 1.16 are: > > @ExecNodeMetadata(name=A, supportedPlanChanges=1.15, > supportedSavepointChanges=1.15) > > @ExecNodeMetadata(name=B, supportedPlanChanges=1.15, > supportedSavepointChanges=1.15, 1.16) > > So the versions in the annotations are "start version"s. > > I don't think we need end versions? End version would mean that we > drop > the ExecNode from the code base? > > Please check the section 10.1.1 again. I added a more complex example. > > > Thanks, > Timo > > > > On 01.12.21 16:29, Timo Walther wrote: > > Response to Francesco's feedback: > > > > > *Proposed changes #6*: Other than defining this rule of thumb, we > > must also make sure that compiling plans with these objects that > > cannot be serialized in the plan must fail hard > > > > Yes, I totally agree. We will fail hard with a helpful exception. Any > > mistake
Re: [DISCUSS] FLIP-191: Extend unified Sink interface to support small file compaction
Hi all, After a lot of discussions with different, we received very fruitful feedback and reworked the ideas behind this FLIP. Initially, we had the impression that the compaction problem is solvable by a single topology that we can reuse across different sinks. We now have a better understanding that different external systems require different compaction mechanism i.e. Hive requires compaction before finally registering the file in the metastore or Iceberg compacts the files after they have been registered and just lazily compacts them. Considering all these different views we came up with a design that builds upon what @guowei@gmail.com and @yungao...@aliyun.com have proposed at the beginning. We allow inserting custom topologies before and after the SinkWriters and Committers. Furthermore, we do not see it as a downside. The Sink interfaces that will expose the DataStream to the user reside in flink-streaming-java in contrast to the basic Sink interfaces that reside fin flink-core deem it to be only used by expert users. Moreover, we also wanted to remove the global committer from the unified Sink interfaces and replace it with a custom post-commit topology. Unfortunately, we cannot do it without breaking the Sink interface since the GlobalCommittables are part of the parameterized Sink interface. Thus, we propose building a new Sink V2 interface consisting of composable interfaces that do not offer the GlobalCommitter anymore. We will implement a utility to extend a Sink with post topology that mimics the behavior of the GlobalCommitter. The new Sink V2 provides the same sort of methods as the Sink V1 interface, so a migration of sinks that do not use the GlobalCommitter should be very easy. We plan to keep the existing Sink V1 interfaces to not break externally built sinks. As part of this FLIP, we migrate all the connectors inside of the main repository to the new Sink V2 API. The FLIP document is also updated and includes the proposed changes. Looking forward to your feedback, Fabian https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction On Thu, Dec 2, 2021 at 10:15 AM Roman Khachatryan wrote: > > Thanks for clarifying (I was initially confused by merging state files > rather than output files). > > > At some point, Flink will definitely have some WAL adapter that can turn > > any sink into an exactly-once sink (with some caveats). For now, we keep > > that as an orthogonal solution as it has a rather high price (bursty > > workload with high latency). Ideally, we can keep the compaction > > asynchronously... > > Yes, that would be something like a WAL. I agree that it would have a > different set of trade-offs. > > > Regards, > Roman > > On Mon, Nov 29, 2021 at 3:33 PM Arvid Heise wrote: > >> > >> > One way to avoid write-read-merge is by wrapping SinkWriter with > >> > another one, which would buffer input elements in a temporary storage > >> > (e.g. local file) until a threshold is reached; after that, it would > >> > invoke the original SinkWriter. And if a checkpoint barrier comes in > >> > earlier, it would send written data to some aggregator. > >> > >> I think perhaps this seems to be a kind of WAL method? Namely we first > >> write the elements to some WAL logs and persist them on checkpoint > >> (in snapshot or remote FS), or we directly write WAL logs to the remote > >> FS eagerly. > >> > > At some point, Flink will definitely have some WAL adapter that can turn > > any sink into an exactly-once sink (with some caveats). For now, we keep > > that as an orthogonal solution as it has a rather high price (bursty > > workload with high latency). Ideally, we can keep the compaction > > asynchronously... > > > > On Mon, Nov 29, 2021 at 8:52 AM Yun Gao > > wrote: > >> > >> Hi, > >> > >> @Roman very sorry for the late response for a long time, > >> > >> > Merging artifacts from multiple checkpoints would apparently > >> require multiple concurrent checkpoints > >> > >> I think it might not need concurrent checkpoints: suppose some > >> operators (like the committer aggregator in the option 2) maintains > >> the list of files to merge, it could stores the lists of files to merge > >> in the states, then after several checkpoints are done and we have > >> enough files, we could merge all the files in the list. > >> > >> > Asynchronous merging in an aggregator would require some resolution > >> > logic on recovery, so that a merged artifact can be used if the > >> > original one was deleted. Otherwise, wouldn't recovery fail because > >> > some artifacts are missing? > >> > We could also defer deletion until the "compacted" checkpoint is > >> > subsumed - but isn't it too late, as it will be deleted anyways once > >> > subsumed? > >> > >> I think logically we could delete the original files once the "compacted" > >> checkpoint > >> (which finish merging the compacted files and record it in the
[jira] [Created] (FLINK-25282) Move runtime dependencies from table-planner to table-runtime
Francesco Guardiani created FLINK-25282: --- Summary: Move runtime dependencies from table-planner to table-runtime Key: FLINK-25282 URL: https://issues.apache.org/jira/browse/FLINK-25282 Project: Flink Issue Type: Sub-task Reporter: Francesco Guardiani Assignee: Francesco Guardiani There are several runtime dependencies (e.g. functions used in codegen) that are shipped by table-planner and calcite-core. We should move these dependencies to runtime -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25281) Azure failed due to python tests "tox check failed"
Yun Gao created FLINK-25281: --- Summary: Azure failed due to python tests "tox check failed" Key: FLINK-25281 URL: https://issues.apache.org/jira/browse/FLINK-25281 Project: Flink Issue Type: Bug Components: API / Python, Build System / Azure Pipelines Affects Versions: 1.14.0 Reporter: Yun Gao {code:java} Dec 13 03:03:08 pip_test_code.py success! Dec 13 03:03:09 ___ summary Dec 13 03:03:09 py36-cython: commands succeeded Dec 13 03:03:09 ERROR: py37-cython: commands failed Dec 13 03:03:09 py38-cython: commands succeeded Dec 13 03:03:09 tox checks... [FAILED] Dec 13 03:03:09 Process exited with EXIT CODE: 1. Dec 13 03:03:09 Trying to KILL watchdog (2760). /__w/1/s/tools/ci/watchdog.sh: line 100: 2760 Terminated watchdog Dec 13 03:03:09 Searching for .dump, .dumpstream and related files in '/__w/1/s' The STDIO streams did not close within 10 seconds of the exit event from process '/bin/bash'. This may indicate a child process inherited the STDIO streams and has not yet exited. ##[error]Bash exited with code '1'. Finishing: Test - python {code} https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=28010=logs=bdd9ea51-4de2-506a-d4d9-f3930e4d2355=dd50312f-73b5-56b5-c172-4d81d03e2ef1=24236 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25280) KafkaPartitionSplitReaderTest failed on azure due to Offsets out of range with no configured reset policy for partitions
Yun Gao created FLINK-25280: --- Summary: KafkaPartitionSplitReaderTest failed on azure due to Offsets out of range with no configured reset policy for partitions Key: FLINK-25280 URL: https://issues.apache.org/jira/browse/FLINK-25280 Project: Flink Issue Type: Bug Components: Connectors / Kafka Affects Versions: 1.14.0 Reporter: Yun Gao {code:java} Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) Dec 13 03:30:12 at java.util.ArrayList.forEach(ArrayList.java:1259) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:143) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:32) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57) Dec 13 03:30:12 at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:51) Dec 13 03:30:12 at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:220) Dec 13 03:30:12 at org.junit.platform.launcher.core.DefaultLauncher.lambda$execute$6(DefaultLauncher.java:188) Dec 13 03:30:12 at org.junit.platform.launcher.core.DefaultLauncher.withInterceptedStreams(DefaultLauncher.java:202) Dec 13 03:30:12 at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:181) Dec 13 03:30:12 at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:128) Dec 13 03:30:12 at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:150) Dec 13 03:30:12 at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:124) Dec 13 03:30:12 at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) Dec 13 03:30:12 at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) Dec 13 03:30:12 at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) Dec 13 03:30:12 at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) Dec 13 03:30:12 {code} [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=28010=logs=c5612577-f1f7-5977-6ff6-7432788526f7=ffa8837a-b445-534e-cdf4-db364cf8235d=7010] testNumBytesInCounter and testPendingRecordsGauge also failed. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25279) KafkaSourceLegacyITCase.testKeyValueSupport failed on azure due to topic already exists
Yun Gao created FLINK-25279: --- Summary: KafkaSourceLegacyITCase.testKeyValueSupport failed on azure due to topic already exists Key: FLINK-25279 URL: https://issues.apache.org/jira/browse/FLINK-25279 Project: Flink Issue Type: Bug Components: Connectors / Kafka Affects Versions: 1.15.0 Reporter: Yun Gao {code:java} Dec 12 03:00:12 [ERROR] Failures: Dec 12 03:00:12 [ERROR] KafkaSourceLegacyITCase.testKeyValueSupport:58->KafkaConsumerTestBase.runKeyValueTest:1528->KafkaTestBase.createTestTopic:222 Create test topic : keyvaluetest failed, org.apache.kafka.common.errors.TopicExistsException: Topic 'keyvaluetest' already exists. Dec 12 03:00:12 [INFO] Dec 12 03:00:12 [ERROR] Tests run: 177, Failures: 1, Errors: 0, Skipped: 0 {code} https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=27995=logs=72d4811f-9f0d-5fd0-014a-0bc26b72b642=e424005a-b16e-540f-196d-da062cc19bdf=35697 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25278) Azure failed due to unable to transfer jar from confluent maven repo
Yun Gao created FLINK-25278: --- Summary: Azure failed due to unable to transfer jar from confluent maven repo Key: FLINK-25278 URL: https://issues.apache.org/jira/browse/FLINK-25278 Project: Flink Issue Type: Bug Components: Build System / Azure Pipelines Affects Versions: 1.13.3 Reporter: Yun Gao {code:java} Dec 12 00:46:45 [ERROR] Failed to execute goal on project flink-avro-confluent-registry: Could not resolve dependencies for project org.apache.flink:flink-avro-confluent-registry:jar:1.13-SNAPSHOT: Could not transfer artifact io.confluent:common-utils:jar:5.5.2 from/to confluent (https://packages.confluent.io/maven/): transfer failed for https://packages.confluent.io/maven/io/confluent/common-utils/5.5.2/common-utils-5.5.2.jar: Connection reset -> [Help 1] Dec 12 00:46:45 [ERROR] Dec 12 00:46:45 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. Dec 12 00:46:45 [ERROR] Re-run Maven using the -X switch to enable full debug logging. Dec 12 00:46:45 [ERROR] Dec 12 00:46:45 [ERROR] For more information about the errors and possible solutions, please read the following articles: Dec 12 00:46:45 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException Dec 12 00:46:45 [ERROR] Dec 12 00:46:45 [ERROR] After correcting the problems, you can resume the build with the command Dec 12 00:46:45 [ERROR] mvn -rf :flink-avro-confluent-registry {code} https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=27994=logs=a5ef94ef-68c2-57fd-3794-dc108ed1c495=9c1ddabe-d186-5a2c-5fcc-f3cafb3ec699=8812 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25277) Introduce explicit shutdown signalling between TaskManager and JobManager
Niklas Semmler created FLINK-25277: -- Summary: Introduce explicit shutdown signalling between TaskManager and JobManager Key: FLINK-25277 URL: https://issues.apache.org/jira/browse/FLINK-25277 Project: Flink Issue Type: Improvement Components: Runtime / Coordination Affects Versions: 1.14.0, 1.13.0 Reporter: Niklas Semmler Fix For: 1.15.0 We need to introduce shutdown signalling between TaskManager and JobManager for fast & graceful shutdown in reactive scheduler mode. In Flink 1.14 and earlier versions, the JobManager tracks the availability of a TaskManager using a hearbeat. This heartbeat is by default configured with an interval of 10 seconds and a timeout of 50 seconds [1]. Hence, the shutdown of a TaskManager is recognized only after about 50-60 seconds. This works fine for the static scheduling mode, where a TaskManager only disappears as part of a cluster shutdown or a job failure. However, in the reactive scheduler mode (FLINK-10407), TaskManagers are regularly added and removed from a running job. Here, the heartbeat-mechanisms incurs additional delays. To remove these delays, we add an explicit shutdown signal from the TaskManager to the JobManager. Additionally, to avoid data loss in a running job, the TaskManager will wait for a shutdown confirmation from the JobManager before shutting down. [1]https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#heartbeat-timeout -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25276) Support native and incremental savepoints
Piotr Nowojski created FLINK-25276: -- Summary: Support native and incremental savepoints Key: FLINK-25276 URL: https://issues.apache.org/jira/browse/FLINK-25276 Project: Flink Issue Type: New Feature Reporter: Piotr Nowojski Motivation. Currently with non incremental canonical format savepoints, with very large state, both taking and recovery from savepoints can take very long time. Providing options to take native format and incremental savepoint would alleviate this problem. In the past the main challenge lied in the ownership semantic and files clean up of such incremental savepoints. However with FLINK-25154 implemented some of those concerns can be solved. Incremental savepoint could leverage "force full snapshot" mode provided by FLINK-25192, to duplicate/copy all of the savepoint files out of the Flink's ownership scope. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-25275) Weighted KeyGroup assignment
Piotr Nowojski created FLINK-25275: -- Summary: Weighted KeyGroup assignment Key: FLINK-25275 URL: https://issues.apache.org/jira/browse/FLINK-25275 Project: Flink Issue Type: New Feature Components: Runtime / Network Affects Versions: 1.14.0 Reporter: Piotr Nowojski Currently key groups are split into key group ranges naively in the simplest way. Key groups are split into equally sized continuous ranges (number of ranges = parallelism = number of keygroups / size of single keygroup). Flink could avoid data skew between key groups, by assigning them to tasks based on their "weight". "Weight" could be defined as frequency of an access for the given key group. Arbitrary, non-continuous, key group assignment (for example TM1 is processing kg1 and kg3 while TM2 is processing only kg2) would require extensive changes to the state backends for example. However the data skew could be mitigated to some extent by creating key group ranges in a more clever way, while keeping the key group range continuity. For example TM1 processes range [kg1, kg9], while TM2 just [kg10, kg11]. [This branch shows a PoC of such approach.|https://github.com/pnowojski/flink/commits/antiskew] -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [DISCUSS] FLIP-200: Support Multiple Rule and Dynamic Rule Changing (Flink CEP)
Thanks Yunfeng for bringing up this discussion. I have seen that there were many users asking for this feature in the past. So big +1 for this proposal. Regarding this FLIP, I have the following question: - Does it make sense to rename the method `getLatestRules` into `getRuleChanges`? I assume we want to know the change history of the rules, not only the latest status. Regarding Table API & SQL support: - Generally I agree that we should consider both Table API and DataStream API for new features. It would avoid introducing changes which will be changed immediately. However, I think it depends. For this feature, I just suspect that it could be supported in SQL in reality. There are so many things which may be dynamically changed, e.g. the rule definitions, the partitioning columns, etc. If we want to change them dynamically, we may need to make a lot of extensions to the MATCH_RECOGNIZE statement which comes from SQL standard. Personally I tend to limit the scope of dynamic changing patterns to only DataStream API at least for now. Regarding CEP.rule vs CEP.dynamicPatterns: - I'm also slightly preferring CEP.dynamicPatterns which is more descriptive. Personally I think that it's difficult to tell the difference between CEP.rule and CEP.pattern unless users read the documentation carefully. Regards, Dian On Mon, Dec 13, 2021 at 5:41 PM Nicholas Jiang wrote: > Hi Konstantin, > >About the renaming for the Rule, I mean that the difference between the > Rule and Pattern is that the Rule not only contains the Pattern, but also > how to match the Pattern, and how to process after matched. If renaming > DynamicPattern, I'm concerned that this name couldn't represent how to > match the Pattern, and how to process after matched but the Rule could > explain this. Therefore I prefer to rename the Rule not the DynamicPattern. > > Best, > Nicholas Jiang > > > On 2021/12/13 08:23:23 Konstantin Knauf wrote: > > Hi Nicholas, > > > > I am not sure I understand your question about renaming. I think the most > > important member of the current Rule class is the Pattern, the > KeySelector > > and PatternProcessFunction are more auxiliary if you will. That's why, I > > think, it would be ok to rename Rule to DynamicPatternHolder although it > > contains more than just a Pattern. > > > > Cheers, > > > > Konstantin > > > > On Mon, Dec 13, 2021 at 9:16 AM Nicholas Jiang > > > wrote: > > > > > Hi Konstantin, > > > > > >Thanks for your feedback. The point that add a timestamp to each > rule > > > that determines the start time from which the rule makes sense to me. > At > > > present, The timestamp is current time at default, so no timestamp > field > > > represents the start time from which the rule. And about the renaming > rule, > > > your suggestion looks good to me and no any new concept introduces. But > > > does this introduce Rule concept or reuse the Pattern concept for the > > > DynamicPattern renaming? > > > > > > Best, > > > Nicholas Jiang > > > > > > On 2021/12/13 07:45:04 Konstantin Knauf wrote: > > > > Thanks, Yufeng, for starting this discussion. I think this will be a > very > > > > popular feature. I've seen a lot of users asking for this in the > past. > > > So, > > > > generally big +1. > > > > > > > > I think we should have a rough idea on how to expose this feature in > the > > > > other APIs. > > > > > > > > Two ideas: > > > > > > > > 1. In order to make this more deterministic in case of reprocessing > and > > > > out-of-orderness, I am wondering if we can add a timestamp to each > rule > > > > that determines the start time from which the rule should be in > effect. > > > > This can be an event or a processing time depending on the > > > characteristics > > > > of the pipeline. The timestamp would default to Long.MIN_TIMESTAMP > if not > > > > provided, which means effectively immediately. This could also be a > > > follow > > > > up, if you think it will make the implementation too complicated > > > initially. > > > > > > > > 2. I am wondering, if we should name Rule->DynamicPatternHolder or > so and > > > > CEP.rule-> CEP.dynamicPatterns instead (other classes > correspondingly)? > > > > Rule is quite ambiguous and DynamicPattern seems more descriptive to > me. > > > > > > > > On Mon, Dec 13, 2021 at 4:30 AM Nicholas Jiang < > nicholasji...@apache.org > > > > > > > > wrote: > > > > > > > > > Hi Martijn, > > > > > > > > > >IMO, in this FLIP, we only need to introduce the general design > of > > > the > > > > > Table API/SQL level. As for the design details, you can create a > new > > > FLIP. > > > > > And do we need to take into account the support for Batch mode if > you > > > > > expand the MATCH_RECOGNIZE function? About the dynamic rule engine > > > design, > > > > > do you have any comments? This core of the FLIP is about the > multiple > > > rule > > > > > and dynamic rule changing mechanism. > > > > > > > > > > Best, > > > > > Nicholas Jiang > > > > > > > > > > > > > > > > >
Re: [DISCUSS] FLIP-200: Support Multiple Rule and Dynamic Rule Changing (Flink CEP)
Hi Konstantin, Thanks for your suggestion. For the first idea, I agree that adding a timestamp field and making users able to schedule a rule is a useful feature. This might not require too much implementation work and I believe it can be achieved in this FLIP. As for the second idea, Rule is a concept that contains pattern and the two are different. I think we can consider the following example use cases: - while one pattern requires input data to be grouped by user id, another pattern might require data to be grouped by item id - users might want to process matched results differently, according to which pattern the result matches to. In these use cases, simply making patterns dynamically change is not enough. Therefore I introduced the concept "rule" to include pattern and all other functions around it, like key selector and the pattern process function, and by dynamically changing rules the example cases above can be supported. Thus maybe rule could be a better naming in the functions that this flip propose to achieve. Best regards, Yunfeng On Mon, Dec 13, 2021 at 3:45 PM Konstantin Knauf wrote: > Thanks, Yufeng, for starting this discussion. I think this will be a very > popular feature. I've seen a lot of users asking for this in the past. So, > generally big +1. > > I think we should have a rough idea on how to expose this feature in the > other APIs. > > Two ideas: > > 1. In order to make this more deterministic in case of reprocessing and > out-of-orderness, I am wondering if we can add a timestamp to each rule > that determines the start time from which the rule should be in effect. > This can be an event or a processing time depending on the characteristics > of the pipeline. The timestamp would default to Long.MIN_TIMESTAMP if not > provided, which means effectively immediately. This could also be a follow > up, if you think it will make the implementation too complicated initially. > > 2. I am wondering, if we should name Rule->DynamicPatternHolder or so and > CEP.rule-> CEP.dynamicPatterns instead (other classes correspondingly)? > Rule is quite ambiguous and DynamicPattern seems more descriptive to me. > > On Mon, Dec 13, 2021 at 4:30 AM Nicholas Jiang > wrote: > > > Hi Martijn, > > > >IMO, in this FLIP, we only need to introduce the general design of the > > Table API/SQL level. As for the design details, you can create a new > FLIP. > > And do we need to take into account the support for Batch mode if you > > expand the MATCH_RECOGNIZE function? About the dynamic rule engine > design, > > do you have any comments? This core of the FLIP is about the multiple > rule > > and dynamic rule changing mechanism. > > > > Best, > > Nicholas Jiang > > > > > -- > > Konstantin Knauf > > https://twitter.com/snntrable > > https://github.com/knaufk >
[jira] [Created] (FLINK-25274) Use ResolvedSchema in DataGen instead of TableSchema
Sergey Nuyanzin created FLINK-25274: --- Summary: Use ResolvedSchema in DataGen instead of TableSchema Key: FLINK-25274 URL: https://issues.apache.org/jira/browse/FLINK-25274 Project: Flink Issue Type: Bug Components: Table SQL / API Reporter: Sergey Nuyanzin {{TableSchema}} is deprecated It is recommended to use {{ResolvedSchema}} and {{Schema}} in {{TableSchema}} javadoc -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [DISCUSS] Immediate dedicated Flink releases for log4j vulnerability
+1 for the quick release. Till Rohrmann 于2021年12月13日周一 17:54写道: > +1 > > Cheers, > Till > > On Mon, Dec 13, 2021 at 10:42 AM Jing Ge wrote: > > > +1 > > > > As I suggested to publish the blog post last week asap, users have been > > keen to have such urgent releases. Many thanks for it. > > > > > > > > On Mon, Dec 13, 2021 at 8:29 AM Konstantin Knauf > > wrote: > > > > > +1 > > > > > > I didn't think this was necessary when I published the blog post on > > Friday, > > > but this has made higher waves than I expected over the weekend. > > > > > > > > > > > > On Mon, Dec 13, 2021 at 8:23 AM Yuan Mei > wrote: > > > > > > > +1 for quick release. > > > > > > > > On Mon, Dec 13, 2021 at 2:55 PM Martijn Visser < > mart...@ververica.com> > > > > wrote: > > > > > > > > > +1 to address the issue like this > > > > > > > > > > On Mon, 13 Dec 2021 at 07:46, Jingsong Li > > > > wrote: > > > > > > > > > > > +1 for fixing it in these versions and doing quick releases. > Looks > > > good > > > > > to > > > > > > me. > > > > > > > > > > > > Best, > > > > > > Jingsong > > > > > > > > > > > > On Mon, Dec 13, 2021 at 2:18 PM Becket Qin > > > > > wrote: > > > > > > > > > > > > > > +1. The solution sounds good to me. There have been a lot of > > > > inquiries > > > > > > > about how to react to this. > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 12:40 PM Prasanna kumar < > > > > > > > prasannakumarram...@gmail.com> wrote: > > > > > > > > > > > > > > > 1+ for making Updates for 1.12.5 . > > > > > > > > We are looking for fix in 1.12 version. > > > > > > > > Please notify once the fix is done. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 9:45 AM Leonard Xu < > xbjt...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > > > > > +1 for the quick release and the special vote period 24h. > > > > > > > > > > > > > > > > > > > 2021年12月13日 上午11:49,Dian Fu 写道: > > > > > > > > > > > > > > > > > > > > +1 for the proposal and creating a quick release. > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > Dian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 11:15 AM Kyle Bendickson < > > > > > k...@tabular.io> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > >> +1 to doing a release for this widely publicized > > > > vulnerability. > > > > > > > > > >> > > > > > > > > > >> In my experience, users will often update to the latest > > > minor > > > > > > patch > > > > > > > > > version > > > > > > > > > >> without much fuss. Plus, users have also likely heard > > about > > > > this > > > > > > and > > > > > > > > > will > > > > > > > > > >> appreciate a simple fix (updating their version where > > > > possible). > > > > > > > > > >> > > > > > > > > > >> The work-around will need to still be noted for users > who > > > > can’t > > > > > > > > upgrade > > > > > > > > > for > > > > > > > > > >> whatever reason (EMR hasn’t caught up, etc). > > > > > > > > > >> > > > > > > > > > >> I also agree with your assessment to apply a patch on > each > > > of > > > > > > those > > > > > > > > > >> previous versions with only the log4j commit, so that > they > > > > don’t > > > > > > need > > > > > > > > > to be > > > > > > > > > >> as rigorously tested. > > > > > > > > > >> > > > > > > > > > >> Best, > > > > > > > > > >> Kyle (GitHub @kbendick) > > > > > > > > > >> > > > > > > > > > >> On Sun, Dec 12, 2021 at 2:23 PM Stephan Ewen < > > > > se...@apache.org> > > > > > > > > wrote: > > > > > > > > > >> > > > > > > > > > >>> Hi all! > > > > > > > > > >>> > > > > > > > > > >>> Without doubt, you heard about the log4j vulnerability > > [1]. > > > > > > > > > >>> > > > > > > > > > >>> There is an advisory blog post on how to mitigate this > in > > > > > Apache > > > > > > > > Flink > > > > > > > > > >> [2], > > > > > > > > > >>> which involves setting a config option and restarting > the > > > > > > processes. > > > > > > > > > That > > > > > > > > > >>> is fortunately a relatively simple fix. > > > > > > > > > >>> > > > > > > > > > >>> Despite this workaround, I think we should do an > > immediate > > > > > > release > > > > > > > > with > > > > > > > > > >> the > > > > > > > > > >>> updated dependency. Meaning not waiting for the next > bug > > > fix > > > > > > releases > > > > > > > > > >>> coming in a few weeks, but releasing asap. > > > > > > > > > >>> The mood I perceive in the industry is pretty much > > panicky > > > > over > > > > > > this, > > > > > > > > > >> and I > > > > > > > > > >>> expect we will see many requests for a patched release > > and > > > > many > > > > > > > > > >> discussions > > > > > > > > > >>> why the workaround alone would not be enough due to > > certain > > > > > > > > guidelines. > > > > > > > > > >>> > > > > > > > > > >>> I suggest that we preempt those discussions and create > > > > releases > > > > > > the > > > > > > > > > >>>
Re: [DISCUSS] Immediate dedicated Flink releases for log4j vulnerability
+1 I need 1.12.6, thanks Till Rohrmann 于2021年12月13日周一 17:54写道: > > +1 > > Cheers, > Till > > On Mon, Dec 13, 2021 at 10:42 AM Jing Ge wrote: > > > +1 > > > > As I suggested to publish the blog post last week asap, users have been > > keen to have such urgent releases. Many thanks for it. > > > > > > > > On Mon, Dec 13, 2021 at 8:29 AM Konstantin Knauf > > wrote: > > > > > +1 > > > > > > I didn't think this was necessary when I published the blog post on > > Friday, > > > but this has made higher waves than I expected over the weekend. > > > > > > > > > > > > On Mon, Dec 13, 2021 at 8:23 AM Yuan Mei wrote: > > > > > > > +1 for quick release. > > > > > > > > On Mon, Dec 13, 2021 at 2:55 PM Martijn Visser > > > > wrote: > > > > > > > > > +1 to address the issue like this > > > > > > > > > > On Mon, 13 Dec 2021 at 07:46, Jingsong Li > > > > wrote: > > > > > > > > > > > +1 for fixing it in these versions and doing quick releases. Looks > > > good > > > > > to > > > > > > me. > > > > > > > > > > > > Best, > > > > > > Jingsong > > > > > > > > > > > > On Mon, Dec 13, 2021 at 2:18 PM Becket Qin > > > > wrote: > > > > > > > > > > > > > > +1. The solution sounds good to me. There have been a lot of > > > > inquiries > > > > > > > about how to react to this. > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 12:40 PM Prasanna kumar < > > > > > > > prasannakumarram...@gmail.com> wrote: > > > > > > > > > > > > > > > 1+ for making Updates for 1.12.5 . > > > > > > > > We are looking for fix in 1.12 version. > > > > > > > > Please notify once the fix is done. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 9:45 AM Leonard Xu > > > > > wrote: > > > > > > > > > > > > > > > > > +1 for the quick release and the special vote period 24h. > > > > > > > > > > > > > > > > > > > 2021年12月13日 上午11:49,Dian Fu 写道: > > > > > > > > > > > > > > > > > > > > +1 for the proposal and creating a quick release. > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > Dian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 11:15 AM Kyle Bendickson < > > > > > k...@tabular.io> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > >> +1 to doing a release for this widely publicized > > > > vulnerability. > > > > > > > > > >> > > > > > > > > > >> In my experience, users will often update to the latest > > > minor > > > > > > patch > > > > > > > > > version > > > > > > > > > >> without much fuss. Plus, users have also likely heard > > about > > > > this > > > > > > and > > > > > > > > > will > > > > > > > > > >> appreciate a simple fix (updating their version where > > > > possible). > > > > > > > > > >> > > > > > > > > > >> The work-around will need to still be noted for users who > > > > can’t > > > > > > > > upgrade > > > > > > > > > for > > > > > > > > > >> whatever reason (EMR hasn’t caught up, etc). > > > > > > > > > >> > > > > > > > > > >> I also agree with your assessment to apply a patch on each > > > of > > > > > > those > > > > > > > > > >> previous versions with only the log4j commit, so that they > > > > don’t > > > > > > need > > > > > > > > > to be > > > > > > > > > >> as rigorously tested. > > > > > > > > > >> > > > > > > > > > >> Best, > > > > > > > > > >> Kyle (GitHub @kbendick) > > > > > > > > > >> > > > > > > > > > >> On Sun, Dec 12, 2021 at 2:23 PM Stephan Ewen < > > > > se...@apache.org> > > > > > > > > wrote: > > > > > > > > > >> > > > > > > > > > >>> Hi all! > > > > > > > > > >>> > > > > > > > > > >>> Without doubt, you heard about the log4j vulnerability > > [1]. > > > > > > > > > >>> > > > > > > > > > >>> There is an advisory blog post on how to mitigate this in > > > > > Apache > > > > > > > > Flink > > > > > > > > > >> [2], > > > > > > > > > >>> which involves setting a config option and restarting the > > > > > > processes. > > > > > > > > > That > > > > > > > > > >>> is fortunately a relatively simple fix. > > > > > > > > > >>> > > > > > > > > > >>> Despite this workaround, I think we should do an > > immediate > > > > > > release > > > > > > > > with > > > > > > > > > >> the > > > > > > > > > >>> updated dependency. Meaning not waiting for the next bug > > > fix > > > > > > releases > > > > > > > > > >>> coming in a few weeks, but releasing asap. > > > > > > > > > >>> The mood I perceive in the industry is pretty much > > panicky > > > > over > > > > > > this, > > > > > > > > > >> and I > > > > > > > > > >>> expect we will see many requests for a patched release > > and > > > > many > > > > > > > > > >> discussions > > > > > > > > > >>> why the workaround alone would not be enough due to > > certain > > > > > > > > guidelines. > > > > > > > > > >>> > > > > > > > > > >>> I suggest that we preempt those discussions and create > > > > releases > > > > > > the > > > > > > > > > >>> following way: > > > > > > > > > >>> > > > > > > > > > >>> - we take
Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs
Hi Timo, +1 for the improvement too. Please count me in when assign subtasks. Best, Jing Zhang Timo Walther 于2021年12月13日周一 17:00写道: > Hi everyone, > > *last call for feedback* on this FLIP. Otherwise I would start a VOTE by > tomorrow. > > @Wenlong: Thanks for offering your help. Once the FLIP has been > accepted. I will create a list of subtasks that we can split among > contributors. Many can be implemented in parallel. > > Regards, > Timo > > > On 13.12.21 09:20, wenlong.lwl wrote: > > Hi, Timo, +1 for the improvement too. Thanks for the great job. > > > > Looking forward to the next progress of the FLIP, I could help on the > > development of some of the specific improvements. > > > > Best, > > Wenlong > > > > On Mon, 13 Dec 2021 at 14:43, godfrey he wrote: > > > >> Hi Timo, > >> > >> +1 for the improvement. > >> > >> Best, > >> Godfrey > >> > >> Timo Walther 于2021年12月10日周五 20:37写道: > >>> > >>> Hi Wenlong, > >>> > >>> yes it will. Sorry for the confusion. This is a logical consequence of > >>> the assumption: > >>> > >>> The JSON plan contains no implementation details (i.e. no classes) and > >>> is fully declarative. > >>> > >>> I will add a remark. > >>> > >>> Thanks, > >>> Timo > >>> > >>> > >>> On 10.12.21 11:43, wenlong.lwl wrote: > hi, Timo, thanks for the explanation. I totally agree with what you > >> said. > My actual question is: Will the version of an exec node be serialised > >> in > the Json Plan? In my understanding, it is not in the former design. If > >> it > is yes, my question is solved already. > > > Best, > Wenlong > > > On Fri, 10 Dec 2021 at 18:15, Timo Walther > wrote: > > > Hi Wenlong, > > > > also thought about adding a `flinkVersion` field per ExecNode. But > >> this > > is not necessary, because the `version` of the ExecNode has the same > > purpose. > > > > The plan version just encodes that: > > "plan has been updated in Flink 1.17" / "plan is entirely valid for > > Flink 1.17" > > > > The ExecNode version maps to `minStateVersion` to verify state > > compatibility. > > > > So even if the plan version is 1.17, some ExecNodes use state layout > >> of > > 1.15. > > > > It is totally fine to only update the ExecNode to version 2 and not 3 > >> in > > your example. > > > > Regards, > > Timo > > > > > > > > On 10.12.21 06:02, wenlong.lwl wrote: > >> Hi, Timo, thanks for updating the doc. > >> > >> I have a comment on plan migration: > >> I think we may need to add a version field for every exec node when > >> serialising. In earlier discussions, I think we have a conclusion > >> that > >> treating the version of plan as the version of node, but in this > >> case it > >> would be broken. > >> Take the following example in FLIP into consideration, there is a > bad > > case: > >> when in 1.17, we introduced an incompatible version 3 and dropped > >> version > >> 1, we can only update the version to 2, so the version should be per > >> exec > >> node. > >> > >> ExecNode version *1* is not supported anymore. Even though the state > >> is > >> actually compatible. The plan restore will fail with a helpful > >> exception > >> that forces users to perform plan migration. > >> > >> COMPILE PLAN '/mydir/plan_new.json' FROM '/mydir/plan_old.json'; > >> > >> The plan migration will safely replace the old version *1* with *2. > >> The > >> JSON plan flinkVersion changes to 1.17.* > >> > >> > >> Best, > >> > >> Wenlong > >> > >> On Thu, 9 Dec 2021 at 18:36, Timo Walther > >> wrote: > >> > >>> Hi Jing and Godfrey, > >>> > >>> I had another iteration over the document. There are two major > >> changes: > >>> > >>> 1. Supported Flink Upgrade Versions > >>> > >>> I got the feedback via various channels that a step size of one > >> minor > >>> version is not very convenient. As you said, "because upgrading to > >> a new > >>> version is a time-consuming process". I rephrased this section: > >>> > >>> Upgrading usually involves work which is why many users perform > this > >>> task rarely (e.g. only once per year). Also skipping a versions is > >>> common until a new feature has been introduced for which is it > >> worth to > >>> upgrade. We will support the upgrade to the most recent Flink > >> version > >>> from a set of previous versions. We aim to support upgrades from > the > >>> last 2-3 releases on a best-effort basis; maybe even more depending > >> on > >>> the maintenance overhead. However, in order to not grow the testing > >>> matrix infinitely and to perform important refactoring if > >> necessary, we > >>> only guarantee upgrades with a step size of a single minor version > >> (i.e. > >>> a cascade of upgrades). > >>> >
Re: [DISCUSS] Immediate dedicated Flink releases for log4j vulnerability
+1 Cheers, Till On Mon, Dec 13, 2021 at 10:42 AM Jing Ge wrote: > +1 > > As I suggested to publish the blog post last week asap, users have been > keen to have such urgent releases. Many thanks for it. > > > > On Mon, Dec 13, 2021 at 8:29 AM Konstantin Knauf > wrote: > > > +1 > > > > I didn't think this was necessary when I published the blog post on > Friday, > > but this has made higher waves than I expected over the weekend. > > > > > > > > On Mon, Dec 13, 2021 at 8:23 AM Yuan Mei wrote: > > > > > +1 for quick release. > > > > > > On Mon, Dec 13, 2021 at 2:55 PM Martijn Visser > > > wrote: > > > > > > > +1 to address the issue like this > > > > > > > > On Mon, 13 Dec 2021 at 07:46, Jingsong Li > > > wrote: > > > > > > > > > +1 for fixing it in these versions and doing quick releases. Looks > > good > > > > to > > > > > me. > > > > > > > > > > Best, > > > > > Jingsong > > > > > > > > > > On Mon, Dec 13, 2021 at 2:18 PM Becket Qin > > > wrote: > > > > > > > > > > > > +1. The solution sounds good to me. There have been a lot of > > > inquiries > > > > > > about how to react to this. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > On Mon, Dec 13, 2021 at 12:40 PM Prasanna kumar < > > > > > > prasannakumarram...@gmail.com> wrote: > > > > > > > > > > > > > 1+ for making Updates for 1.12.5 . > > > > > > > We are looking for fix in 1.12 version. > > > > > > > Please notify once the fix is done. > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 9:45 AM Leonard Xu > > > > wrote: > > > > > > > > > > > > > > > +1 for the quick release and the special vote period 24h. > > > > > > > > > > > > > > > > > 2021年12月13日 上午11:49,Dian Fu 写道: > > > > > > > > > > > > > > > > > > +1 for the proposal and creating a quick release. > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > Dian > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 11:15 AM Kyle Bendickson < > > > > k...@tabular.io> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> +1 to doing a release for this widely publicized > > > vulnerability. > > > > > > > > >> > > > > > > > > >> In my experience, users will often update to the latest > > minor > > > > > patch > > > > > > > > version > > > > > > > > >> without much fuss. Plus, users have also likely heard > about > > > this > > > > > and > > > > > > > > will > > > > > > > > >> appreciate a simple fix (updating their version where > > > possible). > > > > > > > > >> > > > > > > > > >> The work-around will need to still be noted for users who > > > can’t > > > > > > > upgrade > > > > > > > > for > > > > > > > > >> whatever reason (EMR hasn’t caught up, etc). > > > > > > > > >> > > > > > > > > >> I also agree with your assessment to apply a patch on each > > of > > > > > those > > > > > > > > >> previous versions with only the log4j commit, so that they > > > don’t > > > > > need > > > > > > > > to be > > > > > > > > >> as rigorously tested. > > > > > > > > >> > > > > > > > > >> Best, > > > > > > > > >> Kyle (GitHub @kbendick) > > > > > > > > >> > > > > > > > > >> On Sun, Dec 12, 2021 at 2:23 PM Stephan Ewen < > > > se...@apache.org> > > > > > > > wrote: > > > > > > > > >> > > > > > > > > >>> Hi all! > > > > > > > > >>> > > > > > > > > >>> Without doubt, you heard about the log4j vulnerability > [1]. > > > > > > > > >>> > > > > > > > > >>> There is an advisory blog post on how to mitigate this in > > > > Apache > > > > > > > Flink > > > > > > > > >> [2], > > > > > > > > >>> which involves setting a config option and restarting the > > > > > processes. > > > > > > > > That > > > > > > > > >>> is fortunately a relatively simple fix. > > > > > > > > >>> > > > > > > > > >>> Despite this workaround, I think we should do an > immediate > > > > > release > > > > > > > with > > > > > > > > >> the > > > > > > > > >>> updated dependency. Meaning not waiting for the next bug > > fix > > > > > releases > > > > > > > > >>> coming in a few weeks, but releasing asap. > > > > > > > > >>> The mood I perceive in the industry is pretty much > panicky > > > over > > > > > this, > > > > > > > > >> and I > > > > > > > > >>> expect we will see many requests for a patched release > and > > > many > > > > > > > > >> discussions > > > > > > > > >>> why the workaround alone would not be enough due to > certain > > > > > > > guidelines. > > > > > > > > >>> > > > > > > > > >>> I suggest that we preempt those discussions and create > > > releases > > > > > the > > > > > > > > >>> following way: > > > > > > > > >>> > > > > > > > > >>> - we take the latest already released versions from each > > > > release > > > > > > > > >> branch: > > > > > > > > >>> ==> 1.14.0, 1.13.3, 1.12.5, 1.11.4 > > > > > > > > >>> - we add a single commit to those that just updates the > > > log4j > > > > > > > > >> dependency > > > > > > > > >>> - we release those as 1.14.1, 1.13.4, 1.12.6, 1.11.5, > etc. > > > > > > > > >>> - that way we
Re: [DISCUSS] FLIP-200: Support Multiple Rule and Dynamic Rule Changing (Flink CEP)
Hi Konstantin, About the renaming for the Rule, I mean that the difference between the Rule and Pattern is that the Rule not only contains the Pattern, but also how to match the Pattern, and how to process after matched. If renaming DynamicPattern, I'm concerned that this name couldn't represent how to match the Pattern, and how to process after matched but the Rule could explain this. Therefore I prefer to rename the Rule not the DynamicPattern. Best, Nicholas Jiang On 2021/12/13 08:23:23 Konstantin Knauf wrote: > Hi Nicholas, > > I am not sure I understand your question about renaming. I think the most > important member of the current Rule class is the Pattern, the KeySelector > and PatternProcessFunction are more auxiliary if you will. That's why, I > think, it would be ok to rename Rule to DynamicPatternHolder although it > contains more than just a Pattern. > > Cheers, > > Konstantin > > On Mon, Dec 13, 2021 at 9:16 AM Nicholas Jiang > wrote: > > > Hi Konstantin, > > > >Thanks for your feedback. The point that add a timestamp to each rule > > that determines the start time from which the rule makes sense to me. At > > present, The timestamp is current time at default, so no timestamp field > > represents the start time from which the rule. And about the renaming rule, > > your suggestion looks good to me and no any new concept introduces. But > > does this introduce Rule concept or reuse the Pattern concept for the > > DynamicPattern renaming? > > > > Best, > > Nicholas Jiang > > > > On 2021/12/13 07:45:04 Konstantin Knauf wrote: > > > Thanks, Yufeng, for starting this discussion. I think this will be a very > > > popular feature. I've seen a lot of users asking for this in the past. > > So, > > > generally big +1. > > > > > > I think we should have a rough idea on how to expose this feature in the > > > other APIs. > > > > > > Two ideas: > > > > > > 1. In order to make this more deterministic in case of reprocessing and > > > out-of-orderness, I am wondering if we can add a timestamp to each rule > > > that determines the start time from which the rule should be in effect. > > > This can be an event or a processing time depending on the > > characteristics > > > of the pipeline. The timestamp would default to Long.MIN_TIMESTAMP if not > > > provided, which means effectively immediately. This could also be a > > follow > > > up, if you think it will make the implementation too complicated > > initially. > > > > > > 2. I am wondering, if we should name Rule->DynamicPatternHolder or so and > > > CEP.rule-> CEP.dynamicPatterns instead (other classes correspondingly)? > > > Rule is quite ambiguous and DynamicPattern seems more descriptive to me. > > > > > > On Mon, Dec 13, 2021 at 4:30 AM Nicholas Jiang > > > > > wrote: > > > > > > > Hi Martijn, > > > > > > > >IMO, in this FLIP, we only need to introduce the general design of > > the > > > > Table API/SQL level. As for the design details, you can create a new > > FLIP. > > > > And do we need to take into account the support for Batch mode if you > > > > expand the MATCH_RECOGNIZE function? About the dynamic rule engine > > design, > > > > do you have any comments? This core of the FLIP is about the multiple > > rule > > > > and dynamic rule changing mechanism. > > > > > > > > Best, > > > > Nicholas Jiang > > > > > > > > > > > > > -- > > > > > > Konstantin Knauf > > > > > > https://twitter.com/snntrable > > > > > > https://github.com/knaufk > > > > > > > > -- > > Konstantin Knauf > > https://twitter.com/snntrable > > https://github.com/knaufk >
Re: [DISCUSS] Immediate dedicated Flink releases for log4j vulnerability
+1 As I suggested to publish the blog post last week asap, users have been keen to have such urgent releases. Many thanks for it. On Mon, Dec 13, 2021 at 8:29 AM Konstantin Knauf wrote: > +1 > > I didn't think this was necessary when I published the blog post on Friday, > but this has made higher waves than I expected over the weekend. > > > > On Mon, Dec 13, 2021 at 8:23 AM Yuan Mei wrote: > > > +1 for quick release. > > > > On Mon, Dec 13, 2021 at 2:55 PM Martijn Visser > > wrote: > > > > > +1 to address the issue like this > > > > > > On Mon, 13 Dec 2021 at 07:46, Jingsong Li > > wrote: > > > > > > > +1 for fixing it in these versions and doing quick releases. Looks > good > > > to > > > > me. > > > > > > > > Best, > > > > Jingsong > > > > > > > > On Mon, Dec 13, 2021 at 2:18 PM Becket Qin > > wrote: > > > > > > > > > > +1. The solution sounds good to me. There have been a lot of > > inquiries > > > > > about how to react to this. > > > > > > > > > > Thanks, > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > On Mon, Dec 13, 2021 at 12:40 PM Prasanna kumar < > > > > > prasannakumarram...@gmail.com> wrote: > > > > > > > > > > > 1+ for making Updates for 1.12.5 . > > > > > > We are looking for fix in 1.12 version. > > > > > > Please notify once the fix is done. > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 9:45 AM Leonard Xu > > > wrote: > > > > > > > > > > > > > +1 for the quick release and the special vote period 24h. > > > > > > > > > > > > > > > 2021年12月13日 上午11:49,Dian Fu 写道: > > > > > > > > > > > > > > > > +1 for the proposal and creating a quick release. > > > > > > > > > > > > > > > > Regards, > > > > > > > > Dian > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 13, 2021 at 11:15 AM Kyle Bendickson < > > > k...@tabular.io> > > > > > > > wrote: > > > > > > > > > > > > > > > >> +1 to doing a release for this widely publicized > > vulnerability. > > > > > > > >> > > > > > > > >> In my experience, users will often update to the latest > minor > > > > patch > > > > > > > version > > > > > > > >> without much fuss. Plus, users have also likely heard about > > this > > > > and > > > > > > > will > > > > > > > >> appreciate a simple fix (updating their version where > > possible). > > > > > > > >> > > > > > > > >> The work-around will need to still be noted for users who > > can’t > > > > > > upgrade > > > > > > > for > > > > > > > >> whatever reason (EMR hasn’t caught up, etc). > > > > > > > >> > > > > > > > >> I also agree with your assessment to apply a patch on each > of > > > > those > > > > > > > >> previous versions with only the log4j commit, so that they > > don’t > > > > need > > > > > > > to be > > > > > > > >> as rigorously tested. > > > > > > > >> > > > > > > > >> Best, > > > > > > > >> Kyle (GitHub @kbendick) > > > > > > > >> > > > > > > > >> On Sun, Dec 12, 2021 at 2:23 PM Stephan Ewen < > > se...@apache.org> > > > > > > wrote: > > > > > > > >> > > > > > > > >>> Hi all! > > > > > > > >>> > > > > > > > >>> Without doubt, you heard about the log4j vulnerability [1]. > > > > > > > >>> > > > > > > > >>> There is an advisory blog post on how to mitigate this in > > > Apache > > > > > > Flink > > > > > > > >> [2], > > > > > > > >>> which involves setting a config option and restarting the > > > > processes. > > > > > > > That > > > > > > > >>> is fortunately a relatively simple fix. > > > > > > > >>> > > > > > > > >>> Despite this workaround, I think we should do an immediate > > > > release > > > > > > with > > > > > > > >> the > > > > > > > >>> updated dependency. Meaning not waiting for the next bug > fix > > > > releases > > > > > > > >>> coming in a few weeks, but releasing asap. > > > > > > > >>> The mood I perceive in the industry is pretty much panicky > > over > > > > this, > > > > > > > >> and I > > > > > > > >>> expect we will see many requests for a patched release and > > many > > > > > > > >> discussions > > > > > > > >>> why the workaround alone would not be enough due to certain > > > > > > guidelines. > > > > > > > >>> > > > > > > > >>> I suggest that we preempt those discussions and create > > releases > > > > the > > > > > > > >>> following way: > > > > > > > >>> > > > > > > > >>> - we take the latest already released versions from each > > > release > > > > > > > >> branch: > > > > > > > >>> ==> 1.14.0, 1.13.3, 1.12.5, 1.11.4 > > > > > > > >>> - we add a single commit to those that just updates the > > log4j > > > > > > > >> dependency > > > > > > > >>> - we release those as 1.14.1, 1.13.4, 1.12.6, 1.11.5, etc. > > > > > > > >>> - that way we don't need to do functional release tests, > > > > because the > > > > > > > >>> released code is identical to the previous release, except > > for > > > > the > > > > > > > log4j > > > > > > > >>> dependency > > > > > > > >>> - we can then continue the work on the upcoming bugfix > > > releases > > > > as > > > > > > > >>> planned, without high pressure > > > > > > > >>> > > > > >
Re: [DISCUSS] Immediate dedicated Flink releases for log4j vulnerability
I will start preparing the release candidates. On 12/12/2021 23:23, Stephan Ewen wrote: Hi all! Without doubt, you heard about the log4j vulnerability [1]. There is an advisory blog post on how to mitigate this in Apache Flink [2], which involves setting a config option and restarting the processes. That is fortunately a relatively simple fix. Despite this workaround, I think we should do an immediate release with the updated dependency. Meaning not waiting for the next bug fix releases coming in a few weeks, but releasing asap. The mood I perceive in the industry is pretty much panicky over this, and I expect we will see many requests for a patched release and many discussions why the workaround alone would not be enough due to certain guidelines. I suggest that we preempt those discussions and create releases the following way: - we take the latest already released versions from each release branch: ==> 1.14.0, 1.13.3, 1.12.5, 1.11.4 - we add a single commit to those that just updates the log4j dependency - we release those as 1.14.1, 1.13.4, 1.12.6, 1.11.5, etc. - that way we don't need to do functional release tests, because the released code is identical to the previous release, except for the log4j dependency - we can then continue the work on the upcoming bugfix releases as planned, without high pressure I would suggest creating those RCs immediately and release them with a special voting period (24h or so). WDYT? Best, Stephan [1] https://nvd.nist.gov/vuln/detail/CVE-2021-44228 [2] https://flink.apache.org/2021/12/10/log4j-cve.html
Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs
Hi everyone, *last call for feedback* on this FLIP. Otherwise I would start a VOTE by tomorrow. @Wenlong: Thanks for offering your help. Once the FLIP has been accepted. I will create a list of subtasks that we can split among contributors. Many can be implemented in parallel. Regards, Timo On 13.12.21 09:20, wenlong.lwl wrote: Hi, Timo, +1 for the improvement too. Thanks for the great job. Looking forward to the next progress of the FLIP, I could help on the development of some of the specific improvements. Best, Wenlong On Mon, 13 Dec 2021 at 14:43, godfrey he wrote: Hi Timo, +1 for the improvement. Best, Godfrey Timo Walther 于2021年12月10日周五 20:37写道: Hi Wenlong, yes it will. Sorry for the confusion. This is a logical consequence of the assumption: The JSON plan contains no implementation details (i.e. no classes) and is fully declarative. I will add a remark. Thanks, Timo On 10.12.21 11:43, wenlong.lwl wrote: hi, Timo, thanks for the explanation. I totally agree with what you said. My actual question is: Will the version of an exec node be serialised in the Json Plan? In my understanding, it is not in the former design. If it is yes, my question is solved already. Best, Wenlong On Fri, 10 Dec 2021 at 18:15, Timo Walther wrote: Hi Wenlong, also thought about adding a `flinkVersion` field per ExecNode. But this is not necessary, because the `version` of the ExecNode has the same purpose. The plan version just encodes that: "plan has been updated in Flink 1.17" / "plan is entirely valid for Flink 1.17" The ExecNode version maps to `minStateVersion` to verify state compatibility. So even if the plan version is 1.17, some ExecNodes use state layout of 1.15. It is totally fine to only update the ExecNode to version 2 and not 3 in your example. Regards, Timo On 10.12.21 06:02, wenlong.lwl wrote: Hi, Timo, thanks for updating the doc. I have a comment on plan migration: I think we may need to add a version field for every exec node when serialising. In earlier discussions, I think we have a conclusion that treating the version of plan as the version of node, but in this case it would be broken. Take the following example in FLIP into consideration, there is a bad case: when in 1.17, we introduced an incompatible version 3 and dropped version 1, we can only update the version to 2, so the version should be per exec node. ExecNode version *1* is not supported anymore. Even though the state is actually compatible. The plan restore will fail with a helpful exception that forces users to perform plan migration. COMPILE PLAN '/mydir/plan_new.json' FROM '/mydir/plan_old.json'; The plan migration will safely replace the old version *1* with *2. The JSON plan flinkVersion changes to 1.17.* Best, Wenlong On Thu, 9 Dec 2021 at 18:36, Timo Walther wrote: Hi Jing and Godfrey, I had another iteration over the document. There are two major changes: 1. Supported Flink Upgrade Versions I got the feedback via various channels that a step size of one minor version is not very convenient. As you said, "because upgrading to a new version is a time-consuming process". I rephrased this section: Upgrading usually involves work which is why many users perform this task rarely (e.g. only once per year). Also skipping a versions is common until a new feature has been introduced for which is it worth to upgrade. We will support the upgrade to the most recent Flink version from a set of previous versions. We aim to support upgrades from the last 2-3 releases on a best-effort basis; maybe even more depending on the maintenance overhead. However, in order to not grow the testing matrix infinitely and to perform important refactoring if necessary, we only guarantee upgrades with a step size of a single minor version (i.e. a cascade of upgrades). 2. Annotation Design I also adopted the multiple annotations design for the previous supportPlanFormat. So no array of versions anymore. I reworked the section, please have a look with updated examples: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489#FLIP190:SupportVersionUpgradesforTableAPI I also got the feedback offline that `savepoint` might not be the right terminology for the annotation. I changed that to minPlanVersion and minStateVersion. Let me know what you think. Regards, Timo On 09.12.21 08:44, Jing Zhang wrote: Hi Timo, Thanks a lot for driving this discussion. I believe it could solve many problems what we are suffering in upgrading. I only have a little complain on the following point. For simplification of the design, we assume that upgrades use a step size of a single minor version. We don't guarantee skipping minor versions (e.g. 1.11 to 1.14). In our internal production environment, we follow up with the community's latest stable release version almost once a year because upgrading to a new version is a
Re: [VOTE] FLIP-196: Source API stability guarantees
+1 On 10/12/2021 18:07, Till Rohrmann wrote: Hi everyone, I'd like to start a vote on FLIP-196: Source API stability guarantees [1] which has been discussed in this thread [2]. The vote will be open for at least 72 hours unless there is an objection or not enough votes. [1] https://cwiki.apache.org/confluence/x/IJeqCw [2] https://lists.apache.org/thread/gkczh583ovlo1fpj7l61cnr2zl695xkp Cheers, Till
Re: [VOTE] FLIP-197: API stability graduation process
+1 On 10/12/2021 18:09, Till Rohrmann wrote: Hi everyone, I'd like to start a vote on FLIP-197: API stability graduation process [1] which has been discussed in this thread [2]. The vote will be open for at least 72 hours unless there is an objection or not enough votes. [1] https://cwiki.apache.org/confluence/x/J5eqCw [2] https://lists.apache.org/thread/gjgr3b3w2c379ny7on3khjgwyjp2gyq1 Cheers, Till
Re: [DISCUSS] FLIP-200: Support Multiple Rule and Dynamic Rule Changing (Flink CEP)
Hi Nicholas, I am not sure I understand your question about renaming. I think the most important member of the current Rule class is the Pattern, the KeySelector and PatternProcessFunction are more auxiliary if you will. That's why, I think, it would be ok to rename Rule to DynamicPatternHolder although it contains more than just a Pattern. Cheers, Konstantin On Mon, Dec 13, 2021 at 9:16 AM Nicholas Jiang wrote: > Hi Konstantin, > >Thanks for your feedback. The point that add a timestamp to each rule > that determines the start time from which the rule makes sense to me. At > present, The timestamp is current time at default, so no timestamp field > represents the start time from which the rule. And about the renaming rule, > your suggestion looks good to me and no any new concept introduces. But > does this introduce Rule concept or reuse the Pattern concept for the > DynamicPattern renaming? > > Best, > Nicholas Jiang > > On 2021/12/13 07:45:04 Konstantin Knauf wrote: > > Thanks, Yufeng, for starting this discussion. I think this will be a very > > popular feature. I've seen a lot of users asking for this in the past. > So, > > generally big +1. > > > > I think we should have a rough idea on how to expose this feature in the > > other APIs. > > > > Two ideas: > > > > 1. In order to make this more deterministic in case of reprocessing and > > out-of-orderness, I am wondering if we can add a timestamp to each rule > > that determines the start time from which the rule should be in effect. > > This can be an event or a processing time depending on the > characteristics > > of the pipeline. The timestamp would default to Long.MIN_TIMESTAMP if not > > provided, which means effectively immediately. This could also be a > follow > > up, if you think it will make the implementation too complicated > initially. > > > > 2. I am wondering, if we should name Rule->DynamicPatternHolder or so and > > CEP.rule-> CEP.dynamicPatterns instead (other classes correspondingly)? > > Rule is quite ambiguous and DynamicPattern seems more descriptive to me. > > > > On Mon, Dec 13, 2021 at 4:30 AM Nicholas Jiang > > > wrote: > > > > > Hi Martijn, > > > > > >IMO, in this FLIP, we only need to introduce the general design of > the > > > Table API/SQL level. As for the design details, you can create a new > FLIP. > > > And do we need to take into account the support for Batch mode if you > > > expand the MATCH_RECOGNIZE function? About the dynamic rule engine > design, > > > do you have any comments? This core of the FLIP is about the multiple > rule > > > and dynamic rule changing mechanism. > > > > > > Best, > > > Nicholas Jiang > > > > > > > > > -- > > > > Konstantin Knauf > > > > https://twitter.com/snntrable > > > > https://github.com/knaufk > > > -- Konstantin Knauf https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs
Hi, Timo, +1 for the improvement too. Thanks for the great job. Looking forward to the next progress of the FLIP, I could help on the development of some of the specific improvements. Best, Wenlong On Mon, 13 Dec 2021 at 14:43, godfrey he wrote: > Hi Timo, > > +1 for the improvement. > > Best, > Godfrey > > Timo Walther 于2021年12月10日周五 20:37写道: > > > > Hi Wenlong, > > > > yes it will. Sorry for the confusion. This is a logical consequence of > > the assumption: > > > > The JSON plan contains no implementation details (i.e. no classes) and > > is fully declarative. > > > > I will add a remark. > > > > Thanks, > > Timo > > > > > > On 10.12.21 11:43, wenlong.lwl wrote: > > > hi, Timo, thanks for the explanation. I totally agree with what you > said. > > > My actual question is: Will the version of an exec node be serialised > in > > > the Json Plan? In my understanding, it is not in the former design. If > it > > > is yes, my question is solved already. > > > > > > > > > Best, > > > Wenlong > > > > > > > > > On Fri, 10 Dec 2021 at 18:15, Timo Walther wrote: > > > > > >> Hi Wenlong, > > >> > > >> also thought about adding a `flinkVersion` field per ExecNode. But > this > > >> is not necessary, because the `version` of the ExecNode has the same > > >> purpose. > > >> > > >> The plan version just encodes that: > > >> "plan has been updated in Flink 1.17" / "plan is entirely valid for > > >> Flink 1.17" > > >> > > >> The ExecNode version maps to `minStateVersion` to verify state > > >> compatibility. > > >> > > >> So even if the plan version is 1.17, some ExecNodes use state layout > of > > >> 1.15. > > >> > > >> It is totally fine to only update the ExecNode to version 2 and not 3 > in > > >> your example. > > >> > > >> Regards, > > >> Timo > > >> > > >> > > >> > > >> On 10.12.21 06:02, wenlong.lwl wrote: > > >>> Hi, Timo, thanks for updating the doc. > > >>> > > >>> I have a comment on plan migration: > > >>> I think we may need to add a version field for every exec node when > > >>> serialising. In earlier discussions, I think we have a conclusion > that > > >>> treating the version of plan as the version of node, but in this > case it > > >>> would be broken. > > >>> Take the following example in FLIP into consideration, there is a bad > > >> case: > > >>> when in 1.17, we introduced an incompatible version 3 and dropped > version > > >>> 1, we can only update the version to 2, so the version should be per > exec > > >>> node. > > >>> > > >>> ExecNode version *1* is not supported anymore. Even though the state > is > > >>> actually compatible. The plan restore will fail with a helpful > exception > > >>> that forces users to perform plan migration. > > >>> > > >>> COMPILE PLAN '/mydir/plan_new.json' FROM '/mydir/plan_old.json'; > > >>> > > >>> The plan migration will safely replace the old version *1* with *2. > The > > >>> JSON plan flinkVersion changes to 1.17.* > > >>> > > >>> > > >>> Best, > > >>> > > >>> Wenlong > > >>> > > >>> On Thu, 9 Dec 2021 at 18:36, Timo Walther > wrote: > > >>> > > Hi Jing and Godfrey, > > > > I had another iteration over the document. There are two major > changes: > > > > 1. Supported Flink Upgrade Versions > > > > I got the feedback via various channels that a step size of one > minor > > version is not very convenient. As you said, "because upgrading to > a new > > version is a time-consuming process". I rephrased this section: > > > > Upgrading usually involves work which is why many users perform this > > task rarely (e.g. only once per year). Also skipping a versions is > > common until a new feature has been introduced for which is it > worth to > > upgrade. We will support the upgrade to the most recent Flink > version > > from a set of previous versions. We aim to support upgrades from the > > last 2-3 releases on a best-effort basis; maybe even more depending > on > > the maintenance overhead. However, in order to not grow the testing > > matrix infinitely and to perform important refactoring if > necessary, we > > only guarantee upgrades with a step size of a single minor version > (i.e. > > a cascade of upgrades). > > > > 2. Annotation Design > > > > I also adopted the multiple annotations design for the previous > > supportPlanFormat. So no array of versions anymore. I reworked the > > section, please have a look with updated examples: > > > > > > > > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489#FLIP190:SupportVersionUpgradesforTableAPI > > > > I also got the feedback offline that `savepoint` might not be the > right > > terminology for the annotation. I changed that to minPlanVersion and > > minStateVersion. > > > > Let me know what you think. > > > > Regards, > > Timo > > > > > > > > On 09.12.21 08:44, Jing Zhang
Re: [DISCUSS] FLIP-200: Support Multiple Rule and Dynamic Rule Changing (Flink CEP)
Hi Konstantin, Thanks for your feedback. The point that add a timestamp to each rule that determines the start time from which the rule makes sense to me. At present, The timestamp is current time at default, so no timestamp field represents the start time from which the rule. And about the renaming rule, your suggestion looks good to me and no any new concept introduces. But does this introduce Rule concept or reuse the Pattern concept for the DynamicPattern renaming? Best, Nicholas Jiang On 2021/12/13 07:45:04 Konstantin Knauf wrote: > Thanks, Yufeng, for starting this discussion. I think this will be a very > popular feature. I've seen a lot of users asking for this in the past. So, > generally big +1. > > I think we should have a rough idea on how to expose this feature in the > other APIs. > > Two ideas: > > 1. In order to make this more deterministic in case of reprocessing and > out-of-orderness, I am wondering if we can add a timestamp to each rule > that determines the start time from which the rule should be in effect. > This can be an event or a processing time depending on the characteristics > of the pipeline. The timestamp would default to Long.MIN_TIMESTAMP if not > provided, which means effectively immediately. This could also be a follow > up, if you think it will make the implementation too complicated initially. > > 2. I am wondering, if we should name Rule->DynamicPatternHolder or so and > CEP.rule-> CEP.dynamicPatterns instead (other classes correspondingly)? > Rule is quite ambiguous and DynamicPattern seems more descriptive to me. > > On Mon, Dec 13, 2021 at 4:30 AM Nicholas Jiang > wrote: > > > Hi Martijn, > > > >IMO, in this FLIP, we only need to introduce the general design of the > > Table API/SQL level. As for the design details, you can create a new FLIP. > > And do we need to take into account the support for Batch mode if you > > expand the MATCH_RECOGNIZE function? About the dynamic rule engine design, > > do you have any comments? This core of the FLIP is about the multiple rule > > and dynamic rule changing mechanism. > > > > Best, > > Nicholas Jiang > > > > > -- > > Konstantin Knauf > > https://twitter.com/snntrable > > https://github.com/knaufk >