[jira] [Created] (FLINK-21505) Enforce common savepoint format at the operator level

2021-02-25 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-21505: Summary: Enforce common savepoint format at the operator level Key: FLINK-21505 URL: https://issues.apache.org/jira/browse/FLINK-21505 Project: Flink

Re: [DISCUSS] Removal of flink-swift-fs-hadoop module

2021-01-27 Thread Aljoscha Krettek
+1 I'm almost always in favour of removing old code instead of continuing to let it rot. Best, Aljoscha On 2021/01/26 14:11, Robert Metzger wrote: Hi all, during a security maintenance PR [1], Chesnay noticed that the flink-swift-fs-hadoop module is lacking test coverage [2]. Also, there

[jira] [Created] (FLINK-21151) Extract common full-snapshot writer from RocksDB full-snapshot strategy

2021-01-26 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-21151: Summary: Extract common full-snapshot writer from RocksDB full-snapshot strategy Key: FLINK-21151 URL: https://issues.apache.org/jira/browse/FLINK-21151

Re: [ANNOUNCE] New formatting rules are now in effect

2021-01-25 Thread Aljoscha Krettek
I've always been using the most recent IntelliJ plugin version and it's fine for all of my code so far and it was never a problem when I worked on the initial reformatting. For the rare case where more recent versions of the plugin would produce formatting that is incompatible with 1.7.5 our

Re: [VOTE] FLIP-147: Support Checkpoint After Tasks Finished

2021-01-20 Thread Aljoscha Krettek
+1 (binding) Best, Aljoscha On 2021/01/15 22:55, Yun Gao wrote: Hi all, I would like to start the vote for FLIP-147[1], which propose to support checkpoints after tasks finished and is discussed in [2]. The vote will last at least 72 hours (Jan 20th due to weekend), following the

Re: [Vote] FLIP-157 Migrate Flink Documentation from Jekyll to Hugo

2021-01-20 Thread Aljoscha Krettek
+1 Aljoscha On 2021/01/18 08:29, Seth Wiesman wrote: Hi devs, The discussion of the FLIP-157 [1] seems has reached a consensus through the mailing thread [2]. I would like to start a vote for it. The vote will be opened until 20th January (72h), unless there is an objection or no enough

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-01-15 Thread Aljoscha Krettek
Thanks for the summary! I think we can now move towards a [VOTE] thread, right? On 2021/01/15 13:43, Yun Gao wrote: 1) For the problem that the "new" root task coincidently finished before getting triggered successfully, we have listed two options in the FLIP-147[1], for the first version,

Re: [DISCUSS] Planning Flink 1.13

2021-01-14 Thread Aljoscha Krettek
On 2021/01/14 10:28, Till Rohrmann wrote: I've created a 1.13 wiki page [1] where we can collect the features we want to complete for the 1.13 release. [1] https://cwiki.apache.org/confluence/display/FLINK/1.13+Release Thanks! I've starting adding things to this.

Re: [DISCUSS] Planning Flink 1.13

2021-01-14 Thread Aljoscha Krettek
+1 to the rough feature freeze date and the proposed release managers Thanks for taking care of this! Best, Aljoscha On 2021/01/13 15:47, Dawid Wysakowicz wrote: Hi all, With the 1.12 being released some time ago already I thought it would be a good time to kickstart the 1.13 release cycle.

Re: [DISCUSS] FLIP-157 Migrate Flink Documentation from Jekyll to Hugo

2021-01-14 Thread Aljoscha Krettek
+1 The build times on Jekyll have just become to annoying for me. I realize that that is also a function of how we structure our documentation, and especially how we construct the nav sidebar, but I think overall moving to Hugo is still a benefit. Aljoscha On 2021/01/13 10:14, Seth Wiesman

Re: Roadmap for Execution Mode (Batch/Streaming) and interaction with Table/SQL APIs

2021-01-11 Thread Aljoscha Krettek
Also cc'ing dev@flink.apache.org On 2021/01/06 09:19, burkaygur wrote: 1) How do these changes impact the Table and SQL APIs? Are they completely orthogonal or can we get the benefits of the new Batch Mode with Flink SQL as well? The answer here is a bit complicated. The Table API/SQL already

Re: [DISCUSS] Support registering custom JobStatusListeners when scheduling a job

2021-01-08 Thread Aljoscha Krettek
/09/28 11:14, 季文昊 wrote: Hi Aljoscha, Yes, that is not enough, since the `JobListener`s are called only once when `excute()` or `executeAsync()` is called. And in order to sync the status, I also have to call `JobClient#getJobStatus` periodically. On Fri, Sep 25, 2020 at 8:12 PM Aljoscha Krettek

Re: [DISCUSS]FLIP-150: Introduce Hybrid Source

2021-01-08 Thread Aljoscha Krettek
Hi Nicholas, Thanks for starting the discussion! I think we might be able to simplify this a bit and re-use existing functionality. There is already `Source.restoreEnumerator()` and `SplitEnumerator.snapshotState(). This seems to be roughly what the Hybrid Source needs. When the initial

Re: Kafka producer exactly once

2021-01-08 Thread Aljoscha Krettek
On 2021/01/08 10:00, Piotr Nowojski wrote: Moreover I don't think there is a way to implement exactly once producer without some use of transactions one way or another. There are some ways I can think of. If messages have consistent IDs, we could check whether a message is already in Kafka

Re: Kafka producer exactly once

2021-01-08 Thread Aljoscha Krettek
On 2021/01/07 14:17, Pramod Immaneni wrote: Is there a Kafka producer that can do exactly once semantic without the use of transactions? I'm afraid not right now. There are some ideas about using a WAL (write ahead log) and then periodically "shipping" that to Kafka but nothing concrete.

[DISCUSS] Backport broadcast operations in BATCH mode to Flink

2021-01-07 Thread Aljoscha Krettek
1.12.x Reply-To: Hi, what do you think about backporting FLINK-20491 [1] to Flink 1.12.x? I (we, including Dawid and Kostas) are a bit torn on this. a) It's a limitation of Flink 1.12.0 and fixing this seems very good for users that would otherwise have to wait until Flink 1.13.0. b) It's

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-01-07 Thread Aljoscha Krettek
This is somewhat unrelated to the discussion about how to actually do the triggering when sources shut down, I'll write on that separately. I just wanted to get this quick thought out. For letting operators decide whether they actually want to wait for a final checkpoint, which is relevant at

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-01-06 Thread Aljoscha Krettek
On 2021/01/06 16:05, Arvid Heise wrote: thanks for the detailed example. It feels like Aljoscha and you are also not fully aligned yet. For me, it sounded as if Aljoscha would like to avoid sending RPC to non-source subtasks. No, I think we need the triggering of intermediate operators. I was

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-01-06 Thread Aljoscha Krettek
On 2021/01/06 13:35, Arvid Heise wrote: I was actually not thinking about concurrent checkpoints (and actually want to get rid of them once UC is established, since they are addressing the same thing). I would give a yuge +1 to that. I don't see why we would need concurrent checkpoints in

Re: Apache Pinot Sink

2021-01-06 Thread Aljoscha Krettek
, Aljoscha Krettek wrote: It's great to see interest in this. Where you planning to use the new Sink interface that we recently introduced? [1] Best, Aljoscha [1] https://s.apache.org/FLIP-143 On 2021/01/05 12:21, Poerschke, Mats wrote: Hi all, we want to contribute a sink connector for Apache Pinot

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-01-06 Thread Aljoscha Krettek
On 2021/01/06 11:30, Arvid Heise wrote: I'm assuming that this is the normal case. In a A->B graph, as soon as A finishes, B still has a couple of input buffers to process. If you add backpressure or longer pipelines into the mix, it's quite likely that a checkpoint may occur with B being the

Re: Apache Pinot Sink

2021-01-06 Thread Aljoscha Krettek
It's great to see interest in this. Where you planning to use the new Sink interface that we recently introduced? [1] Best, Aljoscha [1] https://s.apache.org/FLIP-143 On 2021/01/05 12:21, Poerschke, Mats wrote: Hi all, we want to contribute a sink connector for Apache Pinot. The following

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-01-06 Thread Aljoscha Krettek
On 2021/01/05 17:27, Arvid Heise wrote: For your question: will there ever be intermediate operators that should be running that are not connected to at least once source? I think there are plenty of examples if you go beyond chained operators and fully connected exchanges. Think of any fan-in,

Re: [DISCUSS] Drop Scala 2.11

2021-01-05 Thread Aljoscha Krettek
t;> 2.12.7. >>>> It is binary incompatible with version 2.12 above ( >>>> https://issues.apache.org/jira/browse/FLINK-12461 ) >>>> That would be great to at least move to a more recent 2.12 version, >>>> and >>>> ideally to 2.13. >>>> >>

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-01-05 Thread Aljoscha Krettek
On 2021/01/05 10:16, Arvid Heise wrote: 1. I'd think that this is an orthogonal issue, which I'd solve separately. My gut feeling says that this is something we should only address for new sinks where we decouple the semantics of commits and checkpoints anyways. @Aljoscha Krettek any idea

[PSA] Configure "Save Actions" only for Java files

2021-01-05 Thread Aljoscha Krettek
If you're using "Save Actions" to auto-format your Java code, as recommended in [1], you should add a regex in the settings to make sure that this only formats Java code. Otherwise you will get weird results when IntelliJ also formats XML, Markdown or Scala files for you. Best, Aljoscha [1]

Re: [DISCUSS] Allow streaming operators to use managed memory

2021-01-04 Thread Aljoscha Krettek
I agree, we should allow streaming operators to use managed memory for other use cases. Do you think we need an additional "consumer" setting or that they would just use `DATAPROC` and decide by themselves what to use the memory for? Best, Aljoscha On 2020/12/22 17:14, Jark Wu wrote: Hi

[jira] [Created] (FLINK-20843) UnalignedCheckpointITCase is unstable

2021-01-04 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-20843: Summary: UnalignedCheckpointITCase is unstable Key: FLINK-20843 URL: https://issues.apache.org/jira/browse/FLINK-20843 Project: Flink Issue Type

Re: Changing application configuration when restoring from checkpoint/savepoint

2020-12-28 Thread Aljoscha Krettek
On Wed Dec 16, 2020 at 6:41 PM CET, vishalovercome wrote: > 1. Is there any way to restore from a checkpoint as well as recreate > client > using newer configuration? I think that would only work if you somehow read the configuration from an external system > 2. If we take a savepoint (drain

Re: [DISCUSS] Enforce common opinionated coding style using Spotless

2020-12-17 Thread Aljoscha Krettek
cle having started and christmas/vacations being around the corner. On 12/16/2020 7:20 PM, Aljoscha Krettek wrote: Let's try and conclude this discussion! I've prepared a PoC that uses Spotless with google-java-format to do the formatting: https://github.com/aljoscha/flink/commits/

[jira] [Created] (FLINK-20651) Use Spotless/google-java-format for code formatting/enforcement

2020-12-17 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-20651: Summary: Use Spotless/google-java-format for code formatting/enforcement Key: FLINK-20651 URL: https://issues.apache.org/jira/browse/FLINK-20651 Project

Re: [DISCUSS] Enforce common opinionated coding style using Spotless

2020-12-16 Thread Aljoscha Krettek
a On 19.10.20 12:36, Aljoscha Krettek wrote: I don't like checkstyle because it cannot be easily applied from the commandline. I'm happy to learn otherwise, though. And I'd also be very happy about alternative suggestions that can do that. Right now, I think Spotless is the most straightforward to

Re: Kafka Consumer Deserializer Exception on application mode

2020-12-16 Thread Aljoscha Krettek
I believe this is caused by dependency conflicts/mismatch. I also commented this on the Jira issue. Best, Aljoscha On 16.12.20 07:39, han guoguo wrote: Hi, Kafka source may has some issues on application mode when i run it with application mode on flink 1.11.2 it can't startup the detail

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2020-12-15 Thread Aljoscha Krettek
Thanks for the thorough update! I'll answer inline. On 14.12.20 16:33, Yun Gao wrote: 1. To include EndOfPartition into consideration for barrier alignment at the TM side, we now tend to decouple the logic for EndOfPartition with the normal alignment behaviors to avoid the complex

Re: [DISCUSS] Make SQL docs Blink only

2020-12-08 Thread Aljoscha Krettek
+1 Yes, please! On 08.12.20 16:52, David Anderson wrote: I agree -- I think separating out the legacy planner info should make things clearer for everyone, and then some day we can simply drop it. Plus, doing it now will make it easier to make improvements to the docs going forward. David On

[jira] [Created] (FLINK-20491) Support Broadcast State in BATCH execution mode

2020-12-04 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-20491: Summary: Support Broadcast State in BATCH execution mode Key: FLINK-20491 URL: https://issues.apache.org/jira/browse/FLINK-20491 Project: Flink

[jira] [Created] (FLINK-20302) Suggest DataStream API with BATCH execution mode in DataSet docs

2020-11-23 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-20302: Summary: Suggest DataStream API with BATCH execution mode in DataSet docs Key: FLINK-20302 URL: https://issues.apache.org/jira/browse/FLINK-20302 Project

Re: [DISCUSS] Stop adding new bash-based e2e tests to Flink

2020-11-18 Thread Aljoscha Krettek
+1 And I want to second Arvid's mention of testcontainers [1]. [1] https://www.testcontainers.org/ On 18.11.20 10:43, Yang Wang wrote: Thanks till and Jark for sharing the information. I am also +1 for this proposal and glad to wire the new introduced K8s HA e2e tests to java based

[jira] [Created] (FLINK-20153) Add documentation for BATCH execution mode

2020-11-13 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-20153: Summary: Add documentation for BATCH execution mode Key: FLINK-20153 URL: https://issues.apache.org/jira/browse/FLINK-20153 Project: Flink Issue

[jira] [Created] (FLINK-20098) Don't add flink-connector-files to flink-dist

2020-11-11 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-20098: Summary: Don't add flink-connector-files to flink-dist Key: FLINK-20098 URL: https://issues.apache.org/jira/browse/FLINK-20098 Project: Flink Issue

Re: Register processing time timers when Operator.close() is called

2020-11-11 Thread Aljoscha Krettek
Hi! This is an interesting topic and we recently created a Jira issue about this: https://issues.apache.org/jira/browse/FLINK-18647. In Beam we even have a workaround for this:

Re: [DISCUSS] Planning the 1.12 release testing

2020-11-10 Thread Aljoscha Krettek
On 04.11.20 20:05, Robert Metzger wrote: For testing Flink in a coordinated way, and to allow the broader community to participate in the testing more easily, I've created a wiki page to collect testing tasks: https://cwiki.apache.org/confluence/display/FLINK/1.12+Release+-+Community+Testing

[jira] [Created] (FLINK-20001) Don't use setAllVerticesInSameSlotSharingGroupByDefault in StreamGraphGenerator

2020-11-05 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-20001: Summary: Don't use setAllVerticesInSameSlotSharingGroupByDefault in StreamGraphGenerator Key: FLINK-20001 URL: https://issues.apache.org/jira/browse/FLINK-20001

[jira] [Created] (FLINK-19932) Add integration test for BATCH execution on DataStream API

2020-11-02 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19932: Summary: Add integration test for BATCH execution on DataStream API Key: FLINK-19932 URL: https://issues.apache.org/jira/browse/FLINK-19932 Project: Flink

Re: [VOTE] Remove flink-connector-filesystem module.

2020-10-30 Thread Aljoscha Krettek
+1 Aljoscha On 29.10.20 09:18, Kostas Kloudas wrote: Hi all, Following the discussion in [1], I would like to start a vote on removing the flink-connector-filesystem module which includes the BucketingSink. The vote will be open till November 3rd (72h, excluding the weekend) unless there is

[jira] [Created] (FLINK-19837) Don't emit intermediate watermarks watermark operators in BATCH execution mode

2020-10-27 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19837: Summary: Don't emit intermediate watermarks watermark operators in BATCH execution mode Key: FLINK-19837 URL: https://issues.apache.org/jira/browse/FLINK-19837

[jira] [Created] (FLINK-19835) Don't emit intermediate watermarks in BATCH execution mode

2020-10-27 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19835: Summary: Don't emit intermediate watermarks in BATCH execution mode Key: FLINK-19835 URL: https://issues.apache.org/jira/browse/FLINK-19835 Project: Flink

[jira] [Created] (FLINK-19833) Rename Sink API Writer interface to SinkWriter

2020-10-27 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19833: Summary: Rename Sink API Writer interface to SinkWriter Key: FLINK-19833 URL: https://issues.apache.org/jira/browse/FLINK-19833 Project: Flink Issue

Re: [DISCUSS] Support KeyedSortedMapState in DataStream API

2020-10-27 Thread Aljoscha Krettek
Do we think this will be useful for users or do we first want to introduce this for internal use cases, such as the Table API/SQL runner? Aljoscha On 15.10.20 10:35, Sean Z wrote: Hi Jark, Thanks for the reply and sharing thoughts. Yes, negative long will make things complicated. We had the

Re: [DISCUSS] Release 1.12 Feature Freeze

2020-10-19 Thread Aljoscha Krettek
@Robert Your (and Dian's) suggestions sound good to me! I like keeping to master frozen for a while since it will prevent a lot of duplicate merging efforts. Regarding the date: I'm fine with the proposed date but I can also see that extending it to the end of the week could be helpful.

Re: [DISCUSS] Enforce common opinionated coding style using Spotless

2020-10-19 Thread Aljoscha Krettek
as convenient to use. [1] https://www.moxio.com/blog/43/ignoring-bulk-change-commits-with-git-blame [2] https://github.community/t/support-ignore-revs-file-in-githubs-blame-view/3256 On Tue, Oct 6, 2020 at 6:00 PM Aljoscha Krettek wrote: Maybe I wasn't very clear on how the ratche

[jira] [Created] (FLINK-19671) Update EditorConfig file to be useful

2020-10-16 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19671: Summary: Update EditorConfig file to be useful Key: FLINK-19671 URL: https://issues.apache.org/jira/browse/FLINK-19671 Project: Flink Issue Type

Re: [DISCUSS] FLIP-146: Improve new TableSource and TableSink interfaces

2020-10-14 Thread Aljoscha Krettek
think? Best, Jingsong On Tue, Oct 13, 2020 at 5:01 PM Aljoscha Krettek wrote: Hi Jingsong, I'm sorry, I didn't want to block you for so long on this. I thought about it again. I think it's fine to add a DataStream Provider if this really unblocks users from migrating to newer Flink versions

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread Aljoscha Krettek
On 13.10.20 14:01, David Anderson wrote: I thought this was waiting on FLIP-46 -- Graceful Shutdown Handling -- and in fact, the StreamingFileSink is mentioned in that FLIP as a motivating use case. Ah yes, I see FLIP-147 as a more general replacement for FLIP-46. Thanks for the reminder, we

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-13 Thread Aljoscha Krettek
On 13.10.20 11:18, David Anderson wrote: I think the pertinent question is whether there are interesting cases where the BucketingSink is still a better choice. One case I'm not sure about is the situation described in docs for the StreamingFileSink under Important Note 2 [1]: ... upon

Re: [DISCUSS] FLIP-146: Improve new TableSource and TableSink interfaces

2020-10-13 Thread Aljoscha Krettek
? Best, Jingsong On Tue, Sep 29, 2020 at 8:48 PM Aljoscha Krettek wrote: Hi, I'll only respond regarding the parallelism for now because I need to think some more about DataStream. What I'm saying is that exposing a parallelism only for Table Connectors is not the right thing. If we want to allow

Re: Wrapping a Flink Function

2020-10-12 Thread Aljoscha Krettek
Could you maybe outline how you want to extend the wrapped sink functionality? A better approach might be to add an operation "in front" of the sink. Best, Aljoscha On 08.10.20 11:32, Lorenzo Pirazzini wrote: Hello, I'm having trouble finding a way to add logic to an existing SinkFunction.

[jira] [Created] (FLINK-19521) Support dynamic properties on DefaultCLI

2020-10-07 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19521: Summary: Support dynamic properties on DefaultCLI Key: FLINK-19521 URL: https://issues.apache.org/jira/browse/FLINK-19521 Project: Flink Issue Type

Re: [DISCUSS] Enforce common opinionated coding style using Spotless

2020-10-06 Thread Aljoscha Krettek
ation scheme because they _just don't work_. On 10/6/2020 2:15 PM, Aljoscha Krettek wrote: Hi All, I know I know, but please keep reading because I recently learned about some new developments in the area of coding-style automation. The tool I would propose we use is Spotless (https://github.c

[DISCUSS] Enforce common opinionated coding style using Spotless

2020-10-06 Thread Aljoscha Krettek
Hi All, I know I know, but please keep reading because I recently learned about some new developments in the area of coding-style automation. The tool I would propose we use is Spotless (https://github.com/diffplug/spotless). This doesn't come with a formatter but allows using other popular

[jira] [Created] (FLINK-19508) Add collect() operation on DataStream

2020-10-06 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19508: Summary: Add collect() operation on DataStream Key: FLINK-19508 URL: https://issues.apache.org/jira/browse/FLINK-19508 Project: Flink Issue Type

[jira] [Created] (FLINK-19493) In CliFrontend, make flow of Configuration more obvious

2020-10-02 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19493: Summary: In CliFrontend, make flow of Configuration more obvious Key: FLINK-19493 URL: https://issues.apache.org/jira/browse/FLINK-19493 Project: Flink

Re: How to clean up resources in a UDF?

2020-10-02 Thread Aljoscha Krettek
, 2020 at 2:55 AM Aljoscha Krettek wrote: Hi! Yes, AbstractRichFunction.close() would be the right place to do cleanup. This method is called both in case of successful finishing and also in the case of failures. For BATCH execution, Flink will do backtracking upwards from the failed task(s

Re: How to clean up resources in a UDF?

2020-10-01 Thread Aljoscha Krettek
Hi! Yes, AbstractRichFunction.close() would be the right place to do cleanup. This method is called both in case of successful finishing and also in the case of failures. For BATCH execution, Flink will do backtracking upwards from the failed task(s) to see if intermediate results from

[jira] [Created] (FLINK-19479) Allow explicitly configuring time behaviour on KeyedStream.intervalJoin()

2020-09-30 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19479: Summary: Allow explicitly configuring time behaviour on KeyedStream.intervalJoin() Key: FLINK-19479 URL: https://issues.apache.org/jira/browse/FLINK-19479

Re: CheckpointedFunction initialization during checkpoint

2020-09-29 Thread Aljoscha Krettek
Hi Teng, I think if the system is slowed down enough it can happen that some parts of the graph are still restoring while others are already taking a checkpoint. By virtue of how checkpointing works (by sending barriers along the network connections between tasks) this should not be a

Re: [DISCUSS] FLIP-146: Improve new TableSource and TableSink interfaces

2020-09-29 Thread Aljoscha Krettek
ad for db and poor performance because of too many small requests if the optimizer didn't know such information, and set a large parallelism for sink when matching the parallelism of its input. On Thu, 24 Sep 2020 at 22:41, Aljoscha Krettek wrote: Thanks for the proposal! I think the use cas

Re: [VOTE] FLIP-143: Unified Sink API

2020-09-25 Thread Aljoscha Krettek
+1 (binding) Aljoscha On 25.09.20 14:26, Guowei Ma wrote: From the discussion[1] we could find that FLIP focuses on providing an unified transactional sink API. So I updated the FLIP's title to "Unified Transactional Sink API". But I found that the old link could not be opened again. I would

Re: [DISCUSS] Support registering custom JobStatusListeners when scheduling a job

2020-09-25 Thread Aljoscha Krettek
Hi, I understand from your email that `StreamExecutionEnvironment.registerJobListener()` would not be enought for you because you want to be notified of changes on the cluster side, correct? That is when the job status changes on the master. Best, Aljoscha On 23.09.20 14:31, 季文昊 wrote: Hi

Re: [DISCUSS] FLIP-146: Improve new TableSource and TableSink interfaces

2020-09-24 Thread Aljoscha Krettek
Thanks for the proposal! I think the use cases that we are trying to solve are indeed valid. However, I think we might have to take a step back to look at what we're trying to solve and how we can solve it. The FLIP seems to have two broader topics: 1) add "get parallelism" to sinks/sources

Re: [VOTE] Apache Flink Stateful Functions 2.2.0, release candidate #2

2020-09-24 Thread Aljoscha Krettek
+1 (binding) - built from source - built docker image - verified Rust SDK works with the 2.2.0 docker image Aljoscha On 24.09.20 10:32, Tzu-Li (Gordon) Tai wrote: FYI - the PR for the release announcement has just been drafted: https://github.com/apache/flink-web/pull/379 Any comments

[jira] [Created] (FLINK-19377) Task can swallow test exceptions which hides test failures

2020-09-23 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19377: Summary: Task can swallow test exceptions which hides test failures Key: FLINK-19377 URL: https://issues.apache.org/jira/browse/FLINK-19377 Project: Flink

Re: [DISCUSS] FLIP-142: Disentangle StateBackends from Checkpointing

2020-09-23 Thread Aljoscha Krettek
On 23.09.20 04:40, Yu Li wrote: To be specific, with the old API users don't need to set checkpoint storage, instead they only need to pass the checkpoint path w/o caring about the storage. The new APIs are forcing users to set the storage so they have to know the difference between different

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-23 Thread Aljoscha Krettek
Yes, that sounds good! I'll probably have some comments on the FLIP about the names of generic parameters and the Javadoc but we can address them later or during implementation. I also think that we probably need the FAIL,RETRY,SUCCESS result for globalCommit() but we can also do that as a

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-22 Thread Aljoscha Krettek
kpoint won't include the committed GlobalCommT. Maybe GlobalCommitter can have an API like this? List snapshotState(); But then we still need the recover API if we don't let sink directly manage the state. List recoverCommittables(List) Thanks, Steven On Tue, Sep 22, 2020 at 6:33 AM Al

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-22 Thread Aljoscha Krettek
On 22.09.20 13:26, Guowei Ma wrote: Actually I am not sure adding `isAvailable` is enough. Maybe it is not. But for the initial version I hope we could make the sink api sync because there is already a lot of stuff that has to finish. :--) I agree, for the first version we should stick to a

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-22 Thread Aljoscha Krettek
On 22.09.20 11:10, Guowei Ma wrote: 1. I think maybe we could add a EOI interface to the `GlobalCommit`. So that we could make `write success file` be available in both batch and stream execution mode. We could, yes. I'm now hesitant because we're adding more things but I think it should be

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-22 Thread Aljoscha Krettek
Ah sorry, I think I now see what you mean. I think it's ok to add a `List recoverCommittables(List)` method. On 22.09.20 09:42, Aljoscha Krettek wrote: On 22.09.20 06:06, Steven Wu wrote: In addition, it is undesirable to do the committed-or-not check in the commit method, which happens

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-22 Thread Aljoscha Krettek
On 22.09.20 06:06, Steven Wu wrote: In addition, it is undesirable to do the committed-or-not check in the commit method, which happens for each checkpoint cycle. CommitResult already indicates SUCCESS or not. when framework calls commit with a list of GlobalCommittableT, it should be certain

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-21 Thread Aljoscha Krettek
can check each individual GlobalCommT (ManifestFile) with Iceberg snapshot metadata. Thanks, Steven [1] https://github.com/Netflix-Skunkworks/nfflink-connector-iceberg/blob/master/nfflink-connector-iceberg/src/main/java/com/netflix/spaas/nfflink/connector/iceberg/sink/IcebergCommitter.java#L56

[jira] [Created] (FLINK-19326) Allow explicitly configuring time behaviour on CEP PatternStream

2020-09-21 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19326: Summary: Allow explicitly configuring time behaviour on CEP PatternStream Key: FLINK-19326 URL: https://issues.apache.org/jira/browse/FLINK-19326 Project

[jira] [Created] (FLINK-19319) Deprecate StreamExecutionEnvironment.setStreamTimeCharacteristic()

2020-09-21 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19319: Summary: Deprecate StreamExecutionEnvironment.setStreamTimeCharacteristic() Key: FLINK-19319 URL: https://issues.apache.org/jira/browse/FLINK-19319 Project

[jira] [Created] (FLINK-19318) Deprecate timeWindow() operations in DataStream API

2020-09-21 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19318: Summary: Deprecate timeWindow() operations in DataStream API Key: FLINK-19318 URL: https://issues.apache.org/jira/browse/FLINK-19318 Project: Flink

[jira] [Created] (FLINK-19317) Make EventTime the default StreamTimeCharacteristic

2020-09-21 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19317: Summary: Make EventTime the default StreamTimeCharacteristic Key: FLINK-19317 URL: https://issues.apache.org/jira/browse/FLINK-19317 Project: Flink

[jira] [Created] (FLINK-19316) FLIP-134: Batch execution for the DataStream API

2020-09-21 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19316: Summary: FLIP-134: Batch execution for the DataStream API Key: FLINK-19316 URL: https://issues.apache.org/jira/browse/FLINK-19316 Project: Flink

[RESULT][VOTE] FLIP-134: Batch execution for the DataStream API

2020-09-21 Thread Aljoscha Krettek
Hi all, The voting time for FLIP-134 [1] has passed. I'm closing the vote now. Including my implicit vote, there were 7 + 1 votes, 4 of which are binding: - Dawid Wysakowicz (binding) - Gao Yun - Ma Guowei - David Anderson (binding) - Kostas Kloudas (binding) - Peter Huang - Aljoscha Krettek

Re: [VOTE] FLIP-36 - Support Interactive Programming in Flink Table API

2020-09-18 Thread Aljoscha Krettek
+1 (binding) Best, Aljoscha

Re: [DISCUSS] Deprecate and remove UnionList OperatorState

2020-09-18 Thread Aljoscha Krettek
On 12.09.20 17:20, Alexey Trenikhun wrote: We use union state to generate sequences, each operator generates offset0 + number-of-tasks - task-index + task-specific-counter * number-of-tasks (e.g. for 2 instances of operator -one instance produce even number, another odd). Last generated

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-18 Thread Aljoscha Krettek
Steven, we were also wondering if it is a strict requirement that "later" updates to Iceberg subsume earlier updates. In the current version, you only check whether checkpoint X made it to Iceberg and then discard all committable state from Flink state for checkpoints smaller X. If we go

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-17 Thread Aljoscha Krettek
Thanks for the summary! On 16.09.20 06:29, Guowei Ma wrote: ## Consensus 1. The motivation of the unified sink API is to decouple the sink implementation from the different runtime execution mode. 2. The initial scope of the unified sink API only covers the file system type, which supports the

Re: [DISCUSS] FLIP-142: Disentangle StateBackends from Checkpointing

2020-09-16 Thread Aljoscha Krettek
/09/2020 16:48, Aljoscha Krettek wrote: I could try and come up with a longer name if you need it ... ;-) Aljoscha On 11.09.20 16:25, Seth Wiesman wrote: Having thought about it more, HashMapStateBackend has won me over. I'll update the FLIP. If there aren't any more comments I'll open it

[jira] [Created] (FLINK-19264) MiniCluster is flaky with concurrent job execution

2020-09-16 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19264: Summary: MiniCluster is flaky with concurrent job execution Key: FLINK-19264 URL: https://issues.apache.org/jira/browse/FLINK-19264 Project: Flink

Re: [VOTE] FLIP-36 - Support Interactive Programming in Flink Table API

2020-09-15 Thread Aljoscha Krettek
+1 (binding) Nice work! :-) Aljoscha On 16.09.20 06:00, Xuannan Su wrote: Hi all, I'd like to start the vote for FLIP-36[1], which has been discussed in thread[2]. The vote will be open for 72h, until September 19, 2020, 04:00 AM UTC, unless there's an objection or not enough votes.

[jira] [Created] (FLINK-19247) Update Chinese documentation after removal of Kafka 0.10 and 0.11

2020-09-15 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-19247: Summary: Update Chinese documentation after removal of Kafka 0.10 and 0.11 Key: FLINK-19247 URL: https://issues.apache.org/jira/browse/FLINK-19247 Project

Re: [DISCUSS] FLIP-36 - Support Interactive Programming in Flink Table API

2020-09-15 Thread Aljoscha Krettek
On 15.09.20 10:54, Xuannan Su wrote: One way of solving this is to let the CatalogManager probe the existence of the IntermediateResult so that the planner can decide if the cache table should be used. That could be a reasonable solution, yes. Best, Aljoscha

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-15 Thread Aljoscha Krettek
On 15.09.20 09:55, Dawid Wysakowicz wrote: BTW Let's not forget about Piotr's comment. I think we could add the isAvailable or similar method to the Writer interface in the FLIP. I'm not so sure about this, the sinks I'm aware of would not be able to implement this method: Kafka doesn't have

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-15 Thread Aljoscha Krettek
On 15.09.20 06:05, Guowei Ma wrote: ## Using checkpointId In the batch execution mode there would be no normal checkpoint any more. That is why we do not introduce the checkpoint id in the API. So it is a great thing that sink decouples its implementation from checkpointid. :) Yes, this is a

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-15 Thread Aljoscha Krettek
On 15.09.20 01:33, Steven Wu wrote: ## concurrent checkpoints @Aljoscha Krettek regarding the concurrent checkpoints, let me illustrate with a simple DAG below. [image: image.png] Hi Steven, images don't make it through to the mailing lists. You would need to host the file somewhere

Re: [DISCUSS] FLIP-36 - Support Interactive Programming in Flink Table API

2020-09-15 Thread Aljoscha Krettek
On 15.09.20 07:00, Xuannan Su wrote: Thanks for your comment. I agree that we should not introduce tight coupling with PipelineExecutor to the execution environment. With that in mind, to distinguish the per-job and session mode, we can introduce a new method, naming isPerJobModeExecutor, in

Re: [VOTE] FLIP-142: Disentangle StateBackends from Checkpointing

2020-09-15 Thread Aljoscha Krettek
+1 (binding) Aljoscha

  1   2   3   4   5   6   7   8   9   10   >