AppClassLoader leakage to MiniCluster

2024-05-20 Thread Jan Lukavský
Hi, I was investigating a problem in Apache Beam test infrastructure after introduction of FlinkRunner for Flink 1.18 (see [1]). There was a persistent failure of integration test which uses `mvn exec:java` to run a word count example. The test (for FlinkRunner) uses MiniCluster and works

Chained rebalance() guarantees

2024-03-15 Thread Jan Lukavský
Hi, while implementing new transform for Apache Beam, we hit some questions about what guarantees a DataSet#rebalance() method has in terms of chaining. According to [1] there was some suboptimality if execution of chained rebalance with intermediate mapping operation. As a result the

Re: [DISCUSS] FLIP-327: Support stream-batch unified operator to improve job throughput when processing backlog data

2023-08-31 Thread Jan Lukavský
Hi, some keywords in this triggered my attention, so sorry for late jumping in, but I'd like to comprehend the nature of the proposal. I'll try to summarize my understanding: The goal of the FLIP is to support automatic switching between streaming and batch processing, leveraging the fact

Re: [ANNOUNCE] New Apache Flink Committer - David Morávek

2022-03-07 Thread Jan Lukavský
Congratulations David!  Jan On 3/7/22 09:44, Etienne Chauchot wrote: Congrats David ! Well deserved ! Etienne Le 07/03/2022 à 08:47, David Morávek a écrit : Thanks everyone! Best, D. On Sun 6. 3. 2022 at 9:07, Yuan Mei wrote: Congratulations, David! Best Regards, Yuan On Sat, Mar 5,

Re: Flink support for OrderedListState

2021-11-15 Thread Jan Lukavský
Hi Reuven, cc dev@flink.apache.org.  Jan On 11/12/21 04:35, Reuven Lax wrote: OrderedListState  was added to Beam over a year ago. To date it is only supported by the

Re: [SURVEY] What is the most subtle/hard to catch bug that people have seen?

2019-10-01 Thread Jan Lukavský
Hi, I'd add another one regarding Java hashCode() and its practical usability for distributed systems [1], although practically all (Java based) data processing systems rely on it. One bug directly related to this I once saw was, that using an Enum inside other object used as partitioning

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-19 Thread Jan Lukavský
One comment concerning your proposed class loader resolution. I think it adds a bit too much magic which is hard to understand for the user. It would be better if the system would fail with a descriptive error message instead. Cheers, Till On Thu, Sep 5, 2019 at 12:55 PM Jan Lukavský wrote: Hi T

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-05 Thread Jan Lukavský
be in parent hierarchy of FlinkUserCodeClassLoaders. Would that solve the issues you see? It works for me. Jan On 9/3/19 4:52 PM, Jan Lukavský wrote: Answers inline. On 9/3/19 4:01 PM, Till Rohrmann wrote: How so? Does your REPL add the generated classes to the system class loader? I assume

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Jan Lukavský
Sep 3, 2019 at 3:47 PM Jan Lukavský wrote: On the other hand, if you say, that the contract of LocalEnvironment is to execute as if it had all classes on its class loader, then it currently breaks this contract. :-) Jan On 9/3/19 3:45 PM, Jan Lukavský wrote: Hi Till, hmm, that sounds it mi

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Jan Lukavský
On the other hand, if you say, that the contract of LocalEnvironment is to execute as if it had all classes on its class loader, then it currently breaks this contract. :-) Jan On 9/3/19 3:45 PM, Jan Lukavský wrote: Hi Till, hmm, that sounds it might work. I would have to incorporate

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Jan Lukavský
Loader created by BlobLibraryCacheManager is not using context classloader guaishushu1...@163.com From: guaishushu1...@163.com Date: 2019-09-03 20:23 To: dev Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader guaishushu1...@163.com From: Jan Lukavsk

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Jan Lukavský
t be able to address differently. Cheers, Till On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek wrote: I’m not saying we can’t change that code to use the context class loader. I’m just not sure whether this might break other things. Best, Aljoscha On 2. Sep 2019, at 11:24, Jan Lukavský wrote

Re: Please add me as contributor

2019-09-03 Thread Jan Lukavský
are interested in working on? Best, Dawid [1] https://flink.apache.org/contributing/contribute-code.html On 03/09/2019 10:18, Jan Lukavský wrote: Hi, I'd like to be able to assign JIRAs to myself, can I be added as contributor, please? My JIRA ID is 'janl'. Thanks,  Jan

Please add me as contributor

2019-09-03 Thread Jan Lukavský
Hi, I'd like to be able to assign JIRAs to myself, can I be added as contributor, please? My JIRA ID is 'janl'. Thanks,  Jan

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-02 Thread Jan Lukavský
19, at 16:28, Till Rohrmann wrote: Hi Jan, this looks to me like a bug for which you could create a JIRA and PR to fix it. Just to make sure, I've pulled in Aljoscha who is the author of this change to check with him whether we are forgetting something. Cheers, Till On Fri, Aug 30, 2019 at 3:

ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-08-30 Thread Jan Lukavský
Hi, I have come across an issue with classloading in Flink's MiniCluster. The issue is that when I run local flink job from a thread, that has a non-default context classloader (for whatever reason), this classloader is not taken into account when classloading user defined functions. This is

Re: Watermarks not propagated to WebUI?

2019-08-26 Thread Jan Lukavský
actually work and >> can be >> observed using the metrics. >> >> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský mailto:je...@seznam.cz>> wrote: >> >>> Hi, >>> >>> is it possible, that watermarks are sometimes n

Re: Watermarks not propagated to WebUI?

2019-08-15 Thread Jan Lukavský
this issue (Flink 1.5, Flink 1.8), and it appears with higher parallelism. This can be confusing to the user when watermarks actually work and can be observed using the metrics. On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský wrote: Hi, is it possible, that watermarks are sometimes not propagated

Watermarks not propagated to WebUI?

2019-08-14 Thread Jan Lukavský
Hi, is it possible, that watermarks are sometimes not propagated to WebUI, although they are internally moving as normal? I see in WebUI every operator showing "No Watermark", but outputs seem to be propagated to sink (and there are watermark sensitive operations involved - e.g. reductions

Re: Suspicious watermark of operators after restore from checkpoint

2019-08-07 Thread Jan Lukavský
ices:stepLength..." (Op. B). I am wondering if it can be that the WebUi is not consistent across different operators. For example, the watermark of Op B was simply not updated in the WebUI. I also cc Chesnay who may have a better insight about the WebUi. Cheers, Kostas On Wed, Aug 7, 2019 at

Re: Suspicious watermark of operators after restore from checkpoint

2019-08-07 Thread Jan Lukavský
provide the code from your job, at least until operator A? On Wed, Aug 7, 2019 at 3:03 PM Jan Lukavský wrote: Actually, operator A is intermediate, source is preceding it. On 8/7/19 2:44 PM, Kostas Kloudas wrote: Hi Jan, After looking at the code, my point 1) is false for *intermediate* tasks

Re: Suspicious watermark of operators after restore from checkpoint

2019-08-07 Thread Jan Lukavský
t can explain your observation. Cheers, Kostas On Wed, Aug 7, 2019 at 2:30 PM Jan Lukavský wrote: Hi Kostas, thanks for reaction, comments inline. On 8/7/19 1:59 PM, Kostas Kloudas wrote: Hi Jan, Two pointers that may help you explain the behavior are the following: 1) If you have a cu

Re: Suspicious watermark of operators after restore from checkpoint

2019-08-07 Thread Jan Lukavský
as they might point out to some bugs somewhere. :-) Or that my mental model it not aligned with reality. Jan I hope this helps, Kostas On Wed, Aug 7, 2019 at 12:11 PM Jan Lukavský wrote: Hi all, I have just come across a weird state of operators after restore from checkpoint. After the re

Suspicious watermark of operators after restore from checkpoint

2019-08-07 Thread Jan Lukavský
Hi all, I have just come across a weird state of operators after restore from checkpoint. After the restore, two operators that are connected (i.e. operator A is input of operator B) ended up with watermark of operator A being less than watermark of operator B. I don't know how to explain

Re: Sort streams in windows

2019-06-17 Thread Jan Lukavský
Hi Eugene, I'd say that what you want essentially is not "sort in windows", because (as you mention), you want to emit elements from windows as soon as watermark passes some timestamp. Maybe a better approach would be to implement this using stateful processing, where you keep a buffer of

Re: [Discuss] FLIP-43: Savepoint Connector

2019-05-31 Thread Jan Lukavský
res are added (think the recent addition of TTL) we will get support automatically. I do not believe anything else is maintainable. Seth On May 31, 2019, at 5:56 AM, Jan Lukavský wrote: Hi, this is awesome, and really useful feature. If I might ask for one thing to consider - would it

Re: [Discuss] FLIP-43: Savepoint Connector

2019-05-31 Thread Jan Lukavský
the recent addition of TTL) we will get support automatically. I do not believe anything else is maintainable. Seth On May 31, 2019, at 5:56 AM, Jan Lukavský wrote: Hi, this is awesome, and really useful feature. If I might ask for one thing to consider - would it be possible to make the

Re: [Discuss] FLIP-43: Savepoint Connector

2019-05-31 Thread Jan Lukavský
Hi, this is awesome, and really useful feature. If I might ask for one thing to consider - would it be possible to make the Savepoint manipulation API (at least writing the Savepoint) less dependent on other parts of Flink internals (e.g. |KeyedStateBootstrapFunction|) and provide something

Re: Storing large lists into state per key

2017-12-19 Thread Jan Lukavský
a Key for each new event. Best case you can do some incremental processing unless your non-combining means non-associative operations per Key. Best, Ovidiu On 12 Dec 2017, at 11:54, Jan Lukavský <je...@seznam.cz> wrote: Hi Fabian, thanks for quick reply, what you suggest seems to wo

Re: Storing large lists into state per key

2017-12-14 Thread Jan Lukavský
can do some incremental processing unless your non-combining means non-associative operations per Key. Best, Ovidiu On 12 Dec 2017, at 11:54, Jan Lukavský <je...@seznam.cz> wrote: Hi Fabian, thanks for quick reply, what you suggest seems to work at first sight, I will try it. Is the

Re: Storing large lists into state per key

2017-12-13 Thread Jan Lukavský
small rate of new events per Key, if needed to process all values of a Key for each new event. Best case you can do some incremental processing unless your non-combining means non-associative operations per Key. Best, Ovidiu On 12 Dec 2017, at 11:54, Jan Lukavský <je...@seznam.cz> wr

Re: Storing large lists into state per key

2017-12-12 Thread Jan Lukavský
pState allows to fetch and traverse the key, value, or entry set of the Map without loading it completely into memory. The sets are traversed in sort order of the key, so should be in insertion order (given that you properly increment the list index). Best, Fabian 2017-12-12 10:23 GMT+01:00 Jan Lu

Storing large lists into state per key

2017-12-12 Thread Jan Lukavský
Hi all, I have a question that appears as a user@ question, but brought me into the dev@ mailing list while I was browsing through the Flink's source codes. First I'll try to briefly describe my use case. I'm trying to do a group-by-key operation with a limited number of distinct keys (which