from:"Bob Tiernay"

Missing support for `TestStreamEnvironment#executeAsync`

2021-03-08 Thread Bob Tiernay

Hi all,

I have been trying to test a Flink 1.11 streaming job using the
`DataStreamUtils#collect` utility against a `MiniCluster` based test.
However, I noticed an issue when doing so.

`TestStreamEnvironment` does not implement `executeAsync`. Thus
when `DataStreamUtils#collect` is called, it invokes
`env.executeAsync("Data Stream Collect");` which will instead use
`StreamExecutionEnvironment#executeAsync`'s implementation. This is
problematic since it will create a brand new `MiniCluster` when the
following lines are hit:

CompletableFuture jobClientFuture = executorFactory
   .getExecutor(configuration)
   .execute(streamGraph, configuration);


Any configurations that were applied during the test won't be respected. Is
this expected behavior?

Thanks in advance,

Bob

Re: Missing support for `TestStreamEnvironment#executeAsync`

2021-03-09 Thread Bob Tiernay

Great, thank you so much!

On Tue, Mar 9, 2021 at 1:08 PM Till Rohrmann  wrote:

> *This message originated outside your organization.*
>
> --
>
> Hi Bob,
>
> Thanks for reporting this issue. I believe that this has been an
> oversight. I have filed a JIRA issue for fixing this problem [1].
>
> [1] https://issues.apache.org/jira/browse/FLINK-21693
> <https://urldefense.com/v3/__https://issues.apache.org/jira/browse/FLINK-21693__;!!PwKahg!vmZiCgmovUl2NR7HnLKGIDXkpYpfIEt2D1cL9RBAEByJiPoqwjDPC2CThEDnTyzx$>
>
> Cheers,
> Till
>
> On Mon, Mar 8, 2021 at 4:15 PM Bob Tiernay  wrote:
>
>> Hi all,
>>
>> I have been trying to test a Flink 1.11 streaming job using the
>> `DataStreamUtils#collect` utility against a `MiniCluster` based test.
>> However, I noticed an issue when doing so.
>>
>> `TestStreamEnvironment` does not implement `executeAsync`. Thus
>> when `DataStreamUtils#collect` is called, it invokes
>> `env.executeAsync("Data Stream Collect");` which will instead use
>> `StreamExecutionEnvironment#executeAsync`'s implementation. This is
>> problematic since it will create a brand new `MiniCluster` when the
>> following lines are hit:
>>
>> CompletableFuture jobClientFuture = executorFactory
>>.getExecutor(configuration)
>>.execute(streamGraph, configuration);
>>
>>
>> Any configurations that were applied during the test won't be respected.
>> Is this expected behavior?
>>
>> Thanks in advance,
>>
>> Bob
>>
>

Re: entrypoint for executing job in task manager

2021-03-10 Thread Bob Tiernay

Hi Steven, 

Curious how you solved this for you use case. Did you ever find a
satisfactory approach?

Thanks in advance,

Bob



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: User metrics outside tasks

2021-03-11 Thread Bob Tiernay

I too think this would be a useful capability for the job manager to be able
to send metrics easily. Sometimes additional compute responsibilities are
placed in the job manager and having a convenient way to add telemetry data
into a metrics stream would be very useful.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: entrypoint for executing job in task manager

2021-04-07 Thread Bob Tiernay

Please see  FLINK-14184   
which should fully support such use cases in the future. Feel free to vote
for it if you believe it would help your use case.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: How to verify what maxParallelism is set to?

2021-04-29 Thread Bob Tiernay

I agree that a way to introspect the effective current value would be a great
observability tool for sanity checking. 

Fabian, do you know if a ticket was ever created?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Choice of time characteristic and performance

2021-05-14 Thread Bob Tiernay

I was wondering if the choice of time characteristic (ingestion, processing
or event time) makes a difference to the performance of a job that isn't
using windowing or process functions. For example, in such a job is it
advisable to disable auto wartermarking and use the default? Or is this in
combination an explicit choice of one characteristic more optimal?

More generally it would be good to know how this choice effects a job. 

Anyone have any details or empirical evidence about this?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Choice of time characteristic and performance

2021-05-25 Thread Bob Tiernay

Thanks for you guidance Robert!

Do you think disabling watermarking would help in terms of a slight
performance boost in such scenarios? 

Bob



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Missing support for `TestStreamEnvironment#executeAsync`

Re: Missing support for `TestStreamEnvironment#executeAsync`

Re: entrypoint for executing job in task manager

Re: User metrics outside tasks

Re: entrypoint for executing job in task manager

Re: How to verify what maxParallelism is set to?

Choice of time characteristic and performance

Re: Choice of time characteristic and performance

8 matches

Site Navigation

Mail list logo

Footer information