Re: Flink operator stuck on created

2021-09-20 Thread Dave Maughan
I should note - this job is being run in batch mode. Could there be a
deadlock related to FLINK-16430?

On Mon, 20 Sept 2021 at 11:26, Dave Maughan  wrote:

> Hi,
>
> I have a Flink job on EMR with an operator stuck on CREATED. The subtasks
> are not being assigned to task manager slots. The previous operator is
> running and has non-zero Bytes Sent and Records Sent. When the job started
> the Job manager requested new workers to start a bunch of the operators but
> it's not requesting any more so the available slots is 0 and the job just
> seems to have stalled. Any pointers on what I might be doing wrong?
>
> I'm specifying parallelism 24 and that's how many task slots are being
> requested by the job manager but also how many subtasks are being created
> each operator. Should I (if so how) specify these two numbers separately?
>
> Thanks,
> Dave
>


Flink operator stuck on created

2021-09-20 Thread Dave Maughan
Hi,

I have a Flink job on EMR with an operator stuck on CREATED. The subtasks
are not being assigned to task manager slots. The previous operator is
running and has non-zero Bytes Sent and Records Sent. When the job started
the Job manager requested new workers to start a bunch of the operators but
it's not requesting any more so the available slots is 0 and the job just
seems to have stalled. Any pointers on what I might be doing wrong?

I'm specifying parallelism 24 and that's how many task slots are being
requested by the job manager but also how many subtasks are being created
each operator. Should I (if so how) specify these two numbers separately?

Thanks,
Dave


Re: Data type serialization and testing

2021-04-30 Thread Dave Maughan
Hi Timo,

Thanks for your suggestions. I did notice the chaining option. I'll give
them a try.

Is there an established pattern for running tests against a local cluster?
Or any examples you could point me to? I did notice a FlinkContainer
(testcontainers) but it appears to be in a module that is not published.

Thanks,
Dave

On Fri, 30 Apr 2021 at 13:11, Timo Walther  wrote:

> Hi Dave,
>
> maybe it would be better to execute your tests against a local cluster
> instead of the mini cluster. Also object reuse should be disabled and
> chaining should be disabled to force serialization.
>
> Maybe others have better ideas.
>
> Regards,
> Timo
>
> On 30.04.21 10:25, Dave Maughan wrote:
> > Hi,
> >
> > I recently encountered a scenario where the data type being passed
> > between operators in my streaming job was modified such that it broke
> > serialization. This was due to a non-Avro top-level data type containing
> > an Avro field. The existing integration test (mini cluster) continued to
> > work and unit tests that attempted to cover Kryo serialization continued
> > to work, but when deployed to a real cluster it failed. The problem was
> > easily solved but in future I'd like to catch problems like this in my
> > testing.
> >
> > Is there a way to force serialization always between all operators in
> > the mini-cluster? Or is there another strategy I can apply to exercise
> > the serialization of my data types?
> >
> > Thanks,
> > Dave
>
>


Data type serialization and testing

2021-04-30 Thread Dave Maughan
Hi,

I recently encountered a scenario where the data type being passed between
operators in my streaming job was modified such that it broke
serialization. This was due to a non-Avro top-level data type containing an
Avro field. The existing integration test (mini cluster) continued to work
and unit tests that attempted to cover Kryo serialization continued to
work, but when deployed to a real cluster it failed. The problem was easily
solved but in future I'd like to catch problems like this in my testing.

Is there a way to force serialization always between all operators in the
mini-cluster? Or is there another strategy I can apply to exercise the
serialization of my data types?

Thanks,
Dave