Re: Flink operator stuck on created
I should note - this job is being run in batch mode. Could there be a deadlock related to FLINK-16430? On Mon, 20 Sept 2021 at 11:26, Dave Maughan wrote: > Hi, > > I have a Flink job on EMR with an operator stuck on CREATED. The subtasks > are not being assigned to task manager slots. The previous operator is > running and has non-zero Bytes Sent and Records Sent. When the job started > the Job manager requested new workers to start a bunch of the operators but > it's not requesting any more so the available slots is 0 and the job just > seems to have stalled. Any pointers on what I might be doing wrong? > > I'm specifying parallelism 24 and that's how many task slots are being > requested by the job manager but also how many subtasks are being created > each operator. Should I (if so how) specify these two numbers separately? > > Thanks, > Dave >
Flink operator stuck on created
Hi, I have a Flink job on EMR with an operator stuck on CREATED. The subtasks are not being assigned to task manager slots. The previous operator is running and has non-zero Bytes Sent and Records Sent. When the job started the Job manager requested new workers to start a bunch of the operators but it's not requesting any more so the available slots is 0 and the job just seems to have stalled. Any pointers on what I might be doing wrong? I'm specifying parallelism 24 and that's how many task slots are being requested by the job manager but also how many subtasks are being created each operator. Should I (if so how) specify these two numbers separately? Thanks, Dave
Re: Data type serialization and testing
Hi Timo, Thanks for your suggestions. I did notice the chaining option. I'll give them a try. Is there an established pattern for running tests against a local cluster? Or any examples you could point me to? I did notice a FlinkContainer (testcontainers) but it appears to be in a module that is not published. Thanks, Dave On Fri, 30 Apr 2021 at 13:11, Timo Walther wrote: > Hi Dave, > > maybe it would be better to execute your tests against a local cluster > instead of the mini cluster. Also object reuse should be disabled and > chaining should be disabled to force serialization. > > Maybe others have better ideas. > > Regards, > Timo > > On 30.04.21 10:25, Dave Maughan wrote: > > Hi, > > > > I recently encountered a scenario where the data type being passed > > between operators in my streaming job was modified such that it broke > > serialization. This was due to a non-Avro top-level data type containing > > an Avro field. The existing integration test (mini cluster) continued to > > work and unit tests that attempted to cover Kryo serialization continued > > to work, but when deployed to a real cluster it failed. The problem was > > easily solved but in future I'd like to catch problems like this in my > > testing. > > > > Is there a way to force serialization always between all operators in > > the mini-cluster? Or is there another strategy I can apply to exercise > > the serialization of my data types? > > > > Thanks, > > Dave > >
Data type serialization and testing
Hi, I recently encountered a scenario where the data type being passed between operators in my streaming job was modified such that it broke serialization. This was due to a non-Avro top-level data type containing an Avro field. The existing integration test (mini cluster) continued to work and unit tests that attempted to cover Kryo serialization continued to work, but when deployed to a real cluster it failed. The problem was easily solved but in future I'd like to catch problems like this in my testing. Is there a way to force serialization always between all operators in the mini-cluster? Or is there another strategy I can apply to exercise the serialization of my data types? Thanks, Dave