[jira] [Created] (FLINK-35351) Restore from unaligned checkpoint with custom partitioner fails.

2024-05-14 Thread Dmitriy Linevich (Jira)
Dmitriy Linevich created FLINK-35351:


 Summary: Restore from unaligned checkpoint with custom partitioner 
fails.
 Key: FLINK-35351
 URL: https://issues.apache.org/jira/browse/FLINK-35351
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Reporter: Dmitriy Linevich


Restore from unaligned checkpoint from job with custom partitioner (
exactly because of using SubtaskStateMapper.FULL), with a change in parallelism 
of one vertex failed:
 
{code:java}
[db13789c52b80aad852c53a0afa26247] Task [Sink: sink (3/3)#0] WARN  Sink: sink 
(3/3)#0 (be1d158c2e77fc9ed9e3e5d9a8431dc2_0a448493b4782967b150582570326227_2_0) 
switched from RUNNING to FAILED with failure cause:
java.io.IOException: Can't get next record for channel 
InputChannelInfo{gateIdx=0, inputChannelIdx=0}
    at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:106)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:600)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:930)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:879) 
~[classes/:?]
    at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:960)
 ~[classes/:?]
    at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:939) 
[classes/:?]
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:753) 
[classes/:?]
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:568) [classes/:?]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: java.io.IOException: Corrupt stream, found tag: -1
    at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:222)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:44)
 ~[classes/:?]
    at 
org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:53)
 ~[classes/:?]
    at 
org.apache.flink.runtime.io.network.api.serialization.NonSpanningWrapper.readInto(NonSpanningWrapper.java:337)
 ~[classes/:?]
    at 
org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.readNonSpanningRecord(SpillingAdaptiveSpanningRecordDeserializer.java:128)
 ~[classes/:?]
    at 
org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.readNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:103)
 ~[classes/:?]
    at 
org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:93)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.io.recovery.DemultiplexingRecordDeserializer$VirtualChannel.getNextRecord(DemultiplexingRecordDeserializer.java:79)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.io.recovery.DemultiplexingRecordDeserializer.getNextRecord(DemultiplexingRecordDeserializer.java:154)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.io.recovery.DemultiplexingRecordDeserializer.getNextRecord(DemultiplexingRecordDeserializer.java:54)
 ~[classes/:?]
    at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:103)
 ~[classes/:?]
    ... 10 more {code}
 

Example: Source[2] -> Sink[3]   restore for Source[1] -> Sink[3]

This fail happens because outputs of source and inputs of sink can consists 
only 1 part of stream record during checkpoint.  After restore 2 parts of one 
record can be sent to different inputs of the sink. Because of parallelism of 
sink not changed, inputs of sink don't know about other channels, only about 
yours.

I think for fix need rescale inputs for sink, if source outputs was rescaled .

For fix need to add 
[here|[https://github.com/apache/flink/blob/4165bac27bda4457e5940a994d923242d4a271dc/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java#L424]:]
{code:java}
boolean noNeedRescale = stateAssignment.executionJobVertex
.getJobVertex()
.getInputs()
.stream()
.map(JobEdge::getDownstreamSubtaskStateMapper)
.anyMatch(m -> !m.equals(SubtaskStateMapper.FULL))
&& stateAssignment.executionJobVertex
.getInputs()
.stream()
.map(IntermediateResult::getProducer)

[jira] [Created] (FLINK-35154) Javadoc aggregate fails

2024-04-18 Thread Dmitriy Linevich (Jira)
Dmitriy Linevich created FLINK-35154:


 Summary: Javadoc aggregate fails
 Key: FLINK-35154
 URL: https://issues.apache.org/jira/browse/FLINK-35154
 Project: Flink
  Issue Type: Bug
Reporter: Dmitriy Linevich


Javadoc plugin fails with error cannot find symbol. Using
{code:java}
javadoc:aggregate{code}
ERROR:

!image-2024-04-18-15-20-56-467.png!

 

Need to increase version of javadoc plugin from 2.9.1 to 2.10.4 (In current 
version exists bug same with 
[this|https://github.com/checkstyle/checkstyle/issues/291])



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33034) Incorrect StateBackendTestBase#testGetKeysAndNamespaces

2023-09-04 Thread Dmitriy Linevich (Jira)
Dmitriy Linevich created FLINK-33034:


 Summary: Incorrect StateBackendTestBase#testGetKeysAndNamespaces
 Key: FLINK-33034
 URL: https://issues.apache.org/jira/browse/FLINK-33034
 Project: Flink
  Issue Type: Bug
  Components: Runtime / State Backends
Affects Versions: 1.17.1, 1.15.0, 1.12.2
Reporter: Dmitriy Linevich
 Fix For: 1.17.1, 1.15.0
 Attachments: image-2023-09-05-12-51-28-203.png

In this test first namespace 'ns1' doesn't exist in state, because creating 
ValueState is incorrect for test. Need ti fix it to change creating ValueState 
or to change process of updating this state.

 

If to add following code for checking count of adding namespaces to state 
[here|https://github.com/apache/flink/blob/3e6a1aab0712acec3e9fcc955a28f2598f019377/flink-runtime/src/test/java/org/apache/flink/runtime/state/StateBackendTestBase.java#L501C28-L501C28]
{code:java}
assertThat(keysByNamespace.size(), is(2)); {code}
then

!image-2023-09-05-12-51-28-203.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)