zhijiangW commented on issue #8242: [FLINK-6227][network] Introduce the DataConsumptionException for downstream task failure URL: https://github.com/apache/flink/pull/8242#issuecomment-490529494 Thanks for the further review @tillrohrmann . Let me try to explain above two concerns. - Whether to always transform `IOException` to `PartitionNotFoundException`? Actually I would like to do this wrapper in the construtor of `SpilledSubpartitionView` which might cause `IOException` during opening disk file. And based on the current codes, this is the only process which might cause `IOExeption` during creating `ResultSubpartitionView`. But considering the `SpillableSubpartitoin` and its reader view would be removed by stephan's new `BoundedBlockingSubpartition` soon, so I do this wrapper as the way in PR which might also suitable for the new `BoundedBlockingSubpartition`. But in strict way it would be better to transform in the specific process instead of covering the whole process. If you have concerns on this, I could adjust it to the internal process even though it would be removed soon. Or we do not focus on the case `c` until `BoundedBlockingSubpartition` is merged. - Whether to check file exist on producer side and throw `PartitionNotFoundException` there? I think the check could only be done on producer side during opening the file. Maybe there are other options but I have not thought of now. But the producer might not throw `PartitionNotFoundException` after checking file nonexist. Another option is the producer throws another special exception called `PartitionOpenException`, then it would be sent to consumer via `ErrorResponse` in network. The consumer would fail directly and wrap it into `PartitionNotFoundException` for JM. To do so we could avoid the process of checking partition state on consumer side when receiving `PartitionNotFoundException`. But we need another new exception defination. My current way in PR is easy and actually for blocking partition the consumer could also fail directly after receiving `PartitionNotFoundException` as I mentioned before. I could create a ticket jira for the connection issue later. The unit tests I have not focused yet, because it is related to the process we want to implement. After you approval the current way, I would fix the existing tests and add new tests to cover the cases.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services