zhijiangW commented on issue #8242: [FLINK-6227][network] Introduce the 
DataConsumptionException for downstream task failure
URL: https://github.com/apache/flink/pull/8242#issuecomment-490529494
 
 
   Thanks for the further review @tillrohrmann . Let me try to explain above 
two concerns.
   
   - Whether to always transform `IOException` to `PartitionNotFoundException`? 
Actually I would like to do this wrapper in the construtor of 
`SpilledSubpartitionView` which might cause `IOException` during opening disk 
file. And based on the current codes, this is the only process which might 
cause `IOExeption` during creating `ResultSubpartitionView`. But considering 
the `SpillableSubpartitoin` and its reader view would be removed by stephan's 
new `BoundedBlockingSubpartition` soon, so I do this wrapper as the way in PR 
which might also suitable for the new `BoundedBlockingSubpartition`. But in 
strict way it would   be better to transform in the specific process instead of 
covering the whole process. If you have concerns on this, I could adjust it to 
the internal process even though it would be removed soon. Or we do not focus 
on the case `c` until `BoundedBlockingSubpartition` is merged.
   
   - Whether to check file exist on producer side and throw 
`PartitionNotFoundException` there?  I think the check could only be done on 
producer side during opening the file. Maybe there are other options but I have 
not thought of now. But the producer might not throw 
`PartitionNotFoundException` after checking file nonexist. Another option is 
the producer throws another special exception called `PartitionOpenException`, 
then it would be sent to consumer via `ErrorResponse` in network. The consumer 
would fail directly and wrap it into `PartitionNotFoundException` for JM. To do 
so we could avoid the process of checking partition state on consumer side when 
receiving `PartitionNotFoundException`. But we need another new exception 
defination. My current way in PR is easy and actually for blocking partition 
the consumer could also fail directly after receiving 
`PartitionNotFoundException` as I mentioned before.
   
   I could create a ticket jira for the connection issue later.
   
   The unit tests I have not focused yet, because it is related to the process 
we want to implement. After you approval the current way, I would fix the 
existing tests and add new tests to cover the cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to