Shanthoosh,
Thank you for suggesting and submitting this SEP:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75957309
Couple of things I would want to point out so far:
1. Kudos on cleaning up the interface and introducing new ones
(LocalityInfo and LocalityManager). I think we also need MetadataStorage
one (details may be worked out later) to hide the locality storage
implementation details.
2. Instead of using physical hostname we should stick to the LocationId,
since some VMs may be running multiple processors on a single physical host.
3. Thank you for adding the diagrams. I think we can improve them little
bit.
- First diagram describes how local storage works. Please label it as
such.
- Second diagram describes the flow of JobModel generation. I am not
sure if actual pictures help here. Consider writing it as a list.
- Third diagram. Host affinity implementation flow. This is very
helpful. I think, though, using function names doesn't give
enough clarity
on what is going on. May be we should add more explanation. For example:
group(InputSSP) -> generate list of SSPs from the list of input
streams/partitions.
readTaskLocalityInfo() -> read locality mapping from the
MetaDataStorage.
Also we should add another step there - each processor will update
locality information based on its mapping in the current JobModel.
4. Some time the perfect mapping to the same Locality is not possible
(especially when a task dies and is distributed between other tasks). What
should we do in this case?
Thanks again. I will keep reading the document.