Hi All, I am seeing some same class names, even though in different package names, that could confuse new contributors. One of the attractiveness of Spark that it is the code structure is simple to follow than Hadoop (or Hive for that matter).
For example we have IntermediateResultPartition in both partition and executiongraph packages, which both are under runtime parent package. To make it more difficult, some of these duplicate classes have no Javadoc or comment why the class exist and how does it relates to other existing code, one has to trace the code and figure out where the code is used and how it is impacting or differ the others existing classes. I would like to propose the "no duplicate class name if possible" (which I know is possible) in the how to contribute code guide. - Henry