I spent the past week going through most of the jiras with a patch attached in the past, and turned up some really good stuff to helps improve HDFS performance.
The list of jiras are listed in the following spreadsheet. If you are interested in reviewing those jiras, please update the following spreadsheet and add you as a reviewer. A reviewer does not need to be a Hadoop committer, but it helps to give the author the feedback. https://docs.google.com/spreadsheets/d/1dvLoZ039ZirdZF9p0wWKhFCtD91jfbdkPg4XZ-AnMNg/edit?usp=sharing I am doing this exercise to identify known performance limitations + fixes submitted but never got committed. There are cases where patch was reviewed or even blessed with +1, but didn't pushed to the repo; there are cases where good ideas never got reviewed. I think this is the low hanging fruit that we as a community should do. I use this filter to search for Hadoop/HDFS patches, if you are interested: https://issues.apache.org/jira/issues/?filter=12311124&jql=project%20in%20(HADOOP%2C%20HDFS)%20AND%20status%20%3D%20%22Patch%20Available%22%20ORDER%20BY%20updated%20DESC%2C%20key%20DESC Best, Wei-Chiu