All,
I've been crawling around the HBase codebase for a little while now,
and I think I have a proposal that would make it easer to find your
way around in the codebase in general.
I think that we should make three new packages below
org.apache.hadoop.hbase, client, master, and regionserver. The client
package would contain HTable, client-side scanning stuff, HBaseAdmin,
the MapReduce-related stuff, the shell, REST and Thrift. The master
package would contain HMaster, maybe Leases, any other classes that
belong to master. The regionserver package would contain
HRegionServer, HRegion, HStore and all its subclasses (HStoreFile,
etc). Whatever is left over should be stuff that's pretty common to
all the sub-packages, so we can either leave that in the hbase
package, or push it down into a common subpackage.
This would make it much easier for new contributors to decide where
to look for stuff, as well as make it more obvious what the
architectural divisions of the system are. To boot, it would allow us
to reorganize our tests into similar subpackages, which has the
advantage of allowing us to think about, for instance, client tests
passing/failing as a group, rather than scattered alphabetically
throughout the entire suite.
This idea would probably erase HADOOP-2518, or at least change the
goal to factor HStore down into o.a.h.h.regionserver.store.
Comments?
-Bryan