Hi everyone, In light of all the conversation on compatibility, I wanted to float the idea of documenting which Java packages, classes, and methods we want to declare as being API compatible in 0.94.x. I'd like your input on: 1. JavaDoc vs. using AudienceInterface 2. What the javadoc notation should look like 3. Which pieces of code should be tagged
What do I mean by documenting API compatibility? That means that we suggest the anyone building applications use specific methods because they would continue to be both binary and RPC-compatible going forward. Any application written, either running on a node of a cluster or on a remote machine, would continue to work properly without recompile for all versions of 0.94.x running on the cluster. *Benefits:* It would prevent developers from using calls that are subject to change. This would give developers more confidence in using the platform, which will encourage more development on our platform. 0.94 will still be with us for some time and I think the better-late-than-never approach will save us pain down the road. Finally, it would allow us to more easily verify that we are in fact API compatible. *Can we use AudienceInterface?* HBase 0.94 can be compiled against both hadoop 0.2x, 1.x, and 2.0.x. In the case of 0.2x, the AudienceInterface classes were not bundled. Therefore, we cannot expect HBase 0.94 to support it. For that reason, I think JavaDoc might be better. On the other hand, perhaps we might just want to bundle AudienceInterface with 0.94 going forward? Then we can have consistent annotations in 0.94, 0.95, and 0.96 without worrying about the hadoop version. Please correct me if I'm wrong about any of the above. *Clarification of RPC compatibility:* We care about RPC compatibility when we create clients that bundle their dependency jars with them. These jars are used to form a request that is executed on a remote machine (i.e. the cluster). If the cluster is upgraded and no longer recognizes the command, then this will break RPC compatibility. *Clarification of Binary compatibility:* We care about binary compatibility when a client is created and compiled, and the jars on which is depends change. It should still be able to form requests using those jars. If the cluster is upgraded and the compiled client code cannot find a method it was depending on to be there, we break binary compatibility. A recent example is in 0.94.2, where the return type of HColumnDescriptor.setMaximumVersions was changed and those who upgraded received this error: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.** HColumnDescriptor.**setMaxVersions(I)V *What we currently have:* We have an @audience annotation set up in 0.95/0.96. In 0.94, I suggest either adding JavaDoc or pulling in the AudienceInterface annotation. *Suggested Javadoc language:* @custom.94_api *Granularity:* Just to the class level. The native java access level (e.g. public, protected, etc.) should indicate what should be kept compatible. *Suggested classes:* Here is a first-cut of things that should be declared and documented as public APIs. This list was obtained from looking at some MapReduce over HBase example code. *JAVA API:* *org.apache.hadoop.hbase (some selected classes, see below) org.apache.hadoop.hbase.client.* org.apache.hadoop.hbase.filter.* org.apache.hadoop.hbase.io.hfile.Compression.Algorithm org.apache.hadoop.hbase.util.* org.apache.hadoop.hbase.mapreduce.** *REST API: org.apache.hadoop.hbase.rest.client.** *Thrift API: All methods defined in: /hbase/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift* *Selected classes in org.apache.hadoop.hbase:* *import org.apache.hadoop.hbase.ClusterStatus; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HRegionInfo; import org.apache.hadoop.hbase.HRegionLocation; import org.apache.hadoop.hbase.HServerAddress; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.KeyValue;* -- Best Regards, Aleks Shulman 847.814.5804 Cloudera