-User : this is really a core dev technical treatise :) Some things to think about before we collaborate next Tuesday. I think some questions need to be really clearly understood before 1.0...
What does 1.0 mean? - Feature complete for majority of use cases? Ultra stable? Something that is long lasting? I'm guessing all of the above, emphasis on lasting. Are we feature complete for the 80% case? - I know that we need & are still developing hourly snapshots. - How confident are we about cross-DC replication? We're currently working on master-master, which is a requirement for many of our future use cases. - Do we have confidence about a finalized CoProcessor API? Would it be nice to have one rev of iteration on this once people use it en-masse? I know we've gone over this on the public JIRA, but it makes a big difference once everyone feels like it's safe enough to touch. APIs are always tricky & iterative. - What about HBCK -Fix? This is a requirement for us. Are there other scripts that we should write to repair a broken system? What about repair of various ZK uses? - When can we deprecate the old 'mapred' user API? That's confusing. - How do we feel about the Thrift server? It seems like everyone has their own customizations here. Seems like performant & stable multi-language support would be critical for a 1.0. How confident are we in telling people to use HBase - When users come to us with questions, do we normally point them to the HBase book or some known material? - Do we understand how to design an optimal schema? - Do we understand options for server partitioning and hardware setup? - What is the optimal way to create a table? - What are recommended config settings to look at? Why are they recommended? - What configs does a novice user looks at? What configs does a power user (not developer)? + I think Lars' HBase book has been a huge help here. Aligning our 1.0 goals & recommendations with 2nd edition of his book should be critical. In general, I think that announcing a 1.0 will mean that we will attract more people, but also more finicky users that will be upset if they have to look at the debug logs much & won't understand why it doesn't "just work". I think that's where consulting companies will come in, help, & be happy. I'm a little worried about the fact that there's region off-lining issues from time-to-time, but I don't have new master experience. In general, I wonder if it would be better to wait until 96 (1 more FF) before announcing 1.0. I also wonder if it would be better to stabilize on a RC and label it 1.0 post-release when everything is smooth. As an example, HDFS 0.20.205 is really the best HDFS 1.0. Then again, maybe acting like HBase 0.94 will inevitably be 1.0 is a way to get various groups to focus.