Below are rough minutes from the developer meetup last Thursday week down at HortonWorks: http://www.meetup.com/hackathon/events/144366512/
The agenda was fast moving and the notes I kept were sparse (pardon me). Hopefully the below at least conveys some flavor of what transpired. Below is proposed agenda with discussion filled in in between in italics. Git/Gerrit<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.bi2n1jnf8wr9> Lieutenants<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.ts3zsiollp5s> MTTR<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.ngm4d0ee18u2> Distributed log replay<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.5nxjf2azyr63> Region online for writes<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.7h4u62q0yu1t> Client<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.dxyzoaydy7up> 0.98<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.jny6o6f3j3x8> 1.0<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.4sjuoukolha1> Release passes IT tests for time period before release?<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.ijcofohq0rk1> Sequenceid<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.14eyjkb3glf1> MVCC<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.hbv260gur4ui> Multiwal<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.tab1edfjmpgq> Speculative Read<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.3xr7mrnpmpby> Favored node finish up<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.2s9y1srcix0z> Compactions<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.23pauxcj0ux8> Stripe?<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.xsncn8yoed07> Master Redo<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.ngm4d0ee18u2> Secondary indices from phoenix<https://docs.google.com/document/d/1pEFqKKxnJWLyOc_jsqmpWcBHl-kC98a969JJ9tVVZEE/edit#heading=h.mk56oao2p3vj> Git/Gerrit Move to this? Or do Git without gerrit ? All were +1 on going to GIT. A gentleman Accumulo fellow present volunteered to pass us his script for how they made the transition. DISCUSSION already started on dev list. Lieutenants/Component Owners Review. How is this going? What can we do to improve? Not enough reviews going on. The list of Component Owners needs an edit. It is stale. We need to revive/renew/refresh this initiative. More Friction Committing Not enough reviews of stuff going in and this is a db after all? +1 (eclark) +1 from Dave Generally agreed. Pointed at Lieutenants/Component Owners role. Was thought there should be some sort of automated performance test as part of hadoopqa. A benchmark proving no degradation of a claimed improvement is required. Better was adding some set of general micro benchmarks... and then do them on your machine before and after and paste findings. Compat testing Now that 0.96.0 is out the door, we should be careful about changes that are compat-breaking. Any ideas on how to best avoid/detect such changes? Also maybe have a discussion about how wire compat/protobufs changes how we do things, how to best utilize it, pitfalls to avoid, what it can’t do? Formalize compat matrix? Go over the matrix we had 1 year ago. Do we want to support all combinations, or only between two major releases with some rolling upgrade model? How to avoid current situation where client does not scale -- found at last minute running on 80 node cluster! Test framework do this? But would have to be a non-apache context because no resources there to do it. There are the jdiff and compare. We have to do rolling upgrade tests. Deprecations in .protos too. Remove after a release. Just go version numbers. Write up the matrix. How to avoid issues like the recent client not scaling. Need perf tests. Need microbenchmarks. Modular so can pull out and simulate rpc. Tests like that. Standup simple server done in proto MTTRDistributed log replay Relax semantics; it is ok to allow out-of-order edits? Or fix? Distributed log replay going to be done in 0.98 using tags. Region online for writes Lets just do this. Client - Short term perf/scalability regression (0.96) time frame - Long term (0.98/1.x/2.x) time frame. - Asynchbase on 0.96 - almost there. Should we remove support for setting timestamp. Lars says set it in HTableDescription. Sequence number and ts. Sequenceid Perhaps add new coordinate. Out-of-order deletes. Ts w/ the value. Make it configurable and do optimization. *Aditya brought up new c-client effort.* 0.98 - tags. - reverse scan - issue with cells? (intel guys at mtg?) Where is the branch? 1.0 is 0.98 == 1.0? - when to drop hadoop1 support? next year? 98.x becomes 1.0 Security not as a CP but in codepath. Do check in the code. Make security first class. But security was always good for dev'ing CP. Make it a required CP. Integrated w/ simple testing. Enable Authorization by default. Permissive mode needs to be added. Small perf price. We need better defaults -- especially for 1.0. Revisit the docs, the refguide. An edit. Questions, the docbook.... Formatting is a pain. Docathon? ….but no one would show up. Drop hadoop1 post 1.0. New release adoption - What can we do for 0.96 adoption - People are still using 0.90 Within versions, update... You are in the middle of a rolling upgrade. Can we make it easier for folks to update? Advertising that upgrade to 0.96 is successful. Rolling upgrade needs to be in the master, not out in a script. Release passes IT tests for time period before release? 48 hours Real honest, regionserver, datanodes, zookeepers all being killed. Committers releasing it for a time. Failing a 48 hour test is enough to sink an RC. We have a bar and we should keep raising it. Elliott will put up bar for test release. SequenceidMVCC Unify sequenceid and mvcc Do we need locks at all? MultiwalSpeculative Read Raise awareness with hdfs brothers and sisters Favored node finish up Add consideration to the balancer Need to fix balancer. Not AM job. Its in AM and in Balancer. CompactionsStripe? Get some empirical data to see if different types of compactions are better. Master Redo Jimmy says evolve instead of rewrite Secondary indices from phoenix?
