Any comments?Is this simply not a concern? -- Lars From: lars hofhansl <la...@apache.org> To: Dev <dev@phoenix.apache.org> Sent: Wednesday, September 13, 2017 10:22 AM Subject: Fw: Phoenix code quality Hi all Phoenix developers, here's a thread that I had started on the private PMC list, and we agreed to have this as a public discussion.
I'd like to solicit feedback on the 6 steps/recommendations below and about we can ingrain those into the development process. Comments, concerns, are - as always - welcome! -- Lars ----- Forwarded Message ----- From: lars hofhansl <la...@apache.org> To: Private <priv...@phoenix.apache.org> Sent: Tuesday, September 5, 2017 9:59 PM Subject: Phoenix code quality Hi all, I realize this might be a difficult topic, and let me prefix this by saying that this is my opinion only. Phoenix is coming to a point where big organizations are relying on it. At Salesforce we do billions of Phoenix queries per day... And we had a bunch of recent production issues - only in part caused by Phoenix. If there was a patch here and there that lacks quality, tests, comments, or proper documentation, then it's the fault of the person who created the patch. If, however, this happens with some frequency, then it a problem that should involve PMC and committers who review and commit the patches in question. I'd like to suggest the following: 1. Comments in the code should be considered when judging a patch for its merit. No need to go overboard, but there should be enough comments so that someone new the code can get an idea about what this code is doing. 2. Eyeball each patch for how it would scale. Will it all work on 1000 machines? With 1bn rows? With 1000 indexes? etc, etc.If it's not obvious, ask the creator of the patch. Agree on what the scaling goals should be.(For anything that works only for a few million rows or on a dozen machines, nobody in their right mind would accept the complexity of running Phoenix - and HBase, HDFS, ZK, etc - folks would and should simply use Postgres.) 3. Check how a patch will behave under failure. Machines failures are common. Regions may not reachable for a bit, etc. Are there good timeouts? Everything should gracefully continue to work. 4. Double check that tests check for corner conditions. 5. Err on the side of stability, rather than committing a patch as beta. If it's in the code, people _will_ use it. 6. Are all config options properly explained and make sense? It's better to err on the side of fewer config options. 7. Probably more stuff... Again. Just MHO. Many of these things are already done. But I still thought might be good to have a quick discussion around this. Comments? Thanks. -- Lars