I've read through the thread and have a few comments and and idea. 1) I can understand a preference for opt in 2) As a user I would have probably opted in every time I hit a performance issue 3) Opt in may well be skewed to poorer use cases or hardware issues 4) There is a trust gap that needs to be bridged before opt out is acceptable
Now for the Idea, perhaps a report tool, in nodetool that generates a human readable profile, in the short term a manual submission process, perhaps down the line fully automated. So basically there are two good plans in your email 1) Standard reporting (+1) 2) Automated feedback (opt in +1) p ________________________________ From: Jonathan Ellis <jbel...@gmail.com> To: dev <dev@cassandra.apache.org> Sent: Tuesday, 15 November 2011, 23:23 Subject: How is Cassandra being used? I started a "users survey" thread over on the users list (replies are still trickling in), but as useful as that is, I'd like to get feedback that is more quantitative and with a broader base. This will let us prioritize our development efforts to better address what people are actually using it for, with less guesswork. For instance: we put a lot of effort into compression for 1.0.0; if it turned out that only 1% of 1.0.x users actually enable compression, then it means that we should spend less effort fine-tuning that moving forward, and use the energy elsewhere. (Of course it could also mean that we did a terrible job getting the word out about new features and explaining how to use them, but either way, it would be good to know!) I propose adding a basic cluster reporting feature to cassandra.yaml, enabled by default. It would send anonymous information about your cluster to an apache.org VM. Information like, number (but not names) of keyspaces and columnfamilies, ks-level options like compression, cf options like compaction strategy, data types (again, not names) of columns, average row size (or better: the histogram data), and average sstables per read. Thoughts? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com