Re: [DISCUSS] Reporting tool for feeding back zone, pod and cluster information

2013-11-28 Thread Wido den Hollander



On 11/26/2013 10:42 PM, Steve Wilson wrote:

I built something like this for products at Sun Microsystems.  We embedded
into nearly everything we built:

The Java Runtime Environment
Open Office
Solaris
MySQL
Even things like Server LOMs
(the list goes on)

By default, when each of these products installed/first run, it would try
to bring the user into the program.  It was always possible to opt out,
but we really worked to get people to not opt out.  We got shockingly HUGE
piles of data (literally from millions of installed product instances).
We didn't get any complaints (EVER) in the years we ran this program.  It
was hugely useful to the product teams.



I wouldn't go for opt-out by default. We might ask the question during 
the initial management server installation, but it shouldn't be opt-out 
and not informing the user.



BTW, we didn't even make this data anonymous.  You could obviously choose
to be anonymous, but if people want to give their names/companies then why
not let them?  You'd be surprised how many people wouldn't mind.



I wouldn't want a database with all that information in there. I'm also 
aiming for usage about CloudStack, not who uses it.


We might make something where you can claim your data, but I'd 
anonimize it anyway.


Wido


-Steve

On 11/26/13 12:49 PM, Chiradeep Vittal chiradeep.vit...@citrix.com
wrote:


+1.
Of course we must ensure proper treatment of this data (anonymization,
retention, removal, copyrights)

On 11/23/13 11:01 AM, Wido den Hollander w...@widodh.nl wrote:


Hi,

I discussed this during CCCEU13 with David, Chip and Hugo and I promised
I put it on the ml.

My idea is to come up with a reporting tool which users can run daily
which feeds us back information about how they are using CloudStack:

* Hypervisors
* Zone sizes
* Cluster sizes
* Primary Storage sizes and types
* Same for Secondary Storage
* Number of management servers
* Version

This would ofcourse be anonimized where we would send one file with JSON
data back to our servers where we can proccess it to do statistics.

The tool will obviously be open source and participating in this will be
opt-in only.

We currently don't know what's running out there, so that would be great
to know.

Some questions remain:
* Who is going to maintain the data?
* Who has access to the data?
* How long do we keep it?
* Do we do logging of IPs sending the data to us?

I certainly do not want to spy on our users, so that's why it's opt-in
and the tool should be part of the main repo, but I think that for us as
a project it's very useful to know what our users are doing with
CloudStack.

Comments?

Wido






Re: [DISCUSS] Reporting tool for feeding back zone, pod and cluster information

2013-11-26 Thread Daan Hoogland
I'd say the opt in can be a check in the start-up wizard. You want to
make sure enough people run it. Or even a intermediate page on the
admin logins that would have to be disabled explicitly.

rant target=not the idea for this tool! size=little
I do not believe in data collected by optional inquiries. Would kvm
users have more incentive to participate then vmware users? Do
noredist users have the liberty to participate or is it so that oss
users have used their management credit for openness by the time they
get to the tool? Only full coverage data is really useful.
/rant

let's be extra careful in handling and interpreting the results,
Daan

On Mon, Nov 25, 2013 at 11:25 AM, Sebastien Goasguen run...@gmail.com wrote:

 On Nov 23, 2013, at 5:01 AM, Wido den Hollander w...@widodh.nl wrote:

 Hi,

 I discussed this during CCCEU13 with David, Chip and Hugo and I promised I 
 put it on the ml.

 My idea is to come up with a reporting tool which users can run daily which 
 feeds us back information about how they are using CloudStack:

 * Hypervisors
 * Zone sizes
 * Cluster sizes
 * Primary Storage sizes and types
 * Same for Secondary Storage
 * Number of management servers
 * Version

 This would ofcourse be anonimized where we would send one file with JSON 
 data back to our servers where we can proccess it to do statistics.

 The tool will obviously be open source and participating in this will be 
 opt-in only.

 We currently don't know what's running out there, so that would be great to 
 know.

 Some questions remain:
 * Who is going to maintain the data?
 * Who has access to the data?
 * How long do we keep it?
 * Do we do logging of IPs sending the data to us?

 I certainly do not want to spy on our users, so that's why it's opt-in and 
 the tool should be part of the main repo, but I think that for us as a 
 project it's very useful to know what our users are doing with CloudStack.

 Comments?


 +1

 Wido



Re: [DISCUSS] Reporting tool for feeding back zone, pod and cluster information

2013-11-26 Thread Chiradeep Vittal
+1. 
Of course we must ensure proper treatment of this data (anonymization,
retention, removal, copyrights)

On 11/23/13 11:01 AM, Wido den Hollander w...@widodh.nl wrote:

Hi,

I discussed this during CCCEU13 with David, Chip and Hugo and I promised
I put it on the ml.

My idea is to come up with a reporting tool which users can run daily
which feeds us back information about how they are using CloudStack:

* Hypervisors
* Zone sizes
* Cluster sizes
* Primary Storage sizes and types
* Same for Secondary Storage
* Number of management servers
* Version

This would ofcourse be anonimized where we would send one file with JSON
data back to our servers where we can proccess it to do statistics.

The tool will obviously be open source and participating in this will be
opt-in only.

We currently don't know what's running out there, so that would be great
to know.

Some questions remain:
* Who is going to maintain the data?
* Who has access to the data?
* How long do we keep it?
* Do we do logging of IPs sending the data to us?

I certainly do not want to spy on our users, so that's why it's opt-in
and the tool should be part of the main repo, but I think that for us as
a project it's very useful to know what our users are doing with
CloudStack.

Comments?

Wido



Re: [DISCUSS] Reporting tool for feeding back zone, pod and cluster information

2013-11-26 Thread Steve Wilson
I built something like this for products at Sun Microsystems.  We embedded
into nearly everything we built:

The Java Runtime Environment
Open Office
Solaris
MySQL
Even things like Server LOMs
(the list goes on)

By default, when each of these products installed/first run, it would try
to bring the user into the program.  It was always possible to opt out,
but we really worked to get people to not opt out.  We got shockingly HUGE
piles of data (literally from millions of installed product instances).
We didn't get any complaints (EVER) in the years we ran this program.  It
was hugely useful to the product teams.

BTW, we didn't even make this data anonymous.  You could obviously choose
to be anonymous, but if people want to give their names/companies then why
not let them?  You'd be surprised how many people wouldn't mind.

-Steve

On 11/26/13 12:49 PM, Chiradeep Vittal chiradeep.vit...@citrix.com
wrote:

+1. 
Of course we must ensure proper treatment of this data (anonymization,
retention, removal, copyrights)

On 11/23/13 11:01 AM, Wido den Hollander w...@widodh.nl wrote:

Hi,

I discussed this during CCCEU13 with David, Chip and Hugo and I promised
I put it on the ml.

My idea is to come up with a reporting tool which users can run daily
which feeds us back information about how they are using CloudStack:

* Hypervisors
* Zone sizes
* Cluster sizes
* Primary Storage sizes and types
* Same for Secondary Storage
* Number of management servers
* Version

This would ofcourse be anonimized where we would send one file with JSON
data back to our servers where we can proccess it to do statistics.

The tool will obviously be open source and participating in this will be
opt-in only.

We currently don't know what's running out there, so that would be great
to know.

Some questions remain:
* Who is going to maintain the data?
* Who has access to the data?
* How long do we keep it?
* Do we do logging of IPs sending the data to us?

I certainly do not want to spy on our users, so that's why it's opt-in
and the tool should be part of the main repo, but I think that for us as
a project it's very useful to know what our users are doing with
CloudStack.

Comments?

Wido




Re: [DISCUSS] Reporting tool for feeding back zone, pod and cluster information

2013-11-25 Thread Sebastien Goasguen

On Nov 23, 2013, at 5:01 AM, Wido den Hollander w...@widodh.nl wrote:

 Hi,
 
 I discussed this during CCCEU13 with David, Chip and Hugo and I promised I 
 put it on the ml.
 
 My idea is to come up with a reporting tool which users can run daily which 
 feeds us back information about how they are using CloudStack:
 
 * Hypervisors
 * Zone sizes
 * Cluster sizes
 * Primary Storage sizes and types
 * Same for Secondary Storage
 * Number of management servers
 * Version
 
 This would ofcourse be anonimized where we would send one file with JSON data 
 back to our servers where we can proccess it to do statistics.
 
 The tool will obviously be open source and participating in this will be 
 opt-in only.
 
 We currently don't know what's running out there, so that would be great to 
 know.
 
 Some questions remain:
 * Who is going to maintain the data?
 * Who has access to the data?
 * How long do we keep it?
 * Do we do logging of IPs sending the data to us?
 
 I certainly do not want to spy on our users, so that's why it's opt-in and 
 the tool should be part of the main repo, but I think that for us as a 
 project it's very useful to know what our users are doing with CloudStack.
 
 Comments?
 

+1

 Wido