On 2014-09-09 6:55, Spencer Krum wrote:
Thanks for the positive feedback Andy!


    I'm wondering if there would be a way of saying "all of these
    installations are for the same 'site'". That would remove a module
    looking popular simply because it is installed a lot, but only by
    two or three groups. Maybe that information is valuable, maybe
    not...I'm not sure yet.


One of the common practices when building a system such as this is
keeping the people who send you data anonymous. That makes filtering on
user hard. We could potentially deal with that in two ways I can think
of. We could allow users to set a anonymous=false flag in their json
blob they deliver, or we could hash the source ip address and keep that
around.


How about a UUID for every master? (Using say UUID format 5-SHA-1 to
make it completely anonymous). If we write this UUID to the configuration it would provide a long term identity of the master.

I think the way I intended for it to be used was for users doing CI was
to report that CI in the purpose field. That way we could see total
deployments, but also per-usage deployments. I'm not sure the users
would be willing to differentiate how they run a script between
production and CI though, since the goal of CI is to test it as close to
prod as you can.



    I'm personally a little cautious about making a deploy process
    depend on external services, but this could be fired off as a
    background job and it doesn't really matter too much if it works or not.


I agree that it is a big pill to swallow.  This will likely change, but
right now every deploy must be reported in a single curl request, no
bulk updates. It is also not possible to 'back-fill' data. So deploys
are recorded when they are submitted to puppet-analytics. I could see
deploys for the day being written to a file or database on the users
systems, then a nightly job running to fill in the days deploys on
puppet-analytics, but it would require some changes to the code.

That sounds like a very good improvement, increases the willingness to
submit the reports.

I weighed the balance of allowing arbitrary date insertion. I'm happy to
be convinced otherwise but I think the problems of figuring out when a
deploy occurred when reported by a global system with timezones and all
that is very hard to get right.


Some experiences from Eclipse which used to have a usage collection framework is that over time it returned less and less value and just confirmed what everyone already knew from simpler measures like "number of downloads". It ended up only wasting cycles and disk space.

Some "module" authors did make use of the facilities to measure in more detail which features of their "modules" that were actually used and frequency - this to ensure they focused on the right set of features, and to prune old / expensive to maintain unused features. This naturally required the module owners to make calls to the API. This was only used by a few projects at Eclipse, and they were not happy when the collection mechanism was turned off.

Just something worth considering.

- henrik

Thanks again,
Spencer



On Mon, Sep 8, 2014 at 11:21 AM, Andy Parker <a...@puppetlabs.com
<mailto:a...@puppetlabs.com>> wrote:

    On Sun, Sep 7, 2014 at 3:57 PM, Spencer Krum <krum.spen...@gmail.com
    <mailto:krum.spen...@gmail.com>> wrote:

        Hi Puppet-dev,

        I've been working, with a lot of help from some others, on a new
        project at http://puppet-analytics.org. It is very much in the
        experimental/development phase and I'm looking for feedback and
        help.

        The goal of this project is to enable module authors and users
        greater visibility into module use. The architecture is modeled
        after Debian's popularity contest, where a program on the debian
        system reports to a central server about package use. This means
        that Puppet users can submit(through a json/http endpoint) 'hey
        I've deployed this version of stdlib!'. After a bunch of users
        have been reporting for a while, module maintainers can see the
        trends, identify which versions of the modules are being used,
        etc. Similarly users can see which modules are the most popular,
        which versions of those modules are the most popular, etc.


    This all looks awesome!

        There is an arbitrary tagging system built in that allows users
        to report that the deploy is being performed by their ci
        infrastructure, by a developer doing testing, or by an operator
        pushing code to production. This allows people viewing the data
        to see the 'true' numbers, unpolluted by ci systems or runaway
        webcrawlers.


    I'm wondering if there would be a way of saying "all of these
    installations are for the same 'site'". That would remove a module
    looking popular simply because it is installed a lot, but only by
    two or three groups. Maybe that information is valuable, maybe
    not...I'm not sure yet.

        Reporting can be done with curl, or with a script. Right now
        there is a script and example curl to report to puppet analytics
        at: https://github.com/nibalizer/puppet-analytics-client. I
        think everyone's infrastructure looks a little different, so
        writing a generic tool to report to PA would be pretty hard. I'd
        like puppet-analytics-client to become a place to put scripts
        and tools to hit PA.

        I'm interested in your thoughts an opinions. Especially around
        the opt-in architecture. Would you be willing to report to PA?
        Do you think we would ever be able to get enough people
        reporting that the data would be significant? All the code is
        open source on github
        (https://github.com/nibalizer/puppet-analytics). The website is
        hosted on digital ocean. I also have the mental model that
        people would report after every code change to their Puppet
        infrastructure, i.e. in the post-commit hook if using dynamic
        environments. Is this a model you agree with? Do you have a
        different idea?


    I think that is a great thing to shoot for. I'm personally a little
    cautious about making a deploy process depend on external services,
    but this could be fired off as a background job and it doesn't
    really matter too much if it works or not.

        We have had a lot of conversations, on this list, and in person,
        around 'what are people doing with puppet?' I think a tool like
        this could really help us figure out which modules are being
        used the most often.


    Currently I answer this by trawling through a dump of the forge that
    we have available internally. However, my questions often revolve
    around how people are using the language rather than what modules
    are in use. That said, knowing which modules are heavily used would
    help everyone to understand a lot more.

        Please note that PA is not nearly done yet. Much of the empty
        space I expect will be filled in with cool visualizations of the
        data. It is liable to break at any time, especially with actual
        users. One of the cool features that is currently in PR is the
        ability to have shields.io <http://shields.io> downloads tags
        come from PA and show up in the ReadMe's of our modules.

        Thanks everybody,
        Spencer

        --
        Spencer Krum
        (619)-980-7820 <tel:%28619%29-980-7820>

        --
        You received this message because you are subscribed to the
        Google Groups "Puppet Developers" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to puppet-dev+unsubscr...@googlegroups.com
        <mailto:puppet-dev+unsubscr...@googlegroups.com>.
        To view this discussion on the web visit
        
https://groups.google.com/d/msgid/puppet-dev/CADt6FWPoK7N6pwPj4h6_84p-6WEwtz3N6zJbuJniRkHaMi9HBA%40mail.gmail.com
        
<https://groups.google.com/d/msgid/puppet-dev/CADt6FWPoK7N6pwPj4h6_84p-6WEwtz3N6zJbuJniRkHaMi9HBA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
        For more options, visit https://groups.google.com/d/optout.




    --
    Andrew Parker
    a...@puppetlabs.com <mailto:a...@puppetlabs.com>
    Freenode: zaphod42
    Twitter: @aparker42
    Software Developer

    *Join us at PuppetConf 2014 <http://www.puppetconf.com/>, September
    22-24 in San Francisco*
    /Register by May 30th to take advantage of the Early Adopter
    discount <http://links.puppetlabs.com/puppetconf-early-adopter>
    //—//save $349!/

    --
    You received this message because you are subscribed to the Google
    Groups "Puppet Developers" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to puppet-dev+unsubscr...@googlegroups.com
    <mailto:puppet-dev+unsubscr...@googlegroups.com>.
    To view this discussion on the web visit
    
https://groups.google.com/d/msgid/puppet-dev/CANhgQXtn%2B2FT%3DVtxhUpYUpTv0ea1Be2L613MSHHROMeRd1jxQQ%40mail.gmail.com
    
<https://groups.google.com/d/msgid/puppet-dev/CANhgQXtn%2B2FT%3DVtxhUpYUpTv0ea1Be2L613MSHHROMeRd1jxQQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.
    For more options, visit https://groups.google.com/d/optout.




--
Spencer Krum
(619)-980-7820

--
You received this message because you are subscribed to the Google
Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to puppet-dev+unsubscr...@googlegroups.com
<mailto:puppet-dev+unsubscr...@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-dev/CADt6FWO%3DG_r79aMRSM8H2DUZ3XJo2a1rZY3z00ust1v%3DvjUBaA%40mail.gmail.com
<https://groups.google.com/d/msgid/puppet-dev/CADt6FWO%3DG_r79aMRSM8H2DUZ3XJo2a1rZY3z00ust1v%3DvjUBaA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.


--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/lunb8r%24a6c%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.

Reply via email to