Hi Pierre,

You are absolutely right, and that's why I started collecting these
numbers.  Now, our numbers look slightly "off" because we have a few
committers who have been inactive for a while now, doing nothing on
Trafodion for 9 months (as in, shortly after the start of incubation).  But
we do know where we are now, and we can focus on adoption and growing our
user base. That's what leads to contributors and committers.

-Carol P.

---------------------------------------------------------------
Email:    [email protected]
Twitter:  @CarolP222
---------------------------------------------------------------

On Thu, Mar 31, 2016 at 1:40 AM, Pierre Smits <[email protected]>
wrote:

> Hi Carol, all,
>
> You are right, numbers without context mean nothing. It is all about
> correlation. Yet, one must start to measure first before the insights can
> be created. But it must not be the end goal. It must all be seen in
> relation to adoption, community growth and health.
>
> Best regards,
>
> Pierre Smits
>
> ORRTIZ.COM <http://www.orrtiz.com>
> OFBiz based solutions & services
>
> OFBiz Extensions Marketplace
> http://oem.ofbizci.net/oci-2/
>
> On Wed, Mar 30, 2016 at 10:13 AM, Carol Pearson <
> [email protected]>
> wrote:
>
> > Hi,
> >
> > I've looked at a bunch of things to get a handle on our users, growth,
> and
> > what some other Apache projects have for committers and community.
> > Trafodion is a database project, so I went looking for real data,
> > everything from participation on our email lists (and new posts there) to
> > Jira activity to Github forks and pulls and commits.  I also monitor some
> > more fanciful stats, looking for references to Trafodion on Twitter,
> > stackoverflow, etc.
> >
> > As far as email list activity goes, I use the data from the mailing list
> > archive.  The user list was very quiet (fewer than 20 emails total from
> > when Trafodion started incubating through December.  That's not very
> > inviting - our users who drove by to check us out didn't see much
> activity,
> > even though there was a lot.  So I don't pay too much attention to data
> in
> > that range.  Our user list has shown a big jump in usage in that period,
> > slightly cannibalizing the dev list.
> >
> > Here are the numbers I have for Jan/Feb/March.  Sorry for the funky ascii
> > formatting, but mailing lists don't do attachments and tables very well:
> >
> > User List:
> >
> > MON        Total Posts       Distinct         Non-Esgyn
> >                                        Posters           Posters
> > ======================================
> > JAN2016         19                  12                         2
> > FEB2016        291                 42                         6
> > MAR2016        126                25                         1
> >
> > Dev List:
> >
> > MON        Total Posts       Distinct         Non-Esgyn
> >                                        Posters           Posters
> > ======================================
> > DEC2016        243                 29                       6
> > JAN2016        199                  24                       3
> > FEB2016        181                  24                       4
> > MAR2016        200                 31                       4
> >
> >
> > Note that Dec2016 was a release month and the Non-Esgyn posters were
> mostly
> > IPMC posters helping guide our release with respect to things like
> > licensing guidance.
> >
> > So we're seeing some additional participation but it's still heavily
> > dominated by Esgyn.
> >
> > I count distinct posters by email address, so posters that use two
> > different emails count twice.
> >
> > We have google analytics on the newly-redesigned website.  It shows
> similar
> > numbers of hits between new users and returning users, but I'm not sure
> how
> > significant that is, since many returning users from Esgyn don't need to
> > re-hit the website.
> >
> > Still, data is data, and here's a sample for the period from 29Feb
> through
> > today, 29Mar:
> >
> > Metric                  New User   Returning User    Total
> > ========================================
> > Sessions                 885             895                1780
> > %New Sessions      100%           0%                49.72%
> > Bounce Rate           60%            48.83%           54.38%
> > Pages/Session        2.09             2.39               2.24
> > Avg Session            02:01           02:57             02:29
> >    Duration
> >
> > And so on.
> >
> >
> > But one thing I've learned over the years is that numbers are just....
> > numbers.  These are nice (and I have plenty more), but the real question
> > is, "what's a good score?"  What's typical for Apache projects for
> > committer distribution? What's typical for user list activity?
> >
> > I started with the first question: Where do committers come from and
> what's
> > their distribution?  I used the Apache committer lists and the websites
> > that indicated committer affiliation. This wasn't perfect:  Some project
> > don't have committer affiliation; I can't trust others to be perfectly
> > up-to-date.  Further, it doesn't indicate committer activity. Still, it
> > gives some targets.
> >
> > After I started, I refined the data a little bit by looking for projects
> > similar to Trafodion along a couple of possible vectors:  data management
> > or Hadoop/Big Data ecosystem and recently graduated.  The latter category
> > is particularly interesting to me because I would expect more diversity
> of
> > committers over time, if only because developers move around.
> >
> > I was not able to collect data on currently incubating projects because
> the
> > list of committers I worked from on ASF did not include incubating
> projects
> > in the phonebook, though the reports have them and many project websites
> > have them.  I was more interested in projects that climbed the mountain
> > we're trying to climb:
> >
> > Here's some of the data I collected back in February
> >
> > Trafodion:
> > ORG       Count   Pct
> > ================
> > Esgyn         10     66.67%
> > orrtiz.com     1     06.67
> > Unvailable    4       28%
> >   /Inactive
> > Total           15
> >
> > HBase:
> > ========================
> > Cloudera 12 26%
> > Continuuity  1 2%
> > Dropbox 1 2%
> > Explorys 1  2%
> > Facebook 9  19%
> > Hortonworks  7  15%
> > IBM  1  2%
> > Intel 2  4%
> > Salesforce.com 3 6%
> > Scaled Risk 1 2%
> > Taobao 1 2%
> > unaffiliated 1 2%
> > WANdisco 1 2%
> > Xiaomi 4 9%
> > Yahoo! 1 2%
> > Yuantiku 1 2%
> >
> >
> > Formatting this is getting crazy and it's getting late since I was up
> early
> > travelling. I'll just C&P and my apologies for the alignment
> >
> > Ignite:  Graduated Sept 2015
> > ChronoTrack 1 4%
> > CyberAgent, Inc. 1 4%
> > Engiweb Security 1 4%
> > Evosent Consulting 1 4%
> > Fitech Source 1 4%
> > GridGain 14 58%
> > Pivotal 1 4%
> > Shoutlet 1 4%
> > Trend Micro 1 4%
> > WANdisco 2 8%
> > Grand Total 24
> >
> > Calcite:  Graduated Nov 2015
> > Dremio 1 7%
> > Hortonworks 7 47%
> > Intel 1 7%
> > MapR 3 20%
> > NetCracker 1 7%
> > NGData 1 7%
> > Salesforce 1 7%
> > Grand Total 15
> >
> > Or
> >
> > Count
> >
> > Spark:
> >
> > Alibaba 1 2%
> >
> > Bizo 1 2%
> >
> > ClearStory Data 1 2%
> >
> > Cloudera 4 9%
> >
> > Databricks 15 34%
> >
> > Databricks, MIT 1 2%
> >
> > Facebook 1 2%
> >
> > Hortonworks 1 2%
> >
> > IBM 1 2%
> >
> > Intel 2 5%
> >
> > Mxit 1 2%
> >
> > Netflix 1 2%
> >
> > NTT Data 1 2%
> >
> > Quantifind 1 2%
> >
> > QuestTec B.V. 1 2%
> >
> > Tachyon Nexus 1 2%
> >
> > UC Berkeley 5 11%
> >
> > University of Michigan, Ann Arbor 1 2%
> >
> > Webtrends 1 2%
> >
> > Yahoo! 3 7%
> >
> > Grand Total 44
> >
> >
> >
> > I have a spreadsheet with a bunch more companies. I'll send it to anyone
> > who
> >
> > asks - the data was all gleaned publicly.
> >
> >
> > Anyway, the upshot from what I saw was that even recently graduated
> > projects
> >
> > had 50-60% at most of active committers from one company (and I would
> guess
> >
> > are moving away from that as apart of the apache way.
> >
> >
> >
> > I have a spreadsheet that I'm happy to send to anyone who wants it - the
> > data was all gleaned publicly.
> >
> > The upshot from what I saw was that even recently graduated projects are
> > typically in the 50-60% range of committers from a single company. The
> > largest percent I saw was 76% on the Ambari project.
> >
> > So that's some of the user data/growth data I have.  Apparently, I'm more
> > of a data junky than I thought....
> >
> > -Carol P.
> >
> >
> > ---------------------------------------------------------------
> > Email:    [email protected]
> > Twitter:  @CarolP222
> > ---------------------------------------------------------------
> >
> > On Tue, Mar 29, 2016 at 6:57 PM, Andrew Purtell <[email protected]>
> > wrote:
> >
> > > On Tue, Mar 29, 2016 at 10:01 AM, Pierre Smits <[email protected]
> >
> > > wrote:
> > >
> > > > A
> > > > distribution with Apache only elements (Hadoop, HBase, Zookeeper,
> > Ambari,
> > > > etc) would surely be a nice-to-have, and also a means to show
> > > cross-selling
> > > > Apache products that could lead to cross-pollination (adoption and
> > > > community growth wise).
> > > >
> > >
> > > ​That's known as Apache Bigtop. ​
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > >    - Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
>

Reply via email to