Hi all, As those of you who have been around for a while might know, I am very interested in community metrics, and I think it's important that those metrics be as visible and transparent as possible.
I've been working on what needs to happen to get a community dashboard going, with a web page updated in real time (or close to it) with the data from the various sources. I'd like to support some more complex querying & reports, and ideally allow integration with office suites too. The goal is to save time of people who have been manually extracting this data monthly for Dawn (and of course save Dawn some time in preparing her monthly reports), and also ensure that everyone has timely access to this data in a nice way. You can follow along the progress and help out in the wiki: http://wiki.meego.com/Metrics/Dashboard The basic idea is to automate this: * Gather data from various data sources into a usable form: * Drupal (user accounts, active users, ...) * git (commits, active developers, ...) * Mailing lists (emails per list, per user, per month, ...) * Wikipedia (Popular pages, active users, total edits, ...) * Bugzilla (new bugs, closed bugs, new comments, patch proposals, patch review, ...) * Forums (new posts, new users, active users, ...) * IRC (Active users, total activity, across various channels) * Transifex (New translations, active translators, ...) * Community OBS (uploads, active users, popular downloads, ...) * SDK downloads and other web analytics (would be nice to get) * Transform the data into useful metrics via queries * Present the data in graphical form showing the evolution of the various stats from week to week or month to month The usual way of doing this is to use some kind of Business Intelligence platform to suck data in using an ETL engine, store it in some kind of local data store, create and store reports using a Reports engine, and present the data in a dashboard which can be a thin or thick client. The two open source BI engines worth considering are Pentaho and Jaspersoft. So far, I'm leaning towards Pentaho, primarily because there seems to be better supprot for dashboards within the community, but I am open to input & any experiences that people might have. I am also open to any input people might have in helping integrate some of these services together. Right now, I have local instances of much (but not all) of the software, and we will need to figure out interchange formats for everything - anything with a MySQL database should be straightforward, but for things like downloads, web analytics, mailing lists, IRC and the forum, where the data will be going through a different format, the integration might be a little trickier. For the time being, I will be investigating Kettle and JasperETL (which, as far as I can tell, is just a rebranded Talend Open Studio) and figuring out how to get data into the BI server from the various apps. So what specific feedback would I like? * What are the useful data that people would like to know about the various meego.com services? * Anyone have experience with Jasper & Pentaho and can give sensible feedback on the advantages & disadvantages of each? * Does anyone want to help with the integration of the various services, once I get a public BI platform installed? * Anyone think that I'm smoking crack with this basic architecture, and care to suggest something simpler/quicker/easier? Thanks! Dave. -- Email: dne...@maemo.org Jabber: bo...@jabber.org _______________________________________________ MeeGo-community mailing list MeeGo-community@meego.com http://lists.meego.com/listinfo/meego-community