RE: plugin to gather statistics, make a summary and store them into aDB

Vincent Massol Thu, 06 May 2004 11:59:29 -0700

Hi Jerome,

I'm also working on this subject so that's great someone else is also
interested!


The goal is to be able to gather data from all existing "quality"
plugins like junit, jdepend, checkstyle, PMD, etc into one single report
and to perform analysis of data over time, trends, etc.

I've been thinking about 2 ways to achieve this:

Solution1:

This is what I have started with the dashboard plugin. As you've
probably seen it provides an aggregated static view of these quality
data. It doesn't support history for a very simple reason: I wanted to
separate the 2 features. Thus, I've started a history plugin which goal
is to save data over time. Then, my idea was to provide other plugins
that will do the analysis of data, correlate them, draw graphs, etc.

Thus to summarize my strategy was to do it with 3 steps:
- step1: create aggregated data report plugin (this is the dashboard
plugin)
- step2: create a history plugin which allows to save the history of any
report (including the dashboard report, but also any other report like
the checkstyle report, etc)
- step3: create analysis plugins that show trends, with color
highlighting when thresholds are crossed, etc

For the history plugin, here's how I imagine implementing it:
- history data is saved in the Maven repository under a "report" type,
i.e. under groupId/reports/artifactId-<report date>.<report type, say
xml>
- there is a goal to install/deploy reports to the repository
- there are goals to download reports matching some criteria (groupId,
artifactId, range of date, etc).
- there is also a need for a special file in groupId/reports/ that
provides information on what reports exist in the repository. Let's call
this file reportlist.xml. This file is automatically modified by the
history plugin when a report is uploaded/installed in the repository.
This is required because there's no way to query the repository for the
list of files available.

Solution 2:

I've posted an entry about this one in my blog (I'm offline and I don't
remember the entry title but it's something like "wanted: a common API
for storing parsing data"). The idea is to do what CAST Software is
doing with their product: they have parsers that parse source code
(java, c#, SQL, etc) and that store this raw data (with dates) in a
database. Then they have analyzers that query the database and generate
reports.

The problem in open source is that each tool has its own format and we
don't have a common storage area and common API to store parsing data
in, say, a database. What we could do would be to create a project that
takes different input files (checkstyle report file, PMD report file,
Simian report file, Clover report files, etc) and save them under a
common format in a database. Then we could wrap this project with a
Maven plugin so that it is integrated with Maven and knows the location
of the different plugin reports.

Then the second step would be to develop analyzer plugins that query
this database to generate reports. 

This strategy is a bit similar with solution 1 but is better structured
in the sense that there is a common pivot format and it goes through a
database which makes it scalable in term of queries.

I would prefer this solution.

Now that we have good pure java databases, like Hypersonic SQL, I think
it would be easy to have a "database" artifact type, so that we can save
the database content in binary format in the repository (in order to
share it with other users of course). That would be instead of saving
the raw data as in solution 1. What would be save are the data in the
pivot format and in queryable format.

Of course all this means there is "someone" (a single person) who is in
charge of generating these reports and saving them in the repository. It
makes sense that this "someone" is the continuous build process or the
nightly build process.

Would you be interested in working on this?

Thanks
-Vincent

PS: I'm currently offline traveling to TSSS2004. I hope I can find some
internet connection to post this... It'll probably be at least 1 or 2
days old when you receive it... :-)

> -----Original Message-----
> From: Jerome Lacoste [mailto:[EMAIL PROTECTED]
> Sent: 04 May 2004 22:22
> To: Maven Developers List
> Subject: plugin to gather statistics, make a summary and store them
into
> aDB
> 
> Hi,
> 
> I've had this idea for a while. I prefer to bring it up on the dev
> mailing list instead of issue tracker, because I work offline most of
> the time, and because I wanted to get some feedback before creating an
> issue.
> 
> My primary intent would be to gather info from the build and store
them
> into a DB. The info would be generated from the different plugins. For
> example: nb of lines of code, number of bugs found by findbugs, etc...
> 
> At each build, this data is already gathered, but in different places
by
> different plugins. At least in the maven version I am using (1.0rc2).
I
> want to centralize and store it.
> 
> I've seen things in the dev list talking about multi-projects and
> dashboard. Maybe those are related? (I couldn't find much information
> about those in my personal archives). I've got the maven
> 1-0-rc3-snapshot PDF doc from the 30th of March but it says nothing
> about those.
> 
> 
> 
> One way I could see this working would be by:
> 
> - having this plugin create a raw property file in the target dir,
with
> a property containing the data timestamp (perhaps need another one to
> contain the date format to parse it out?)
> 
> - let every plugin the possibility to add some data to this file. My
> understanding is that the plugins are not run in parallel, so there is
> no problem writing sequentially to the property file.
> 
> - a summary page is made out off that information.
> 
> - that information is stored into a DB.
> 
> There are probably different ways to architect that plugin, and using
a
> property file is perhaps not the best thing. I've just tried to find a
> simple solution that could work. My main idea was to not make this new
> plugin dependent on the others, but rather the other way around.
> 
> >From the DB, graphes can be generated to check the evolution of the
> project. (one cool graph could be to have the evolution of the code
size
> against the effort put, such as described in Mythical Man-Month, but
> that would require have the number of hours spent by the different
> developers. This diagram is probably not very appropriate for open
> source projects, but for more closed projects it would fit well.)
> 
> One thing to keep in mind, is that it would be nice to be able to
script
> maven to retroactively build statistics, e.g. for the first time, or
> when a new type of data is to be added to the stored information. That
> means that the task requires a timestamp argument, same timestamp used
> to retrieve the code from the source repository.
> 
> 
> So now that I've described the idea, I would like to know:
> 
> - is something like that already feasible?
> 
> - if not would it make more sense for m2?
> 
> - comments? Especially whether or not I should open an issue.
> 
> Cheers,
> 
> Jerome
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: plugin to gather statistics, make a summary and store them into aDB

Reply via email to