RE: plugin to gather statistics, make a summary and store them into aDB

2004-05-06 Thread Jerome Lacoste
On Thu, 2004-05-06 at 00:48, Vincent Massol wrote:
> Hi Jerome,
> 
> I'm also working on this subject so that's great someone else is also
> interested! 

Vincent, 

Thanks for the answer

> The goal is to be able to gather data from all existing "quality"
> plugins like junit, jdepend, checkstyle, PMD, etc into one single report
> and to perform analysis of data over time, trends, etc.
> 
> I've been thinking about 2 ways to achieve this:
> 
> Solution1:
> 
> This is what I have started with the dashboard plugin. As you've
> probably seen it provides an aggregated static view of these quality
> data. It doesn't support history for a very simple reason: I wanted to
> separate the 2 features. Thus, I've started a history plugin which goal
> is to save data over time. Then, my idea was to provide other plugins
> that will do the analysis of data, correlate them, draw graphs, etc.
> 
> Thus to summarize my strategy was to do it with 3 steps:
> - step1: create aggregated data report plugin (this is the dashboard
> plugin)
> - step2: create a history plugin which allows to save the history of any
> report (including the dashboard report, but also any other report like
> the checkstyle report, etc)
> - step3: create analysis plugins that show trends, with color
> highlighting when thresholds are crossed, etc
> 
> For the history plugin, here's how I imagine implementing it:
> - history data is saved in the Maven repository under a "report" type,
> i.e. under groupId/reports/artifactId-. xml>
> - there is a goal to install/deploy reports to the repository
> - there are goals to download reports matching some criteria (groupId,
> artifactId, range of date, etc).
> - there is also a need for a special file in groupId/reports/ that
> provides information on what reports exist in the repository. Let's call
> this file reportlist.xml. This file is automatically modified by the
> history plugin when a report is uploaded/installed in the repository.
> This is required because there's no way to query the repository for the
> list of files available.
> 
> Solution 2:
> 
> I've posted an entry about this one in my blog (I'm offline and I don't
> remember the entry title but it's something like "wanted: a common API
> for storing parsing data"). The idea is to do what CAST Software is
> doing with their product: they have parsers that parse source code
> (java, c#, SQL, etc) and that store this raw data (with dates) in a
> database. Then they have analyzers that query the database and generate
> reports.
> 
> The problem in open source is that each tool has its own format and we
> don't have a common storage area and common API to store parsing data
> in, say, a database. What we could do would be to create a project that
> takes different input files (checkstyle report file, PMD report file,
> Simian report file, Clover report files, etc) and save them under a
> common format in a database. Then we could wrap this project with a
> Maven plugin so that it is integrated with Maven and knows the location
> of the different plugin reports.
> 
> Then the second step would be to develop analyzer plugins that query
> this database to generate reports. 
> 
> This strategy is a bit similar with solution 1 but is better structured
> in the sense that there is a common pivot format and it goes through a
> database which makes it scalable in term of queries.
> 
> I would prefer this solution.

me too. 

> Now that we have good pure java databases, like Hypersonic SQL, I think
> it would be easy to have a "database" artifact type, so that we can save
> the database content in binary format in the repository (in order to
> share it with other users of course). That would be instead of saving
> the raw data as in solution 1. What would be save are the data in the
> pivot format and in queryable format.

Hmm I wonder how would interact with my idea to create the data
retroactively.

> Of course all this means there is "someone" (a single person) who is in
> charge of generating these reports and saving them in the repository. It
> makes sense that this "someone" is the continuous build process or the
> nightly build process.

agree.

> Would you be interested in working on this?

yes. Now just have to find the time to do it...

I've got already 2-3 other projects I've promised to commit some time
on, so I won't be able to put real engineering time until let's say
early June (hope so). But if we can keep the discussion going, maybe I
can start putting some effort then. If by the end of June we see that
the idea has grown sufficiently, we may even be able to meet for real to
discuss it further as I should be in Paris for a week.

> Thanks
> -Vincent
> 
> PS: I'm currently offline traveling to TSSS2004. I hope I can find some
> internet connection to post this... It'll probably be at least 1 or 2
> days old when you receive it... :-)

no problem: I am currently offline most of the time, travelling to PTTSA2004 (*).
I manage to connect usuall

RE: plugin to gather statistics, make a summary and store them into aDB

2004-05-06 Thread Jerome Lacoste
On Thu, 2004-05-06 at 00:48, Vincent Massol wrote:
> Hi Jerome,
> 
> I'm also working on this subject so that's great someone else is also
> interested! 

Vincent, 

Thanks for the answer

> The goal is to be able to gather data from all existing "quality"
> plugins like junit, jdepend, checkstyle, PMD, etc into one single report
> and to perform analysis of data over time, trends, etc.
> 
> I've been thinking about 2 ways to achieve this:
> 
> Solution1:
> 
> This is what I have started with the dashboard plugin. As you've
> probably seen it provides an aggregated static view of these quality
> data. It doesn't support history for a very simple reason: I wanted to
> separate the 2 features. Thus, I've started a history plugin which goal
> is to save data over time. Then, my idea was to provide other plugins
> that will do the analysis of data, correlate them, draw graphs, etc.
> 
> Thus to summarize my strategy was to do it with 3 steps:
> - step1: create aggregated data report plugin (this is the dashboard
> plugin)
> - step2: create a history plugin which allows to save the history of any
> report (including the dashboard report, but also any other report like
> the checkstyle report, etc)
> - step3: create analysis plugins that show trends, with color
> highlighting when thresholds are crossed, etc
> 
> For the history plugin, here's how I imagine implementing it:
> - history data is saved in the Maven repository under a "report" type,
> i.e. under groupId/reports/artifactId-. xml>
> - there is a goal to install/deploy reports to the repository
> - there are goals to download reports matching some criteria (groupId,
> artifactId, range of date, etc).
> - there is also a need for a special file in groupId/reports/ that
> provides information on what reports exist in the repository. Let's call
> this file reportlist.xml. This file is automatically modified by the
> history plugin when a report is uploaded/installed in the repository.
> This is required because there's no way to query the repository for the
> list of files available.
> 
> Solution 2:
> 
> I've posted an entry about this one in my blog (I'm offline and I don't
> remember the entry title but it's something like "wanted: a common API
> for storing parsing data"). The idea is to do what CAST Software is
> doing with their product: they have parsers that parse source code
> (java, c#, SQL, etc) and that store this raw data (with dates) in a
> database. Then they have analyzers that query the database and generate
> reports.
> 
> The problem in open source is that each tool has its own format and we
> don't have a common storage area and common API to store parsing data
> in, say, a database. What we could do would be to create a project that
> takes different input files (checkstyle report file, PMD report file,
> Simian report file, Clover report files, etc) and save them under a
> common format in a database. Then we could wrap this project with a
> Maven plugin so that it is integrated with Maven and knows the location
> of the different plugin reports.
> 
> Then the second step would be to develop analyzer plugins that query
> this database to generate reports. 
> 
> This strategy is a bit similar with solution 1 but is better structured
> in the sense that there is a common pivot format and it goes through a
> database which makes it scalable in term of queries.
> 
> I would prefer this solution.

me too. 

> Now that we have good pure java databases, like Hypersonic SQL, I think
> it would be easy to have a "database" artifact type, so that we can save
> the database content in binary format in the repository (in order to
> share it with other users of course). That would be instead of saving
> the raw data as in solution 1. What would be save are the data in the
> pivot format and in queryable format.

Hmm I wonder how would interact with my idea to create the data
retroactively.

> Of course all this means there is "someone" (a single person) who is in
> charge of generating these reports and saving them in the repository. It
> makes sense that this "someone" is the continuous build process or the
> nightly build process.

agree.

> Would you be interested in working on this?

yes. Now just have to find the time to do it...

I've got already 2-3 other projects I've promised to commit some time
on, so I won't be able to put real engineering time until let's say
early June (hope so). But if we can keep the discussion going, maybe I
can start putting some effort then. If by the end of June we see that
the idea has grown sufficiently, we may even be able to meet for real to
discuss it further as I should be in Paris for a week.

> Thanks
> -Vincent
> 
> PS: I'm currently offline traveling to TSSS2004. I hope I can find some
> internet connection to post this... It'll probably be at least 1 or 2
> days old when you receive it... :-)

no problem: I am currently offline most of the time, travelling to PTTSA2004 (*).
I manage to connect usuall

RE: plugin to gather statistics, make a summary and store them into aDB

2004-05-06 Thread Vincent Massol
Hi Jerome,

I'm also working on this subject so that's great someone else is also
interested! 

The goal is to be able to gather data from all existing "quality"
plugins like junit, jdepend, checkstyle, PMD, etc into one single report
and to perform analysis of data over time, trends, etc.

I've been thinking about 2 ways to achieve this:

Solution1:

This is what I have started with the dashboard plugin. As you've
probably seen it provides an aggregated static view of these quality
data. It doesn't support history for a very simple reason: I wanted to
separate the 2 features. Thus, I've started a history plugin which goal
is to save data over time. Then, my idea was to provide other plugins
that will do the analysis of data, correlate them, draw graphs, etc.

Thus to summarize my strategy was to do it with 3 steps:
- step1: create aggregated data report plugin (this is the dashboard
plugin)
- step2: create a history plugin which allows to save the history of any
report (including the dashboard report, but also any other report like
the checkstyle report, etc)
- step3: create analysis plugins that show trends, with color
highlighting when thresholds are crossed, etc

For the history plugin, here's how I imagine implementing it:
- history data is saved in the Maven repository under a "report" type,
i.e. under groupId/reports/artifactId-.
- there is a goal to install/deploy reports to the repository
- there are goals to download reports matching some criteria (groupId,
artifactId, range of date, etc).
- there is also a need for a special file in groupId/reports/ that
provides information on what reports exist in the repository. Let's call
this file reportlist.xml. This file is automatically modified by the
history plugin when a report is uploaded/installed in the repository.
This is required because there's no way to query the repository for the
list of files available.

Solution 2:

I've posted an entry about this one in my blog (I'm offline and I don't
remember the entry title but it's something like "wanted: a common API
for storing parsing data"). The idea is to do what CAST Software is
doing with their product: they have parsers that parse source code
(java, c#, SQL, etc) and that store this raw data (with dates) in a
database. Then they have analyzers that query the database and generate
reports.

The problem in open source is that each tool has its own format and we
don't have a common storage area and common API to store parsing data
in, say, a database. What we could do would be to create a project that
takes different input files (checkstyle report file, PMD report file,
Simian report file, Clover report files, etc) and save them under a
common format in a database. Then we could wrap this project with a
Maven plugin so that it is integrated with Maven and knows the location
of the different plugin reports.

Then the second step would be to develop analyzer plugins that query
this database to generate reports. 

This strategy is a bit similar with solution 1 but is better structured
in the sense that there is a common pivot format and it goes through a
database which makes it scalable in term of queries.

I would prefer this solution.

Now that we have good pure java databases, like Hypersonic SQL, I think
it would be easy to have a "database" artifact type, so that we can save
the database content in binary format in the repository (in order to
share it with other users of course). That would be instead of saving
the raw data as in solution 1. What would be save are the data in the
pivot format and in queryable format.

Of course all this means there is "someone" (a single person) who is in
charge of generating these reports and saving them in the repository. It
makes sense that this "someone" is the continuous build process or the
nightly build process.

Would you be interested in working on this?

Thanks
-Vincent

PS: I'm currently offline traveling to TSSS2004. I hope I can find some
internet connection to post this... It'll probably be at least 1 or 2
days old when you receive it... :-)

> -Original Message-
> From: Jerome Lacoste [mailto:[EMAIL PROTECTED]
> Sent: 04 May 2004 22:22
> To: Maven Developers List
> Subject: plugin to gather statistics, make a summary and store them
into
> aDB
> 
> Hi,
> 
> I've had this idea for a while. I prefer to bring it up on the dev
> mailing list instead of issue tracker, because I work offline most of
> the time, and because I wanted to get some feedback before creating an
> issue.
> 
> My primary intent would be to gather info from the build and store
them
> into a DB. The info would be generated from the different plugins. For
> example: nb of lines of code, number of bugs found by findbugs, etc...
> 
> At each build, this data is already gathered, but in different places
by
> different plugins. At least in the maven version I am using (1.0rc2).
I
> want to centralize and store it.
> 
> I've seen things in the dev list talking about multi-project

Re: plugin to gather statistics, make a summary and store them into aDB

2004-05-06 Thread Jerome Lacoste
On Wed, 2004-05-05 at 02:47, Nicolas De Loof wrote:
> I was thinking about something like this. 
> I vas trying to build a plugin that should as a "site" postGoal
> It works this way:
> 
> For ecah "*-report.xml" file in target/generated-doc it search a coresponding 
> stylesheet (jsl in plugin-resources) and applies it. 
> It then concat the result of all those stylesheets to a merged-repport.xml that could
> be formated by xdoc and becomes the project summary on the site. It couls also be 
> saved
> as timestamp or in DB.

My only problem with that is the you probably need a different parser for each report. 
The plugin may know in a better way how to extract the relevant information. So I 
would perhaps rather have the functionality to extract the info in each plugin, or at 
least make it possible to be there.
The drawback with that are:
- no centralization for information extraction
- the plugin defines what is extracted. Each users may want different info.

I would really appreciate some input/thoughts from the people who are working more on 
the core, such as Jason or Vincent.

Jerome


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: plugin to gather statistics, make a summary and store them into aDB

2004-05-04 Thread Nicolas De Loof

I was thinking about something like this. 
I vas trying to build a plugin that should as a "site" postGoal
It works this way:

For ecah "*-report.xml" file in target/generated-doc it search a coresponding 
stylesheet (jsl in plugin-resources) and applies it. 
It then concat the result of all those stylesheets to a merged-repport.xml that could
be formated by xdoc and becomes the project summary on the site. It couls also be saved
as timestamp or in DB.

Another plugin / tool could extract from timestamped infos the repports evolution : 
number of checkstyle rules violation per line of code for example...

Nico.


> 
> One way I could see this working would be by:
> 
> - having this plugin create a raw property file in the target dir, with
> a property containing the data timestamp (perhaps need another one to
> contain the date format to parse it out?)
> 
> - let every plugin the possibility to add some data to this file. My
> understanding is that the plugins are not run in parallel, so there is
> no problem writing sequentially to the property file.
> 
> - a summary page is made out off that information.
> 
> - that information is stored into a DB.
> 
> There are probably different ways to architect that plugin, and using a
> property file is perhaps not the best thing. I've just tried to find a
> simple solution that could work. My main idea was to not make this new
> plugin dependent on the others, but rather the other way around.
> 
> >From the DB, graphes can be generated to check the evolution of the
> project. (one cool graph could be to have the evolution of the code size
> against the effort put, such as described in Mythical Man-Month, but
> that would require have the number of hours spent by the different
> developers. This diagram is probably not very appropriate for open
> source projects, but for more closed projects it would fit well.)
> 
> One thing to keep in mind, is that it would be nice to be able to script
> maven to retroactively build statistics, e.g. for the first time, or
> when a new type of data is to be added to the stored information. That
> means that the task requires a timestamp argument, same timestamp used
> to retrieve the code from the source repository. 
> 
> 
> So now that I've described the idea, I would like to know:
> 
> - is something like that already feasible? 
> 
> - if not would it make more sense for m2?
> 
> - comments? Especially whether or not I should open an issue.
> 
> Cheers,
> 
> Jerome
> 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]



Our name has changed, please update your address book to the following format for the 
latest identities received "[EMAIL PROTECTED]".

This message contains information that may be privileged or confidential and is the 
property of the Capgemini Group. It is intended only for the person to whom it is 
addressed. If you are not the intended recipient,  you are not authorized to read, 
print, retain, copy, disseminate,  distribute, or use this message or any part 
thereof. If you receive this  message in error, please notify the sender immediately 
and delete all  copies of this message.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]