Thanks, Cedric: that was excellent (and fast) detective work!

So, the net of it is that the dailyprojectdata reporting simply changed from non-comment source lines of code to commented source lines of code. And I'm glad to hear that you compared the two analyses to each other: that increases the probability that things are OK.

As far as I'm concerned, we can leave it at total lines of code. All of our prior research indicates that, at least for Hackystat, it doesn't matter whether you count classes, methods, NCLOC, or simple LOC; they all co-vary with each other very precisely. And anyway, that makes the dailyprojectdata analysis easier because the totalLines field is guaranteed to be present in all FileMetrics data.

With respect to the drill down, currently there is a single column for total lines, which is good. It would be quite helpful to add additional columns to breakout the subtotals for all the file types found in the data for that day. So, for our current data, there would be one additional column ("java"), and that number would currently be the same as the Total column. (I think for consistency's sake with the other drilldowns the "Total" column should come last). Once SCLC comes on line, then there will be additional columns for the other file types we're counting.

Cheers,
Philip


--On Saturday, February 25, 2006 5:15 PM -1000 "(Cedric) Qin ZHANG" <[EMAIL PROTECTED]> wrote:

The generic file metric analysis is correct. I checked its output with
the old JavaFileMetric analysis, and result varies less than 1%.

Some recent development:
(1) Old JavaFileMetric related code is deleted, since you don't like
deprecated code. (2) The file metric data on the public server is still
sent by LOCC (i.e. there is only java metrics).

If you run the following analysis:
  streams totalLines(fileType) = {"desc", "count", FileMetric(fileType,
"sourceLines", "**")};   streams sourceLines(fileType) = {"desc",
"count", FileMetric(fileType, "totalLines", "**")};   chart my(fileType)
= {"size", totalLines(fileType), sourceLines(fileType)};

You can use either draw("*") or draw("Java") to get the chart. Since
there is only java metrics on the server side, the two will give you the
same result. (see the attachment for the chart)

The resulting chart has two lines:
(1) total line: around 200K, this is what you see in the report. Note
that total line includes comments and blank lines (ask Mike to confirm).
(2) source line: around 125K for java code, this is what you used to see.

As far as the drill down is concerned,  we have one single column (total
lines) currently. What other columns do you want to add?

Cheers,

Cedric




Philip Johnson wrote:
I've discovered an anomaly with Cedric's recent changes to FileMetrics
that appears to warrant some investigation.

I just received today's daily project summary, and while idly glancing
at it, I noticed something exciting:

The following alert(s) were triggered today:
* Project Daily Summaries
Hackystat-7 on 24-Feb-2006
  Dependency: 361 (inbound), 211 (outbound)
  Issues: 48/2/15/99/165 (o/i/r/c/t), 19/0/5/24/49 (yours)
  Unit Tests: 3990/50/4 (total p/f/e), 0/0/0 (yours)
  Churn: 3437/574 (total lines added/deleted), 667/173 (yours)
  Commits: 91 (total files committed), 8 (yours)
  CodeIssue: 6569 (Total Code Issues)
  Performance: 26 (total tests), 4 (failures)
  Build: 73 (total), 65 (successful)
  Generic File Metrics: 195912 (Total LOC)
  Active Time: 10 (total), 2.83 (yours)

No, it's not that 10 hours of Active Time were spent on Hackystat
yesterday (although that is, indeed, exciting).  It's the line before
that:

  Generic File Metrics: 195912 (Total LOC)

"Aha!", I said to myself (quietly, though, because it's 6:00am).
Austen has integrated SCLC into the daily build, and now we are
reporting on more than Java LOC!  I jumped to this conclusion because
I've been watching the Java SLOC recently, and it's been around 125
KLOC, so the jump to 195 KLOC seemed to indicate the move to
multi-language size counting.

I eagerly clicked on the link below this report, expecting to see in
the detailed report a breakout of the number of LOC associated with
each filetype found in each top-level workspace.  After waiting quite
a while (a couple of minutes), the report appeared, and to my surprise
the drilldown for Generic File Metrics just showed a count of Total
LOC per top-level module with no indication of the file type.

With growing confusion, I logged into the hackystat-l account and
looked at the FileMetric data for 24-Feb-2006.  There I found that the
data was all generated by LOCC as it has been forever.  That's not good.

I did one more analysis: I ran the Daily Project Details report for
20-Feb-2006---several days back when I know for a fact that the old
JavaFileMetric analysis was running and reporting the system size as
approximately 125 KLOC.  The new Generic File Metrics reports the
system size as being around 200K for that day.

Based on this data, I am led to believe that either (a)
JavaFileMetrics was broken and was undercounting Java size as reported
by LOCC, or (b) the new Generic File Metrics analysis is broken and is
overcounting the amount of Java size as reported by LOCC.

Cedric, could you investigate and let us know which of the following
is true?

Also, could you improve the drilldown for the Generic Size Analysis to
report, for each top-level workspace, the total LOC (as it does now)
as well as a breakdown by all file types found in the FileMetric data
for that particular day?  (The one line summary for Generic File
Metrics shown at the top of this email could also be improved to
report the overall total and the overall breakdown by file type.
Actually, there's no need to call it "Generic File Metrics"; "File
Metrics" will do just fine.)

Cheers,
Philip

Reply via email to