Hi,

Replying to myself, but it's only when you start talking to yourself that it's 
an issue, right ? RIGHT ?

--- On Tue, 22/9/09, Tony <td_mi...@yahoo.com> wrote:
>
--snip--
> 
> I've upgraded to the latest 12.0rc2 version and the results
> are a lot better. An example of the data is:
> 
--snip--
> 
> Now that adjb seems to be doing what it is supposed to do I
> will accumulate a few more days/weeks of data and compare
> the values from pmacct (with adjb) to those being recorded
> directly by the packeteer. Hopefully they will be a lot
> closer now.
> 

The below stats are from a SINGLE day (23/09/2009) worth of data. I have some 
small concerns about the validity of the data set for comparison given the way 
data is extracted from the packeteer. The concern I have is that I'm not sure 
if the daily report that runs extracts data from 2300-2300 or 0000-0000. 
Regardless the difference in the volume of data between 2300-0000 on different 
days shouldn't be that great anyway.

Here is the data:

adjb            pmacct          packeteer       (pack-adjb)     %
11037185152     10733168136     12957484242     1920299090      14.820%
4216446261      4112843092      4062920012      -153526249      -3.779%
5176360717      4945117219      5133601176      -42759541       -0.833%
1347873812      1318879176      1362592012      14718200        1.080%
955390004       923140839       952564475       -2825529        -0.297%
871276688       852006937       892911008       21634320        2.423%
703135346       673351910       695471238       -7664108        -1.102%
449624941       455719218       453788344       4163403         0..917%
339088025       324566192       338516514       -571511         -0.169%
148191479       144684695       149437506       1246027         0.834%
52648230        38364870        40825032        -11823198       -28.961%

adjb = Data from pmacct with adjb=26 applied
pmacct = Direct pmacct data (no adjust)
packeteer = Data exported from the packeteer
(pack-adj) = 3rd column minus 1st column
% = (pack-adj) column as a percent of packeteer column


If you were to score it like they do at the Olympics and discard the highest & 
lowest and then average the rest, it would come out a very respectable -0.103%, 
which in anyones language would be near enough not to worry about. The concern 
I have is with the ones that are wildly different (14 & 28%) and the fact that 
they are in opposite directions. The -3.8% is a bit far off too, but that could 
just be due to the smallish sample size and might get better over a few days. 
The 28% could be the same, it's not a very large sample. The 14% however is 
10GB of data and should be big enough to reflect proper statistical variance 
given that most of the smaller ones seem to.

I have some issues with the quality of the data extracted from the packeteer 
and I'm going to see if I can extract it in a better manner. At the moment it 
is grouped into subnets that are allocated to users and it is on a daily basis. 
This means that I'm creating a spreadsheet with the info from mySQL for pmacct 
and then manually copying stuff from the packeteer output with a lot of 
cross-referencing to match "names" to IP addresses. The above table took about 
3 hours worth of time to create and isn't conducive for continual testing as I 
make changes.

I'm hopefully going to revisit this early next week and try and get some better 
information.


regards,
Tony.


      
__________________________________________________________________________________
Get more done like never before with Yahoo!7 Mail.
Learn more: http://au.overview.mail.yahoo.com/


_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to