May I stick in $0.02 worth? (And to all those who figure that is what it is worth, I say, "So be it!" :)

You are measuring the 'cumulative byte count' as an indicator of 'load on the machine.' So far, so good. Can I surmise that at some point you (or a manager) are going to stand up at a meeting and say, "the current load is xxx, we expect it to increase to yyy in 6 months. Therefore, we recommend...."

My point is, you are measuring load now, with an eye to _predicting_ load at some future point. You may include specific conditions in the network that you find influence the load, but my point is the same.

Your measurements taken today are _estimates_ of the load under a specific set of conditions. As soon as you predict the load at a different time (i.e., an as yet unmeasured point) then your 'specific set of conditions' must be _defined_ to include the conditions you used now, and those you will use in future. A de' facto definition perhaps, but a definition, nonetheless.

What is the mean load under those conditions? Your measurements at any moment will be near, but slightly different from, that mean. they will be different because of 'minor' variations in the conditions, minor variations in the source of the load, etc. Not because the measurement is imprecise, as you point out.

I don't feel your proposed method of obtaining an 'error' estimate will get you home. More likely, you would do well to measure the load for multiple periods in a short time, and work out the est. standard deviation from that. This could serve as your 'process capability,' as it were - an indication of how madly the load fluctuates over a short period of the day.

I dare say that your network load is, on average, different at 9:30 am local time than at 12:30 pm, or at 11:00 pm or 3 am. It will possibly be different on Wednesday than on Sunday.

If you obtain measurements of load over a long period of time, say a week or two, then the standard deviation would be the equivalent of 'product variability.' this will clearly be larger than the 'process capability' or short term measure of variation. It would indicate the variation in load that a user could expect, whenever they try to do work on the network. Clearly, your selection of times to measure load will influence the significance of the stdev to the user. A system that reports an uptime of 98.5% does not tell a user much of interest, when the user only is involved for 8 hours a day and the machine idles for 16 hours.

Help any?
Jay

Don wrote:

Greetings,

I'm involved in a research project which measures the load on a
(computer) network. The reponse variable is the cumulative byte count,
which is measured at various times (which are determined by an
adaptive sampling technique).

The measurements taken at these times are assumed to be accurate, so I
am using the following technique to judge the accuracy of the
sampling:

Assuming we measure the cumulative byte count after 10s and 20s, and
record 100kb, and 200kb respectively....

1. Linearly interpolate between these 2 points to get

11s - 110kb
12s - 120kb
...

2. Calculate the difference between these interpolated values and the
actual values at 11s,12s,...

3. Use RMSE, SSE, or similar to get an overall measure of error


The obvious question is "How do you know the actual value is at 11s, 12,...?" The answer is that I am using an off-line data set, rather than doing the experiment in real-time to test the sampling algorithm.

Anyway, my question is: how valid is this method of assessing the
accuracy of the sampling technique given that there is no estimate of
"pure error" at the sample points?

Thanks in Advance,
D�nal
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================





-- Jay Warner Principal Scientist Warner Consulting, Inc. 4444 North Green Bay Road Racine, WI 53404-1216 USA

Ph:     (262) 634-9100
FAX:    (262) 681-1133
email:  [EMAIL PROTECTED]
web:    http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?




. . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================

Reply via email to