A few questions for clarification:

1.  In this venue, nearly everyone will immediately equate "correlation
value" to "value of a Pearson correlation coefficient".  Is that in fact
what you have in mind?  I think it cannot be of the kind we ordinarily
think of, because you're "us[ing] a correlation threshold to determine
which servers and clients talk to each other well and which don't",
which implies that you have different correlation values for different
client-server dyads;  in most of our experience, a correlation
coefficient is computed for the entire sample in hand, or perhaps for
previously-identified subsamples (e.g., males and females) of the
sample.  I infer that either you have something quite different in mind,
or that you're calculating correlations row-wise rather than column-wise

Then:  if you are using "correlation" in the sense most of us would
expect, what variables are being correlated to obtain the value that you
use to compare to the "threshold" value?  (You mention four possible
variables;  they would define six possible different zero-order
correlations;  which do you use for your criterion?)  Might be necessary
to see more concrete detail about what you're doing, in order to hope
for sensible advice...

2.  You're "measuring the data flow":  are your four variables different
forms of this measurement?  And I'm curious as to what you mean by "data
flow".  In a chemical plant, "flow" would refer to the flow of fluids,
and would be measured in <volume of fluid transferred per unit time> or
<mass of fluid transferred per unit time> or <velocity of the fluid>,
etc.  (And it may be worth observing that water flowing at a gallon per
minute through a pipe a foot in diameter is moving rather languidly;
water flowing at a gallon per minute through a tube of capillary
dimensions is a lethal weapon).  What would be the corresponding
dimensions of "data flow"?  (Bytes per second?  Number of client/server
exchanges per minute?  Etc.)

3.  You write of "how well a server and a client talk to each other".
What does "well" or "badly" mean, or how is this defined?  You "use a
correlation threshold":  how was that threshold value determined in the
first place (and if not by your research team, how do you know it's
germane to your project?)?

4.  That you "only want to see servers and clients that talk to each
other well" implies that you have no interest in diagnosing cases in
which they don't, and no interest in trying to remedy the, shall I say,
relationship, when it's not up to snuff.  Really?

As you can tell, I have very little notion of what your project actually
entails.  'Twouldn't surprise me much if others were in the same boat;
and collectively we have very little likelihood of being able to be
helpful to you.  Hard to guess a correlation value if one doesn't really
know what sort of beast it is.  It's not difficult to imagine lots of
ways of arriving at a number that might reasonably be called a
correlation (and even a Pearson correlation, come to that);  it's harder
to figure out which of these multitudinous figments of the imagination
might correspond to what you have in mind.

On Fri, 2 Apr 2004, meredith wrote:

> Summary of research project: The basic idea is that on a computer
> network you have machines called servers and machines called clients
> that talk to each other.  The data flow is of interest to the research
> team I'm working with.  We're measuring the data flow and calculating
> a correlation value as a standard measure of "how well" a server and
> client talk to each other.  Naturally we only want to see servers and
> clients that talk to each other well, so we use a correlation
> threshold to determine which servers and clients talk to each other
> well and which don't.
>
> Currently our calculations are done after the fact, it's more a
> history report than a real time diagnostic tool.  I'm looking for a
> way to guess the correlation value (within 90% - 99%) based on the
> number of machines that passed the correlation threshold.  I have four
> possible variables to work with and I'm currently testing for
> interactions between the variables.
>
> Any tips or guidance is appreciated.

 ------------------------------------------------------------
 Donald F. Burrill                              [EMAIL PROTECTED]
 56 Sebbins Pond Drive, Bedford, NH 03110      (603) 626-0816
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to