Re: a form of censoring I have not met before

2001-06-25 Thread Donald Burrill

Hi, Margaret.  I've given some thought to your problem;  here's a 
restatement of it, and a few thoughts.

Recapitulating, in case I've misunderstood a small point or three, 
you have "a 3-factor experiment", by which I assume you mean a 
complete, balanced, crossed design:  R(AxBxC)  for observations (R) 
within factors A, B, C, which in turn are crossed with each other. 
R is a random effect;  A,B,C are fixed effects.
You have measurements (Y, response variable) taken at 97 equal 
(2-hour) intervals over time (T, whose values range from 0 to 192). 
It is not clear whether T is one of the three factors.
Y is known or assumed to be a monotonic increasing function of T.
There exists a value Ymax such that when Y >= Ymax for some value T, 
say T = Tmax, no observations are made for T > Tmax.

I had a similar situation once, years ago.
Investigating the error between a reported covariance and the known 
true value of the covariance, in an experiment where two factors were 
manipulated (A:  the number of digits carried in arithmetic computations; 
B:  the number of digits required to represent the data), I made the 
logical error of using data for which the true covariance was exactly 
expressible as an integer.  Consequently I couldn't tell, for some cells 
of the design (where A was large and B was small), whether the 
disagreement between computed and theoretical values was actually zero, 
or merely smaller than was detectable because the theoretical value was 
an integer.  In the event, I was fortunate that discarding the rows or 
columns in which the observed error was zero left me with the bulk of 
the design intact.

In your case, I gather (between the lines) that discarding cells that 
have missing data does rather more damage to the design, and is 
therefore not an option.  

First question:  If Y = f(T),  f a monotone increasing function,  
as you assert, is the form of  f  known?  
If not, may we assume it's of the same form for all cells of the design? 
(If  f  is linear, the problem reduces to an analysis of covariance 
of some sort, possibly with some missing cells, and can be addressed 
in a straightforward way by multiple regression analysis.
If for some transformation of  Y  and/or  T  the corresponding  f  
is linear, the same is true for the transformed variable(s).)  

(In my problem, when the precision of the reported covariance was 
expressed as the negative logarithm of the absolute relative error, 
the relationship between the transformed Y, A, and B turned out to be 
markedly linear (even with unit coefficients, when all three variables 
were expressed to the same base).  Took a while to discover that, 
though.) 

In each cell of your design, you have a number of values (<=97) of (Y,T). 
You can therefore estimate the parameters of the function  f  in each 
cell for which  n  is large enough;  this may not even entail discarding 
any cells for insufficient data, although if you have cells for which 
 Tmax << 97  you may not be able to get a _good_ estimate of some of the 
parameters of  f  in those cells.
(Although since  f  is monotone, it may be possible to find a 
transformation for which  f  is linear, which would vastly simplify the 
estimating process.)

At this point I don't know whether the cells of your design are for a 
2-way or a 3-way design (if T is one of the three factors, it's 2-way);  
but you can then carry out the obvious ANOVA (possibly unbalanced) on 
the parameters of  f  to see how that varies across vales of your 
factors.  You can also discover how much it costs you to assume that one 
or more parameters remain constant across the design (like the slope of a 
regression line in analysis of covariance, under homogeneity of 
regression). 

Hope this has been helpful.
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: a form of censoring I have not met before

2001-06-21 Thread Rich Ulrich

On 21 Jun 2001 00:35:11 -0700, [EMAIL PROTECTED] (Margaret
Mackisack) wrote:

> I was wondering if anyone could direct me to a reference about the 
> following situation. In a 3-factor experiment, measurements of a continuous 
> variable, which is increasing monotonically over time, are made every 2 
> hours from 0 to 192 hours on the experimental units (this is an engineering 
> experiment). If the response exceeds a set maximum level the unit is not 
> observed any more (so we only know that the response is > that level). If 
> the measuring equipment could do so it would be preferred to observe all 
> units for the full 192 hours. The time to censoring is of no interest as 
> such, the aim is to estimate the form of the response for each unit which 
> is the trace of some curve that we observe every 2 hours. Ignoring the 
> censored traces in the time period after they are censored puts a huge 

Well, it certainly *sounds*  as if the "time to censoring" should be 
of great interest, if you had an adequate model.

Thus, when you say that "ignoring" them gives  "a huge 
downward bias",  it sounds to me as if you are admitting that 
you do not have an acceptable model.

Who can you blame for that?  What leverage do you 
have, if you try to toss out those bad results?  (Surely, 
you do have some ideas about forming estimates
that *do*  take the hours into account.  The problem
belongs in the hands of someone who does.)

 - maybe you want to segregate trials into the ones
with 192 hours, or less than 192 hours; and figure two 
(Maximum Likelihood) estimates for the parameters, which
you then combine.


> downward bias into the results and is clearly not the thing to do although 
> that's what has been done in the past with these experiments. Any 
> suggestions of where people have addressed data of this or related form 
> would be very gratefully received.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



a form of censoring I have not met before

2001-06-20 Thread Margaret Mackisack

Hi folks,

I was wondering if anyone could direct me to a reference about the 
following situation. In a 3-factor experiment, measurements of a continuous 
variable, which is increasing monotonically over time, are made every 2 
hours from 0 to 192 hours on the experimental units (this is an engineering 
experiment). If the response exceeds a set maximum level the unit is not 
observed any more (so we only know that the response is > that level). If 
the measuring equipment could do so it would be preferred to observe all 
units for the full 192 hours. The time to censoring is of no interest as 
such, the aim is to estimate the form of the response for each unit which 
is the trace of some curve that we observe every 2 hours. Ignoring the 
censored traces in the time period after they are censored puts a huge 
downward bias into the results and is clearly not the thing to do although 
that's what has been done in the past with these experiments. Any 
suggestions of where people have addressed data of this or related form 
would be very gratefully received.

TIA,

Margaret
___
Margaret Mackisack PhD CStat
Consulting Statistician
Stillman & Mackisack Pty Ltd
ABN 65 091 869 091
23 Sherwood St
Kurrajong NSW 2758 Australia



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=