Hi all,

as John pointed out, there is a way to create settings where the studentized residuals are undefined. However, after cross-checking it seems that the residuals are getting calculated without any error. The problem comes up when I use outlierTest to assign a p,q value,respectively.

Below is the code and some printouts:


non.zero.orga.vector <- c(1.00,1.00,1.00,1.00,1.00,1.00,1.00,1.00,1.00,1.00,13.98) #targervector

print(meta.Matrix[,3])
x1 x28 x2 x3 x4 x51 x52 x5 x6 x7 x82 4.6 35.8 4.1 5.4 5.2 27.9 48.2 4.5 5.7 4.2 100.5

print(meta.Matrix[,10])
     x1                      x28               x2     x3
0.2990466     1.0497156     0.2805028 0.3517993

      x4          x51                x52                  x5
0.3543678 1.0178604 0.4933182 0.2810865

    x6                   x7              x82
0.4349550 0.3269192 3.0889747

glModel <- glm(non.zero.orga.vector ~ meta.Matrix[,3]+meta.Matrix[,10]) #create glm for testing

print(glModel$residuals) #check if residuals are calculated for every entry in the target vector x1 x28 x2 x3 x4 x51 x52 0.6600696 -2.5816334 0.7438969 0.4129873 0.3976498 -2.5350861 0.3128240
       x5            x6                    x7           x82
 0.7465766  0.0102020  0.5181337  1.3143796



test.Res <- outlierTest(glModel,digits=4,cutoff=Inf,n.max=Inf) #test the glm for outlier, cutoff=Inf/n.max=Inf == report everything

print(test.Res$bonf.p) #check if q-value exists
x28 x51 x52 x5 x2 x1 x7 x3 0.1915995 0.2240759 2.1637438 6.0211575 6.0306110 6.4618792 7.1932613 7.7595619
      x4               x6
7.8394477 9.9437848



As you see the glm calculates residuals for x82 (which is in fact 1.314) but the outlierTest does not assign a p/q value to it. Does anyone know why?

Thanks in advance,

Phil

On 10/17/2014 08:54 PM, John Fox wrote:
Dear Phil,

Yes, that's a bit clearer. One can invent data configurations where certain 
studentized residuals are undefined. For example, try the following:

y <- c(0, 0, 0, 0, 0, 1)
x <- 1:6
xx <- (1:6 - 3.5)^2
rstudent(lm(y ~ x))
rstudent(lm(y ~ xx))
plot(x, y)
plot(xx, y)

The plots should clarify what's going on.

I'm copying to r-help since the discussion began there.

I hope this helps,
  John

On Fri, 17 Oct 2014 18:35:59 +0200
  Philip Stevens <chobop...@gmail.com> wrote:
Dear John,

Thank you for your fast reply. Unfortunately I am out of office right now
but I will get back to you on Monday asap with a toy example and some code.
Meanwhile let me try to explain further.

Basically not the glm but the outlierTest afterwards fails. And it only
fails if all values, used to set up the glm, are exactly 1 except for one
value which can be arbitrary large. I construct the glm from a vector of
doubles (the target values) and a vector of other numeric values(metadata).
(So this should be fit <- glm(targetVector ~ metadata) ) And want to check
if one of those doubles(in the target vector) is an outlier using
outlierTest(fit). The residuals are calculated by the glm but the
outlierTest does not report a p nor a q value for the one value in the
target vector which is not 1. And I can't figure out why...

I hope this makes it a bit clearer.
Anyways I will come back to you on Monday.

Best,

Phil
Am 17.10.2014 17:43 schrieb "John Fox" <j...@mcmaster.ca>:

Dear Phil,

After reading your posting several times, I still don't understand what
you did. As usual, having a reproducible example illustrating the error
would be a great help. I do have a guess about the source of the error:
glm() failed in some way for the problematic case.

Best,
  John

------------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/


On Fri, 17 Oct 2014 12:53:30 +0200
  Phil <chobop...@gmail.com> wrote:
Hi guys,

I came across a strange phenomena and can't figure out why it happens by
myself so here we go.
I got a dataframe which consists of double numbers which I want to
check, row-wise if there are outliers in the rows.
So I iterate over the rows and create a glm using the numbers of that
particular row. Which might look like this:
case1)
          x1        x2        x3        x4        x5        x6 x7
x8        x9        x10        x11
      0.00     3.91     0.00     0.00     0.00   68.03   40.39 0.00
0.00      0.00       4.11

or like this:
case2)
          x1        x2        x3        x4        x5        x6 x7
x8        x9        x10        x11
       1.00     1.00    1.00     1.00     1.00     1.00     1.00 1.00
1.00     1.00      5.34

or any other combination of double numbers...

however, using a glm like this:

glModel <- glm(vector ~ some_other_meta_data_which_is_double_numbers)

and testing it with:

   test.Res <- outlierTest(glModel,digits=4,cutoff=Inf,n.max=Inf)

I always get a result consisting of the desired p and q values but not
if the vector I use looks like case2. There is no error message and the
computation does not stop either.
However, all p and q values are produced except for the last value x11.

Any idea why this particular value gets dropped from the output of the
outlierTest Method in the car package.
Here is the sessioninfo:

   sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
   [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
   [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
   [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8
   [7] LC_PAPER=en_US.utf8       LC_NAME=C
   [9] LC_ADDRESS=C              LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods base

other attached packages:
[1] ggplot2_1.0.0      car_2.0-21         RColorBrewer_1.0-5 iNEXT_1.0
[5] vegan_2.0-10       lattice_0.20-29    permute_0.8-3

loaded via a namespace (and not attached):
   [1] colorspace_1.2-4 compiler_3.1.1   digest_0.6.4 grid_3.1.1
   [5] gtable_0.1.2     labeling_0.3     MASS_7.3-33 munsell_0.4.2
   [9] nnet_7.3-8       plyr_1.8.1       proto_0.3-10 Rcpp_0.11.2
[13] reshape2_1.4     scales_0.2.4     stringr_0.6.2    tools_3.1.1

Any help is highly appreciated.

Thanks

Phil

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




------------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/
        
        


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to