Re: [R] hypergeometric vs fisher.test

2010-08-13 Thread Peter Dalgaard
Andrea Franceschini wrote:
 I ask the question also because I found this line in Wikipedia:
 The test (see above) based on the hypergeometric distribution
 (hypergeometric test) is identical to the corresponding one-tailed
 version of Fisher's exact test.
 
 Is this wrong ?  May I kindly ask a friendly explanation for
 not-experts in statistics ?
 
 Thx a lot,
 

You never said you were a non-expert. A question like that might very
well have come from a student, or an incompetent claiming to have found
a bug in fisher.test...

The point was that Fisher's test takes the 2x2 table

a b
c d

and interprets the situation under the null hypothesis as taking a
sample of size a+c from an urn with a+b white balls and c+d black balls.

Once you read the docs for phyper properly (it _is_ tricky to get it
right), you arrive at

 phyper(16,17+181,449+19551,17+449, lower.tail=FALSE)
[1] 3.693347e-06


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unable to retrieve residual sum of squares from nls output

2010-08-13 Thread Dennis Fisher
Colleagues,

I am using nls successfully (2.11.1, OS X) but I am having difficulties 
retrieving part of the output - residual sum of squares.  I have assigned the 
output to FIT:
  FIT
 Nonlinear regression model
   model:  NEWY ~ PMESOR + PAMPLITUDE * cos(2 * pi * (NEWX - POFFSET)/PERIOD) 
data:  parent.frame() 
 PMESOR PAMPLITUDEPOFFSET 
1153.02   -1183.09  24.58 
  residual sum-of-squares: 1815056
 
 Number of iterations to convergence: 8 
 Achieved convergence tolerance: 1.643e-08 

As you can see, the object contains residual sum-of-squares: 1815056.  
However, I can't figure out how to grab that value.   The tried using str:
  str(FIT)
 List of 6
  $ m  :List of 16
   ..$ resid :function ()  
   ..$ fitted:function ()  
   ..$ formula   :function ()  
   ..$ deviance  :function ()  
   ..$ lhs   :function ()  
   ..$ gradient  :function ()  
   ..$ conv  :function ()  
   ..$ incr  :function ()  
   ..$ setVarying:function (vary = rep(TRUE, length(useParams)))  
   ..$ setPars   :function (newPars)  
   ..$ getPars   :function ()  
   ..$ getAllPars:function ()  
   ..$ getEnv:function ()  
   ..$ trace :function ()  
   ..$ Rmat  :function ()  
   ..$ predict   :function (newdata = list(), qr = FALSE)  
   ..- attr(*, class)= chr nlsModel
  $ convInfo   :List of 5
   ..$ isConv : logi TRUE
   ..$ finIter: int 8
   ..$ finTol : num 1.64e-08
   ..$ stopCode   : int 0
   ..$ stopMessage: chr converged
  $ data   : language parent.frame()
  $ call   : language nls(formula = NEWY ~ PMESOR + PAMPLITUDE * cos(2 * 
 pi * (NEWX -  POFFSET)/PERIOD), start = list(PMESOR = MESOR, PAMPLITUDE = 
 AMPLITUDE,  ...
  $ dataClasses: Named chr numeric
   ..- attr(*, names)= chr NEWX
  $ control:List of 2
   ..$ maxiter : num 1000
   ..$ warnOnly: logi TRUE
  - attr(*, class)= chr nls

However, none of these elements appears to contain the residual sum of squares. 
 
Help!  Thanks.

Dennis





Dennis Fisher MD
P  (The P Less Than Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unable to retrieve residual sum of squares from nls output

2010-08-13 Thread Gabor Grothendieck
On Fri, Aug 13, 2010 at 5:21 PM, Dennis Fisher fis...@plessthan.com wrote:
 Colleagues,

 I am using nls successfully (2.11.1, OS X) but I am having difficulties 
 retrieving part of the output - residual sum of squares.  I have assigned the 
 output to FIT:
  FIT
 Nonlinear regression model
   model:  NEWY ~ PMESOR + PAMPLITUDE * cos(2 * pi * (NEWX - POFFSET)/PERIOD)
    data:  parent.frame()
     PMESOR PAMPLITUDE    POFFSET
    1153.02   -1183.09      24.58
  residual sum-of-squares: 1815056

 Number of iterations to convergence: 8
 Achieved convergence tolerance: 1.643e-08

 As you can see, the object contains residual sum-of-squares: 1815056.  
 However, I can't figure out how to grab that value.   The tried using str:
  str(FIT)
 List of 6
  $ m          :List of 16
   ..$ resid     :function ()
   ..$ fitted    :function ()
   ..$ formula   :function ()
   ..$ deviance  :function ()
   ..$ lhs       :function ()
   ..$ gradient  :function ()
   ..$ conv      :function ()
   ..$ incr      :function ()
   ..$ setVarying:function (vary = rep(TRUE, length(useParams)))
   ..$ setPars   :function (newPars)
   ..$ getPars   :function ()
   ..$ getAllPars:function ()
   ..$ getEnv    :function ()
   ..$ trace     :function ()
   ..$ Rmat      :function ()
   ..$ predict   :function (newdata = list(), qr = FALSE)
   ..- attr(*, class)= chr nlsModel
  $ convInfo   :List of 5
   ..$ isConv     : logi TRUE
   ..$ finIter    : int 8
   ..$ finTol     : num 1.64e-08
   ..$ stopCode   : int 0
   ..$ stopMessage: chr converged
  $ data       : language parent.frame()
  $ call       : language nls(formula = NEWY ~ PMESOR + PAMPLITUDE * cos(2 * 
 pi * (NEWX -      POFFSET)/PERIOD), start = list(PMESOR = MESOR, PAMPLITUDE 
 = AMPLITUDE,  ...
  $ dataClasses: Named chr numeric
   ..- attr(*, names)= chr NEWX
  $ control    :List of 2
   ..$ maxiter : num 1000
   ..$ warnOnly: logi TRUE
  - attr(*, class)= chr nls

 However, none of these elements appears to contain the residual sum of 
 squares.
 Help!  Thanks.


Try:

   sum(resid(FIT)^2)

or

   FIT$m$deviance()

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unable to retrieve residual sum of squares from nls output

2010-08-13 Thread Jun Shen
Hi, Dennis,

Does this give what you want?

sum(resid(FIT)^2)

Jun

Senior Pharmacokineticist
Seventh Wave Labs
On Fri, Aug 13, 2010 at 4:21 PM, Dennis Fisher fis...@plessthan.com wrote:
 Colleagues,

 I am using nls successfully (2.11.1, OS X) but I am having difficulties 
 retrieving part of the output - residual sum of squares.  I have assigned the 
 output to FIT:
  FIT
 Nonlinear regression model
   model:  NEWY ~ PMESOR + PAMPLITUDE * cos(2 * pi * (NEWX - POFFSET)/PERIOD)
    data:  parent.frame()
     PMESOR PAMPLITUDE    POFFSET
    1153.02   -1183.09      24.58
  residual sum-of-squares: 1815056

 Number of iterations to convergence: 8
 Achieved convergence tolerance: 1.643e-08

 As you can see, the object contains residual sum-of-squares: 1815056.  
 However, I can't figure out how to grab that value.   The tried using str:
  str(FIT)
 List of 6
  $ m          :List of 16
   ..$ resid     :function ()
   ..$ fitted    :function ()
   ..$ formula   :function ()
   ..$ deviance  :function ()
   ..$ lhs       :function ()
   ..$ gradient  :function ()
   ..$ conv      :function ()
   ..$ incr      :function ()
   ..$ setVarying:function (vary = rep(TRUE, length(useParams)))
   ..$ setPars   :function (newPars)
   ..$ getPars   :function ()
   ..$ getAllPars:function ()
   ..$ getEnv    :function ()
   ..$ trace     :function ()
   ..$ Rmat      :function ()
   ..$ predict   :function (newdata = list(), qr = FALSE)
   ..- attr(*, class)= chr nlsModel
  $ convInfo   :List of 5
   ..$ isConv     : logi TRUE
   ..$ finIter    : int 8
   ..$ finTol     : num 1.64e-08
   ..$ stopCode   : int 0
   ..$ stopMessage: chr converged
  $ data       : language parent.frame()
  $ call       : language nls(formula = NEWY ~ PMESOR + PAMPLITUDE * cos(2 * 
 pi * (NEWX -      POFFSET)/PERIOD), start = list(PMESOR = MESOR, PAMPLITUDE 
 = AMPLITUDE,  ...
  $ dataClasses: Named chr numeric
   ..- attr(*, names)= chr NEWX
  $ control    :List of 2
   ..$ maxiter : num 1000
   ..$ warnOnly: logi TRUE
  - attr(*, class)= chr nls

 However, none of these elements appears to contain the residual sum of 
 squares.
 Help!  Thanks.

 Dennis





 Dennis Fisher MD
 P  (The P Less Than Company)
 Phone: 1-866-PLessThan (1-866-753-7784)
 Fax: 1-866-PLessThan (1-866-753-7784)
 www.PLessThan.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unable to retrieve residual sum of squares from nls output

2010-08-13 Thread Gabor Grothendieck
On Fri, Aug 13, 2010 at 5:32 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Fri, Aug 13, 2010 at 5:21 PM, Dennis Fisher fis...@plessthan.com wrote:
 Colleagues,

 I am using nls successfully (2.11.1, OS X) but I am having difficulties 
 retrieving part of the output - residual sum of squares.  I have assigned 
 the output to FIT:
  FIT
 Nonlinear regression model
   model:  NEWY ~ PMESOR + PAMPLITUDE * cos(2 * pi * (NEWX - POFFSET)/PERIOD)
    data:  parent.frame()
     PMESOR PAMPLITUDE    POFFSET
    1153.02   -1183.09      24.58
  residual sum-of-squares: 1815056

 Number of iterations to convergence: 8
 Achieved convergence tolerance: 1.643e-08

 As you can see, the object contains residual sum-of-squares: 1815056.  
 However, I can't figure out how to grab that value.   The tried using str:
  str(FIT)
 List of 6
  $ m          :List of 16
   ..$ resid     :function ()
   ..$ fitted    :function ()
   ..$ formula   :function ()
   ..$ deviance  :function ()
   ..$ lhs       :function ()
   ..$ gradient  :function ()
   ..$ conv      :function ()
   ..$ incr      :function ()
   ..$ setVarying:function (vary = rep(TRUE, length(useParams)))
   ..$ setPars   :function (newPars)
   ..$ getPars   :function ()
   ..$ getAllPars:function ()
   ..$ getEnv    :function ()
   ..$ trace     :function ()
   ..$ Rmat      :function ()
   ..$ predict   :function (newdata = list(), qr = FALSE)
   ..- attr(*, class)= chr nlsModel
  $ convInfo   :List of 5
   ..$ isConv     : logi TRUE
   ..$ finIter    : int 8
   ..$ finTol     : num 1.64e-08
   ..$ stopCode   : int 0
   ..$ stopMessage: chr converged
  $ data       : language parent.frame()
  $ call       : language nls(formula = NEWY ~ PMESOR + PAMPLITUDE * cos(2 * 
 pi * (NEWX -      POFFSET)/PERIOD), start = list(PMESOR = MESOR, PAMPLITUDE 
 = AMPLITUDE,  ...
  $ dataClasses: Named chr numeric
   ..- attr(*, names)= chr NEWX
  $ control    :List of 2
   ..$ maxiter : num 1000
   ..$ warnOnly: logi TRUE
  - attr(*, class)= chr nls

 However, none of these elements appears to contain the residual sum of 
 squares.
 Help!  Thanks.


 Try:

   sum(resid(FIT)^2)

 or

   FIT$m$deviance()

or deviance(FIT)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bug in t.test?

2010-08-13 Thread Johannes W. Dietrich

Hello all,

due to unexplained differences between statistical results from 
collaborators and our lab that arose in the same large proteomics 
dataset we reevaluated the t.test() function. Here, we found a weird 
behaviour that is also reproducible in the following small test 
dataset:


Suppose, we have two vectors with numbers and some missing values 
that refer to the same individuals and that should therefore be 
evaluated with a paired t-test:



 testdata.A - c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
 testdata.B - c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);


Then

 print(t.test(testdata.A, testdata.B, paired=TRUE, 
alternative=two.sided, na.action=na.pass))


and

 print(t.test(testdata.A, testdata.B, paired=TRUE, 
alternative=two.sided, na.action=na.exclude))


deliver the same p value (0.1162, identical to Excel's result).

However, after combining the two vectors with


 testdata - c(testdata.A, testdata.B);


and defining a criterion vector with


 criterion - c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);


(that is the type of data layout we have in our proteomics project) 
we get a different p-value (0.01453) with


 print(t.test(testdata ~ criterion, paired=TRUE, 
alternative=two.sided, na.action=na.exclude)) .


The statement

 print(t.test(testdata ~ criterion, paired=TRUE, 
alternative=two.sided, na.action=na.pass))


however, delivers a p-value of 0.1162 again.

With

 print(t.test(testdata[criterion==0], testdata[criterion==1], 
paired=TRUE, alternative=two.sided, na.action=na.exclude))


that imitates the first form, we get again a p value of 0.1162.

What is the reason for the different p values? Should not all calls 
to t.test that exlude missing values be equivalent and therefore 
deliver the same results?


Excel, StatView and KaleidaGraph all display a p-value of 0.1162.

J. W. D.
--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- Dr. Johannes W. Dietrich, M.D.
-- Laboratory XU44, Endocrine Research
-- Medical Hospital I, Bergmannsheil University Hospitals
-- Ruhr University of Bochum
-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
-- eMail: j.w.dietr...@medical-cybernetics.de
-- WWW: http://medical-cybernetics.de
-- WWW: http://www.bergmannsheil.de
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lattice xyplots plots with multiple lines per cell

2010-08-13 Thread Robin Jeffries
Hello,

I need to plot the means of some outcome for two groups (control vs
intervention) over time (discrete) on the same plot, for various subsets
such as gender and grade level. What I have been doing is creating all
possible subsets first, using the aggregate function to create the means
 over time, then plotting the means over time (as a simple line plot with
both control  intervention on one plot) for one subset. I then use par()
and repeat this plot for each gender x grade level subset so they all appear
on one page.


This appears to me to be very similar to an xyplot, something like
 mean(outcome) ~ gender + gradelevel. However, I can't figure out how I
could get both control and intervention lines in the same plot.

Any suggestions? What i'm doing now -works-, but just seems to be the long
way around.

-Robin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Installing rJava fails on Gentoo (amd64) with Sun JDK - checking JNI data types... error

2010-08-13 Thread Anthony Staines
Installing rJava fails consistently, whether installed via
the command line as below, or through install.packages(
'rJava' ), and whether 0.84 or 0.85 is used, with the
message :-
checking JNI data types... configure: error: One or more
JNI types differ from the corresponding native type. You may
need to use non-standard compiler flags or a different
compiler in order to fix this.

I've tried to fix it using suggestions from the r-help list,
and the Gentoo java documentation, but had no joy so far.
Other Java stuff works fine on this box.

# R CMD INSTALL rJava_0.8-5.tar.gz
* installing to library ‘/usr/lib64/R/library’
* installing *source* package ‘rJava’ ...
checking for gcc... x86_64-pc-linux-gnu-gcc -std=gnu99
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether x86_64-pc-linux-gnu-gcc -std=gnu99 accepts
-g... yes
checking for x86_64-pc-linux-gnu-gcc -std=gnu99 option to
accept ISO C89... none needed
checking how to run the C preprocessor...
x86_64-pc-linux-gnu-gcc -std=gnu99 -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/wait.h that is POSIX.1 compatible... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for string.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for unistd.h... (cached) yes
checking for an ANSI C-conforming const... yes
checking whether time.h and sys/time.h may both be
included... yes
configure: checking whether x86_64-pc-linux-gnu-gcc
-std=gnu99 supports static inline...
yes
checking whether setjmp.h is POSIX.1 compatible... yes
checking whether sigsetjmp is declared... yes
checking whether siglongjmp is declared... yes
checking Java support in R... present:
interpreter : '/usr/bin/java'
archiver: '/usr/bin/jar'
compiler:
'/home/astaines/.gentoo/java-config-2/current-user-vm/bin/javac'
header prep.: '/usr/bin/javah'
cpp flags   : '-I/opt/sun-jdk-1.6.0.20/jre/../include
-I/opt/sun-jdk-1.6.0.20/jre/../include/linux'
java libs   : '-L/opt/sun-jdk-1.6.0.20/jre/lib/amd64/server
-L/opt/sun-jdk-1.6.0.20/jre/lib/amd64
-L/opt/sun-jdk-1.6.0.20/jre/../lib/amd64 -L
-L/usr/java/packages/lib/amd64 -L/usr/lib64 -L/lib64 -L/lib
-L/usr/lib -ljvm'
checking whether JNI programs can be compiled... yes
checking JNI data types... configure: error: One or more JNI
types differ from the corresponding native type. You may
need to use non-standard compiler flags or a different
compiler in order to fix this.
ERROR: configuration failed for package ‘rJava’
* removing ‘/usr/lib64/R/library/rJava’


# R --version
R version 2.11.1 (2010-05-31)

# java -showversion
java version 1.6.0_20
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed  mode)

The compiler mentioned in the R output :-
/home/astaines/.gentoo/java-config-2/current-user-vm
is a link pointing to
/usr/lib/jvm/sun-jdk-1.6
which in turn is a link pointing to
/opt/sun-jdk-1.6.0.20
however, any of the three will work.

# R CMD javareconf
*** JAVA_HOME is not a valid path, ignoring
Java interpreter : /usr/bin/java
Java version : 1.6.0_20
Java home path   : /opt/sun-jdk-1.6.0.20/jre
Java compiler:
/home/astaines/.gentoo/java-config-2/current-user-vm/bin/javac
Java headers gen.: /usr/bin/javah
Java archive tool: /usr/bin/jar
Java library path:
$(JAVA_HOME)/lib/amd64/server:$(JAVA_HOME)/lib/amd64:$(JAVA_HOME)/../lib/amd64::/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
JNI linker flags : -L$(JAVA_HOME)/lib/amd64/server
-L$(JAVA_HOME)/lib/amd64 -L$(JAVA_HOME)/../lib/amd64 -L
-L/usr/java/packages/lib/amd64 -L/usr/lib64 -L/lib64 -L/lib
-L/usr/lib -ljvm
JNI cpp flags: -I$(JAVA_HOME)/../include
-I$(JAVA_HOME)/../include/linux

Updating Java configuration in /usr/lib64/R
Done.

JAVA_HOME is unset by default on my system. As suggested, by
Godmar Back, on the list I've tried setting it. The Gentoo
way is to use
JAVA_HOME=$(java-config --jdk-home)
and this produces what I think is the desired result
#printenv | grep -i JAVA_HOME
JAVA_HOME=/opt/sun-jdk-1.6.0.20

BUT

# env JAVA_HOME=/opt/sun-jdk-1.6.0.20/ R CMD javareconf
*** JAVA_HOME is not a valid path, ignoring
Java interpreter : /usr/bin/java
Java version : 1.6.0_20
Java home path   : /opt/sun-jdk-1.6.0.20/jre
Java compiler:
/home/astaines/.gentoo/java-config-2/current-user-vm/bin/javac
Java headers gen.: 

Re: [R] Bug in t.test?

2010-08-13 Thread Thomas Lumley



Thanks for the clear example. However, if there is a bug it is only that 
t.test.formula doesn't throw an error when given the paired=TRUE option.

The documentation says The formula interface is only applicable for the 2-sample 
tests.,  but there probably should be an explicit check -- I didn't see that in the 
documentation until I realized that t.test.formula couldn't work for paired tests because 
you don't tell it which observations are paired.

   -thomas


On Fri, 13 Aug 2010, Johannes W. Dietrich wrote:


Hello all,

due to unexplained differences between statistical results from collaborators 
and our lab that arose in the same large proteomics dataset we reevaluated 
the t.test() function. Here, we found a weird behaviour that is also 
reproducible in the following small test dataset:


Suppose, we have two vectors with numbers and some missing values that refer 
to the same individuals and that should therefore be evaluated with a paired 
t-test:



 testdata.A - c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
 testdata.B - c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);


Then

 print(t.test(testdata.A, testdata.B, paired=TRUE, alternative=two.sided, 
na.action=na.pass))


and

 print(t.test(testdata.A, testdata.B, paired=TRUE, alternative=two.sided, 
na.action=na.exclude))


deliver the same p value (0.1162, identical to Excel's result).

However, after combining the two vectors with


 testdata - c(testdata.A, testdata.B);


and defining a criterion vector with


 criterion - c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);


(that is the type of data layout we have in our proteomics project) we get a 
different p-value (0.01453) with


 print(t.test(testdata ~ criterion, paired=TRUE, alternative=two.sided, 
na.action=na.exclude)) .


The statement

 print(t.test(testdata ~ criterion, paired=TRUE, alternative=two.sided, 
na.action=na.pass))


however, delivers a p-value of 0.1162 again.

With

 print(t.test(testdata[criterion==0], testdata[criterion==1], paired=TRUE, 
alternative=two.sided, na.action=na.exclude))


that imitates the first form, we get again a p value of 0.1162.

What is the reason for the different p values? Should not all calls to t.test 
that exlude missing values be equivalent and therefore deliver the same 
results?


Excel, StatView and KaleidaGraph all display a p-value of 0.1162.

J. W. D.
--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- Dr. Johannes W. Dietrich, M.D.
-- Laboratory XU44, Endocrine Research
-- Medical Hospital I, Bergmannsheil University Hospitals
-- Ruhr University of Bochum
-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
-- eMail: j.w.dietr...@medical-cybernetics.de
-- WWW: http://medical-cybernetics.de
-- WWW: http://www.bergmannsheil.de
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug in t.test?

2010-08-13 Thread Johannes W. Dietrich
Thank you for the fast reply! Although I have read the help page for 
t.test over and over again I have obviously overlooked the relevant 
sentence. The workaround that I have planned seems to be the 
correct use.


Thanks again,

J. W. D.

At 15:31 Uhr -0700 13.08.2010, Thomas Lumley wrote:
Thanks for the clear example. However, if there is a bug it is only 
that t.test.formula doesn't throw an error when given the 
paired=TRUE option.


The documentation says The formula interface is only applicable for 
the 2-sample tests.,  but there probably should be an explicit 
check -- I didn't see that in the documentation until I realized 
that t.test.formula couldn't work for paired tests because you don't 
tell it which observations are paired.


   -thomas


On Fri, 13 Aug 2010, Johannes W. Dietrich wrote:


Hello all,

due to unexplained differences between statistical results from 
collaborators and our lab that arose in the same large proteomics 
dataset we reevaluated the t.test() function. Here, we found a 
weird behaviour that is also reproducible in the following small 
test dataset:


Suppose, we have two vectors with numbers and some missing values 
that refer to the same individuals and that should therefore be 
evaluated with a paired t-test:



 testdata.A - c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
 testdata.B - c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);


Then

 print(t.test(testdata.A, testdata.B, paired=TRUE, 
alternative=two.sided, na.action=na.pass))


and

 print(t.test(testdata.A, testdata.B, paired=TRUE, 
alternative=two.sided, na.action=na.exclude))


deliver the same p value (0.1162, identical to Excel's result).

However, after combining the two vectors with


 testdata - c(testdata.A, testdata.B);


and defining a criterion vector with


 criterion - c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);


(that is the type of data layout we have in our proteomics project) 
we get a different p-value (0.01453) with


 print(t.test(testdata ~ criterion, paired=TRUE, 
alternative=two.sided, na.action=na.exclude)) .


The statement

 print(t.test(testdata ~ criterion, paired=TRUE, 
alternative=two.sided, na.action=na.pass))


however, delivers a p-value of 0.1162 again.

With

 print(t.test(testdata[criterion==0], testdata[criterion==1], 
paired=TRUE, alternative=two.sided, na.action=na.exclude))


that imitates the first form, we get again a p value of 0.1162.

What is the reason for the different p values? Should not all calls 
to t.test that exlude missing values be equivalent and therefore 
deliver the same results?


Excel, StatView and KaleidaGraph all display a p-value of 0.1162.

J. W. D.
--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- Dr. Johannes W. Dietrich, M.D.
-- Laboratory XU44, Endocrine Research
-- Medical Hospital I, Bergmannsheil University Hospitals
-- Ruhr University of Bochum
-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
-- eMail: j.w.dietr...@medical-cybernetics.de
-- WWW: http://medical-cybernetics.de
-- WWW: http://www.bergmannsheil.de
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle


--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- Dr. Johannes W. Dietrich, M.D.
-- Laboratory XU44, Endocrine Research
-- Medical Hospital I, Bergmannsheil University Hospitals
-- Ruhr University of Bochum
-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
-- eMail: j.w.dietr...@medical-cybernetics.de
-- WWW: http://medical-cybernetics.de
-- WWW: http://www.bergmannsheil.de
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merge function in R?

2010-08-13 Thread JesperHybel

I think it would be helpful if you could clarify youre question - do you want
distinct sets - maybe use 

unique()

but why (5,20) when its (5,10) in the row in youre example? What criteria do
you want the function to select the sets by and what kind of output do you
need? 

Maybe it's just me who dosn't get the question..sr
-- 
View this message in context: 
http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324844.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merge function in R?

2010-08-13 Thread fishkbob

I too think I worded it incorrectly...

so the second two columns of the matrix are the start and end of an interval
however, because some of the intervals overlap, I want to limit the number
of intervals I have to deal with.

So therefore,
(5 10)should merge with(7 18)   making(5 18)
and then (518)   should merge with (1620)   giving   (520)
whereas  (1 4) has no overlap with any other interval and is therefore
left on its own

Ideal output would just be a collapsing of the matrix
sample   start end
#  5   20
#  14

I got this to work using unique(c(5:10,7:18,16:20,1:4)) which gives me a
c(1:4,5:20)
However, I have to do this on a very large dataset and the numbers are more
like
c(100542:100782,598322:598821,...)

any help would be appreciated
thanks 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324855.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] decision tree finetune

2010-08-13 Thread Olga Shaganova
I figured it out myself, here it is: control=rpart.control(cp=.001))

Thank you!


On Fri, Aug 13, 2010 at 12:58 PM, Olga Shaganova olga.sha...@gmail.comwrote:

 My decision tree grows only with one split and based on what I see in
 E-Miner it should split on more variables. How can I adjust splitting
 criteria in R?

 Also is there way to indicate that some variables are binary, like variable
 Info_G is binary so in the results would be nice to see 2) Info_G=0
 instead of 2) Info_G0.5.

 Thank you in advance!
 And thanks for Eric who helped with my previous question about starting
 rpart.

 Olga

  fit - rpart(Retention ~
 Info_G+AOPD+Mail+Xref_Umbr+Ins_Age+Discount+Xref_A   + Con6  +
 + Con5   +Con4 +
 + Con3  +  Con2  +
 + Con1 , data=Home,control=rpart.control(minsplit=5))
 
  fit
 n= 48407
 node), split, n, deviance, yval
   * denotes terminal node
 1) root 48407 4730.642 0.8902225
   2) Info_G 0.5 14280 1999.293 0.8316527 *
   3) Info_G=0.5 34127 2661.865 0.9147303 *
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merge function in R?

2010-08-13 Thread David Winsemius
Neither you nor your responder have continued the eamil chain very  
well so let me put things back together:

on  Aug 13, 2010; 03:54pm fishkbob wrote subj = merge function in R?

So I have a bunch of c(start,end) points and want to consolidate  
them into as few c(start,end) as possible.


For example:
sample   startend
A  5   10
B  7   18
C  14
D  16  20

I'd want the function to return the two distinct sets (1,4) and  
(5,20)


Is there an R function that already does this?
or should I write my own? (how would I go about that?)


In an effort to be be helpful but not copying the prior message on  
Aug 13, 2010; 06:46pm  JesperHybel wrote:


I think it would be helpful if you could clarify youre question -  
do you want distinct sets - maybe use


unique()

but why (5,20) when its (5,10) in the row in youre example? What  
criteria do you want the function to select the sets by and what  
kind of output do you need?


Maybe it's just me who dosn't get the question..sr


On Aug 13, 2010, at 7:01 PM, fishkbob wrote:



I too think I worded it incorrectly...

so the second two columns of the matrix are the start and end of an  
interval
however, because some of the intervals overlap, I want to limit the  
number

of intervals I have to deal with.

So therefore,
(5 10)should merge with(7 18)   making(5 18)
and then (518)   should merge with (1620)   giving   (520)
whereas  (1 4) has no overlap with any other interval and is  
therefore

left on its own

Ideal output would just be a collapsing of the matrix
sample   start end
#  5   20
#  14

I got this to work using unique(c(5:10,7:18,16:20,1:4)) which gives  
me a

c(1:4,5:20)
However, I have to do this on a very large dataset and the numbers  
are more

like
c(100542:100782,598322:598821,...)

any help would be appreciated
thanks
--
View this message in context: 
http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324855.html
Sent from the R help mailing list archive at Nabble.com.


Nabble is where I saw all of this, but Nabble is not r-help:

I suggest you sort your rows by the start variable and then examine  
where the breaks would remain by looking at the prior values of end:


 dd - rd.txt(sample   startend
+ A  5   10
+ B  7   18
+ C  14
+ D  16  20)
 dd[order(dd$start), ]
  sample start end
3  C 1   4
1  A 5  10
2  B 7  18
4  D16  20
 ndd - dd[order(dd$start), ]
 ndd$inprior - c(NA, ndd[1:nrow(ndd)-1,3] = ndd[2:nrow(ndd),2] )
 ndd
  sample start end inprior
3  C 1   4  NA
1  A 5  10   FALSE
2  B 7  18TRUE
4  D16  20TRUE

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kalman filter

2010-08-13 Thread Spencer Graves
Other terms for Kalman filtering, prediction and smoothing are 
state space modeling and dynamic linear models.


   Consider the following extension of Ben Bolker's suggestion to 
use the 'sos' package:


library(sos)
k - ???Kalman
ss - findFn('state space')
dlm - findFn('dynamic linear model')
dlms - findFn('dynamic linear models')
# NOT necessarily a subset of dlm
ksd - (k | ss | dlm | dlms)
# union of the above 4
installPackages(ksd)
# Install the primary packages to get addition info for writeFindFn2xls
writeFindFn2xls(ksd)
getwd() # directory containing ksd.xls


   The first sheet of ksd.xls provides a one-line summary of the 
packages including date last updated, any vignettes, etc.  This file 
says that ksd found 31 matches in the dse package by Paul Gilbert, 
who has been a major contributor in this area for many years, using his 
own code at the Bank of Canada.  I have not used the second and third 
packages, expsmooth and MARSS, so I can't speak about them.  However, 
the next package is dlm, for which a companion book appeared just over 
a year ago;  I've found it very powerful and relatively easy to use.  If 
your response variables are something other than the a linear 
combination of components of the state vector plus normal noise, then I 
suggest you consider the sspir package, which is number 6 on this list 
in terms of the number of help pages matching the search terms.


   Hope this helps.
   Spencer Graves


On 8/13/2010 2:06 PM, Bruno Mourato wrote:
 Take a look on this

 Packages:

 - KFTRACK
 - UKFSST
 - TRACKIT



 2010/8/13 FMHkagba2...@yahoo.com

 Dear All,

 Could anyone give me a hand to suggest few packages in R to running Kalman
 prediction and filtration ?

 Thanks
 Fir




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug in t.test?

2010-08-13 Thread Ted Harding
Hi Thomas,
I'm not too sure about your interpretation. Consider:

  set.seed(54321)
  X - rnorm(10) ; Y - rnorm(10)
  XY - c(X,Y); group-c(rep(0,10),rep(1,10))

  t.test(X,Y,paired=TRUE)
  # Paired t-test
  # data:  X and Y 
  # t = -1.5265, df = 9, p-value = 0.1612
  # 95 percent confidence interval:
  #  -2.1521562  0.4178885 
  # sample estimates: mean of the differences 
  #  -0.8671339 

  t.test(XY~group,paired=TRUE)
  # Paired t-test
  # data:  XY by group 
  # t = -1.5265, df = 9, p-value = 0.1612
  # 95 percent confidence interval:
  #  -2.1521562  0.4178885 
  # sample estimates: mean of the differences
  #  -0.8671339 

  t.test(X,Y,paired=FALSE)
  # Welch Two Sample t-test
  # data:  X and Y 
  # t = -1.4865, df = 17.956, p-value = 0.1545
  # 95 percent confidence interval:
  #  -2.0928836  0.3586159 
  # sample estimates: mean of x  mean of y
  #  -0.2646042  0.6025297 

  t.test(XY~group,paired=FALSE)
  # Welch Two Sample t-test
  # data:  XY by group 
  # t = -1.4865, df = 17.956, p-value = 0.1545
  # 95 percent confidence interval:
  #  -2.0928836  0.3586159 
  # sample estimates: mean in group 0 mean in group 1
  #  -0.2646042   0.6025297 

So in each case t.test(XY~group,...) has done the same as
the corresponding t.test(X,Y,...); thus it would not seem
that The formula interface is only applicable for the
2-sample tests excludes specifying the two samples by
a formula with a 2-level factor; and it would seem that
the pairing has been correctly carried out (i.e. in the
formula case t.test assumes that it is by sequential
position within the respective group, without having to
be told as you suggest). In other words, it acts as though
giving a formula with a 2-group factor on the right, and
equal numbers in the two groups, is equivalent to giving
the two samples separately.

Johannes' original query was about differences when there
are NAs, corresponding to different settings of na.action.
It is perhaps possible that 'na.action=na.pass' and
'na.action=na.exclude' result in different pairings in the
case paired=TRUE. However, it seems to me that the differences
he observed are, shall we say, obscure!

Ted.

On 13-Aug-10 22:31:45, Thomas Lumley wrote:
 Thanks for the clear example. However, if there is a bug it is
 only that t.test.formula doesn't throw an error when given the
 paired=TRUE option.
 
 The documentation says The formula interface is only applicable
 for the 2-sample tests.,  but there probably should be an explicit
 check -- I didn't see that in the documentation until I realized
 that t.test.formula couldn't work for paired tests because you don't
 tell it which observations are paired.
 -thomas
 
 On Fri, 13 Aug 2010, Johannes W. Dietrich wrote:
 Hello all,

 due to unexplained differences between statistical results from
 collaborators 
 and our lab that arose in the same large proteomics dataset we
 reevaluated 
 the t.test() function. Here, we found a weird behaviour that is also 
 reproducible in the following small test dataset:

 Suppose, we have two vectors with numbers and some missing values that
 refer 
 to the same individuals and that should therefore be evaluated with a
 paired 
 t-test:

  testdata.A - c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
  testdata.B - c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);

 Then

  print(t.test(testdata.A, testdata.B, paired=TRUE,
  alternative=two.sided, 
 na.action=na.pass))

 and

  print(t.test(testdata.A, testdata.B, paired=TRUE,
  alternative=two.sided, 
 na.action=na.exclude))

 deliver the same p value (0.1162, identical to Excel's result).

 However, after combining the two vectors with

  testdata - c(testdata.A, testdata.B);

 and defining a criterion vector with

  criterion - c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);

 (that is the type of data layout we have in our proteomics project) we
 get a 
 different p-value (0.01453) with

  print(t.test(testdata ~ criterion, paired=TRUE,
  alternative=two.sided, 
 na.action=na.exclude)) .

 The statement

  print(t.test(testdata ~ criterion, paired=TRUE,
  alternative=two.sided, 
 na.action=na.pass))

 however, delivers a p-value of 0.1162 again.

 With

  print(t.test(testdata[criterion==0], testdata[criterion==1],
  paired=TRUE, 
 alternative=two.sided, na.action=na.exclude))

 that imitates the first form, we get again a p value of 0.1162.

 What is the reason for the different p values? Should not all calls to
 t.test 
 that exlude missing values be equivalent and therefore deliver the
 same 
 results?

 Excel, StatView and KaleidaGraph all display a p-value of 0.1162.

 J. W. D.
 -- 
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- --
 -- Dr. Johannes W. Dietrich, M.D.
 -- Laboratory XU44, Endocrine Research
 -- Medical Hospital I, Bergmannsheil University Hospitals
 -- Ruhr University of Bochum
 -- Buerkle-de-la-Camp-Platz 1, 

Re: [R] Bug in t.test?

2010-08-13 Thread Peter Dalgaard
(Ted Harding) wrote:

 Johannes' original query was about differences when there
 are NAs, corresponding to different settings of na.action.
 It is perhaps possible that 'na.action=na.pass' and
 'na.action=na.exclude' result in different pairings in the
 case paired=TRUE. However, it seems to me that the differences
 he observed are, shall we say, obscure!

Did you miss the point that the incorrect behaviour with na.exclude is
the default one? (Actually, na.omit is, but the effect is the same).

Arguably, t.test could just hardcode na.action=na.pass and begone with it.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to add lines to lattice plot produced by rms::bplot

2010-08-13 Thread David Winsemius


I have a plot produced by function bplot (package = rms) that is  
really a lattice plot (class=trellis). It is similar to this plot  
produced by a very minor modification of the first example on the  
bplot help page:


requiere(rms)
n - 1000# define sample size
set.seed(17) # so can reproduce the results
age- rnorm(n, 50, 10)
blood.pressure - rnorm(n, 120, 15)
cholesterol- rnorm(n, 200, 25)
sex- factor(sample(c('female','male'), n,TRUE))
label(age)- 'Age'  # label is in Hmisc
label(cholesterol)- 'Total Cholesterol'
label(blood.pressure) - 'Systolic Blood Pressure'
label(sex)- 'Sex'
units(cholesterol)- 'mg/dl'   # uses units.default in Hmisc
units(blood.pressure) - 'mmHg'

# Specify population model for log odds that Y=1
L - .4*(sex=='male') + .045*(age-50) +
  (log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))
# Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)]
y - ifelse(runif(n)  plogis(L), 1, 0)

ddist - datadist(age, blood.pressure, cholesterol, sex)
options(datadist='ddist')

fit - lrm(y ~ blood.pressure + sex * (age + rcs(cholesterol,4)),
   x=TRUE, y=TRUE)
p - Predict(fit, age, cholesterol, sex='male', np=50) # vary sex last
bp.plot - bplot(p, lfun=contourplot)
bp.plot

I have tried a variety of efforts at using update (which I assume is a  
lattice function although I can find no help page for it. It does  
appear in some of the lattice hep pages and my understanding is that  
it pushes objects onto the list structure of a plot object. I've also  
tried adding to it with llines()

#-
Oh, never mind. I recovered a memory that I had seen a solution on  
rhelp and had saved it. Turns out it was from Peter Ehlers, to whom I  
offer thanks. I was trying to add a step function: ht and weight from  
a dataframe, bld:


trellis.focus(panel, 1, 1)
   panel.lines(x=bld$inches, y=bld$BMI28, type='s')
 trellis.unfocus()

Success!

Now... how do I control the color levels in levelplot or contourplot???


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odp: help

2010-08-13 Thread milly

Please,when i export the output from R to excel.I am not getting all the
15,000 observations but only 2000.Thank you
-- 
View this message in context: 
http://r.789695.n4.nabble.com/help-tp2323542p2324459.html
Sent from the Export many data to Excel 2007 mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R execution

2010-08-13 Thread chantalm

So I have a for loop

for (i in 1 : 5) {
print(i)
}

and I want it to print 1, 2, 3, 4, 5 sequentially for each loop iteration.
Instead, it waits for the whole loop to finish and then prints. It's strange
because my R console seems to have two modes of execution. One where it will
iterate for each loop and print the message (and where I can easily abort an
R program with the escape key), and another mode where it doesn't print for
each loop iteration/just keeps executing even if I press escape key. I can't
figure out how to enable the first way of execution! Can someone shed some
light on this matter?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/R-execution-tp2324852p2324852.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Re: [R] how to eliminate an element of a list

2010-08-13 Thread fishkbob

 list-seq(2,10,2)
 list
[1]  2  4  6  8 10
 list[-which(2==list)]
[1]  4  6  8 10

using the which() will let you remove things from a list that have a
specified value... I usually use the 

blah- blah[-which(TRUE==is.na(blah)) ]

which will remove all NA values in your list
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-eliminate-an-element-of-a-list-tp2323329p2324696.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hypergeometric vs fisher.test

2010-08-13 Thread TGS
or an incompetent

Such harsh words Peter. You're not making this a friendly environment for 
people to ask questions. Is there a less competent attribute mailing list so 
that some of us don't offend you with our questions?

On Aug 13, 2010, at 2:11 PM, Peter Dalgaard wrote:

Andrea Franceschini wrote:
 I ask the question also because I found this line in Wikipedia:
 The test (see above) based on the hypergeometric distribution
 (hypergeometric test) is identical to the corresponding one-tailed
 version of Fisher's exact test.
 
 Is this wrong ?  May I kindly ask a friendly explanation for
 not-experts in statistics ?
 
 Thx a lot,
 

You never said you were a non-expert. A question like that might very
well have come from a student, or an incompetent claiming to have found
a bug in fisher.test...

The point was that Fisher's test takes the 2x2 table

a b
c d

and interprets the situation under the null hypothesis as taking a
sample of size a+c from an urn with a+b white balls and c+d black balls.

Once you read the docs for phyper properly (it _is_ tricky to get it
right), you arrive at

 phyper(16,17+181,449+19551,17+449, lower.tail=FALSE)
[1] 3.693347e-06


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merge function in R?

2010-08-13 Thread fishkbob

So I have a bunch of c(start,end) points and want to consolidate them into as
few c(start,end) as possible.

For example:
sample   startend
A  5   10
B  7   18
C  14
D  16  20

I'd want the function to return the two distinct sets (1,4) and (5,20)

Is there an R function that already does this?
or should I write my own? (how would I go about that?)
-- 
View this message in context: 
http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324684.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to eliminate an element of a list

2010-08-13 Thread fishkbob

Try to do this

 list - seq(1,5,1) 
 list-list[-3]
 list
[1] 1 2 4 5

list[-x] will remove the xth value in your list. Also you can do something
like

 list[-c(1:4)]
[1] 5
To remove values at indexes 1-4

 list[-c(1,4)]
[1] 2 3 5
To remove values at indexes 1 and 4
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-eliminate-an-element-of-a-list-tp2323329p2324691.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Learning ANOVA

2010-08-13 Thread Stephen Liu
- Original Message 
From: Liaw, Andy andy_l...@merck.com
To: Stephen Liu sati...@yahoo.com; r-help@r-project.org
Sent: Sat, August 14, 2010 12:48:33 AM
Subject: RE: [R]  Learning ANOVA

Note the names of the variables here.  They don't match what you tried
to use in your boxplot() call below.  Where did you get the idea that
there are DO and Stream in the test01 data frame?


Hi Andy,


I followed following video to proceed;

ANOVA in R 
http://www.youtube.com/watch?v=Dwd3ha0P8uwfeature=related


I fixed it with;

 boxplot(test01$count ~ test01$spray)


Continued:

 InsectSprays.aov -(test01$count ~ test01$spray)
 summary(InsectSprays.aov)
Length   ClassMode 
  3 formulacall 

Seems having problem here.


 TukeyHSD(InsectSprays.aov)
Error in UseMethod(TukeyHSD) : 
  no applicable method for 'TukeyHSD' applied to an object of class formula

I was still stuck here.


B.R.
Stephen L




- Original Message 
From: Liaw, Andy andy_l...@merck.com
To: Stephen Liu sati...@yahoo.com; r-help@r-project.org
Sent: Sat, August 14, 2010 12:48:33 AM
Subject: RE: [R]  Learning ANOVA

From: Stephen Liu
 
 Hi folks,
 
 R on Ubuntu 10.04 64 bit.
 
 Performed following steps on R:-
 
 ### to access to the object
  data(InsectSprays)
 
 ### create a .csv file
  write.csv(InsectSprays, InsectSpraysCopy.csv)
 
 
 On another terminal
 $ sudo updatedb
 $ locate InsectSpraysCopy.csv
 /home/userA/InsectSpraysCopy.csv
 
 
 ### Read in some data
  test01 - read.csv(file.choose(), header=TRUE)
 
 Enter file name: /home/userA/InsectSpraysCopy.csv
 
 
 ### Look at the data
  test01
 X count spray
 1   110 A
[snipped]

Note the names of the variables here.  They don't match what you tried
to use in your boxplot() call below.  Where did you get the idea that
there are DO and Stream in the test01 data frame?

Andy


 ### Create a side-by-side boxplot of the data
 boxplot(test01$DO ~ test01$Stream)
 Error in model.frame.default(formula = test01$DO ~ test01$Stream) : 
   invalid type (NULL) for variable 'test01$DO'
 
 
 I was stucked here.  Pls help.  TIA
 
 
 B.R.
 Stephen L
 
 
 
 - Original Message 
 From: Stephen Liu sati...@yahoo.com
 To: r-help@r-project.org
 Sent: Fri, August 13, 2010 11:34:31 AM
 Subject: [R] Learning ANOVA
 
 Hi folks,
 
 
 File to be used is on;
 data(InsectSprays)
 
 
 I can't figure out where to insert it on following command;
 test01 - read.csv(fil.choose(), header=TRUE)
 
 Please help.  TIA
 
 B.R.
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachments, contains
information of Merck  Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates Direct contact information
for affiliates is available at 
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from 
your system.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Polygon Graph in lattice/ lattice extra

2010-08-13 Thread Felix Andrews
OK, I've added a 'horizontal' argument to panel.xyarea(), which is
consistent with the meaning in panel.xyplot(). It is available from
R-forge via SVN now, or as a built package within a day or 2 probably.

Questions like this are best sent to package maintainers directly, I think.

Regards
# Felix

On 13 August 2010 22:59, Sebastian wurster.sebast...@googlemail.com wrote:

 Hi, I'd like to draw a polygon graph. I used the package lattice extra which
 includes a function for that (see:
 http://latticeextra.r-forge.r-project.org/#panel.xyareatheme=default). But
 i don't want the polygon ending with its filled border at the x-axis (like
 in my code). Instead it should be rotated about 90 degrees and end with its
 filled border at the y-axis.
 I'm pleased about any suggestions or ideas. Thanks, Sebastian
 Here's the code:

 library(latticeExtra)

 data -
 as.data.frame(cbind(c(1,2,3,4,1,2,3,4),c(1,1,1,1,2,2,2,2),c(2,3.8,3,3.5,2.2,4,3.2,3.5)))

 x - data$V1
 y - data$V3
 groups - data$V2

 xyplot(y ~ x, data, groups = groups,
       type='l',
       par.settings = simpleTheme(col = c(grey,
 rgb(166,27,30,maxColorValue = 255)), lwd = c(5,2)),
       superpose = TRUE,
       panel = panel.superpose,
       panel.groups = function(..., group.number) {
          if (group.number == 1)
          panel.xyarea(...) else panel.xyplot(...)
          panel.grid(h=FALSE, v=-1, col.line=grey)
       },
       auto.key=list(columns=2, space=bottom, cex=0.8, size=1.3, adj=1,
                     between=0.2, between.colums=0.1, points = FALSE,
 rectangles = TRUE))
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Polygon-Graph-in-lattice-lattice-extra-tp2324156p2324156.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Felix Andrews / 安福立
http://www.neurofractal.org/felix/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] detecting a key press

2010-08-13 Thread Martin Teicher
Hi Folks,

I'm relatively new to r.  I'd like to have a user respond by pressing a 1 or a 
2 and determining their choice and the time of response.  Previous postings 
have indicated that keyboard responses can be processed using scan and readline 
but both seem to wait for the user to also press return.  Is there a way to 
detect the initial key press without requiring them to hit return?  

Thank you so much.

Martin H. Teicher
Department of Psychiatry
McLean Hospital / Harvard Medical School
Belmont, MA USA
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] different outcomes of P values in SPSS and R

2010-08-13 Thread Ben Bolker
Leo Vorthoren L.Vorthoren at nioo.knaw.nl writes:

 I have been using generalized linear models in SPSS 18, in order to build
 models and to calculate the P values. When I was building models in Excel
 (using the intercept and Bs from SPSS), I noticed that the graphs differed
 from my expectations. When I ran the dataset again in R, I got totally
 different outcomes for both the P values as well as the Bs and the
 intercepts. The outcomes of R seem much more likely to be the correct ones,
 but I really cannot explain the differences.

  I appreciate/assume that you're asking on the off chance that someone
else has tried something very similar and gone to the trouble of figuring
out the differences between R's and SPSS's default setup, but you're
unlikely to get an answer without more detailed information.

  My best guess is that SPSS and R are using different contrasts
and/or different baseline levels.  R uses treatment contrasts by default,
and assumes that the first (alphabetical) level of a factor is the
baseline level.

  It's conceivable that you have a dataset where the results are
numerically unstable and sensitive to small details in the algorithms
used.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] what is scaling in plot

2010-08-13 Thread elaine kuo
Dear List,

Pls kindly advise what scaling is in plot.
Sometime it could be negative but sometimes it might be positive
.(I guess it is the proportion between the plot and the margin)

Thanks

Elaine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] discerning species by color in cca biplot

2010-08-13 Thread elaine kuo
Dear List,

I am running constrained correspondence analysis for abundance data of 7
birds.

However, I would like to check which bird prefers which environment gradient
by showing the species with different colors of the dots in cca plot
(package vegan).

Please kindly help and thank you

Elaine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] detecting a key press

2010-08-13 Thread Ben Bolker
Martin Teicher martin_teicher at hms.harvard.edu writes:

 I'm relatively new to r.  I'd like to have a user respond by 
 pressing a 1 or a 2 and determining their choice and
 the time of response.  Previous postings have indicated that
 keyboard responses can be processed using
 scan and readline but both seem to wait for the user to also 
 press return.  Is there a way to detect the initial
 key press without requiring them to hit return?  

  Probably not -- probably depends a lot on platform and
interface too (i.e. in a command-line interface, interaction
with the keyboard is likely to be handled by the OS rather
than by R itself).

  If you're on Windows you could look at ?getGraphicsEvent
(which will see keyboard responses, not from the console, but
from the graphics window).

  Alternatively, you might be able to put something together
with the tcltk package ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what is scaling in plot

2010-08-13 Thread Ben Bolker
elaine kuo elaine.kuo.tw at gmail.com writes:

 Pls kindly advise what scaling is in plot.
 Sometime it could be negative but sometimes it might be positive
 .(I guess it is the proportion between the plot and the margin)

  Your question is unclear. Please give more context and/or details.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] different outcomes of P values in SPSS and R

2010-08-13 Thread Martin Teicher
R usesType I sequential SS, not the default Type III marginal SS reported by 
SPSS.  There is a good blog post explaining this difference along with some 
interesting comments -- 
http://myowelt.blogspot.com/2008/05/obtaining-same-anova-results-in-r-as-in.html

Best Wishes,

Martin H. Teicher
Dept of Psychiatry
McLean Hospital / Harvard Medical School
Belmont MA 02478


On Aug 13, 2010, at 10:32 PM, Ben Bolker wrote:

 Leo Vorthoren L.Vorthoren at nioo.knaw.nl writes:
 
 I have been using generalized linear models in SPSS 18, in order to build
 models and to calculate the P values. When I was building models in Excel
 (using the intercept and Bs from SPSS), I noticed that the graphs differed
 from my expectations. When I ran the dataset again in R, I got totally
 different outcomes for both the P values as well as the Bs and the
 intercepts. The outcomes of R seem much more likely to be the correct ones,
 but I really cannot explain the differences.
 
  I appreciate/assume that you're asking on the off chance that someone
 else has tried something very similar and gone to the trouble of figuring
 out the differences between R's and SPSS's default setup, but you're
 unlikely to get an answer without more detailed information.
 
  My best guess is that SPSS and R are using different contrasts
 and/or different baseline levels.  R uses treatment contrasts by default,
 and assumes that the first (alphabetical) level of a factor is the
 baseline level.
 
  It's conceivable that you have a dataset where the results are
 numerically unstable and sensitive to small details in the algorithms
 used.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] different outcomes of P values in SPSS and R

2010-08-13 Thread Ben Bolker
  Yes, but ... the original poster said the coefficients differed too.
  (The blog post
you refer to deals with ANOVA (i.e. linear models) rather than GLMs (generalized
linear models): it is true that the sequential/marginal
distinction still applies, but I don't think that can be the *only*
thing going on here.)


On Fri, Aug 13, 2010 at 10:50 PM, Martin Teicher
martin_teic...@hms.harvard.edu wrote:
 R usesType I sequential SS, not the default Type III marginal SS reported by 
 SPSS.  There is a good blog post explaining this difference along with some 
 interesting comments -- 
 http://myowelt.blogspot.com/2008/05/obtaining-same-anova-results-in-r-as-in.html

 Best Wishes,

 Martin H. Teicher
 Dept of Psychiatry
 McLean Hospital / Harvard Medical School
 Belmont MA 02478


 On Aug 13, 2010, at 10:32 PM, Ben Bolker wrote:

 Leo Vorthoren L.Vorthoren at nioo.knaw.nl writes:

 I have been using generalized linear models in SPSS 18, in order to build
 models and to calculate the P values. When I was building models in Excel
 (using the intercept and Bs from SPSS), I noticed that the graphs differed
 from my expectations. When I ran the dataset again in R, I got totally
 different outcomes for both the P values as well as the Bs and the
 intercepts. The outcomes of R seem much more likely to be the correct ones,
 but I really cannot explain the differences.

  I appreciate/assume that you're asking on the off chance that someone
 else has tried something very similar and gone to the trouble of figuring
 out the differences between R's and SPSS's default setup, but you're
 unlikely to get an answer without more detailed information.

  My best guess is that SPSS and R are using different contrasts
 and/or different baseline levels.  R uses treatment contrasts by default,
 and assumes that the first (alphabetical) level of a factor is the
 baseline level.

  It's conceivable that you have a dataset where the results are
 numerically unstable and sensitive to small details in the algorithms
 used.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Learning ANOVA

2010-08-13 Thread Stephen Liu
Hi JesperHybel,

Thanks for your advice.

If you're trying to follow the youtube video you have a typing mistake here:

InsectSprays.aov -(test01$count ~ test01$spray)

I think this should be:

InsectSprays.aov -aov(test01$count ~ test01$spray)


Your advice works for me.  Sorry I missed aov before(test01$count ~ 
test01$spray)

 InsectSprays.aov - aov(test01$count ~ test01$spray)
 summary(InsectSprays.aov)
 Df Sum Sq Mean Sq F valuePr(F)
test01$spray  5 2668.8  533.77  34.702  2.2e-16 ***
Residuals66 1015.2   15.38  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 


 TukeyHSD(InsectSprays.aov)
  Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = test01$count ~ test01$spray)

$`test01$spray`
   difflwr   upr p adj
B-A   0.833  -3.866075  5.532742 0.9951810
C-A -12.417 -17.116075 -7.717258 0.000
D-A  -9.583 -14.282742 -4.883925 0.014
E-A -11.000 -15.699409 -6.300591 0.000
F-A   2.167  -2.532742  6.866075 0.7542147
C-B -13.250 -17.949409 -8.550591 0.000
D-B -10.417 -15.116075 -5.717258 0.002
E-B -11.833 -16.532742 -7.133925 0.000
F-B   1.333  -3.366075  6.032742 0.9603075
D-C   2.833  -1.866075  7.532742 0.4920707
E-C   1.417  -3.282742  6.116075 0.9488669
F-C  14.583   9.883925 19.282742 0.000
E-D  -1.417  -6.116075  3.282742 0.9488669
F-D  11.750   7.050591 16.449409 0.000
F-E  13.167   8.467258 17.866075 0.000


I made a comparison of my result with example(InsectSprays).  They looks the 
same.

I also compared plot(InsectSprays.aov).


A further question how to plot 4 graphs simultaneously?  Instead of reading 
them, individually.  I read ?plot but unable to resolve.

Also how to save InsectSprays.aov?  I think I can only save it as 
InsectSprays.csv.  I can't find write.aov command. 


Thanks


TIA

B.R.
satimis






- Original Message 
From: JesperHybel jesperhy...@hotmail.com
To: r-help@r-project.org
Sent: Sat, August 14, 2010 2:09:48 AM
Subject: Re: [R] Learning ANOVA


If you're trying to follow the youtube video you have a typing mistake here:

InsectSprays.aov -(test01$count ~ test01$spray)

I think this should be:

InsectSprays.aov -aov(test01$count ~ test01$spray)


youre missing the functioncall aov on the right hand side of the
assignment operator '-'.

the results of the application of function aov() is stored in
InsectSprays.aov and is accessed through

summary(InsectSprays.aov)

since you missed the functioncall you cannot apply TukeyHSD() to
InsectSprays.aov

I think
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Learning-ANOVA-tp2323660p2324590.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to add lines to lattice plot produced by rms::bplot

2010-08-13 Thread Duncan Mackay

Hi David

I do not know if you have done something like this.
I tried str(bp.plot) which gave the section about the regions (for colours) as:
 $ panel.args.common:List of 8
  ..$ x  : num [1:2500] 27 28 29 29.9 30.9 ...
  ..$ y  : num [1:2500] 141 141 141 141 141 ...
  ..$ z  : num [1:2500] -1.43 -1.41 -1.39 -1.36 -1.34 ...
  ..$ at : num [1:10] -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
  ..$ region : logi FALSE
  ..$ zlab   :List of 3
  .. ..$ label: chr log odds
  .. ..$ rot  : num 90
  .. ..$ cex  : num 1
  ..$ labels : logi TRUE
  ..$ contour: logi TRUE

I added the col.region and colours from a plot levelplot that I had done to 
see what would occur to the trellis parameters. No colours were produced 
when plotted.
bp.plot - bplot(p, lfun=contourplot, color.key = TRUE, col.regions = 
c(#FF,#00,#A9E2FF,#8080FF,#FF,#FFD18F,#FF) )


$ panel.args.common:List of 10
  ..$ x  : num [1:2500] 27 28 29 29.9 30.9 ...
  ..$ y  : num [1:2500] 141 141 141 141 141 ...
  ..$ z  : num [1:2500] -1.43 -1.41 -1.39 -1.36 -1.34 ...
  ..$ at : num [1:10] -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
  ..$ region : logi FALSE
  ..$ color.key  : logi TRUE
  ..$ zlab   :List of 3
  .. ..$ label: chr log odds
  .. ..$ rot  : num 90
  .. ..$ cex  : num 1
  ..$ labels : logi TRUE
  ..$ contour: logi TRUE
  ..$ col.regions: chr [1:7] #FF #00 #A9E2FF #8080FF ...

So it has been added to the panel.args.common, whether you can access these 
are another matter.


I then tried
bp.plot - bplot(p, lfun=contourplot, par.settings = list(axis.text = 
list(cex = 0.65)), color.key = TRUE, col.regions = 
c(#FF,#00,#A9E2FF,#8080FF,#FF,#FFD18F,#FF) )


which changed the size of the axis text so it may be the case of having to 
add the col.regions etc to the appropriate list in par.settings


I'll leave you to amend and access the colours.
You may have to add values for the wireframe/levelplot arguments like 
at  etc. and col.regions (I think that is the function) to produce an 
appropriate colour range of your choice


It is a while since I have delved into these sorts of plots.
Need some sustenance.

Regards

Duncan

Duncan Mackay
Department of Agronomy and Soil Science
University of New England
ARMIDALE NSW 2351
Email home: mac...@northnet.com.au



At 10:33 14/08/2010, you wrote:


I have a plot produced by function bplot (package = rms) that is
really a lattice plot (class=trellis). It is similar to this plot
produced by a very minor modification of the first example on the
bplot help page:

require(rms)
n - 1000# define sample size
set.seed(17) # so can reproduce the results
age- rnorm(n, 50, 10)
blood.pressure - rnorm(n, 120, 15)
cholesterol- rnorm(n, 200, 25)
sex- factor(sample(c('female','male'), n,TRUE))
label(age)- 'Age'  # label is in Hmisc
label(cholesterol)- 'Total Cholesterol'
label(blood.pressure) - 'Systolic Blood Pressure'
label(sex)- 'Sex'
units(cholesterol)- 'mg/dl'   # uses units.default in Hmisc
units(blood.pressure) - 'mmHg'

# Specify population model for log odds that Y=1
L - .4*(sex=='male') + .045*(age-50) +
  (log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))
# Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)]
y - ifelse(runif(n)  plogis(L), 1, 0)

ddist - datadist(age, blood.pressure, cholesterol, sex)
options(datadist='ddist')

fit - lrm(y ~ blood.pressure + sex * (age + rcs(cholesterol,4)),
   x=TRUE, y=TRUE)
p - Predict(fit, age, cholesterol, sex='male', np=50) # vary sex last
bp.plot - bplot(p, lfun=contourplot)
bp.plot

I have tried a variety of efforts at using update (which I assume is a
lattice function although I can find no help page for it. It does
appear in some of the lattice hep pages and my understanding is that
it pushes objects onto the list structure of a plot object. I've also
tried adding to it with llines()
#-
Oh, never mind. I recovered a memory that I had seen a solution on
rhelp and had saved it. Turns out it was from Peter Ehlers, to whom I
offer thanks. I was trying to add a step function: ht and weight from
a dataframe, bld:

trellis.focus(panel, 1, 1)
   panel.lines(x=bld$inches, y=bld$BMI28, type='s')
 trellis.unfocus()

Success!

Now... how do I control the color levels in levelplot or contourplot???


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] different outcomes of P values in SPSS and R

2010-08-13 Thread Martin Teicher
What your saying is true.  The sequential/marginal difference can account for 
the discrepancy in p values but not necessarily the coefficients.  One thing 
I've found that can lead to differences in coefficients and p values between R 
and SPSS is whether or not you specify that a variable is a factor (e.g., 
group2 - factor(group) ).  

Marty Teicher

On Aug 13, 2010, at 10:58 PM, Ben Bolker wrote:

  Yes, but ... the original poster said the coefficients differed too.
  (The blog post
 you refer to deals with ANOVA (i.e. linear models) rather than GLMs 
 (generalized
 linear models): it is true that the sequential/marginal
 distinction still applies, but I don't think that can be the *only*
 thing going on here.)
 
 
 On Fri, Aug 13, 2010 at 10:50 PM, Martin Teicher
 martin_teic...@hms.harvard.edu wrote:
 R usesType I sequential SS, not the default Type III marginal SS reported by 
 SPSS.  There is a good blog post explaining this difference along with some 
 interesting comments -- 
 http://myowelt.blogspot.com/2008/05/obtaining-same-anova-results-in-r-as-in.html
 
 Best Wishes,
 
 Martin H. Teicher
 Dept of Psychiatry
 McLean Hospital / Harvard Medical School
 Belmont MA 02478
 
 
 On Aug 13, 2010, at 10:32 PM, Ben Bolker wrote:
 
 Leo Vorthoren L.Vorthoren at nioo.knaw.nl writes:
 
 I have been using generalized linear models in SPSS 18, in order to build
 models and to calculate the P values. When I was building models in Excel
 (using the intercept and Bs from SPSS), I noticed that the graphs differed
 from my expectations. When I ran the dataset again in R, I got totally
 different outcomes for both the P values as well as the Bs and the
 intercepts. The outcomes of R seem much more likely to be the correct ones,
 but I really cannot explain the differences.
 
  I appreciate/assume that you're asking on the off chance that someone
 else has tried something very similar and gone to the trouble of figuring
 out the differences between R's and SPSS's default setup, but you're
 unlikely to get an answer without more detailed information.
 
  My best guess is that SPSS and R are using different contrasts
 and/or different baseline levels.  R uses treatment contrasts by default,
 and assumes that the first (alphabetical) level of a factor is the
 baseline level.
 
  It's conceivable that you have a dataset where the results are
 numerically unstable and sensitive to small details in the algorithms
 used.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to eliminate an element of a list

2010-08-13 Thread David Winsemius


On Aug 13, 2010, at 4:06 PM, fishkbob wrote:




list-seq(2,10,2)
list

[1]  2  4  6  8 10

list[-which(2==list)]

[1]  4  6  8 10

using the which() will let you remove things from a list that have a
specified value... I usually use the

blah- blah[-which(TRUE==is.na(blah)) ]

which will remove all NA values in your list


Although which() is useful when used with [ because it avoids the NA  
indexing problem,  it is entirely superfluous here.


 blah- blah[-is.na(blah) ]   # will be more efficient





--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to add lines to lattice plot produced by rms::bplot

2010-08-13 Thread David Winsemius


On Aug 13, 2010, at 11:25 PM, Duncan Mackay wrote:


Hi David

I do not know if you have done something like this.


I had tried a few efforts like that, starting with an examination of  
str(bp.plot) as you demonstrate.


I tried str(bp.plot) which gave the section about the regions (for  
colours) as:

$ panel.args.common:List of 8
 ..$ x  : num [1:2500] 27 28 29 29.9 30.9 ...
 ..$ y  : num [1:2500] 141 141 141 141 141 ...
 ..$ z  : num [1:2500] -1.43 -1.41 -1.39 -1.36 -1.34 ...
 ..$ at : num [1:10] -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
 ..$ region : logi FALSE
 ..$ zlab   :List of 3
 .. ..$ label: chr log odds
 .. ..$ rot  : num 90
 .. ..$ cex  : num 1
 ..$ labels : logi TRUE
 ..$ contour: logi TRUE


I tried (with a bplot object named bldLT40):

bldLT40$legend$right$args$key$at - c(-0.233, seq(.50, 2.25, by=0.25),  
seq(2.5, 5.0, by=0.5), 6:10)


... and then tried

bldLT40$panel.args$at - c(-0.233, seq(.50, 2.25, by=0.25), seq(2.5,  
5.0, by=0.5), 6:10)


Neither of these efforts changed the boundaries beteen colors in the  
plot area. The first effort changed the legend scal,e but that just  
created a misalignment of the colors of plot area and the legend. I  
would be interested in either a strategy that lets one alter the color  
level changes of the z variable (which in bplot-created objects is  
zhat, or lets one specify the values at which contour lines are  
drawn in contourplot.


Thanks for your efforts.

--
David.




I added the col.region and colours from a plot levelplot that I had  
done to see what would occur to the trellis parameters. No colours  
were produced when plotted.
bp.plot - bplot(p, lfun=contourplot, color.key = TRUE, col.regions  
=  
c 
(#FF 
,#00,#A9E2FF,#8080FF,#FF,#FFD18F,#FF) )


$ panel.args.common:List of 10
 ..$ x  : num [1:2500] 27 28 29 29.9 30.9 ...
 ..$ y  : num [1:2500] 141 141 141 141 141 ...
 ..$ z  : num [1:2500] -1.43 -1.41 -1.39 -1.36 -1.34 ...
 ..$ at : num [1:10] -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
 ..$ region : logi FALSE
 ..$ color.key  : logi TRUE
 ..$ zlab   :List of 3
 .. ..$ label: chr log odds
 .. ..$ rot  : num 90
 .. ..$ cex  : num 1
 ..$ labels : logi TRUE
 ..$ contour: logi TRUE
 ..$ col.regions: chr [1:7] #FF #00 #A9E2FF  
#8080FF ...


So it has been added to the panel.args.common, whether you can  
access these are another matter.


I then tried
bp.plot - bplot(p, lfun=contourplot, par.settings = list(axis.text  
= list(cex = 0.65)), color.key = TRUE, col.regions =  
c 
(#FF 
,#00,#A9E2FF,#8080FF,#FF,#FFD18F,#FF) )


which changed the size of the axis text so it may be the case of  
having to add the col.regions etc to the appropriate list in  
par.settings


I'll leave you to amend and access the colours.
You may have to add values for the wireframe/levelplot arguments  
like at  etc. and col.regions (I think that is the function) to  
produce an appropriate colour range of your choice


It is a while since I have delved into these sorts of plots.
Need some sustenance.

Regards

Duncan

Duncan Mackay
Department of Agronomy and Soil Science
University of New England
ARMIDALE NSW 2351
Email home: mac...@northnet.com.au



At 10:33 14/08/2010, you wrote:


I have a plot produced by function bplot (package = rms) that is
really a lattice plot (class=trellis). It is similar to this plot
produced by a very minor modification of the first example on the
bplot help page:

require(rms)
n - 1000# define sample size
set.seed(17) # so can reproduce the results
age- rnorm(n, 50, 10)
blood.pressure - rnorm(n, 120, 15)
cholesterol- rnorm(n, 200, 25)
sex- factor(sample(c('female','male'), n,TRUE))
label(age)- 'Age'  # label is in Hmisc
label(cholesterol)- 'Total Cholesterol'
label(blood.pressure) - 'Systolic Blood Pressure'
label(sex)- 'Sex'
units(cholesterol)- 'mg/dl'   # uses units.default in Hmisc
units(blood.pressure) - 'mmHg'

# Specify population model for log odds that Y=1
L - .4*(sex=='male') + .045*(age-50) +
 (log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))
# Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)]
y - ifelse(runif(n)  plogis(L), 1, 0)

ddist - datadist(age, blood.pressure, cholesterol, sex)
options(datadist='ddist')

fit - lrm(y ~ blood.pressure + sex * (age + rcs(cholesterol,4)),
  x=TRUE, y=TRUE)
p - Predict(fit, age, cholesterol, sex='male', np=50) # vary sex  
last

bp.plot - bplot(p, lfun=contourplot)
bp.plot

I have tried a variety of efforts at using update (which I assume  
is a

lattice function although I can find no help page for it. It does
appear in some of the lattice hep pages and my understanding is that
it pushes objects onto the list structure of a plot object. I've also
tried adding to it with llines()
#-
Oh, never mind. I recovered a memory that I had seen a solution on
rhelp and had saved it. Turns out 

Re: [R] R execution

2010-08-13 Thread David Winsemius


On Aug 13, 2010, at 6:57 PM, chantalm wrote:



So I have a for loop

for (i in 1 : 5) {
print(i)
}

and I want it to print 1, 2, 3, 4, 5 sequentially for each loop  
iteration.
Instead, it waits for the whole loop to finish and then prints. It's  
strange
because my R console seems to have two modes of execution. One where  
it will
iterate for each loop and print the message (and where I can easily  
abort an
R program with the escape key), and another mode where it doesn't  
print for
each loop iteration/just keeps executing even if I press escape key.  
I can't
figure out how to enable the first way of execution! Can someone  
shed some

light on this matter?


This depends (if I remember correctly) on OS specifics which you have  
not provided. Aren't they teaching you to read the manuals (aka  
Posting Guide) at U-dub these days?


(perhaps search on :

??flush console  # ( again, IIRC)

--

David Winsemius, MD (UW '87)
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Learning ANOVA

2010-08-13 Thread Stephen Liu
Hi Chris,


Thanks for your advice which works for me here.

I'm reading;
An Introduction to R
http://h1.mg4.mail.yahoo.com/dc/launch?sysreq=ignore

NOT finish yet.  Also I found many tutorials on Internet.  I'll take me time 
going through all of them.

I'm interesting reading tutorial with demo.  Otherwise it won't be easy for a 
beginning remembering all commands and steps on manual.

B.R.
Stephen L






- Original Message 
From: Christopher W Ryan cr...@binghamton.edu
To: R-help@r-project.org
Sent: Sat, August 14, 2010 1:51:34 AM
Subject: Re: [R] Learning ANOVA

Read documentation for TukeyHSD by typing the command:

?TukeyHSD

The input to that function should usually be, a fitted model object,
usually an aov fit.

You have not created a fitted model object.

This seems to work:
model - aov(InsectSprays$count ~ InsectSprays$spray)
TukeyHSD(model)


The aov() makes the aov model object. Tacking .aov onto an object name
doesn't do that.

Which introductory R book(s) have you read? That would be a worthwhile
investment of your time.

--Chris Ryan

On Fri, Aug 13, 2010 at 12:43 PM, Stephen Liu sati...@yahoo.com wrote:

 Hi Erik,


 I followed following video as example;

 ANOVA in R
 http://www.youtube.com/watch?v=Dwd3ha0P8uwfeature=related


 Now I got it done;

  boxplot(test01$count ~ test01$spray)


 Continued:

  InsectSprays.aov -(test01$count ~ test01$spray)
  summary(InsectSprays.aov)
  Length   ClassMode
  3 formulacall

 Seems having problem here.


  TukeyHSD(InsectSprays.aov)
 Error in UseMethod(TukeyHSD) :
  no applicable method for 'TukeyHSD' applied to an object of class
 formula

 I'm still stuck here.


 B.R.
 Stephen L




 - Original Message 
  From: Erik Iverson er...@ccbr.umn.edu
 To: Stephen Liu sati...@yahoo.com
 Cc: r-help@r-project.org
 Sent: Sat, August 14, 2010 12:15:31 AM
 Subject: Re: [R]  Learning ANOVA



  Performed following steps on R:-
 
  ### to access to the object
  data(InsectSprays)
 
  ### create a .csv file
  write.csv(InsectSprays, InsectSpraysCopy.csv)
 
 
  On another terminal
  $ sudo updatedb
  $ locate InsectSpraysCopy.csv
  /home/userA/InsectSpraysCopy.csv
 
 
  ### Read in some data
  test01 - read.csv(file.choose(), header=TRUE)
 
  Enter file name: /home/userA/InsectSpraysCopy.csv

 I either don't understand what you're doing, or you seem
 very confused.

 R comes with many sample data sets for you to use.

 You can see a list of them using the ?data function.

 Calling data with an argument loads that dataset.

 So, when you type:

  data(InsectSprays)

 that data object is now available in R, see

  objects()

 You can look at it simply by printing it:

  InsectSprays

 If for some reason it makes sense to do it this way for your
 use case, then that's fine, I just want to make sure you understand
 that you don't have to if accessing built-in datasets is all you want.

  ### Look at the data
  test01
  X count spray
  1   110 A
  2   2 7 A
  3   320 A

 snip

  ### Create a side-by-side boxplot of the data
  boxplot(test01$DO ~ test01$Stream)
  Error in model.frame.default(formula = test01$DO ~ test01$Stream) :
 invalid
 type (NULL) for variable 'test01$DO'

 Why do you think test01 has an element called D0 or Stream?? The
 column names when you print the data tell you otherwise!



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merge function in R?

2010-08-13 Thread Charles C. Berry

On Fri, 13 Aug 2010, fishkbob wrote:



So I have a bunch of c(start,end) points and want to consolidate them into as
few c(start,end) as possible.

For example:
sample   startend
A  5   10
B  7   18
C  14
D  16  20

I'd want the function to return the two distinct sets (1,4) and (5,20)

Is there an R function that already does this?


Yes.

See the reduce() function in the IRanges package on BioConductor

See pages 11-12 of

http://www.bioconductor.org/packages/2.6/bioc/vignettes/IRanges/inst/doc/IRangesOverview.pdf


HTH,

Chuck



or should I write my own? (how would I go about that?)
--
View this message in context: 
http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324684.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


<    1   2