Re: [R] Import more than one sheet in a single excel file

2009-08-11 Thread Keo Ormsby

rajclinasia escribió:

Hi Every one,
my question is, How to Import more than one sheet in a single excel file
(e.g. 10 sheets in one excel file) into R and create datasets for all the
sheets in a single excel file without specifying the sheetnames.

Thank you in Advance.

Hello,
One way is to use the read.xls() function from package gdata. You will 
need to have Perl installed (one easy choice is 
http://www.activestate.com/activeperl), and you can only read Excel < 
2003 files (not 2007). After running  this code, you can merge the 
resulting list of datafremes with the merge() function. Be sure to have 
a column in each Sheet in the .xls file that corresponds to 
subject/observation, with the same name [see ?merge ]. Then you can use 
the write(),  write.table() or write.csv() functions for exporting the 
resulting data frame to an Excel readable 1 sheet only file.


xls <- file.choose()  #Choose the .xls file to convert
prl <- file.choose()  #Choose the perl.exe file location 
example:"C:/Perl/bin/perl.exe"

sheet <- list()
for(i in 1:10){
sheet[[i]] <- read.xls(f, sheet=i, perl=prl)
}

If you plan to use Excel and R a lot, I sincerely recommend using Rexel, 
from http://rcom.univie.ac.at/download.html, it is a great package.


Keo.
Mexico City.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] psi not functioning in nlrob?

2009-08-12 Thread Keo Ormsby

You have to install MASS package first.
Hope this does the trick.
Best,
Keo.

Xiao Xiao wrote:

Hi all,

I'm trying to fit a nonlinear regression by "nlrob":

 model3=nlrob(y~a1*x^a2,data=transient,psi=psi.bisquare,
start=list(a1=0.02,a2=0.7),maxit=1000)

However an error message keeps popping up saying that the function
psi.bisquare doesn't exist.

I also tried psi.huber, which is supposed to be the default for nlrob:

model3=nlrob(y~a1*x^a2,data=transient,psi=psi.huber,
start=list(a1=0.02,a2=0.7),maxit=1000)

But I still got the same error message - psi.huber doesn't exist.

Is the argument "psi" not available in nlrob?

Any help will be appreciated.

Best,
Xiao Xiao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] psi not functioning in nlrob?

2009-08-13 Thread Keo Ormsby

I hadn't checked this, I only had used the default.
The problem stems from the reassigned weighs during nls iterations, 
where the psi.bisquare function can return some 0 weight values, which 
then returns some NaN values in the resid vector that is calculated by 
dividing residuals/weights.

Since the default na.action value is na.fail, this causes the nls to crash.
you can bypass this problem with
model3=nlrob(y~a1*x^a2,data=transient,psi=psi.bisquare, 
start=list(a1=0.02,a2=0.7), maxit=1000, na.action = na.omit)
but it will give you a lot of warning messages about a division of a 
different size vectors in -residuals(out)/sqrt(w), but will finish and 
calculate the parameters.
I don't know enough statistics to assure you that omiting values with 
weight=0 is sound from a mathematical standpoint, intuitively I would 
substitute them for 0.


Perhaps someone out there can help us with this?

Best wishes,
Keo.


Xiao Xiao escribió:

Thank you Keo!
After installing MASS the default "psi=psi.huber" is working now.
However I still can't get "psi=psi.bisquare" to work, and here's
another error message:
  

model3=nlrob(y~a1*x^a2,data=transient,psi=psi.bisquare, 
start=list(a1=0.02,a2=0.7),maxit=1000)


Error in na.fail.default(list(y = c(71.2600034232749, 148.175742933206,  :
  missing values in object

I don't know why there are missing values, I'm sure y is the right
length and there are no NAs in it. Could somebody help me with this
one please?

Thanks in advance,
Xiao
On Wed, Aug 12, 2009 at 11:47 PM, Keo
Ormsby wrote:
  

You have to install MASS package first.
Hope this does the trick.
Best,
Keo.

Xiao Xiao wrote:


Hi all,

I'm trying to fit a nonlinear regression by "nlrob":

 model3=nlrob(y~a1*x^a2,data=transient,psi=psi.bisquare,
start=list(a1=0.02,a2=0.7),maxit=1000)

However an error message keeps popping up saying that the function
psi.bisquare doesn't exist.

I also tried psi.huber, which is supposed to be the default for nlrob:

model3=nlrob(y~a1*x^a2,data=transient,psi=psi.huber,
start=list(a1=0.02,a2=0.7),maxit=1000)

But I still got the same error message - psi.huber doesn't exist.

Is the argument "psi" not available in nlrob?

Any help will be appreciated.

Best,
Xiao Xiao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] wilcox.test p-value = 0

2009-09-15 Thread Keo Ormsby

Hi Murat,
I am not an expert in either statistics nor R, but I can imagine that 
since the default is exact=TRUE, It numerically computes the 
probability, and it may indeed be 0. if you use wilcox.test(x, y, 
exact=FALSE) it will give you a normal aproximation, which will most 
likely be different from zero.

Hope this helps.
Keo.

Murat Tasan escribió:

hi, folks,

how have you gone about reporting a p-value from a test when the
returned value from a test (in this case a rank-sum test) is
numerically equal to 0 according to the machine?

the next lowest value greater than zero that is distinct from zero on
the machine is likely algorithm-dependent (the algorithm of the test
itself), but without knowing the explicit steps of the algorithm
implementation, it is difficult to provide any non-zero value.  i
initially thought to look at .mach...@double.xmin, but i'm not
comfortable with reporting p < .mach...@double.xmin, since without
knowing the specifics of the implementation, this may not be true!

to be clear, if i have data x, and i run the following line, the
returned value is TRUE.

wilcox.test(x)$p.value == 0

thanks for any help on this!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] wilcox.test p-value = 0

2009-09-18 Thread Keo Ormsby

Hello Thomas and Bryan,
Thanks for the correction, sorry Murat I was mistaken. Actually your 
answers solved me a problem I was having using multiple fisher.test() on 
nucleic acid sequences, where we come up with hundreds of thousands of p 
values, a lot of which are 0's. Since we have to correct for multiple 
tests, even very, very small p's might end up not being significant, i 
had assumed the 0's were tied p values, but now I know I can use the 
numerator and the denominator to rank the 0's, even if I don't have the 
exact p value.

Best,
Keo.

Marc Schwartz escribió:
Once one gets past the issue of the p value being extremely small, 
irrespective of the test being used, the OP has asked the question of 
how to report it.


Most communities will have standards for how to report p values, 
covering things like how many significant digits and a minimum p value 
threshold to report.


For example, in medicine, it is common to report 'small' p values as 
'p < 0.001' or 'p < 0.0001'.


Thus, below those numbers, the precision is largely irrelevant and one 
need not report the actual p value.


I just wanted to be sure that we don't lose sight of the forest for 
the trees...  :-)


The OP should consult a relevant guidance document or an experienced 
author in the domain of interest.


HTH,

Marc Schwartz


On Sep 16, 2009, at 9:54 AM, Bryan Keller wrote:

That's right, if the test is exact it is not possible to get a 
p-value of zero.  wilcox.test does not provide an exact p-value in 
the presence of ties so if there are any ties in your data you are 
getting a normal approximation.  Incidentally, if there are any ties 
in your data set I would strongly recommend computing the *exact* 
p-value because using the normal approximation on tied data sets will 
either inflate type I error rate or reduce power depending on how the 
ties are distributed.  Depending on the pattern of ties this can 
result in gross under or over estimation of the p-value.


I guess this is all by way of saying that you should always compute 
the exact p-value if possible.


The package exactRankTests uses the algorithm by Mehta Patel and 
Tsiatis (1984).  If your sample sizes are larger, there is a freely 
available .exe by Cheung and Klotz (1995) that will do exact p-values 
for sample sizes larger than 100 in each group!


You can find it at http://pages.cs.wisc.edu/~klotz/

Bryan


Hi Murat,
I am not an expert in either statistics nor R, but I can imagine 
that since the
default is exact=TRUE, It numerically computes the probability, and 
it may
indeed be 0. if you use wilcox.test(x, y, exact=FALSE) it will give 
you a

normal aproximation, which will most likely be different from zero.


No, the exact p-value can't be zero for a discrete distribution. The 
smallest possible value in this case would, I think, be 
1/choose(length(x)+length(y),length(x)), or perhaps twice that.


More generally, the approach used by format.pvalue() is to display 
very small p-values as <2e-16, where 2e-16 is machine epsilon.  I 
wouldn't want to claim optimality for this choice, but it seems a 
reasonable way to represent "very small".


-thomas



Hope this helps.
Keo.

Murat Tasan escribi?:

hi, folks,

how have you gone about reporting a p-value from a test when the
returned value from a test (in this case a rank-sum test) is
numerically equal to 0 according to the machine?

the next lowest value greater than zero that is distinct from zero on
the machine is likely algorithm-dependent (the algorithm of the test
itself), but without knowing the explicit steps of the algorithm
implementation, it is difficult to provide any non-zero value.  i
initially thought to look at .mach...@double.xmin, but i'm not
comfortable with reporting p < .mach...@double.xmin, since without
knowing the specifics of the implementation, this may not be true!

to be clear, if i have data x, and i run the following line, the
returned value is TRUE.

wilcox.test(x)$p.value == 0

thanks for any help on this!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] wilcox.test p-value = 0

2009-09-18 Thread Keo Ormsby

Hello,
Thanks for the correction, sorry Murat I was mistaken. Actually your 
answers solved me a problem I was having using multiple fisher.test() on 
nucleic acid sequences, where we come up with hundreds of thousands of p 
values, a lot of which are 0's. Since we have to correct for multiple 
tests, even very, very small p's might end up not being significant, i 
had assumed the 0's were tied p values, but now I know I can use the 
numerator and the denominator to rank the 0's, even if I don't have the 
exact p value.

Best,
Keo.

Marc Schwartz escribió:
Once one gets past the issue of the p value being extremely small, 
irrespective of the test being used, the OP has asked the question of 
how to report it.


Most communities will have standards for how to report p values, 
covering things like how many significant digits and a minimum p value 
threshold to report.


For example, in medicine, it is common to report 'small' p values as 
'p < 0.001' or 'p < 0.0001'.


Thus, below those numbers, the precision is largely irrelevant and one 
need not report the actual p value.


I just wanted to be sure that we don't lose sight of the forest for 
the trees...  :-)


The OP should consult a relevant guidance document or an experienced 
author in the domain of interest.


HTH,

Marc Schwartz


On Sep 16, 2009, at 9:54 AM, Bryan Keller wrote:

That's right, if the test is exact it is not possible to get a 
p-value of zero.  wilcox.test does not provide an exact p-value in 
the presence of ties so if there are any ties in your data you are 
getting a normal approximation.  Incidentally, if there are any ties 
in your data set I would strongly recommend computing the *exact* 
p-value because using the normal approximation on tied data sets will 
either inflate type I error rate or reduce power depending on how the 
ties are distributed.  Depending on the pattern of ties this can 
result in gross under or over estimation of the p-value.


I guess this is all by way of saying that you should always compute 
the exact p-value if possible.


The package exactRankTests uses the algorithm by Mehta Patel and 
Tsiatis (1984).  If your sample sizes are larger, there is a freely 
available .exe by Cheung and Klotz (1995) that will do exact p-values 
for sample sizes larger than 100 in each group!


You can find it at http://pages.cs.wisc.edu/~klotz/

Bryan


Hi Murat,
I am not an expert in either statistics nor R, but I can imagine 
that since the
default is exact=TRUE, It numerically computes the probability, and 
it may
indeed be 0. if you use wilcox.test(x, y, exact=FALSE) it will give 
you a

normal aproximation, which will most likely be different from zero.


No, the exact p-value can't be zero for a discrete distribution. The 
smallest possible value in this case would, I think, be 
1/choose(length(x)+length(y),length(x)), or perhaps twice that.


More generally, the approach used by format.pvalue() is to display 
very small p-values as <2e-16, where 2e-16 is machine epsilon.  I 
wouldn't want to claim optimality for this choice, but it seems a 
reasonable way to represent "very small".


-thomas



Hope this helps.
Keo.

Murat Tasan escribi?:

hi, folks,

how have you gone about reporting a p-value from a test when the
returned value from a test (in this case a rank-sum test) is
numerically equal to 0 according to the machine?

the next lowest value greater than zero that is distinct from zero on
the machine is likely algorithm-dependent (the algorithm of the test
itself), but without knowing the explicit steps of the algorithm
implementation, it is difficult to provide any non-zero value.  i
initially thought to look at .mach...@double.xmin, but i'm not
comfortable with reporting p < .mach...@double.xmin, since without
knowing the specifics of the implementation, this may not be true!

to be clear, if i have data x, and i run the following line, the
returned value is TRUE.

wilcox.test(x)$p.value == 0

thanks for any help on this!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading data

2009-09-23 Thread Keo Ormsby

Hello Ashta,
You need to use double blackslashes, liike: "C:\\Documents and 
Settings\\ashta\\MyDocuments\\R_data\\rel.dat"


I usually use the following to avoid writing the path:

#select file from a popup window
f <- file.choose()
#read the file.  the    is Rese for any other arguments e.g. header, 
sep. quote, etc.

data <- read.table(f, ...)

Good luck! and welcome to R.

Keo.

Ashta escribió:

Dear R-users,

 I am a new user for R. I am eager to lean about it.



I wanted to read and  summary of the  a simple data file



I used the following,





rel <- read.table("C:/Documents and Settings/ashta/My
Documents/R_data/rel.dat", quote="",header=FALSE,sep="",col.names=

c("id","orel","nrel"))

summary(rel)





Below is the error message,



rel <- read.table("C:/Documents and Settings/ashta/My
Documents/R_data/rel.dat", quote="",header=FALSE,sep="",col.names=

+ c("id","orel","nrel"))

Error in file(file, "r") : cannot open the connection

In addition: Warning message:

In file(file, "r") :

  cannot open file 'file=C:/Documents and Settings/sewalem/My
Documents/R_data/rel.dat': Invalid argument

  

summary(rel)



Error in summary(rel) : object 'rel' not found



Does it need a library? Where can I get the library?



Any help is highly appreciated



Ashta

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.