Re: [R] Why are the number of coefficients varying? [mgcv][gam]

2013-01-27 Thread Simon Wood

Hi Andrew,

Do you know which coefficients are missing (i.e. which terms have 
missing coefficients)? This would help to narrow down the possibilities. 
Is it certain that all levels of all factors occur in all replicates? If 
not then you will get different numbers of coefficients (in the 
parametric part of the model).


btw. what is tryCatch catching here?

best,
simon

On 28/01/13 05:36, Andrew Crane-Droesch wrote:

Dear List,

I'm using gam in a multiple imputation framework -- specifying the knot
locations, and saving the results of multiple models, each of which is
fit with slightly different data (because some of it is predicted when
missing).  In MI, coefficients from multiple models are averaged, as are
variance-covariance matrices.  VCV's get an additional correction to
account for how variable they are between each other.

For this to work in the context of a penalized spline model, the knots
need to be specified identically for each model (this is assisted by
context knowledge), and each model needs to have the same number of
knots.  This is what I've done, below.  I run that code multiple times
with slightly different (imputed) datasets, but the number of
coefficients varies (between 263-265).

What gives?  Why don't all of my models have the same number of
coefficients?

Thanks in advance!

Best,
Andrew

BCAR.knots = c(2,15,60,120)
INAR.knots = c(50,100,200,300)
bcph.knots = c(7.5,8.5,9.5,10.5)
htt.knots = c(350,450,550,650)
bc.prc.C.knots = c(.3,.45,.6,.8)
phi.knots = c(4.5,5.5,6.5,7.5)
CEC.knots = c(5,12,19,26)
soc.knots = c(10,20,30,40)
sand.knots = c(.2,.4,.6,.8)
clay.knots = c(.15,.3,.45,.6)
abslat.knots = c(10,20,30,45)
lon.knots = c(-50,0,50,125)

dum = as.vector(rep(1,length(trialid)))

doyee = NA
r.ints= tryCatch(gam(RR ~
  as.factor(pot.trial)
  +as.factor(year)
  +as.factor(crop.legume)
  +as.factor(crop.fruit)
  +as.factor(feedstock)
  +s(trialid,bs="re",by=dum)
  + s(BCAR.imp,bs="cr",k=length(BCAR.knots),by=as.factor(pot.trial))
  +
s(INAR.imp,bs="cr",k=length(INAR.knots),by=as.factor(crop.legume))
  + s(bcph.imp,bs="cr",k=length(bcph.knots),by=as.factor(year))
  +s(htt.imp,bs="cr",k=length(htt.knots))
  + s(bc.prc.C.imp,bs="cr",k=length(bc.prc.C.knots))
  + s(phi.imp,bs="cr",k=length(phi.knots),by=as.factor(year))
  +s(CEC.imp,bs="cr",k=length(CEC.knots))
  +s(soc.imp,bs="cr",k=length(soc.knots))
  + s(sand.imp,bs="cr",k=length(sand.knots),by=as.factor(year))
  +s(clay.imp,bs="cr",k=length(clay.knots))
  +s(abslat,bs="cr",k=length(abslat.knots))
  +te(INAR.imp,soc.imp,CEC.imp,k=c(4,4,4))
  +te(soc.imp,BCAR.imp,k=c(4,4))
  +te(phi.imp,bcph.imp,k=c(4,4))
  +te(clay.imp,CEC.imp,k=c(4,4))
  + te(htt.imp,bc.prc.C.imp,k=c(4,4),by=as.factor(feedstock))
  ,data=grain.data
  ,knots=list(
BCAR.imp=BCAR.knots,INAR.imp=INAR.knots,bcph.imp=bcph.knots,htt.imp=htt.knots,bc.prc.C.imp=bc.prc.C.knots,
phi.imp=phi.knots,CEC.imp=CEC.knots,soc.imp=soc.knots,sand.imp=sand.knots,clay.imp=clay.knots,
  abslat.imp=abslat.knots
  )
  ,method = "REML"
  ,weights = 1/nsame
), error=function(e) e, finally=doyee)
if(inherits(r.ints, "error")) {r.ints=doyee; print("an error happened
but it got handled.")}






--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603   http://people.bath.ac.uk/sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to create a random matrix with conditions

2013-01-27 Thread Simon Givoli
Hi!

I want to create a random matrix with 15 variables, each variable having
1000 observations.
Between each two variables, I want to define a specific (*not *random)
correlations between them, but still saving the "randomness" of each
variable (mean=zero, s.d=1).
How can I do this in R?

thanks,
Simon

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scan not working

2013-01-27 Thread Emily Sessa
Thank you all - these suggestions have been very helpful! I've got it doing 
what I wanted know, and I appreciate the help!

Emily


On Jan 27, 2013, at 1:18 PM, Rui Barradas  wrote:

> Hello,
> 
> Try the following. Create a file called test.R with these instructions:
> 
> cmd <- commandArgs(TRUE)
> if(any(grepl("--ext", cmd))){
>   suffix <- cmd[grep("--ext", cmd)]
>   suffix <- unlist(strsplit(suffix, ":"))[2]
>   fl <- paste0("test_", suffix, ".txt") # underscore included
> }else{
>   fl <- "test.txt"
> }
> write(cmd, file = fl)
> 
> # Call like this
> #$ Rscript test.R other --options --ext:cro
> 
> 
> This creates a file test_cro.txt with the command line options written to it. 
> You can adapt this R script to your needs.
> Instead of 'suf', I've used option '--ext' for 'extension'. Change to fit 
> your taste.
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Em 27-01-2013 17:27, Emily Sessa escreveu:
>> Hello all (again),
>> 
>> I received a very helpful answer to this question, and would like to pose 
>> one more:
>> 
>> Right now I have this script, which is being called from the command line, 
>> writing output to two generically named files ("pvalues" and "qvalues") that 
>> are named in the script using the line:
>> 
>> write(pvalues, file="pvalues", ncol=1)
>> write(adjusted, file="qvalues", ncol=1)
>> 
>> However, ideally I would like those two files to have something appended to 
>> their names that make them separate from one another, so I can identify 
>> which input they went with and so they won't write over each other when I 
>> script this into a Perl pipeline that will process many input files, which 
>> is my ultimate goal. I know how to do this in Perl, but not R... is there 
>> some way I can add another argument on the command line that will get passed 
>> to the R.script, like a simple letter code (e.g. "cro"), and then have it 
>> append that to the output file names, so they are, for example: 
>> "qvalues_cro" and "pvalues_cro"?
>> 
>> Thank you very much,
>> Emily
>> 
>> On Jan 27, 2013, at 4:34 AM, peter dalgaard  wrote:
>> 
>>> 
>>> On Jan 27, 2013, at 08:33 , Emily Sessa wrote:
>>> 
 Hi all,
 
 I am trying to use the scan function in an R script that I am calling from 
 the command line on a Mac; at the shell prompt I type:
 
 $ Rscript get_q_values.R LRT_codeml_output
 
 in the hope that LRT_codeml_output will get passed to the get_q_values R 
 script. The first line of that script is:
 
 chidata <- scan(file="")
 
 which, as I understand how scan works, will read the contents of the file 
 from the command line into the object chidata. I did this a few times and 
 it worked like a charm. And then, it stopped working. Now, every time I 
 try to do this, I get "Read 0 items" as the next line in the terminal 
 window, and the output produced by the script is empty, because it's 
 apparently no longer reading anything in. I don't think I changed anything 
 in the script; it just stopped being able to execute the scan function. 
 Does anyone have any idea how to fix this?? I did not have anything else 
 in that scan line when it was working before. I've updated R and restarted 
 my computer in the hope that it would help, but it hasn't. Any help would 
 be much appreciated.
>>> 
>>> I don't see how that would ever work. The 2nd and further args to Rscript 
>>> are passed to R and accesible via commandArgs(). There's no way that scan() 
>>> can know what the arguments are. It might work with
>>> 
>>> Rscript get_q_values.R < LRT_codeml_output
>>> 
>>> though. Or you need to arrange explicitly for 
>>> scan(file=commandArgs(TRUE)[1]).
>>> 
 
 -ES
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] parse/eval and character encoded expressions: How to deal with non-encoding strings?

2013-01-27 Thread Johannes Graumann
Hi,

I am intending to save a path-describing character object in a slot of a 
class I'm working on. In order to have the option to use "system.file" etc 
in such string-saved path definitions, I wrote this

ExpressionEvaluator <- function(x){
  x <- tryCatch(
expr=base::parse(text=x),
error = function(e){return(as.expression(x))},
finally=TRUE)
  return(x)
}

This produces

> ExpressionEvaluator("system.file(\"INDEX\")")
expression(system.file("INDEX"))
> eval(ExpressionEvaluator("system.file(\"INDEX\")"))
[1] "/usr/lib/R/library/base/INDEX"

Which is what I want. However, 

> eval(ExpressionEvaluator("Test"))
Error in eval(expr, envir, enclos) : object 'Test' not found

prevents me from general usage (also in cases where "x" does NOT encode an 
expression).

I don't understand why it is that
> base::parse(text="Test")
will return
[1] expression(Test)
while
> as.expression("Test")
produces
[1] expression("Test")
which would work with the eval call.

Can anyone point out to me how to solve this generally? How can I feed the 
function a character object and get back an eval-able expression independent 
of whether there was an expression "encoded" in the input or not.

Thank you for any hints.

Sincerely, Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] importing data

2013-01-27 Thread Ray Cheung
Thanks a million for all help provided!! I can do what I intend to using
the "for loop". However, I'm still eager to try the list.files approach.
Here is the error message that I got using Ivan's code:

> list_of_dataset <- do.call(read.table, file_names)
Error in do.call(read.table, file_names) : second argument must be a list

Please advise.

Ray

On Sun, Jan 27, 2013 at 10:57 PM, Ivan Calandra <
ivan.calan...@u-bourgogne.fr> wrote:

>  Hi Ray!
>
> I'm insisting with list.files...!
>
> What about like this (untested)?
> file_names <- list.files(path="C:/.../data", pattern=".dat$",
> full.names=TRUE)
> list_of_dataset <- do.call(read.table, file_names)
>
> Let me know if this helps!
> Ivan
>
> --
> Ivan CALANDRA
> Université de Bourgogne
> UMR CNRS/uB 6282 Biogéosciences
> 6 Boulevard Gabriel
> 21000 Dijon, FRANCE+33(0)3.80.39.63.06ivan.calan...@u-bourgogne.fr
> http://biogeosciences.u-bourgogne.fr/calandra
>
> Le 26/01/13 10:03, Ray Cheung a écrit :
>
> Thanks for your commands, Ivan and Michael! However, I am still not
> producing the right codes. Would you please help me on this? I've written
> the following codes. Please comment. Thank you very much.
>
> Task: Reading data1.dat to data1000.dat (with missing files) into R.
> Missing files can be omitted in the list.
>
> ###FUNCTION TO READ FILES
> little_helpful <- function(n) {
> file_name <- paste0("C:/.../data", n, ".dat")
> read.table(file_name)
> }
>
> ###RETURN AN OBJECT WHICH CHECKS FOR THE EXISTENCE OF FILES
> check  <- function(n) {
> a <- ifelse(file.exists(paste0("C:/.../data", n, ".dat")), 1, 0)
> a
> }
>  ###Combining the functions
> IMPORT <- function(n) {
>L <- check(1:n)
>for (i in 1:n) {
>   if (L[i] == 1)
>   list_of_datasets <- lapply(i, little_helpful) else list_of_datasets
> <- 0
>   }
>list_of_datasets
>}
>
> Thanks for all comments.
>
> Best Regards,
> Ray
>
>  On Fri, Jan 25, 2013 at 5:48 PM, Ivan Calandra <
> ivan.calan...@u-bourgogne.fr> wrote:
>
>> Hi,
>>
>> Not sure this is what you need, but what about list.files()?
>> It can get you all the files from a given folder, and you could then work
>> this list with regular expressions for example.
>>
>> HTH,
>> Ivan
>>
>> --
>> Ivan CALANDRA
>> Université de Bourgogne
>> UMR CNRS/uB 6282 Biogéosciences
>> 6 Boulevard Gabriel
>> 21000 Dijon, FRANCE
>> +33(0)3.80.39.63.06
>> ivan.calan...@u-bourgogne.fr
>> http://biogeosciences.u-bourgogne.fr/calandra
>>
>> Le 25/01/13 10:00, R. Michael Weylandt a écrit :
>>
>>>  On Fri, Jan 25, 2013 at 6:11 AM, Ray Cheung  wrote:
>>>
 Dear Michael,

 Thanks for your codes. However, lapply does not work in my case since
 I've
 some files missing in the data (say, the file data101.dat). Do you have
 any
 suggestions on this?? Thank you very much.

  You could simply add a test using file.exists() but I'm not sure what
>>> you want to do with the M matrix then -- omit the slice (so the others
>>> are all shifted down one) or fill it entirely with NA's.
>>>
>>> Michael
>>>
>>>   __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Request for unsubscribe from this forum.

2013-01-27 Thread Berend Hasselman

On 28-01-2013, at 06:17, Purna chander  wrote:

> Dear admin members,
> 
> 
> My Inbox is being flooded with the posts every time. As a reason, I
> wish to unsubscribe from this forum.
> 
> Can you suggest me how to do that.
> 

At the bottom of every message is a link to the R-help mailing list page 
(https://stat.ethz.ch/mailman/listinfo/r-help)
Scroll to the very end of that page. Follow the instructions.

Berend


> Regards,
> Purna
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Why are the number of coefficients varying? [mgcv][gam]

2013-01-27 Thread Andrew Crane-Droesch
Dear List,

I'm using gam in a multiple imputation framework -- specifying the knot 
locations, and saving the results of multiple models, each of which is 
fit with slightly different data (because some of it is predicted when 
missing).  In MI, coefficients from multiple models are averaged, as are 
variance-covariance matrices.  VCV's get an additional correction to 
account for how variable they are between each other.

For this to work in the context of a penalized spline model, the knots 
need to be specified identically for each model (this is assisted by 
context knowledge), and each model needs to have the same number of 
knots.  This is what I've done, below.  I run that code multiple times 
with slightly different (imputed) datasets, but the number of 
coefficients varies (between 263-265).

What gives?  Why don't all of my models have the same number of 
coefficients?

Thanks in advance!

Best,
Andrew

BCAR.knots = c(2,15,60,120)
INAR.knots = c(50,100,200,300)
bcph.knots = c(7.5,8.5,9.5,10.5)
htt.knots = c(350,450,550,650)
bc.prc.C.knots = c(.3,.45,.6,.8)
phi.knots = c(4.5,5.5,6.5,7.5)
CEC.knots = c(5,12,19,26)
soc.knots = c(10,20,30,40)
sand.knots = c(.2,.4,.6,.8)
clay.knots = c(.15,.3,.45,.6)
abslat.knots = c(10,20,30,45)
lon.knots = c(-50,0,50,125)

dum = as.vector(rep(1,length(trialid)))

doyee = NA
r.ints= tryCatch(gam(RR ~
 as.factor(pot.trial)
 +as.factor(year)
 +as.factor(crop.legume)
 +as.factor(crop.fruit)
 +as.factor(feedstock)
 +s(trialid,bs="re",by=dum)
 + s(BCAR.imp,bs="cr",k=length(BCAR.knots),by=as.factor(pot.trial))
 + 
s(INAR.imp,bs="cr",k=length(INAR.knots),by=as.factor(crop.legume))
 + s(bcph.imp,bs="cr",k=length(bcph.knots),by=as.factor(year))
 +s(htt.imp,bs="cr",k=length(htt.knots))
 + s(bc.prc.C.imp,bs="cr",k=length(bc.prc.C.knots))
 + s(phi.imp,bs="cr",k=length(phi.knots),by=as.factor(year))
 +s(CEC.imp,bs="cr",k=length(CEC.knots))
 +s(soc.imp,bs="cr",k=length(soc.knots))
 + s(sand.imp,bs="cr",k=length(sand.knots),by=as.factor(year))
 +s(clay.imp,bs="cr",k=length(clay.knots))
 +s(abslat,bs="cr",k=length(abslat.knots))
 +te(INAR.imp,soc.imp,CEC.imp,k=c(4,4,4))
 +te(soc.imp,BCAR.imp,k=c(4,4))
 +te(phi.imp,bcph.imp,k=c(4,4))
 +te(clay.imp,CEC.imp,k=c(4,4))
 + te(htt.imp,bc.prc.C.imp,k=c(4,4),by=as.factor(feedstock))
 ,data=grain.data
 ,knots=list( 
BCAR.imp=BCAR.knots,INAR.imp=INAR.knots,bcph.imp=bcph.knots,htt.imp=htt.knots,bc.prc.C.imp=bc.prc.C.knots,
phi.imp=phi.knots,CEC.imp=CEC.knots,soc.imp=soc.knots,sand.imp=sand.knots,clay.imp=clay.knots,
 abslat.imp=abslat.knots
 )
 ,method = "REML"
 ,weights = 1/nsame
), error=function(e) e, finally=doyee)
if(inherits(r.ints, "error")) {r.ints=doyee; print("an error happened 
but it got handled.")}



-- 

*Andrew Crane-Droesch*
Energy and Resources Group
UC Berkeley
+1 215 435 2644
andre...@berkeley.edu
skype: andrew.crane-droesch
http://andrewcd.berkeley.edu



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Request for unsubscribe from this forum.

2013-01-27 Thread Purna chander
Dear admin members,


My Inbox is being flooded with the posts every time. As a reason, I
wish to unsubscribe from this forum.

Can you suggest me how to do that.

Regards,
Purna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R package for normalizing microarray with few samples

2013-01-27 Thread Gundala Viswanath
I have only *two* datasets from normal and cancer samples.

  CancerNormal
--
mRNA1  3049
mRNA2 199200
... ...   ...
mRNA1000   1340


Each samples contain several thousan mRNA microarray expressions.
The final aim is to identify mRNAs that are significantly
down/upregulated.

What is the best R package we can use for this?

- G.V.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confidence / prediction ellipse

2013-01-27 Thread David Winsemius


On Jan 27, 2013, at 8:16 PM, Giuseppe Amatulli wrote:


Dear all,
thanks for your input.

Bert - yes you get the point, i would to like to draw an ellipse  
contour

for a population quantile.
Indeed, as you mention data.ellipse() should draw that. In other  
words if i
re-run my model for another prediction (getting a new vector b) i  
would

have the 95% probability that my prediction fall inside the ellipse.


install.packages("hdrcde")  # from Rob Hyndman
require(hdrcde) x <- c(rnorm(200,0,1),rnorm(200,4,1))
y <- c(rnorm(200,0,1),rnorm(200,4,1))
par(mfrow=c(1,2))
plot(x,y, pch="+", cex=.5)
hdr.boxplot.2d(x,y)


--
David.



On 27 January 2013 17:26, Bert Gunter  wrote:


All:

Aha! -- The light dawneth, methinks (maybe...)

Giuseppe: I re-read the SAS link you sent and if I have parsed it
correctly, what SAS chooses to call an ellipse for "prediction" -- a
rather idiosyncratic way to describe it, imo -- I believe the rest of
us would call a contour for a population quantile. The key phrase  
that

indicates this is "It [the elliptical region] also approximates a
region containing a specified percentage of the population. "

So, if I'm right, I believe you just need to use car's data.ellipse()
function.

And if I'm wrong, my little corner of the globe remains cloaked in
darkness confusion.

Cheers,
Bert

On Sun, Jan 27, 2013 at 12:43 PM, John Fox  wrote:

Dear Giuseppe and Bert,

I also didn't follow what's intended, more or less for the same  
reasons

as
Bert mentioned, which is why I didn't reply to the initial  
posting. In

the
car package, confidenceEllipse() draws confidence ellipses for a  
pair of

coefficients from a statistical model, and dataEllipse() draws
bivariate-normal concentration ellipses for the bivariate  
distribution of

two variables.

I'm copying to Georges Monette and Michael Friendly, coauthors of  
these

functions, in case they have something to add.

I hope that this helps, but I doubt that it does.

John

---
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org

]

On Behalf Of Giuseppe Amatulli
Sent: Sunday, January 27, 2013 11:41 AM
To: Bert Gunter
Cc: r-help@r-project.org
Subject: Re: [R] confidence / prediction ellipse

Hi,
thanks for your replay.
My values of a and b are respectively:
a = observation of an event
b = prediction of a model.

Therefore i would like to draw the confidence region for  
predicting a

new
observation, and according to this
http://v8doc.sas.com/sashtml/insight/chap40/sect35.htm the  
prediction

ellipse should be more appropriate.

But i'm not able to track back the function
radius <- sqrt(dfn * qf(level, dfn, dfd))
in order to change it and draw a prediction ellipses.

Regards
Giuseppe




On 26 January 2013 17:19, Bert Gunter   
wrote:


Well, I'd guess you have to first define what you mean by  
"prediction

ellipse," as the confidence ellipses are for the bivariate
distribution of 2 parameter estimates -- as I understand it --
whereas predictions depend on the covariate values and are for a
single response value (unless you have fitted multiple  
responses, I

suppose).

-- Bert

On Sat, Jan 26, 2013 at 1:12 PM, Giuseppe Amatulli
 wrote:

Hi,
I'm using the R library(car) to draw confidence/prediction  
ellipses

in a

scatterplot.

From what i understood  the ellipse() function return an ellipse

based

parameters:  shape, center,  radius .
If i read  dataEllipse() function i can see how these  
parameters are

calculated for a confidence ellipse.

ibrary(car)

a=c(12,12,4,5,63,63,23)
b=c(13,15,7,10,73,83,43)

v <- cov.trob(cbind(a, b))
shape <- v$cov
center <- v$center

radius <- sqrt(2 * qf(0.95, 2, length(a) - 1))   # radius <-

sqrt(dfn *

qf(level, dfn, dfd))

conf.elip = ellipse(center, shape, radius,draw = F)
plot(conf.elip, type='l')
points(a,b)

My question is how I can calculate shape, center and radius  to

obtain a

prediction ellipses rather than a confidence ellipse?
Thanks in Advance
Giuseppe

--
Giuseppe Amatulli
Web: www.spatial-ecology.net

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible  
code.




--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:

http://pharmadevelopment.roche.com/index/pdb/pdb-functional-

groups/pdb-biostatistics/pdb-ncb-home.htm






--
Giuseppe Amatulli
Web: www.spatial-ecology.net

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the p

Re: [R] Sparse dataframes?

2013-01-27 Thread Andrew Hoerner
Hi Karl! Thanks for writing!

Doesn't this format require a column for a factor for every variable
present in any observation, whether or not that variable is present in the
observation in question? I think I end up with data that consists mainly of
columns of variables that are NAs for all but a few years.

But let me take a closer look at the data format and see.

Again, thanks! --andrewH


On Tue, Jan 15, 2013 at 9:22 AM, andrewH  wrote:

> Dear Folks--
> Is there a data frame analog to sparse matrices? I am working with a panel
> data set that has a large number of variables that are redefined repeatedly
> or exist for only a few years (out of 48).  In my current structure, where
> variables are columns and rows are years, more than 90 percent of the cells
> and more than 3/4 of the total size of my file are NAs.
>
> I am wondering if there is an alternate file specification currently
> available that still allows numeric, character and factor data to be
> stored.
> Besides just using a database.
>
> A pointer in the right direction (or a solid "no" if that is the truth)
> would be greatly appreciated.
>
> Sincerely, andrewH
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Sparse-dataframes-tp4655614.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
J. Andrew Hoerner
Director, Sustainable Economics Program
Redefining Progress
(510) 507-4820

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confidence / prediction ellipse

2013-01-27 Thread Giuseppe Amatulli
Dear all,
thanks for your input.

Bert - yes you get the point, i would to like to draw an ellipse contour
for a population quantile.
Indeed, as you mention data.ellipse() should draw that. In other words if i
re-run my model for another prediction (getting a new vector b) i would
have the 95% probability that my prediction fall inside the ellipse.

Thanks again
Giuseppe











On 27 January 2013 17:26, Bert Gunter  wrote:

> All:
>
> Aha! -- The light dawneth, methinks (maybe...)
>
> Giuseppe: I re-read the SAS link you sent and if I have parsed it
> correctly, what SAS chooses to call an ellipse for "prediction" -- a
> rather idiosyncratic way to describe it, imo -- I believe the rest of
> us would call a contour for a population quantile. The key phrase that
> indicates this is "It [the elliptical region] also approximates a
> region containing a specified percentage of the population. "
>
> So, if I'm right, I believe you just need to use car's data.ellipse()
> function.
>
> And if I'm wrong, my little corner of the globe remains cloaked in
> darkness confusion.
>
> Cheers,
> Bert
>
> On Sun, Jan 27, 2013 at 12:43 PM, John Fox  wrote:
> > Dear Giuseppe and Bert,
> >
> > I also didn't follow what's intended, more or less for the same reasons
> as
> > Bert mentioned, which is why I didn't reply to the initial posting. In
> the
> > car package, confidenceEllipse() draws confidence ellipses for a pair of
> > coefficients from a statistical model, and dataEllipse() draws
> > bivariate-normal concentration ellipses for the bivariate distribution of
> > two variables.
> >
> > I'm copying to Georges Monette and Michael Friendly, coauthors of these
> > functions, in case they have something to add.
> >
> > I hope that this helps, but I doubt that it does.
> >
> > John
> >
> > ---
> > John Fox
> > Senator McMaster Professor of Social Statistics
> > Department of Sociology
> > McMaster University
> > Hamilton, Ontario, Canada
> >
> >
> >
> >
> >> -Original Message-
> >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org
> ]
> >> On Behalf Of Giuseppe Amatulli
> >> Sent: Sunday, January 27, 2013 11:41 AM
> >> To: Bert Gunter
> >> Cc: r-help@r-project.org
> >> Subject: Re: [R] confidence / prediction ellipse
> >>
> >> Hi,
> >> thanks for your replay.
> >> My values of a and b are respectively:
> >> a = observation of an event
> >> b = prediction of a model.
> >>
> >> Therefore i would like to draw the confidence region for predicting a
> >> new
> >> observation, and according to this
> >> http://v8doc.sas.com/sashtml/insight/chap40/sect35.htm the prediction
> >> ellipse should be more appropriate.
> >>
> >> But i'm not able to track back the function
> >> radius <- sqrt(dfn * qf(level, dfn, dfd))
> >> in order to change it and draw a prediction ellipses.
> >>
> >> Regards
> >> Giuseppe
> >>
> >>
> >>
> >>
> >> On 26 January 2013 17:19, Bert Gunter  wrote:
> >>
> >> > Well, I'd guess you have to first define what you mean by "prediction
> >> > ellipse," as the confidence ellipses are for the bivariate
> >> > distribution of 2 parameter estimates -- as I understand it --
> >> > whereas predictions depend on the covariate values and are for a
> >> > single response value (unless you have fitted multiple responses, I
> >> > suppose).
> >> >
> >> > -- Bert
> >> >
> >> > On Sat, Jan 26, 2013 at 1:12 PM, Giuseppe Amatulli
> >> >  wrote:
> >> > > Hi,
> >> > > I'm using the R library(car) to draw confidence/prediction ellipses
> >> in a
> >> > > scatterplot.
> >> > > >From what i understood  the ellipse() function return an ellipse
> >> based
> >> > > parameters:  shape, center,  radius .
> >> > > If i read  dataEllipse() function i can see how these parameters are
> >> > > calculated for a confidence ellipse.
> >> > >
> >> > > ibrary(car)
> >> > >
> >> > > a=c(12,12,4,5,63,63,23)
> >> > > b=c(13,15,7,10,73,83,43)
> >> > >
> >> > > v <- cov.trob(cbind(a, b))
> >> > > shape <- v$cov
> >> > > center <- v$center
> >> > >
> >> > > radius <- sqrt(2 * qf(0.95, 2, length(a) - 1))   # radius <-
> >> sqrt(dfn *
> >> > > qf(level, dfn, dfd))
> >> > >
> >> > > conf.elip = ellipse(center, shape, radius,draw = F)
> >> > > plot(conf.elip, type='l')
> >> > > points(a,b)
> >> > >
> >> > > My question is how I can calculate shape, center and radius  to
> >> obtain a
> >> > > prediction ellipses rather than a confidence ellipse?
> >> > > Thanks in Advance
> >> > > Giuseppe
> >> > >
> >> > > --
> >> > > Giuseppe Amatulli
> >> > > Web: www.spatial-ecology.net
> >> > >
> >> > > [[alternative HTML version deleted]]
> >> > >
> >> > > __
> >> > > R-help@r-project.org mailing list
> >> > > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > > PLEASE do read the posting guide
> >> > http://www.R-project.org/posting-guide.html
> >> > > and provide commented, minimal, self-contained, reproducible code.
> >> >
> >> >
> >> >
>

Re: [R] Equivalent of box() in grid graphics

2013-01-27 Thread Paul Murrell

Hi

On 17/01/13 13:19, p_conno...@slingshot.co.nz wrote:

Paul Murell's article "What's in a Name" in The R Journal Vol 4/2
gives an interesting example of editing a stacked barplot of the barley
data.  Using the method described in that article, it's easy to do
something along the lines of

grid.edit("plot_01.border.strip.1",
   grep=TRUE, global=TRUE,
   gp=gpar(col = "red"))

That changes more than I'd like to change. I'd like to change only the
bottom line of the rectangle.  How would I overwrite the unwanted red
lines along the lines of what box() would do with base graphics?


You cannot modify just part of a basic shape (e.g., just the bottom line 
of a rectangle).  One way to do what I think you want is to specify a 
custom strip function that draws an extra line over the top of the 
existing rectangle, like this ...


library(lattice)
library(grid)
barchart(yield ~ variety | site, data = barley,
 groups = year, layout = c(1,6),
 stack = TRUE,
 ylab = "Barley Yield (bushels/acre)",
 scales = list(x = list(rot = 45)),
 strip = function(...) {
 strip.default(...)
 grid.segments(0, 0, 1, 0, gp=gpar(lwd=1.5, col="red"))
 })

Is that the sort of effect you want?

Paul


TIA
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package: VennDiagram. Error in draw.pairwise.venn Impossible: cross section area too large

2013-01-27 Thread Paul Boutros
Hi Fabrice,

The cross.area parameter gives the size of the intersection, which cannot be 
larger than the size of either set 1 (area1 parameter) or set 2 (area2 
parameter).  You probably want:
venn.plot <- draw.pairwise.venn(
area1 = 3186 + 5880,
area2 = 325 + 5880,
cross.area = 5880);

Paul


-Original Message-
From: hiek...@gmail.com [mailto:hiek...@gmail.com] On Behalf Of Fabrice Tourre
Sent: January-27-13 5:06 PM
To: r-help@r-project.org
Cc: Paul Boutros
Subject: Package: VennDiagram. Error in draw.pairwise.venn Impossible: cross 
section area too large

Dear list,

When I use VennDiagram package, I got a error as follow:

venn.plot <- draw.pairwise.venn(
area1 = 3186,
area2 = 325,
cross.area = 5880);


Error in draw.pairwise.venn(area1 = 3186, area2 = 325, cross.area = 588) :
  Impossible: cross section area too large.

Does anyone have suggestion?

Thank you.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to extract values from a raster according to Lat and long of the values?

2013-01-27 Thread jim holtman
How do you get those values from the example header file that you included?

On Sun, Jan 27, 2013 at 9:15 PM, Pascal Oettli  wrote:
> ?extract
>
> HTH
> Pascal
>
>
> Le 27/01/2013 22:53, Jonsson a écrit :
>>
>> having 12 files with 12 hdrs for one year:these files are raster
>> (projected  WGS84,lat
>> long):https://echange-fichiers.inra.fr/get?k=rLSyoavrnifGyH5XrlO
>>
>>samples = 1440
>> lines   = 720
>> bands   = 1
>> header offset = 0
>> file type = ENVI Standard
>> data type = 4
>> interleave = bsq
>>  byte order = 0
>>map info = {  Geographic Lat/Lon, 1, 1, -180, 90, 0.25,
>> 0.25,WGS-84}
>>  coordinate system string =
>> GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",
>>   SPHEROID["WGS_1984",6378137,298.257223563]]
>> ,PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]
>>   }
>> These lines will open the files as a list:
>>
>>a<-list.files("D:\\ECV\\2010", "*.envi", full.names = TRUE)
>> for(i in 1:length(a)){
>>  d <- raster(a[i]}
>>
>> I would like to extract the values correspond to  44.8386° N, 0.5783° W
>> from
>> all files as txt file
>>
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/how-to-extract-values-from-a-raster-according-to-Lat-and-long-of-the-values-tp4656767.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to extract values from a raster according to Lat and long of the values?

2013-01-27 Thread Pascal Oettli

?extract

HTH
Pascal


Le 27/01/2013 22:53, Jonsson a écrit :

having 12 files with 12 hdrs for one year:these files are raster
(projected  WGS84,lat
long):https://echange-fichiers.inra.fr/get?k=rLSyoavrnifGyH5XrlO

   samples = 1440
lines   = 720
bands   = 1
header offset = 0
file type = ENVI Standard
data type = 4
interleave = bsq
 byte order = 0
   map info = {  Geographic Lat/Lon, 1, 1, -180, 90, 0.25,
0.25,WGS-84}
 coordinate system string =
GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",
  SPHEROID["WGS_1984",6378137,298.257223563]]
,PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]
  }
These lines will open the files as a list:

   a<-list.files("D:\\ECV\\2010", "*.envi", full.names = TRUE)
for(i in 1:length(a)){
 d <- raster(a[i]}

I would like to extract the values correspond to  44.8386° N, 0.5783° W from
all files as txt file



--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-extract-values-from-a-raster-according-to-Lat-and-long-of-the-values-tp4656767.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] set.seed()

2013-01-27 Thread Jeff Newmiller
Depends what algorithm you are using. Read ?set.seed
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

londonphd  wrote:

>Hi,
>I am learning R. I've been using set.seed() for a while, but without
>actually understanding the significance of the "number" we put in the
>brackets. e.g. set.seed(135) & set.seed(930). 
>
>Can anyone shed some light on this please?
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/set-seed-tp4656788.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an "echo" question

2013-01-27 Thread Erin Hodgess
Ok.  Thanks!


On Sun, Jan 27, 2013 at 7:44 PM, Jeff Newmiller
 wrote:
> ?cat
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> Erin Hodgess  wrote:
>
>>Dear R People:
>>
>>Here is an unusual question:  if you are using "source", it just
>>prints the output.  If you use source with "echo", you get both
>>commands and output.  Is there a way just to show the commands,
>>please?
>>
>>Thanks,
>>Erin
>



-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an "echo" question

2013-01-27 Thread Jeff Newmiller
?cat
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Erin Hodgess  wrote:

>Dear R People:
>
>Here is an unusual question:  if you are using "source", it just
>prints the output.  If you use source with "echo", you get both
>commands and output.  Is there a way just to show the commands,
>please?
>
>Thanks,
>Erin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] an "echo" question

2013-01-27 Thread Erin Hodgess
Dear R People:

Here is an unusual question:  if you are using "source", it just
prints the output.  If you use source with "echo", you get both
commands and output.  Is there a way just to show the commands,
please?

Thanks,
Erin



-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package: VennDiagram. Error in draw.pairwise.venn Impossible: cross section area too large

2013-01-27 Thread Fabrice Tourre
Thank you all.

Yes. I just miss understand this part. Now it is OK.

On Sun, Jan 27, 2013 at 7:41 PM, Paul Boutros  wrote:
> Hi Fabrice,
>
> The cross.area parameter gives the size of the intersection, which cannot be 
> larger than the size of either set 1 (area1 parameter) or set 2 (area2 
> parameter).  You probably want:
> venn.plot <- draw.pairwise.venn(
> area1 = 3186 + 5880,
> area2 = 325 + 5880,
> cross.area = 5880);
>
> Paul
>
>
> -Original Message-
> From: hiek...@gmail.com [mailto:hiek...@gmail.com] On Behalf Of Fabrice Tourre
> Sent: January-27-13 5:06 PM
> To: r-help@r-project.org
> Cc: Paul Boutros
> Subject: Package: VennDiagram. Error in draw.pairwise.venn Impossible: cross 
> section area too large
>
> Dear list,
>
> When I use VennDiagram package, I got a error as follow:
>
> venn.plot <- draw.pairwise.venn(
> area1 = 3186,
> area2 = 325,
> cross.area = 5880);
>
>
> Error in draw.pairwise.venn(area1 = 3186, area2 = 325, cross.area = 588) :
>   Impossible: cross section area too large.
>
> Does anyone have suggestion?
>
> Thank you.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package: VennDiagram. Error in draw.pairwise.venn Impossible: cross section area too large

2013-01-27 Thread Ted Harding

On 27-Jan-2013 23:50:57 Ben Bolker wrote:
> Fabrice Tourre  gmail.com> writes:
> 
>> Dear list,
>> When I use VennDiagram package, I got a error as follow:
>> 
>> venn.plot <- draw.pairwise.venn(
>> area1 = 3186,
>> area2 = 325,
>> cross.area = 5880);
>> 
>> Error in draw.pairwise.venn(area1 = 3186, area2 = 325, cross.area = 588) :
>>   Impossible: cross section area too large.
>> 
>> Does anyone have suggestion?
>> 
>> Thank you.
> 
>   I don't know the package, but it looks like you're trying to 
> draw two bubbles with areas of 3186 and 325.  cross.area sounds like
> the area of the intersection.  You can't have an area of intersection
> that's bigger than one of the categories ...

According to the PDF manual for VennDiagram:

  Arguments
area1 The size of the first set
area2 The size of the second set
cross.area The size of the intersection between the sets

so Ben has hit the nail on the head! (Maybe "area2 = 325" was a typo?).

Best wishes to all,
Ted.

-
E-Mail: (Ted Harding) 
Date: 28-Jan-2013  Time: 00:09:32
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package: VennDiagram. Error in draw.pairwise.venn Impossible: cross section area too large

2013-01-27 Thread Ben Bolker
Fabrice Tourre  gmail.com> writes:

> 
> Dear list,
> 
> When I use VennDiagram package, I got a error as follow:
> 
> venn.plot <- draw.pairwise.venn(
> area1 = 3186,
> area2 = 325,
> cross.area = 5880);
> 
> Error in draw.pairwise.venn(area1 = 3186, area2 = 325, cross.area = 588) :
>   Impossible: cross section area too large.
> 
> Does anyone have suggestion?
> 
> Thank you.
> 
> 

  I don't know the package, but it looks like you're trying to 
draw two bubbles with areas of 3186 and 325.  cross.area sounds like
the area of the intersection.  You can't have an area of intersection
that's bigger than one of the categories ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confidence / prediction ellipse

2013-01-27 Thread Bert Gunter
All:

Aha! -- The light dawneth, methinks (maybe...)

Giuseppe: I re-read the SAS link you sent and if I have parsed it
correctly, what SAS chooses to call an ellipse for "prediction" -- a
rather idiosyncratic way to describe it, imo -- I believe the rest of
us would call a contour for a population quantile. The key phrase that
indicates this is "It [the elliptical region] also approximates a
region containing a specified percentage of the population. "

So, if I'm right, I believe you just need to use car's data.ellipse() function.

And if I'm wrong, my little corner of the globe remains cloaked in
darkness confusion.

Cheers,
Bert

On Sun, Jan 27, 2013 at 12:43 PM, John Fox  wrote:
> Dear Giuseppe and Bert,
>
> I also didn't follow what's intended, more or less for the same reasons as
> Bert mentioned, which is why I didn't reply to the initial posting. In the
> car package, confidenceEllipse() draws confidence ellipses for a pair of
> coefficients from a statistical model, and dataEllipse() draws
> bivariate-normal concentration ellipses for the bivariate distribution of
> two variables.
>
> I'm copying to Georges Monette and Michael Friendly, coauthors of these
> functions, in case they have something to add.
>
> I hope that this helps, but I doubt that it does.
>
> John
>
> ---
> John Fox
> Senator McMaster Professor of Social Statistics
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
>
>
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf Of Giuseppe Amatulli
>> Sent: Sunday, January 27, 2013 11:41 AM
>> To: Bert Gunter
>> Cc: r-help@r-project.org
>> Subject: Re: [R] confidence / prediction ellipse
>>
>> Hi,
>> thanks for your replay.
>> My values of a and b are respectively:
>> a = observation of an event
>> b = prediction of a model.
>>
>> Therefore i would like to draw the confidence region for predicting a
>> new
>> observation, and according to this
>> http://v8doc.sas.com/sashtml/insight/chap40/sect35.htm the prediction
>> ellipse should be more appropriate.
>>
>> But i'm not able to track back the function
>> radius <- sqrt(dfn * qf(level, dfn, dfd))
>> in order to change it and draw a prediction ellipses.
>>
>> Regards
>> Giuseppe
>>
>>
>>
>>
>> On 26 January 2013 17:19, Bert Gunter  wrote:
>>
>> > Well, I'd guess you have to first define what you mean by "prediction
>> > ellipse," as the confidence ellipses are for the bivariate
>> > distribution of 2 parameter estimates -- as I understand it --
>> > whereas predictions depend on the covariate values and are for a
>> > single response value (unless you have fitted multiple responses, I
>> > suppose).
>> >
>> > -- Bert
>> >
>> > On Sat, Jan 26, 2013 at 1:12 PM, Giuseppe Amatulli
>> >  wrote:
>> > > Hi,
>> > > I'm using the R library(car) to draw confidence/prediction ellipses
>> in a
>> > > scatterplot.
>> > > >From what i understood  the ellipse() function return an ellipse
>> based
>> > > parameters:  shape, center,  radius .
>> > > If i read  dataEllipse() function i can see how these parameters are
>> > > calculated for a confidence ellipse.
>> > >
>> > > ibrary(car)
>> > >
>> > > a=c(12,12,4,5,63,63,23)
>> > > b=c(13,15,7,10,73,83,43)
>> > >
>> > > v <- cov.trob(cbind(a, b))
>> > > shape <- v$cov
>> > > center <- v$center
>> > >
>> > > radius <- sqrt(2 * qf(0.95, 2, length(a) - 1))   # radius <-
>> sqrt(dfn *
>> > > qf(level, dfn, dfd))
>> > >
>> > > conf.elip = ellipse(center, shape, radius,draw = F)
>> > > plot(conf.elip, type='l')
>> > > points(a,b)
>> > >
>> > > My question is how I can calculate shape, center and radius  to
>> obtain a
>> > > prediction ellipses rather than a confidence ellipse?
>> > > Thanks in Advance
>> > > Giuseppe
>> > >
>> > > --
>> > > Giuseppe Amatulli
>> > > Web: www.spatial-ecology.net
>> > >
>> > > [[alternative HTML version deleted]]
>> > >
>> > > __
>> > > R-help@r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>> > --
>> >
>> > Bert Gunter
>> > Genentech Nonclinical Biostatistics
>> >
>> > Internal Contact Info:
>> > Phone: 467-7374
>> > Website:
>> >
>> > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-
>> groups/pdb-biostatistics/pdb-ncb-home.htm
>> >
>>
>>
>>
>> --
>> Giuseppe Amatulli
>> Web: www.spatial-ecology.net
>>
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Bert Gunter
Genentech Nonclinic

[R] set.seed()

2013-01-27 Thread londonphd
Hi,
I am learning R. I've been using set.seed() for a while, but without
actually understanding the significance of the "number" we put in the
brackets. e.g. set.seed(135) & set.seed(930). 

Can anyone shed some light on this please?



--
View this message in context: 
http://r.789695.n4.nabble.com/set-seed-tp4656788.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing values containing a specific character

2013-01-27 Thread ypodeswa
Thanks for your help!  Your original solution would have worked just fine
too if I didn't have a number of other large data frames loaded.  This is
my first time working with such large data sets, so I'd never previously
run out of available memory.  I finished up the project, though I'm sure
all be back to this mailing list in the future!


On Sun, Jan 27, 2013 at 11:54 AM, arun kirshna [via R] <
ml-node+s789695n4656784...@n4.nabble.com> wrote:

> Hi,
> I tried with bigger dataset.
>
> set.seed(25)
> names <- sample(c("bob", "joe", "[hidden 
> email]",
> "emily", "[hidden 
> email]"),5e6,replace=TRUE)
>
> set.seed(1651)
> emails
>  <- sample(c("[hidden 
> email]",
> "[hidden email] ", 
> "[hidden
> email] ",
>  "[hidden email] ",
> "[hidden email] 
> "),5e6,replace=TRUE)
>
>
>  df <- data.frame(names, emails)
>  dim(df)
> #[1] 500   2
>  df[]<-lapply(df,as.character)
>  system.time(df[,1][grep("@",df$names)]<- "" )
> #   user  system elapsed
> #  1.732   0.108   1.844
>  system.time(dfNew1<-df[grep("\\w+",df$names),])
> #   user  system elapsed
> #  0.896   0.024   0.923
>  system.time(dfNew2<- df[df$names!="",])
> #   user  system elapsed
>  # 0.460   0.028   0.490
> A.K.
>
>
>
>
>
>
>
> 
> From: Yasha Podeswa <[hidden 
> email]>
>
> To: arun <[hidden 
> email]>
>
> Cc: R help <[hidden 
> email]>;
> Uwe Ligges <[hidden 
> email]>
>
> Sent: Sunday, January 27, 2013 2:05 PM
> Subject: Re: [R] Removing values containing a specific character
>
>
> You two were 100% right, it was just a memory issue.  This was part of a
> bigger project where I had a number of data frames loaded, all with 1-5
> million rows. Cleaned up my code to have less data frames loaded at once,
> and everything is working great.  Thanks for the help!
> On Jan 27, 2013 9:46 AM, "arun" <[hidden 
> email]>
> wrote:
>
> Hi Yasha,
>
> >
> > I guess you got Uwe's response.
> >
> > I created `df2` with the intention of getting the two results from the
> original dataset.
> >For example, after you get the first result
> >df[,1][grep("@",df$names)]<- ""
> >#you can get the second result by:
> >df[df$names!="",]
> > # names emails
> >#1   bob   [hidden 
> >email]
> >#2   joe [hidden 
> >email]
> >#4 emily   [hidden 
> >email]
> >
> >#or
> >df[grep("\\w+",df$names),]
> >#  names emails
> >#1   bob   [hidden 
> >email]
> >#2   joe [hidden 
> >email]
> >#4 emily   [hidden 
> >email]
> >
> >But, I am  not sure how this will work over a 5.5 million rows.
> >A.K.
> >
> >
> >
> >
> >- Original Message -
> >From: ypodeswa <[hidden 
> >email]>
>
> >To: [hidden email]
> >Cc:
> >Sent: Sunday, January 27, 2013 1:11 AM
> >Subject: Re: [R] Removing values containing a specific character
> >
> >Actually, it worked perfectly for my sample data, but my actual data has
> >5.5 million rows, and grep doesn't seem to work with over a million rows.
> >Any idea on a workaround?
> >
> >
> >On Sat, Jan 26, 2013 at 9:37 PM, Yasha Podeswa <[hidden 
> >email]>
> wrote:
> >
> >> Awesome, thanks Arun, that's exactly what I was looking for!
> >>
> >>
> >> On Sat, Jan 26, 2013 at 9:21 PM, arun kirshna [via R] <
> >> [hidden email] >
> wrote:
> >>
> >>> Hi,
> >>> Try this:
> >>> df[]<-lapply(df,as.character)
> >>> df2<-df
> >>> df[,1][grep("@",df$names)]<- ""
> >>> df
> >>>   #names emails
> >>> #1   bob  [hidden 
> >>> email]
> >>> #2   joe [hidden 
> >>> email]
> >>> #3  [hidden 
> >>> email]
> >>> #4 emily  [hidden 
> >>> email]
> >>> #5  [hidden 
> >>> email]

Re: [R] Recursive file Download from FTP Error

2013-01-27 Thread Peter Maclean
I am trying to download multiple files from a ftp. However, I get this error 
after downloading several files: cannot open URL ''. The stopage occurs 
randomly; can be after the 10th  or 30th etc. The code I am using is:
 
#
#rm(list=ls(all=TRUE))
require(RCurl)
require("R.utils")
ftp://ftp.root/ <- "ftp://ftp.cpc.ncep.noaa.gov/fews/AFR_CLIM/ARC2/DATA/1987/";
dropbox.root <- "H:/UNZIPPEDATA/"
 
#Function to download
 
fdownload <- function(sourcelink) { 
targetlink <- paste(dropbox.root, substr(sourcelink, nchar(ftp://ftp.root)+1/, 
nchar(sourcelink)), sep = '')
 
# list of contents of the files
  filenames <- getURL(sourcelink, ftp://ftp.use.epsv/ = FALSE, dirlistonly = 
TRUE)
  filenames <- strsplit(filenames, "\r\n*")
  filenames <- unlist(filenames)
  files <- filenames[grep('\\.', filenames)]  
  dirs <- setdiff(filenames, files)
  if (length(dirs) != 0) {
    dirs <- paste(sourcelink, dirs, '/', sep = '')
  }  
# Download files
  for (filename in files) {
    sourcefile <- paste(sourcelink, filename, sep = '')
    targetfile <- paste(targetlink, filename, sep = '')
    download.file(sourcefile, targetfile, cacheOK = TRUE, quiet = FALSE)
   
  }
# subfolders
  for (dirname in dirs) {
    fdownload(dirname)
  }
}
# Call the function
 
fdownload(ftp://ftp.root/)
 
## 

I could not find any solution from the web.

I will appreciate any help

Peter Maclean
Department of Economics
UDSM
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Package for multi-dimensional terminal value ODE solver

2013-01-27 Thread Robert A'gata
Hi,

I did some googling. I found multiple ODE solver packages. But I am
wondering if there is any terminal value ODE solver (linear case) for
multi-dimension case or not? I have not come across any. Any guidance would
be appreciated.

Robert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Package: VennDiagram. Error in draw.pairwise.venn Impossible: cross section area too large

2013-01-27 Thread Fabrice Tourre
Dear list,

When I use VennDiagram package, I got a error as follow:

venn.plot <- draw.pairwise.venn(
area1 = 3186,
area2 = 325,
cross.area = 5880);


Error in draw.pairwise.venn(area1 = 3186, area2 = 325, cross.area = 588) :
  Impossible: cross section area too large.

Does anyone have suggestion?

Thank you.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior with abbreviation of an argument to paste

2013-01-27 Thread David Winsemius


On Jan 27, 2013, at 4:10 PM, Dennis Fisher wrote:


R 2.15.1
OS X

Colleagues,

I encountered the following unexpected behavior today:

The following command yielded the expected result:
paste(c("TEXT1", "TEXT2"), collapse="|")
Result:
[1] "TEXT1|TEXT2"

However, abbreviating "collapse" by even one character:
paste(c("TEXT1", "TEXT2"), collaps="|")
yielded the following:
[1] "TEXT1 |" "TEXT2 |" >

My experience has been that one can abbreviate these options as long  
as there is no ambiguity.  For example, the "ignore.case" argument  
for grep can be abbreviated to "ig" (shortening it to a single  
character creates an ambiguity).


Is there something special about "collapse" that I am missing?


What you are missing is not special about 'collapse', but is rather a  
feature shared by any argument after the "dots" in a function's  
argument list.


--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior with abbreviation of an argument to paste

2013-01-27 Thread Mark Leeds
Hi Dennis: One of  function argument matching rules in R that  arguments
after the dotdotdot have to be matched exactly ( see code for paste below )
so that's why your attempt doesn't work. But I would have been surprised
also so I'm not trying to imply that one should know which functions have
arguments after dotdotdot and which don't.

#=

paste
function (..., sep = " ", collapse = NULL)
.Internal(paste(list(...), sep, collapse))





> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unexpected behavior with abbreviation of an argument to paste

2013-01-27 Thread Dennis Fisher
R 2.15.1
OS X

Colleagues,

I encountered the following unexpected behavior today:

The following command yielded the expected result:
paste(c("TEXT1", "TEXT2"), collapse="|")
Result:
[1] "TEXT1|TEXT2"

However, abbreviating "collapse" by even one character:
paste(c("TEXT1", "TEXT2"), collaps="|") 
 yielded the following:
[1] "TEXT1 |" "TEXT2 |" >

My experience has been that one can abbreviate these options as long as there 
is no ambiguity.  For example, the "ignore.case" argument for grep can be 
abbreviated to "ig" (shortening it to a single character creates an ambiguity).

Is there something special about "collapse" that I am missing?

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dummy encoding in metafor

2013-01-27 Thread Alma Wilflinger
Hi Michael!

Yes, I do use Cohens d. As a matter of fact my thesis supervisor told me to use 
1 as the value for standard deviation for all of my studies.
Unfortunately I am not totally sure myself why to do this have you ever used 
such an approach?

kind regards,
Alma





 From: Michael Dewey 

t.co.uk>; "r-help@r-project.org"  
Sent: Thursday, January 24, 2013 6:57 PM
Subject: Re: [R] dummy encoding in metafor

At 22:06 23/01/2013, Alma Wilflinger wrote:

> Hi Michael,
> 
> The supervisor for my Master's Thesis told me that my means are the effect 
> size and cause of this I have to take figure 1 for all standard deviations. 
> So I hope that was the right information.

Alma
There is a fairly comprehensive list of all the things which might be an effect 
size on
http://en.wikipedia.org/wiki/Effect_size
Is what you call Mean one of them?


> 
> From: Michael Dewey 

wolfgang.viechtba...@maastrichtuniversity.nl>; Michael Dewey 
; "r-help@r-project.org" 
> Sent: Wednesday, January 23, 2013 10:22 AM
> Subject: Re: [R] dummy encoding in metafor
> 
> At 08:30 23/01/2013, Alma Wilflinger wrote:
> > Dear Wolfgang and Michael,
> >
[[elided Yahoo spam]]
> >
> > Concerning the Variance: I took the variance I used for CMA (which is 
> > always 1), so I think it should be the right one.
> 
> It seems unlikely to me that the variance from each study would be the same 
> although I suppose it could be possible. Are you sure you are supplying the 
> right values to CMA?
> 
> 
> > Thank you for noticing and mentioning though :)
> >
> > I really appreciate how helpful you both are.
> >
> > best,
> > Alma
> >
> >
> >
> > From: Viechtbauer Wolfgang (STAT) 
> > <wolfgang.viechtba...@maastrichtuniversity.nl>
> > To: Michael Dewey <i...@aghmed.fsnet.co
"r-help@r-project.org" 
<r-help@r-project.org>
> > Sent: Monday, January 21, 2013 11:10 AM
> > Subject: RE: [R] dummy encoding in metafor
> >
> > As Michael already mentioned, the error:
> >
> > Error in qr.solve(wX, diag(k)) : singular matrix 'a' in solve
> >
> > indeed indicates that your design matrix is not of full rank (i.e., there 
> > are linear dependencies among your predictors). With this many factors in 
> > the same model, this is not surprising if k is "only" 94 (which is actually 
> > quite large for a meta-analysis). One options is to leave out some of the 
> > predictors. You can also try collapsing some of the levels of the factors. 
> > Of course, you lose some "details" that way, but apparently you don't have 
> > enough data in the first place to carry out such a detailed analysis.
> >
> > One other thing I noticed. You wrote:
> >
> > rma(yi=Mean, vi=Variance, ni=N.1, ...)
> >
> > I suspect that your variable "Variance" is actually the variance of the raw 
> > scores. However, the vi argument is used to pass the sampling variances of 
> > the yi values to the function -- not the variance of raw scores. The 
> > (estimated) sampling variance of a mean is s^2 / n, so if I am not 
> > mistaken, you really want to use:
> >
> > rma(yi=Mean, vi=Variance/N.1, ...)
> >
> > Best,
> > Wolfgang
> >
> > --
> > Wolfgang Viechtbauer, Ph.D., Statistician
> > Department of Psychiatry and Psychology
> > School for Mental Health and Neuroscience
> > Faculty of Health, Medicine, and Life Sciences
> > Maastricht University, P.O. Box 616 (VIJV1)
> > 6200 MD Maastricht, The Netherlands
> > +31 (43) 388-4170 | http://www.wvbauer.com
> >
> > > -Original Message-
> > > From: 
> > > r-help-boun...@r-project.org
> > >  [mailto:r-help-boun...@r-project.org]
> > > On Behalf Of Michael Dewey
> > > Sent: Monday, January 21, 2013 10:40
> > > To: Alma Wilflinger; Michael Dewey; 
> > > r-help@r-project.org
> > > Subject: Re: [R] dummy encoding in metafor
> > >
> > > At 14:48 20/01/2013, Alma Wilflinger wrote:
> > > >Hi,
> > > >
> > > >thank you very much for your kind answer.
> > > >
> > > > >If you look a bit further down the manual page you will see
> > > > >### using a model formula to specify the same model
> > > > >rma(yi, vi, mods=~factor(alloc)+year+ablat, data=dat, method="REML",
> > > > >btt=c(2,3))
> > > >
> > > > >which is much easier.
> > > >
> > > >I have seen the possibility of using a model formula for dummy
> > > >encoding and you are right it is much easier than doing it by hand.
> > > >Thing is that if I include some moderator variables into the
> > > >parameters I get the error:
> > > >
> > > >Error in qr.solve(wX, diag(k)) : singular matrix 'a' in solve
> > >
> > > I suspect that you have a linear dependence between your moderator
> > > variables. Depending on how many levels there are for country,
> > > sample, and so on you do have a lot of predictors (you pre

Re: [R] confidence / prediction ellipse

2013-01-27 Thread John Fox
Dear Giuseppe and Bert,

I also didn't follow what's intended, more or less for the same reasons as
Bert mentioned, which is why I didn't reply to the initial posting. In the
car package, confidenceEllipse() draws confidence ellipses for a pair of
coefficients from a statistical model, and dataEllipse() draws
bivariate-normal concentration ellipses for the bivariate distribution of
two variables.

I'm copying to Georges Monette and Michael Friendly, coauthors of these
functions, in case they have something to add.

I hope that this helps, but I doubt that it does.

John

---
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada




> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Giuseppe Amatulli
> Sent: Sunday, January 27, 2013 11:41 AM
> To: Bert Gunter
> Cc: r-help@r-project.org
> Subject: Re: [R] confidence / prediction ellipse
> 
> Hi,
> thanks for your replay.
> My values of a and b are respectively:
> a = observation of an event
> b = prediction of a model.
> 
> Therefore i would like to draw the confidence region for predicting a
> new
> observation, and according to this
> http://v8doc.sas.com/sashtml/insight/chap40/sect35.htm the prediction
> ellipse should be more appropriate.
> 
> But i'm not able to track back the function
> radius <- sqrt(dfn * qf(level, dfn, dfd))
> in order to change it and draw a prediction ellipses.
> 
> Regards
> Giuseppe
> 
> 
> 
> 
> On 26 January 2013 17:19, Bert Gunter  wrote:
> 
> > Well, I'd guess you have to first define what you mean by "prediction
> > ellipse," as the confidence ellipses are for the bivariate
> > distribution of 2 parameter estimates -- as I understand it --
> > whereas predictions depend on the covariate values and are for a
> > single response value (unless you have fitted multiple responses, I
> > suppose).
> >
> > -- Bert
> >
> > On Sat, Jan 26, 2013 at 1:12 PM, Giuseppe Amatulli
> >  wrote:
> > > Hi,
> > > I'm using the R library(car) to draw confidence/prediction ellipses
> in a
> > > scatterplot.
> > > >From what i understood  the ellipse() function return an ellipse
> based
> > > parameters:  shape, center,  radius .
> > > If i read  dataEllipse() function i can see how these parameters are
> > > calculated for a confidence ellipse.
> > >
> > > ibrary(car)
> > >
> > > a=c(12,12,4,5,63,63,23)
> > > b=c(13,15,7,10,73,83,43)
> > >
> > > v <- cov.trob(cbind(a, b))
> > > shape <- v$cov
> > > center <- v$center
> > >
> > > radius <- sqrt(2 * qf(0.95, 2, length(a) - 1))   # radius <-
> sqrt(dfn *
> > > qf(level, dfn, dfd))
> > >
> > > conf.elip = ellipse(center, shape, radius,draw = F)
> > > plot(conf.elip, type='l')
> > > points(a,b)
> > >
> > > My question is how I can calculate shape, center and radius  to
> obtain a
> > > prediction ellipses rather than a confidence ellipse?
> > > Thanks in Advance
> > > Giuseppe
> > >
> > > --
> > > Giuseppe Amatulli
> > > Web: www.spatial-ecology.net
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> >
> > Bert Gunter
> > Genentech Nonclinical Biostatistics
> >
> > Internal Contact Info:
> > Phone: 467-7374
> > Website:
> >
> > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-
> groups/pdb-biostatistics/pdb-ncb-home.htm
> >
> 
> 
> 
> --
> Giuseppe Amatulli
> Web: www.spatial-ecology.net
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scan not working

2013-01-27 Thread Rui Barradas

Hello,

Try the following. Create a file called test.R with these instructions:

cmd <- commandArgs(TRUE)
if(any(grepl("--ext", cmd))){
suffix <- cmd[grep("--ext", cmd)]
suffix <- unlist(strsplit(suffix, ":"))[2]
fl <- paste0("test_", suffix, ".txt") # underscore included
}else{
fl <- "test.txt"
}
write(cmd, file = fl)

# Call like this
#$ Rscript test.R other --options --ext:cro


This creates a file test_cro.txt with the command line options written 
to it. You can adapt this R script to your needs.
Instead of 'suf', I've used option '--ext' for 'extension'. Change to 
fit your taste.



Hope this helps,

Rui Barradas

Em 27-01-2013 17:27, Emily Sessa escreveu:

Hello all (again),

I received a very helpful answer to this question, and would like to pose one 
more:

Right now I have this script, which is being called from the command line, writing output to two 
generically named files ("pvalues" and "qvalues") that are named in the script 
using the line:

write(pvalues, file="pvalues", ncol=1)
write(adjusted, file="qvalues", ncol=1)

However, ideally I would like those two files to have something appended to their names that make them 
separate from one another, so I can identify which input they went with and so they won't write over each 
other when I script this into a Perl pipeline that will process many input files, which is my ultimate goal. 
I know how to do this in Perl, but not R... is there some way I can add another argument on the command line 
that will get passed to the R.script, like a simple letter code (e.g. "cro"), and then have it 
append that to the output file names, so they are, for example: "qvalues_cro" and 
"pvalues_cro"?

Thank you very much,
Emily

On Jan 27, 2013, at 4:34 AM, peter dalgaard  wrote:



On Jan 27, 2013, at 08:33 , Emily Sessa wrote:


Hi all,

I am trying to use the scan function in an R script that I am calling from the 
command line on a Mac; at the shell prompt I type:

$ Rscript get_q_values.R LRT_codeml_output

in the hope that LRT_codeml_output will get passed to the get_q_values R 
script. The first line of that script is:

chidata <- scan(file="")

which, as I understand how scan works, will read the contents of the file from the 
command line into the object chidata. I did this a few times and it worked like a charm. 
And then, it stopped working. Now, every time I try to do this, I get "Read 0 
items" as the next line in the terminal window, and the output produced by the 
script is empty, because it's apparently no longer reading anything in. I don't think I 
changed anything in the script; it just stopped being able to execute the scan function. 
Does anyone have any idea how to fix this?? I did not have anything else in that scan 
line when it was working before. I've updated R and restarted my computer in the hope 
that it would help, but it hasn't. Any help would be much appreciated.


I don't see how that would ever work. The 2nd and further args to Rscript are 
passed to R and accesible via commandArgs(). There's no way that scan() can 
know what the arguments are. It might work with

Rscript get_q_values.R < LRT_codeml_output

though. Or you need to arrange explicitly for scan(file=commandArgs(TRUE)[1]).



-ES
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing values containing a specific character

2013-01-27 Thread arun
Hi, 
I tried with bigger dataset.

set.seed(25)
names <- sample(c("bob", "joe", "cr...@gmail.com", "emily", 
"j...@yahoo.com"),5e6,replace=TRUE)
set.seed(1651)
emails
 <- sample(c("b...@cup.com", "joesm...@gmail.com", "cr...@gmail.com",
 "emi...@yahoo.com", "j...@yahoo.com"),5e6,replace=TRUE)

 df <- data.frame(names, emails) 
 dim(df)
#[1] 500   2
 df[]<-lapply(df,as.character)
 system.time(df[,1][grep("@",df$names)]<- "" )
#   user  system elapsed 
#  1.732   0.108   1.844 
 system.time(dfNew1<-df[grep("\\w+",df$names),])
#   user  system elapsed 
#  0.896   0.024   0.923 
 system.time(dfNew2<- df[df$names!="",])
#   user  system elapsed 
 # 0.460   0.028   0.490 
A.K.








From: Yasha Podeswa 
To: arun  
Cc: R help ; Uwe Ligges  
Sent: Sunday, January 27, 2013 2:05 PM
Subject: Re: [R] Removing values containing a specific character


You two were 100% right, it was just a memory issue.  This was part of a bigger 
project where I had a number of data frames loaded, all with 1-5 million rows. 
Cleaned up my code to have less data frames loaded at once, and everything is 
working great.  Thanks for the help!
On Jan 27, 2013 9:46 AM, "arun"  wrote:

Hi Yasha,
>
> I guess you got Uwe's response.
>
> I created `df2` with the intention of getting the two results from the 
>original dataset.
>For example, after you get the first result
>df[,1][grep("@",df$names)]<- ""
>#you can get the second result by:
>df[df$names!="",]
> # names emails
>#1   bob   b...@cup.com
>#2   joe joesm...@gmail.com
>#4 emily   emi...@yahoo.com
>
>#or
>df[grep("\\w+",df$names),]
>#  names emails
>#1   bob   b...@cup.com
>#2   joe joesm...@gmail.com
>#4 emily   emi...@yahoo.com
>
>But, I am  not sure how this will work over a 5.5 million rows.
>A.K.
>
>
>
>
>- Original Message -
>From: ypodeswa 
>To: r-help@r-project.org
>Cc:
>Sent: Sunday, January 27, 2013 1:11 AM
>Subject: Re: [R] Removing values containing a specific character
>
>Actually, it worked perfectly for my sample data, but my actual data has
>5.5 million rows, and grep doesn't seem to work with over a million rows.
>Any idea on a workaround?
>
>
>On Sat, Jan 26, 2013 at 9:37 PM, Yasha Podeswa  wrote:
>
>> Awesome, thanks Arun, that's exactly what I was looking for!
>>
>>
>> On Sat, Jan 26, 2013 at 9:21 PM, arun kirshna [via R] <
>> ml-node+s789695n4656749...@n4.nabble.com> wrote:
>>
>>> Hi,
>>> Try this:
>>> df[]<-lapply(df,as.character)
>>> df2<-df
>>> df[,1][grep("@",df$names)]<- ""
>>> df
>>>   #names             emails
>>> #1   bob      b...@cup.com
>>> #2   joe joesm...@gmail.com
>>> #3          cr...@gmail.com
>>> #4 emily  emi...@yahoo.com
>>> #5          j...@yahoo.com
>>>
>>> #2nd part:
>>>
>>>  df2[-grep("@",df2$names),]
>>>   names             emails
>>> #1   bob      b...@cup.com
>>> #2   joe joesm...@gmail.com
>>> #4 emily  emi...@yahoo.com
>>> A.K.
>>>
>>> --
>>>  If you reply to this email, your message will be added to the
>>> discussion below:
>>>
>>> http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656749.html
>>>  To unsubscribe from Removing values containing a specific character, click
>>> here
>>> .
>>> NAML
>>>
>>
>>
>
>
>
>
>--
>View this message in context: 
>http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656751.html
>Sent from the R help mailing list archive at Nabble.com.
>    [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting column of strings to boolean

2013-01-27 Thread domcastro
Thanks all. I will give them all a go and let you know the outcome.

kind regards



--
View this message in context: 
http://r.789695.n4.nabble.com/Converting-column-of-strings-to-boolean-tp4656739p4656774.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scan not working

2013-01-27 Thread Steve Lianoglou
Hi,

It sounds like you just want to write a command line script using R
and you would pass the suffix/prefix as command line args, no?

Why not just go with what Peter has already suggested with
`commandArgs`, or if you want a more feature-rich command line arg
parser, you can try:

http://cran.r-project.org/web/packages/optparse/index.html

If the command line argument thing won't work for you, perhaps you can
elaborate more? For instance, you mention that you know how you might
do "this" in Perl ... perhaps you can clarify "this" a bit more.

-steve

On Sun, Jan 27, 2013 at 12:27 PM, Emily Sessa  wrote:
> Hello all (again),
>
> I received a very helpful answer to this question, and would like to pose one 
> more:
>
> Right now I have this script, which is being called from the command line, 
> writing output to two generically named files ("pvalues" and "qvalues") that 
> are named in the script using the line:
>
> write(pvalues, file="pvalues", ncol=1)
> write(adjusted, file="qvalues", ncol=1)
>
> However, ideally I would like those two files to have something appended to 
> their names that make them separate from one another, so I can identify which 
> input they went with and so they won't write over each other when I script 
> this into a Perl pipeline that will process many input files, which is my 
> ultimate goal. I know how to do this in Perl, but not R... is there some way 
> I can add another argument on the command line that will get passed to the 
> R.script, like a simple letter code (e.g. "cro"), and then have it append 
> that to the output file names, so they are, for example: "qvalues_cro" and 
> "pvalues_cro"?
>
> Thank you very much,
> Emily
>
> On Jan 27, 2013, at 4:34 AM, peter dalgaard  wrote:
>
>>
>> On Jan 27, 2013, at 08:33 , Emily Sessa wrote:
>>
>>> Hi all,
>>>
>>> I am trying to use the scan function in an R script that I am calling from 
>>> the command line on a Mac; at the shell prompt I type:
>>>
>>> $ Rscript get_q_values.R LRT_codeml_output
>>>
>>> in the hope that LRT_codeml_output will get passed to the get_q_values R 
>>> script. The first line of that script is:
>>>
>>> chidata <- scan(file="")
>>>
>>> which, as I understand how scan works, will read the contents of the file 
>>> from the command line into the object chidata. I did this a few times and 
>>> it worked like a charm. And then, it stopped working. Now, every time I try 
>>> to do this, I get "Read 0 items" as the next line in the terminal window, 
>>> and the output produced by the script is empty, because it's apparently no 
>>> longer reading anything in. I don't think I changed anything in the script; 
>>> it just stopped being able to execute the scan function. Does anyone have 
>>> any idea how to fix this?? I did not have anything else in that scan line 
>>> when it was working before. I've updated R and restarted my computer in the 
>>> hope that it would help, but it hasn't. Any help would be much appreciated.
>>
>> I don't see how that would ever work. The 2nd and further args to Rscript 
>> are passed to R and accesible via commandArgs(). There's no way that scan() 
>> can know what the arguments are. It might work with
>>
>> Rscript get_q_values.R < LRT_codeml_output
>>
>> though. Or you need to arrange explicitly for 
>> scan(file=commandArgs(TRUE)[1]).
>>
>>>
>>> -ES
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing values containing a specific character

2013-01-27 Thread arun
Hi Yasha,

 I guess you got Uwe's response. 

 I created `df2` with the intention of getting the two results from the 
original dataset.
For example, after you get the first result
df[,1][grep("@",df$names)]<- "" 
#you can get the second result by:
df[df$names!="",]
 # names emails
#1   bob   b...@cup.com
#2   joe joesm...@gmail.com
#4 emily   emi...@yahoo.com

#or
df[grep("\\w+",df$names),]
#  names emails
#1   bob   b...@cup.com
#2   joe joesm...@gmail.com
#4 emily   emi...@yahoo.com

But, I am  not sure how this will work over a 5.5 million rows. 
A.K.




- Original Message -
From: ypodeswa 
To: r-help@r-project.org
Cc: 
Sent: Sunday, January 27, 2013 1:11 AM
Subject: Re: [R] Removing values containing a specific character

Actually, it worked perfectly for my sample data, but my actual data has
5.5 million rows, and grep doesn't seem to work with over a million rows.
Any idea on a workaround?


On Sat, Jan 26, 2013 at 9:37 PM, Yasha Podeswa  wrote:

> Awesome, thanks Arun, that's exactly what I was looking for!
>
>
> On Sat, Jan 26, 2013 at 9:21 PM, arun kirshna [via R] <
> ml-node+s789695n4656749...@n4.nabble.com> wrote:
>
>> Hi,
>> Try this:
>> df[]<-lapply(df,as.character)
>> df2<-df
>> df[,1][grep("@",df$names)]<- ""
>> df
>>   #names             emails
>> #1   bob      b...@cup.com
>> #2   joe joesm...@gmail.com
>> #3          cr...@gmail.com
>> #4 emily  emi...@yahoo.com
>> #5          j...@yahoo.com
>>
>> #2nd part:
>>
>>  df2[-grep("@",df2$names),]
>>   names             emails
>> #1   bob      b...@cup.com
>> #2   joe joesm...@gmail.com
>> #4 emily  emi...@yahoo.com
>> A.K.
>>
>> --
>>  If you reply to this email, your message will be added to the
>> discussion below:
>>
>> http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656749.html
>>  To unsubscribe from Removing values containing a specific character, click
>> here
>> .
>> NAML
>>
>
>




--
View this message in context: 
http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656751.html
Sent from the R help mailing list archive at Nabble.com.
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confidence / prediction ellipse

2013-01-27 Thread Giuseppe Amatulli
Hi,
thanks for your replay.
My values of a and b are respectively:
a = observation of an event
b = prediction of a model.

Therefore i would like to draw the confidence region for predicting a new
observation, and according to this
http://v8doc.sas.com/sashtml/insight/chap40/sect35.htm the prediction
ellipse should be more appropriate.

But i'm not able to track back the function
radius <- sqrt(dfn * qf(level, dfn, dfd))
in order to change it and draw a prediction ellipses.

Regards
Giuseppe




On 26 January 2013 17:19, Bert Gunter  wrote:

> Well, I'd guess you have to first define what you mean by "prediction
> ellipse," as the confidence ellipses are for the bivariate
> distribution of 2 parameter estimates -- as I understand it --
> whereas predictions depend on the covariate values and are for a
> single response value (unless you have fitted multiple responses, I
> suppose).
>
> -- Bert
>
> On Sat, Jan 26, 2013 at 1:12 PM, Giuseppe Amatulli
>  wrote:
> > Hi,
> > I'm using the R library(car) to draw confidence/prediction ellipses in a
> > scatterplot.
> > >From what i understood  the ellipse() function return an ellipse based
> > parameters:  shape, center,  radius .
> > If i read  dataEllipse() function i can see how these parameters are
> > calculated for a confidence ellipse.
> >
> > ibrary(car)
> >
> > a=c(12,12,4,5,63,63,23)
> > b=c(13,15,7,10,73,83,43)
> >
> > v <- cov.trob(cbind(a, b))
> > shape <- v$cov
> > center <- v$center
> >
> > radius <- sqrt(2 * qf(0.95, 2, length(a) - 1))   # radius <- sqrt(dfn *
> > qf(level, dfn, dfd))
> >
> > conf.elip = ellipse(center, shape, radius,draw = F)
> > plot(conf.elip, type='l')
> > points(a,b)
> >
> > My question is how I can calculate shape, center and radius  to obtain a
> > prediction ellipses rather than a confidence ellipse?
> > Thanks in Advance
> > Giuseppe
> >
> > --
> > Giuseppe Amatulli
> > Web: www.spatial-ecology.net
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
>
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>



-- 
Giuseppe Amatulli
Web: www.spatial-ecology.net

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scan not working

2013-01-27 Thread Emily Sessa
Hello all (again), 

I received a very helpful answer to this question, and would like to pose one 
more: 

Right now I have this script, which is being called from the command line, 
writing output to two generically named files ("pvalues" and "qvalues") that 
are named in the script using the line:

write(pvalues, file="pvalues", ncol=1)
write(adjusted, file="qvalues", ncol=1)

However, ideally I would like those two files to have something appended to 
their names that make them separate from one another, so I can identify which 
input they went with and so they won't write over each other when I script this 
into a Perl pipeline that will process many input files, which is my ultimate 
goal. I know how to do this in Perl, but not R... is there some way I can add 
another argument on the command line that will get passed to the R.script, like 
a simple letter code (e.g. "cro"), and then have it append that to the output 
file names, so they are, for example: "qvalues_cro" and "pvalues_cro"? 

Thank you very much,
Emily

On Jan 27, 2013, at 4:34 AM, peter dalgaard  wrote:

> 
> On Jan 27, 2013, at 08:33 , Emily Sessa wrote:
> 
>> Hi all, 
>> 
>> I am trying to use the scan function in an R script that I am calling from 
>> the command line on a Mac; at the shell prompt I type: 
>> 
>> $ Rscript get_q_values.R LRT_codeml_output 
>> 
>> in the hope that LRT_codeml_output will get passed to the get_q_values R 
>> script. The first line of that script is: 
>> 
>> chidata <- scan(file="")
>> 
>> which, as I understand how scan works, will read the contents of the file 
>> from the command line into the object chidata. I did this a few times and it 
>> worked like a charm. And then, it stopped working. Now, every time I try to 
>> do this, I get "Read 0 items" as the next line in the terminal window, and 
>> the output produced by the script is empty, because it's apparently no 
>> longer reading anything in. I don't think I changed anything in the script; 
>> it just stopped being able to execute the scan function. Does anyone have 
>> any idea how to fix this?? I did not have anything else in that scan line 
>> when it was working before. I've updated R and restarted my computer in the 
>> hope that it would help, but it hasn't. Any help would be much appreciated.
> 
> I don't see how that would ever work. The 2nd and further args to Rscript are 
> passed to R and accesible via commandArgs(). There's no way that scan() can 
> know what the arguments are. It might work with 
> 
> Rscript get_q_values.R < LRT_codeml_output
> 
> though. Or you need to arrange explicitly for scan(file=commandArgs(TRUE)[1]).
> 
>> 
>> -ES
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rpart

2013-01-27 Thread carol white
Hi,
When I look at the summary of an rpart object run on my data, I get 7 nodes but 
when I plot the rpart object, I get only 3 nodes. Should the number of nodes 
not match in the results of the 2 functions (summary and plot) or it is not 
always the same?

Look forward to your reply,

Carol

 summary(rpart.res)
Call:
rpart(formula = mydata$class ~ ., data = as.data.frame(t(mydata)))
  n= 62 

 CP nsplit rel error    xerror  xstd
1 0.6363636  0 1.000 1.000 0.1712469
2 0.1363636  1 0.3636364 0.6818182 0.1532767
3 0.010  2 0.2272727 0.7727273 0.1596659

Variable importance
  Hsa.627   Hsa.692 Hsa.692.2  Hsa.3306   Hsa.601   Hsa.831  Hsa.1832  Hsa.2456 
   19    13    11    10    10 8 6 6 
 Hsa.8147  Hsa.1131 Hsa.692.1 
    6 5 5 

Node number 1: 62 observations,    complexity param=0.6363636
  predicted class=t  expected loss=0.3548387  P(node) =1
    class counts:    22    40
   probabilities: 0.355 0.645 
  left son=2 (14 obs) right son=3 (48 obs)
  Primary splits:
  Hsa.627   < 59.83    to the left,  improve=15.05376, (0 missing)
  Hsa.8147  < 1696.23  to the right, improve=14.46790, (0 missing)
  Hsa.37937 < 379.39   to the right, improve=13.75358, (0 missing)
  Hsa.692.2 < 842.305  to the right, improve=12.38710, (0 missing)
  Hsa.1832  < 735.805  to the right, improve=11.90495, (0 missing)
  Surrogate splits:
  Hsa.692.2 < 1086.655 to the right, agree=0.903, adj=0.571, (0 split)
  Hsa.3306  < 170.515  to the left,  agree=0.887, adj=0.500, (0 split)
  Hsa.601   < 88.065   to the left,  agree=0.887, adj=0.500, (0 split)
  Hsa.692   < 1251.99  to the right, agree=0.871, adj=0.429, (0 split)
  Hsa.831   < 281.54   to the left,  agree=0.871, adj=0.429, (0 split)

Node number 2: 14 observations
  predicted class=n  expected loss=0  P(node) =0.2258065
    class counts:    14 0
   probabilities: 1.000 0.000 

Node number 3: 48 observations,    complexity param=0.1363636
  predicted class=t  expected loss=0.167  P(node) =0.7741935
    class counts: 8    40
   probabilities: 0.167 0.833 
  left son=6 (7 obs) right son=7 (41 obs)
  Primary splits:
  Hsa.8147  < 1722.605 to the right, improve=4.915215, (0 missing)
  Hsa.1832  < 681.145  to the right, improve=4.915215, (0 missing)
  Hsa.1410  < 49.985   to the left,  improve=4.915215, (0 missing)
  Hsa.2456  < 186.195  to the right, improve=4.915215, (0 missing)
  Hsa.11616 < 969.085  to the right, improve=4.915215, (0 missing)
  Surrogate splits:
  Hsa.1832  < 681.145  to the right, agree=1.000, adj=1.000, (0 split)
  Hsa.2456  < 186.195  to the right, agree=1.000, adj=1.000, (0 split)
  Hsa.692   < 1048.375 to the right, agree=0.979, adj=0.857, (0 split)
  Hsa.692.1 < 1136.75  to the right, agree=0.979, adj=0.857, (0 split)
  Hsa.1131  < 1679.54  to the right, agree=0.979, adj=0.857, (0 split)

Node number 6: 7 observations
  predicted class=n  expected loss=0.2857143  P(node) =0.1129032
    class counts: 5 2
   probabilities: 0.714 0.286 

Node number 7: 41 observations
  predicted class=t  expected loss=0.07317073  P(node) =0.6612903
    class counts: 3    38
   probabilities: 0.073 0.927 
<>__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing values containing a specific character

2013-01-27 Thread Yasha Podeswa
You two were 100% right, it was just a memory issue.  This was part of a
bigger project where I had a number of data frames loaded, all with 1-5
million rows. Cleaned up my code to have less data frames loaded at once,
and everything is working great.  Thanks for the help!
On Jan 27, 2013 9:46 AM, "arun"  wrote:

> Hi Yasha,
>
>  I guess you got Uwe's response.
>
>  I created `df2` with the intention of getting the two results from the
> original dataset.
> For example, after you get the first result
> df[,1][grep("@",df$names)]<- ""
> #you can get the second result by:
> df[df$names!="",]
>  # names emails
> #1   bob   b...@cup.com
> #2   joe joesm...@gmail.com
> #4 emily   emi...@yahoo.com
>
> #or
> df[grep("\\w+",df$names),]
> #  names emails
> #1   bob   b...@cup.com
> #2   joe joesm...@gmail.com
> #4 emily   emi...@yahoo.com
>
> But, I am  not sure how this will work over a 5.5 million rows.
> A.K.
>
>
>
>
> - Original Message -
> From: ypodeswa 
> To: r-help@r-project.org
> Cc:
> Sent: Sunday, January 27, 2013 1:11 AM
> Subject: Re: [R] Removing values containing a specific character
>
> Actually, it worked perfectly for my sample data, but my actual data has
> 5.5 million rows, and grep doesn't seem to work with over a million rows.
> Any idea on a workaround?
>
>
> On Sat, Jan 26, 2013 at 9:37 PM, Yasha Podeswa  wrote:
>
> > Awesome, thanks Arun, that's exactly what I was looking for!
> >
> >
> > On Sat, Jan 26, 2013 at 9:21 PM, arun kirshna [via R] <
> > ml-node+s789695n4656749...@n4.nabble.com> wrote:
> >
> >> Hi,
> >> Try this:
> >> df[]<-lapply(df,as.character)
> >> df2<-df
> >> df[,1][grep("@",df$names)]<- ""
> >> df
> >>   #names emails
> >> #1   bob  b...@cup.com
> >> #2   joe joesm...@gmail.com
> >> #3  cr...@gmail.com
> >> #4 emily  emi...@yahoo.com
> >> #5  j...@yahoo.com
> >>
> >> #2nd part:
> >>
> >>  df2[-grep("@",df2$names),]
> >>   names emails
> >> #1   bob  b...@cup.com
> >> #2   joe joesm...@gmail.com
> >> #4 emily  emi...@yahoo.com
> >> A.K.
> >>
> >> --
> >>  If you reply to this email, your message will be added to the
> >> discussion below:
> >>
> >>
> http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656749.html
> >>  To unsubscribe from Removing values containing a specific character,
> click
> >> here<
> http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4656744&code=eXBvZGVzd2FAZ21haWwuY29tfDQ2NTY3NDR8LTEyMTY0MzM4NDk=
> >
> >> .
> >> NAML<
> http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >>
> >
> >
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656751.html
> Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confidence / prediction ellipse

2013-01-27 Thread Bert Gunter
You appear to be quite confused. I have no idea what your "a" and "b"
below mean. The SAS documentation that you quote is for prediction
from a bivariate normal. I don't know what that has to do with your
problem, nor with car's confidence ellipses for parameters for a
univariate regression. I would suggest you get local statistical help,
though perhaps someone with more time on their hands on this list than
I may help sort you out.

Cheers,
Bert

On Sun, Jan 27, 2013 at 8:41 AM, Giuseppe Amatulli
 wrote:
> Hi,
> thanks for your replay.
> My values of a and b are respectively:
> a = observation of an event
> b = prediction of a model.
>
> Therefore i would like to draw the confidence region for predicting a new
> observation, and according to this
> http://v8doc.sas.com/sashtml/insight/chap40/sect35.htm the prediction
> ellipse should be more appropriate.
>
> But i'm not able to track back the function
> radius <- sqrt(dfn * qf(level, dfn, dfd))
> in order to change it and draw a prediction ellipses.
>
> Regards
> Giuseppe
>
>
>
>
> On 26 January 2013 17:19, Bert Gunter  wrote:
>>
>> Well, I'd guess you have to first define what you mean by "prediction
>> ellipse," as the confidence ellipses are for the bivariate
>> distribution of 2 parameter estimates -- as I understand it --
>> whereas predictions depend on the covariate values and are for a
>> single response value (unless you have fitted multiple responses, I
>> suppose).
>>
>> -- Bert
>>
>> On Sat, Jan 26, 2013 at 1:12 PM, Giuseppe Amatulli
>>  wrote:
>> > Hi,
>> > I'm using the R library(car) to draw confidence/prediction ellipses in a
>> > scatterplot.
>> > >From what i understood  the ellipse() function return an ellipse based
>> > parameters:  shape, center,  radius .
>> > If i read  dataEllipse() function i can see how these parameters are
>> > calculated for a confidence ellipse.
>> >
>> > ibrary(car)
>> >
>> > a=c(12,12,4,5,63,63,23)
>> > b=c(13,15,7,10,73,83,43)
>> >
>> > v <- cov.trob(cbind(a, b))
>> > shape <- v$cov
>> > center <- v$center
>> >
>> > radius <- sqrt(2 * qf(0.95, 2, length(a) - 1))   # radius <- sqrt(dfn *
>> > qf(level, dfn, dfd))
>> >
>> > conf.elip = ellipse(center, shape, radius,draw = F)
>> > plot(conf.elip, type='l')
>> > points(a,b)
>> >
>> > My question is how I can calculate shape, center and radius  to obtain a
>> > prediction ellipses rather than a confidence ellipse?
>> > Thanks in Advance
>> > Giuseppe
>> >
>> > --
>> > Giuseppe Amatulli
>> > Web: www.spatial-ecology.net
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>>
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
>
>
>
> --
> Giuseppe Amatulli
> Web: www.spatial-ecology.net



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loops

2013-01-27 Thread arun


Hi,

You could use library(plyr) as well
library(plyr)
pnew<-colSums(aaply(laply(split(as.data.frame(p),((1:nrow(as.data.frame(p))-1)%/%
 25)+1),as.matrix),c(2,3),function(x) x))
res<-rbind(t(pnew),colSums(p))
row.names(res)<-1:nrow(res)
res<- 100-100*abs(res/rowSums(res)-(1/3))
A.K.


- Original Message -
From: Rui Barradas 
To: Francesca 
Cc: r-help@r-project.org
Sent: Sunday, January 27, 2013 6:17 AM
Subject: Re: [R] Loops

Hello,

I think there is an error in the expression

100-(100*abs(fa1[i]/sum(fa1[i])-(1/3)))

Note that fa1[i]/sum(fa1[i]) is always 1. If it's fa1[i]/sum(fa1), try 
the following, using lists to hold the results.


# Make up some data
set.seed(6628)
p <- matrix(runif(300), nrow = 100)

idx <- seq(1, 100, by = 25)
fa <- lapply(idx, function(i) colSums(p[i:(i + 24), ]))
fa[[5]] <- colSums(p)

fab <- lapply(fa, function(x) 100 - 100*abs(x/sum(x) - 1/3))
fab

You can give names to the lists elements, if you want to.


names(fa) <- paste0("fa", 1:5)
names(fab) <- paste0("fa", 1:5, "b")


Hope this helps,

Rui Barradas

Em 27-01-2013 08:02, Francesca escreveu:
> Dear Contributors,
> I am asking help on the way how to solve a problem related to loops for
> that I always get confused with.
> I would like to perform the following procedure in a compact way.
>
> Consider that p is a matrix composed of 100 rows and three columns. I need
> to calculate the sum over some rows of each
> column separately, as follows:
>
> fa1<-(colSums(p[1:25,]))
>
> fa2<-(colSums(p[26:50,]))
>
> fa3<-(colSums(p[51:75,]))
>
> fa4<-(colSums(p[76:100,]))
>
> fa5<-(colSums(p[1:100,]))
>
>
>
> and then I need to  apply to each of them the following:
>
>
> fa1b<-c()
>
> for (i in 1:3){
>
> fa1b[i]<-(100-(100*abs(fa1[i]/sum(fa1[i])-(1/3
>
> }
>
>
> fa2b<-c()
>
> for (i in 1:3){
>
> fa2b[i]<-(100-(100*abs(fa2[i]/sum(fa2[i])-(1/3
>
> }
>
>
> and so on.
>
> Is there a more efficient way to do this?
>
> Thanks for your time!
>
> Francesca
>
> --
> Francesca Pancotto, PhD
> Università di Modena e Reggio Emilia
> Viale A. Allegri, 9
> 40121 Reggio Emilia
> Office: +39 0522 523264
> Web: https://sites.google.com/site/francescapancotto/
> --
>
>     [[alternative HTML version deleted]]
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and SpatialGridDataFrame error

2013-01-27 Thread Irucka Embry
Hi all, I have a set of 54 files that I need to convert from ASCII grid
format to .shp files to .bnd files for BayesX.

I have the following R code to operate on those files:

library(maptools)
library(Grid2Polygons)
library(BayesX)
library(BayesXsrc)
library(R2BayesX)

readfunct <- function(x)
{
u <- readAsciiGrid(x)
}

modfilesmore <- paste0("MaxFloodDepth_", 1:54, ".txt")
modeldepthsmore <- lapply(modfilesmore, readfunct)

maxdepth.plys <- lapply(modeldepthsmore, Grid2Polygons(modeldepthsmore,
level = FALSE))

layers <- paste0("examples/floodlayers_", 1:54)
polyshapes <- lapply(writePolyShape(maxdepth.plys, layers))
shpName <- sub(pattern="(.*)\\.dbf", replacement="\\1",
x=system.file("examples/Flood/layer_.dbf", package="BayesX")) 
floodmaps <- lapply(shp2bnd(shpname=shpName, regionnames="SP_ID"))

## draw the map
drawmap(map=floodmaps)


This is the error message that I receive:
> maxdepth.plys <- lapply(modeldepthsmore,
Grid2Polygons(modeldepthsmore, level = FALSE))
Error in Grid2Polygons(modeldepthsmore, level = FALSE) : Grid object not
of class SpatialGridDataFrame


Can someone assist me in modifying the R code so that I can convert the
set of files to .shp files and then to .bnd files for BayesX?

Thank-you.

Irucka Embry 


___Get
 the Free email that has everyone talking at http://www.mail2world.com 
target=new>http://www.mail2world.com  Unlimited 
Email Storage – POP3 – Calendar – SMS – Translator – 
Much More!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] positioning a light source within a rgl-plot

2013-01-27 Thread Duncan Murdoch

On 13-01-27 10:37 AM, Alexander Senger wrote:

Hello useRs,


I would like to draw a 3D-surface using rgl with a point-like
light-source within the scene, that is with finite distance of the
light-source to the surface to be lit.


The rgl package doesn't support that.




From the help to the 'light3d' command I read:


"They [the light-sources] are positioned either in world space or
relative to the camera using polar coordinates."

which *could* be understood as if such a thing would be possible. But
probably this is wishful thinking as my naive approach:

light3d(x = 0, y = 0, z = 1)


There are no x, y, z arguments to light3d.  Only directional sources at 
infinite distance are supported.


gives an error about un-used arguments in the function call. Also
skimming the web does not produce any helpful examples.

So please advise if there is a way to achieve my desired setting.
Alternatively any hint how and where to make (moderate) modifications to
the source code to get this functionality would be very welcome.


It's rather tedious to make the changes.  You need to change rgl.light, 
the rgl_light function in api.cpp that it calls, and the parts of 
light.hpp and light.cpp that are called by that.  For completeness you'd 
also want to fix writeWebGL and scene3d and the functions they call.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] positioning a light source within a rgl-plot

2013-01-27 Thread Alexander Senger
Hello useRs,


I would like to draw a 3D-surface using rgl with a point-like
light-source within the scene, that is with finite distance of the
light-source to the surface to be lit.

>From the help to the 'light3d' command I read:

"They [the light-sources] are positioned either in world space or
relative to the camera using polar coordinates."

which *could* be understood as if such a thing would be possible. But
probably this is wishful thinking as my naive approach:

light3d(x = 0, y = 0, z = 1)

gives an error about un-used arguments in the function call. Also
skimming the web does not produce any helpful examples.

So please advise if there is a way to achieve my desired setting.
Alternatively any hint how and where to make (moderate) modifications to
the source code to get this functionality would be very welcome.


Thanks in advance.


Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] importing data

2013-01-27 Thread Ivan Calandra
Hi Ray!

I'm insisting with list.files...!

What about like this (untested)?
file_names <- list.files(path="C:/.../data", pattern=".dat$", 
full.names=TRUE)
list_of_dataset <- do.call(read.table, file_names)

Let me know if this helps!
Ivan

--
Ivan CALANDRA
Université de Bourgogne
UMR CNRS/uB 6282 Biogéosciences
6 Boulevard Gabriel
21000 Dijon, FRANCE
+33(0)3.80.39.63.06
ivan.calan...@u-bourgogne.fr
http://biogeosciences.u-bourgogne.fr/calandra

Le 26/01/13 10:03, Ray Cheung a écrit :
> Thanks for your commands, Ivan and Michael! However, I am still not 
> producing the right codes. Would you please help me on this? I've 
> written the following codes. Please comment. Thank you very much.
> Task: Reading data1.dat to data1000.dat (with missing files) into R. 
> Missing files can be omitted in the list.
> ###FUNCTION TO READ FILES
> little_helpful <- function(n) {
> file_name <- paste0("C:/.../data", n, ".dat")
> read.table(file_name)
> }
> ###RETURN AN OBJECT WHICH CHECKS FOR THE EXISTENCE OF FILES
> check  <- function(n) {
> a <- ifelse(file.exists(paste0("C:/.../data", n, ".dat")), 1, 0)
> a
> }
> ###Combining the functions
> IMPORT <- function(n) {
>L <- check(1:n)
>for (i in 1:n) {
>   if (L[i] == 1)
>   list_of_datasets <- lapply(i, little_helpful) else 
> list_of_datasets <- 0
>   }
>list_of_datasets
>}
> Thanks for all comments.
> Best Regards,
> Ray
>
> On Fri, Jan 25, 2013 at 5:48 PM, Ivan Calandra 
> mailto:ivan.calan...@u-bourgogne.fr>> 
> wrote:
>
> Hi,
>
> Not sure this is what you need, but what about list.files()?
> It can get you all the files from a given folder, and you could
> then work this list with regular expressions for example.
>
> HTH,
> Ivan
>
> --
> Ivan CALANDRA
> Université de Bourgogne
> UMR CNRS/uB 6282 Biogéosciences
> 6 Boulevard Gabriel
> 21000 Dijon, FRANCE
> +33(0)3.80.39.63.06 
> ivan.calan...@u-bourgogne.fr 
> http://biogeosciences.u-bourgogne.fr/calandra
>
> Le 25/01/13 10:00, R. Michael Weylandt a écrit :
>
> On Fri, Jan 25, 2013 at 6:11 AM, Ray Cheung  > wrote:
>
> Dear Michael,
>
> Thanks for your codes. However, lapply does not work in my
> case since I've
> some files missing in the data (say, the file
> data101.dat). Do you have any
> suggestions on this?? Thank you very much.
>
> You could simply add a test using file.exists() but I'm not
> sure what
> you want to do with the M matrix then -- omit the slice (so
> the others
> are all shifted down one) or fill it entirely with NA's.
>
> Michael
>
> __
> R-help@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loops

2013-01-27 Thread Uwe Ligges



On 27.01.2013 12:50, Richard D. Morey wrote:

Dear Contributors,
I am asking help on the way how to solve a problem related to loops for
that I always get confused with.
I would like to perform the following procedure in a compact way.

Consider that p is a matrix composed of 100 rows and three columns. I need
to calculate the sum over some rows of each
column separately, as follows:

fa1<-(colSums(p[1:25,]))

fa2<-(colSums(p[26:50,]))

fa3<-(colSums(p[51:75,]))

fa4<-(colSums(p[76:100,]))

fa5<-(colSums(p[1:100,]))



and then I need to  apply to each of them the following:


fa1b<-c()

for (i in 1:3){

fa1b[i]<-(100-(100*abs(fa1[i]/sum(fa1[i])-(1/3

}



I think I'd do it this way (correcting for the presumed error that Rui Barradas 
noted):


dim( p ) = c(25,4,3)



p2 = apply(p, c(2,3), sum)
p3 = t(apply(p2, 1, function(fa) 100-(100*abs(fa/sum(fa)-(1/3))) ) )


But you actually want it without any (apply-)loop:

p3 <- 100 - 100 * abs(p2 / rowSums(p2) - (1/3))

For this small setup it is not too important, but should be several 
times faster.


Uwe Ligges





p3 now contains all your results except the one including all the data, which 
is trivial to compute.

--
Richard D. Morey
Assistant Professor
Psychometrics and Statistics
Rijksuniversiteit Groningen / University of Groningen
http://drsmorey.org/research/rdmorey

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing values containing a specific character

2013-01-27 Thread Uwe Ligges



On 27.01.2013 07:11, ypodeswa wrote:

Actually, it worked perfectly for my sample data, but my actual data has
5.5 million rows, and grep doesn't seem to work with over a million rows.
  Any idea on a workaround?



It is not a matter of grep() but of available memory, I guess.
Hence try to reduce the number of copies of your data, e.g. by not 
generating an interim df2.


Best,
Uwe Ligges





On Sat, Jan 26, 2013 at 9:37 PM, Yasha Podeswa  wrote:


Awesome, thanks Arun, that's exactly what I was looking for!


On Sat, Jan 26, 2013 at 9:21 PM, arun kirshna [via R] <
ml-node+s789695n4656749...@n4.nabble.com> wrote:


Hi,
Try this:
df[]<-lapply(df,as.character)
df2<-df
df[,1][grep("@",df$names)]<- ""
df
   #names emails
#1   bob   b...@cup.com
#2   joe joesm...@gmail.com
#3  cr...@gmail.com
#4 emily   emi...@yahoo.com
#5   j...@yahoo.com

#2nd part:

  df2[-grep("@",df2$names),]
   names emails
#1   bob   b...@cup.com
#2   joe joesm...@gmail.com
#4 emily   emi...@yahoo.com
A.K.

--
  If you reply to this email, your message will be added to the
discussion below:

http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656749.html
  To unsubscribe from Removing values containing a specific character, click
here
.
NAML









--
View this message in context: 
http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656751.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to extract values from a raster according to Lat and long of the values?

2013-01-27 Thread Jonsson
having 12 files with 12 hdrs for one year:these files are raster
(projected  WGS84,lat
long):https://echange-fichiers.inra.fr/get?k=rLSyoavrnifGyH5XrlO

  samples = 1440
   lines   = 720
   bands   = 1
   header offset = 0
   file type = ENVI Standard
   data type = 4
   interleave = bsq
byte order = 0
  map info = {  Geographic Lat/Lon, 1, 1, -180, 90, 0.25,
0.25,WGS-84}
coordinate system string =
GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",
 SPHEROID["WGS_1984",6378137,298.257223563]]
   ,PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]
 }
These lines will open the files as a list:

  a<-list.files("D:\\ECV\\2010", "*.envi", full.names = TRUE)
   for(i in 1:length(a)){
d <- raster(a[i]}

I would like to extract the values correspond to  44.8386° N, 0.5783° W from
all files as txt file 



--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-extract-values-from-a-raster-according-to-Lat-and-long-of-the-values-tp4656767.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loops

2013-01-27 Thread Richard D. Morey
> Dear Contributors,
> I am asking help on the way how to solve a problem related to loops for
> that I always get confused with.
> I would like to perform the following procedure in a compact way.
> 
> Consider that p is a matrix composed of 100 rows and three columns. I need
> to calculate the sum over some rows of each
> column separately, as follows:
> 
> fa1<-(colSums(p[1:25,]))
> 
> fa2<-(colSums(p[26:50,]))
> 
> fa3<-(colSums(p[51:75,]))
> 
> fa4<-(colSums(p[76:100,]))
> 
> fa5<-(colSums(p[1:100,]))
> 
> 
> 
> and then I need to  apply to each of them the following:
> 
> 
> fa1b<-c()
> 
> for (i in 1:3){
> 
> fa1b[i]<-(100-(100*abs(fa1[i]/sum(fa1[i])-(1/3
> 
> }
> 

I think I'd do it this way (correcting for the presumed error that Rui Barradas 
noted):

> dim( p ) = c(25,4,3)

> p2 = apply(p, c(2,3), sum)
> p3 = t(apply(p2, 1, function(fa) 100-(100*abs(fa/sum(fa)-(1/3))) ) )

p3 now contains all your results except the one including all the data, which 
is trivial to compute.

--
Richard D. Morey
Assistant Professor
Psychometrics and Statistics
Rijksuniversiteit Groningen / University of Groningen
http://drsmorey.org/research/rdmorey

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scan not working

2013-01-27 Thread peter dalgaard

On Jan 27, 2013, at 08:33 , Emily Sessa wrote:

> Hi all, 
> 
> I am trying to use the scan function in an R script that I am calling from 
> the command line on a Mac; at the shell prompt I type: 
> 
> $ Rscript get_q_values.R LRT_codeml_output 
> 
> in the hope that LRT_codeml_output will get passed to the get_q_values R 
> script. The first line of that script is: 
> 
> chidata <- scan(file="")
> 
> which, as I understand how scan works, will read the contents of the file 
> from the command line into the object chidata. I did this a few times and it 
> worked like a charm. And then, it stopped working. Now, every time I try to 
> do this, I get "Read 0 items" as the next line in the terminal window, and 
> the output produced by the script is empty, because it's apparently no longer 
> reading anything in. I don't think I changed anything in the script; it just 
> stopped being able to execute the scan function. Does anyone have any idea 
> how to fix this?? I did not have anything else in that scan line when it was 
> working before. I've updated R and restarted my computer in the hope that it 
> would help, but it hasn't. Any help would be much appreciated.

I don't see how that would ever work. The 2nd and further args to Rscript are 
passed to R and accesible via commandArgs(). There's no way that scan() can 
know what the arguments are. It might work with 

Rscript get_q_values.R < LRT_codeml_output

though. Or you need to arrange explicitly for scan(file=commandArgs(TRUE)[1]).

> 
> -ES
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting column of strings to boolean

2013-01-27 Thread Rui Barradas

Hello,

Something like this?


x <- sample(c("red", "blue", "green", "yellow"), 100, replace = TRUE)
cnames <- unique(x)
sapply(cnames, function(.x) x == .x)


Hope this helps,

Rui Barradas

Em 26-01-2013 22:25, domcastro escreveu:

Hi

I'm trying to convert a column of strings (nominal types) to a set of
boolean / binary / logical values. For example, in the column there is red,
blue, green and yellow. There are 100 rows and each has a colour. I want to
convert the column to 4 columns: red, blue, green,yellow and then either 1
or 0 put in the relevant row.
Thanks



--
View this message in context: 
http://r.789695.n4.nabble.com/Converting-column-of-strings-to-boolean-tp4656739.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loops

2013-01-27 Thread Rui Barradas

Hello,

I think there is an error in the expression

100-(100*abs(fa1[i]/sum(fa1[i])-(1/3)))

Note that fa1[i]/sum(fa1[i]) is always 1. If it's fa1[i]/sum(fa1), try 
the following, using lists to hold the results.



# Make up some data
set.seed(6628)
p <- matrix(runif(300), nrow = 100)

idx <- seq(1, 100, by = 25)
fa <- lapply(idx, function(i) colSums(p[i:(i + 24), ]))
fa[[5]] <- colSums(p)

fab <- lapply(fa, function(x) 100 - 100*abs(x/sum(x) - 1/3))
fab

You can give names to the lists elements, if you want to.


names(fa) <- paste0("fa", 1:5)
names(fab) <- paste0("fa", 1:5, "b")


Hope this helps,

Rui Barradas

Em 27-01-2013 08:02, Francesca escreveu:

Dear Contributors,
I am asking help on the way how to solve a problem related to loops for
that I always get confused with.
I would like to perform the following procedure in a compact way.

Consider that p is a matrix composed of 100 rows and three columns. I need
to calculate the sum over some rows of each
column separately, as follows:

fa1<-(colSums(p[1:25,]))

fa2<-(colSums(p[26:50,]))

fa3<-(colSums(p[51:75,]))

fa4<-(colSums(p[76:100,]))

fa5<-(colSums(p[1:100,]))



and then I need to  apply to each of them the following:


fa1b<-c()

for (i in 1:3){

fa1b[i]<-(100-(100*abs(fa1[i]/sum(fa1[i])-(1/3

}


fa2b<-c()

for (i in 1:3){

fa2b[i]<-(100-(100*abs(fa2[i]/sum(fa2[i])-(1/3

}


and so on.

Is there a more efficient way to do this?

Thanks for your time!

Francesca

--
Francesca Pancotto, PhD
Università di Modena e Reggio Emilia
Viale A. Allegri, 9
40121 Reggio Emilia
Office: +39 0522 523264
Web: https://sites.google.com/site/francescapancotto/
--

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Testing continuous zero-inflated response

2013-01-27 Thread Kay Cichini
That said,

> wilcox_test(x ~ factor(y), distribution = "exact")

or the same with oneway_test, i.e would be ok?


2013/1/27 Achim Zeileis 

> On Sun, 27 Jan 2013, Kay Cichini wrote:
>
>  Thanks for the reply!
>>
>> Still, aren't there issues with 2-sample test vs y and excess zeroes
>> (->many ties), like for Mann-Whitney-U tests?
>>
>
> If you use the (approximate) exact distribution, that is no problem.
>
> The problem with the Wilcoxon/Mann-Whitney test and ties is only that the
> simple recursion formula for computing the exact distribution only works
> without ties. Thus, it's not the exact distribution that is wrong but only
> the standard algorithm for evaluating it.
>
> Best,
> Z
>
>  Kind regards,
>> Kay
>>
>>
>> 2013/1/26 Achim Zeileis 
>>
>>  On Fri, 25 Jan 2013, Kay Cichini wrote:
>>>
>>>  Hello,
>>>

 I'm searching for a test that applies to a dataset (N=36) with a
 continuous zero-inflated dependent variable


>>> In a regression setup, one can use a regression model with a response
>>> censored at zero. survreg() in survival fits such models, tobit() in AER
>>> is
>>> a convenience interface for this special case.
>>>
>>> If the effects of a regressor can be different for the probability of a
>>> zero and the mean of the non-zero observations, then a two-part model can
>>> be used. E.g. a probit fit (via glm) plus a truncated regression (via
>>> truncreg in the package of the same name).
>>>
>>> However:
>>>
>>>
>>>  and only one nominal grouping variable with 2 levels (balanced).
>>>


>>> In that case I would probably use no regression model but two-sample
>>> permutation tests, e.g. via the "coin" package.
>>>
>>>
>>>  In fact there are 4 response variables of this kind which I plan to test
>>>
 seperately - the amount of zeroes ranges from 75 to 97%..


>>> That means you have between one (!) and nine non-zero observations. In
>>> the
>>> former case, it will be hard to model anything. And even in the latter
>>> case
>>> it will be hard to investigate the probability of zero and the mean of
>>> the
>>> non-zero observations separately.
>>>
>>> I would start out with a simple two-way table of (y > 0) vs group and
>>> conduct Fisher's exact test.
>>>
>>> And then you might try also your favorite two sample test of y vs group,
>>> preferably using the approximate exact distribution.
>>>
>>> Hope that helps,
>>> Z
>>>
>>>  I searched the web and found several modelling approaches but have the
>>>
 feeling that they are overly complex for my very simple dataset.

 Thanks in advance for any help!
 Kay

 --

 Kay Cichini, MSc Biol

 Grubenweg 22, 6071 Aldrans

 Tel.: 0650 9359101

 E-Mail: kay.cich...@gmail.com

 Web: www.theBioBucket.blogspot.co.at>
 <
 http://www.**thebiobucket.blogspot.co.at/
 >

> 
> 
> >
>
>  --

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 
 >
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html 
 
 >

 and provide commented, minimal, self-contained, reproducible code.



>>
>> --
>>
>> Kay Cichini, MSc Biol
>>
>> Grubenweg 22, 6071 Aldrans
>>
>> Tel.: 0650 9359101
>>
>> E-Mail: kay.cich...@gmail.com
>>
>> Web: 
>> www.theBioBucket.blogspot.co.**at
>> 
>> >
>> >
>> --
>>
>> [[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>


-- 

Kay Cichini, MSc Biol

Grubenweg 22, 6071 Aldrans

Tel.: 0650 9359101

E-Mail: kay.cich...@gmail.com

Web: 
www.theBioBucket.blogspot.co.at

Re: [R] Testing continuous zero-inflated response

2013-01-27 Thread Achim Zeileis

On Sun, 27 Jan 2013, Kay Cichini wrote:


Thanks for the reply!

Still, aren't there issues with 2-sample test vs y and excess zeroes
(->many ties), like for Mann-Whitney-U tests?


If you use the (approximate) exact distribution, that is no problem.

The problem with the Wilcoxon/Mann-Whitney test and ties is only that the 
simple recursion formula for computing the exact distribution only works 
without ties. Thus, it's not the exact distribution that is wrong but only 
the standard algorithm for evaluating it.


Best,
Z


Kind regards,
Kay


2013/1/26 Achim Zeileis 


On Fri, 25 Jan 2013, Kay Cichini wrote:

 Hello,


I'm searching for a test that applies to a dataset (N=36) with a
continuous zero-inflated dependent variable



In a regression setup, one can use a regression model with a response
censored at zero. survreg() in survival fits such models, tobit() in AER is
a convenience interface for this special case.

If the effects of a regressor can be different for the probability of a
zero and the mean of the non-zero observations, then a two-part model can
be used. E.g. a probit fit (via glm) plus a truncated regression (via
truncreg in the package of the same name).

However:


 and only one nominal grouping variable with 2 levels (balanced).




In that case I would probably use no regression model but two-sample
permutation tests, e.g. via the "coin" package.


 In fact there are 4 response variables of this kind which I plan to test

seperately - the amount of zeroes ranges from 75 to 97%..



That means you have between one (!) and nine non-zero observations. In the
former case, it will be hard to model anything. And even in the latter case
it will be hard to investigate the probability of zero and the mean of the
non-zero observations separately.

I would start out with a simple two-way table of (y > 0) vs group and
conduct Fisher's exact test.

And then you might try also your favorite two sample test of y vs group,
preferably using the approximate exact distribution.

Hope that helps,
Z

 I searched the web and found several modelling approaches but have the

feeling that they are overly complex for my very simple dataset.

Thanks in advance for any help!
Kay

--

Kay Cichini, MSc Biol

Grubenweg 22, 6071 Aldrans

Tel.: 0650 9359101

E-Mail: kay.cich...@gmail.com

Web: www.theBioBucket.blogspot.co.**at





--

[[alternative HTML version deleted]]

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.





--

Kay Cichini, MSc Biol

Grubenweg 22, 6071 Aldrans

Tel.: 0650 9359101

E-Mail: kay.cich...@gmail.com

Web: 
www.theBioBucket.blogspot.co.at
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Testing continuous zero-inflated response

2013-01-27 Thread Kay Cichini
Thanks for the reply!

Still, aren't there issues with 2-sample test vs y and excess zeroes
(->many ties), like for Mann-Whitney-U tests?

Kind regards,
Kay


2013/1/26 Achim Zeileis 

> On Fri, 25 Jan 2013, Kay Cichini wrote:
>
>  Hello,
>>
>> I'm searching for a test that applies to a dataset (N=36) with a
>> continuous zero-inflated dependent variable
>>
>
> In a regression setup, one can use a regression model with a response
> censored at zero. survreg() in survival fits such models, tobit() in AER is
> a convenience interface for this special case.
>
> If the effects of a regressor can be different for the probability of a
> zero and the mean of the non-zero observations, then a two-part model can
> be used. E.g. a probit fit (via glm) plus a truncated regression (via
> truncreg in the package of the same name).
>
> However:
>
>
>  and only one nominal grouping variable with 2 levels (balanced).
>>
>
> In that case I would probably use no regression model but two-sample
> permutation tests, e.g. via the "coin" package.
>
>
>  In fact there are 4 response variables of this kind which I plan to test
>> seperately - the amount of zeroes ranges from 75 to 97%..
>>
>
> That means you have between one (!) and nine non-zero observations. In the
> former case, it will be hard to model anything. And even in the latter case
> it will be hard to investigate the probability of zero and the mean of the
> non-zero observations separately.
>
> I would start out with a simple two-way table of (y > 0) vs group and
> conduct Fisher's exact test.
>
> And then you might try also your favorite two sample test of y vs group,
> preferably using the approximate exact distribution.
>
> Hope that helps,
> Z
>
>  I searched the web and found several modelling approaches but have the
>> feeling that they are overly complex for my very simple dataset.
>>
>> Thanks in advance for any help!
>> Kay
>>
>> --
>>
>> Kay Cichini, MSc Biol
>>
>> Grubenweg 22, 6071 Aldrans
>>
>> Tel.: 0650 9359101
>>
>> E-Mail: kay.cich...@gmail.com
>>
>> Web: 
>> www.theBioBucket.blogspot.co.**at
>> 
>> >
>> >
>> --
>>
>> [[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>


-- 

Kay Cichini, MSc Biol

Grubenweg 22, 6071 Aldrans

Tel.: 0650 9359101

E-Mail: kay.cich...@gmail.com

Web: 
www.theBioBucket.blogspot.co.at
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] scan not working

2013-01-27 Thread Emily Sessa
Hi all, 

I am trying to use the scan function in an R script that I am calling from the 
command line on a Mac; at the shell prompt I type: 

$ Rscript get_q_values.R LRT_codeml_output 

in the hope that LRT_codeml_output will get passed to the get_q_values R 
script. The first line of that script is: 

chidata <- scan(file="")

which, as I understand how scan works, will read the contents of the file from 
the command line into the object chidata. I did this a few times and it worked 
like a charm. And then, it stopped working. Now, every time I try to do this, I 
get "Read 0 items" as the next line in the terminal window, and the output 
produced by the script is empty, because it's apparently no longer reading 
anything in. I don't think I changed anything in the script; it just stopped 
being able to execute the scan function. Does anyone have any idea how to fix 
this?? I did not have anything else in that scan line when it was working 
before. I've updated R and restarted my computer in the hope that it would 
help, but it hasn't. Any help would be much appreciated.

-ES
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting column of strings to boolean

2013-01-27 Thread arun
Hi,

Anther possibility may be to use:

library(MatrixModels)

  model.Matrix(~d-1,sparse=FALSE)
#7 x 4 Matrix of class "ddenseModelMatrix"
 # dblue dgreen dred dyellow
#1     0  0    1   0
#2 0  1    0   0
#3 0  0    1   0
#4 1  0    0   0
#5 0  1    0   0
#6 0  0    0   1
#7 0  0    1   0
 model.Matrix(~d-1,sparse=TRUE)
#"dsparseModelMatrix": 7 x 4 sparse Matrix of class "dgCMatrix"
 # dblue dgreen dred dyellow
#1 .  .    1   .
#2 .  1    .   .
#3 .  .    1   .
#4 1  .    .   .
#5 .  1    .   .
#6 .  .    .   1
#7 .  .    1   .
#@ assign:  1 1 1 1 
#@ contrasts:
#$d
#[1] "contr.treatment"

A.K.


- Original Message -
From: Pete Brecknock 
To: r-help@r-project.org
Cc: 
Sent: Saturday, January 26, 2013 6:27 PM
Subject: Re: [R] Converting column of strings to boolean

domcastro wrote
> Hi
> 
> I'm trying to convert a column of strings (nominal types) to a set of
> boolean / binary / logical values. For example, in the column there is
> red, blue, green and yellow. There are 100 rows and each has a colour. I
> want to convert the column to 4 columns: red, blue, green,yellow and then
> either 1 or 0 put in the relevant row.
> Thanks

maybe model.matrix will help 

# d is my understanding of your data
d<-factor(c("red","green","red","blue","green","yellow","red"))
model.matrix(~d -1)

HTH 

Pete



--
View this message in context: 
http://r.789695.n4.nabble.com/Converting-column-of-strings-to-boolean-tp4656739p4656741.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Converting column of strings to boolean

2013-01-27 Thread domcastro
Hi

I'm trying to convert a column of strings (nominal types) to a set of
boolean / binary / logical values. For example, in the column there is red,
blue, green and yellow. There are 100 rows and each has a colour. I want to
convert the column to 4 columns: red, blue, green,yellow and then either 1
or 0 put in the relevant row.
Thanks



--
View this message in context: 
http://r.789695.n4.nabble.com/Converting-column-of-strings-to-boolean-tp4656739.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] different legends in lattice panels

2013-01-27 Thread Tito de Morais Luis
Thank you _very much_ Ilai for the rapid and accurate answer.

It works and indeed helps a lot. Both to solve the question and to help 
me progress !
Possibly this will also help others.

Thanks again

Tito

Le 27/01/2013 01:11, ilai a écrit :
>
>
> On Sat, Jan 26, 2013 at 10:26 AM, Tito de Morais Luis 
> mailto:luis.tito-de-mor...@ird.fr>> wrote:
>
> Hi listers,
>
> I want to make lattice plots xyplots with the indication of legends
> inside each panel with only the points and the lines actually ploted
> inside each given panel according to the group(ing) factor.
>
> The code below shows what I have achieved so far and I hope will make
> clear what I want to have.
> It seems to me that my solution is a very "dirty hack" and there
> certainly is a much simple and "clean" way to do it.
> Besides, there is no concordance in lty and pch between the legend
> above
> the graph with those inside the panels.
>
> No. Look again. It is your panel legends that don't correspond to the 
> actual plot. The plot symbols and line types for the chosen theme != 
> pch[1:10] and lty[1:10]. You can either explicitly set the pch and lty 
> in the plot and auto.key to be 1:10 and proceed with trellis.focus or 
> insert draw.key in the panel function to automate the procedure and 
> query the groups and graphical parameters of each panel :
>
> xyplot(lbt ~ de | type, data=dataf, groups =sta,
>  type=c("p","g","r"), layout=c(4,1), par.settings = 
> standard.theme(color = FALSE),
>  auto.key=list(space="top", columns=5, lines=TRUE),
>  panel=function(x,y,groups,subscripts,...){
>   panel.xyplot(x,y,groups=groups,subscripts=subscripts,...)
>   pug <- levels(groups)[levels(groups)%in%groups[subscripts]]
>   draw.key(key=list(text = list(as.character(pug)),
> lines = list(lty = 
> rep(trellis.par.get('superpose.line')$lty,10)[as.numeric(pug)]),
> points = list(pch = 
> rep(trellis.par.get('superpose.symbol')$pch,10)[as.numeric(pug)])
>),
>draw=TRUE , vp=viewport(x=0.25,y=0.9))
> })
>
> HTH
> (...snip...)

-- 
Luis Tito de Morais
IRD - UMR LEMAR (IRD/UBO/CNRS/IFREMER)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loops

2013-01-27 Thread Francesca
Dear Contributors,
I am asking help on the way how to solve a problem related to loops for
that I always get confused with.
I would like to perform the following procedure in a compact way.

Consider that p is a matrix composed of 100 rows and three columns. I need
to calculate the sum over some rows of each
column separately, as follows:

fa1<-(colSums(p[1:25,]))

fa2<-(colSums(p[26:50,]))

fa3<-(colSums(p[51:75,]))

fa4<-(colSums(p[76:100,]))

fa5<-(colSums(p[1:100,]))



and then I need to  apply to each of them the following:


fa1b<-c()

for (i in 1:3){

fa1b[i]<-(100-(100*abs(fa1[i]/sum(fa1[i])-(1/3

}


fa2b<-c()

for (i in 1:3){

fa2b[i]<-(100-(100*abs(fa2[i]/sum(fa2[i])-(1/3

}


and so on.

Is there a more efficient way to do this?

Thanks for your time!

Francesca

--
Francesca Pancotto, PhD
Università di Modena e Reggio Emilia
Viale A. Allegri, 9
40121 Reggio Emilia
Office: +39 0522 523264
Web: https://sites.google.com/site/francescapancotto/
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.