[R] MCMClogit: Cannot calculate marginal likelihood with improper prior
I'm an undergrad who is new to MCMCpack and I haven't been able to find an answer to my problem online yet: I'm attempting to run MCMClogit with a Cauchy proper prior but I'm getting the warning Cannot calculate marginal likelihood with improper prior (my purposes require the marginal likelihood calculation so I understand that I need to use a proper prior). I'm trying to simulate the user-defined independent Cauchy prior with additional args as specified in the MCMCpack User Manual (p. 76, April 2013 version). My input data has been standardized (mean = 0, sd = 0.5 for non-binary variables, and binary variables with mean of 0 and difference of 1 between upper and lower ends) according to the Gelman 2008 paper on logistic regression (www.stat.columbia.edu/~gelman/research/published/priors11.pdf). When I run the example data set (birthwt) from the User Manual, the logpriorfun works correctly allowing the marginal likelihood to be generated. However, when I try running my data with the logprior fun, I get a warning that the prior is improper. Here is the code I am running: *logpriorfun = function(beta, location,scale){ sum(dcauchy(beta, location, scale, log = TRUE)) }* * MCMC.2= MCMClogit(DEAD ~ YEARS + MALE + x1 + x2 + x3+ x4 +x5 + x6 + x7 + x8 + x9, tune= 0.65,burnin =500, mcmc=5000, data = dat, marginal.likelihood = Laplace, user.prior.density=logpriorfun, logfun=TRUE, location = 0, scale=2.5) * *@ The Metropolis acceptance rate was 0.27418 @ Warning message: In MCMClogit(DEAD ~ YEARS + MALE + x1 + x2 + x3 + : Cannot calculate marginal likelihood with improper prior* Any advice on how to fix my arguments so it is a proper prior and will allow me to generate a marginal likelihood using the Laplace approximation? Or how should I be coding a Cauchy proper prior? I'm having problems defining the priors. Thanks, B. -- View this message in context: http://r.789695.n4.nabble.com/MCMClogit-Cannot-calculate-marginal-likelihood-with-improper-prior-tp4672561.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with prefmod
Hello, I'm using the prefmod package, with pattPC fit, and I'm having some trouble interpreting the results. I am giving two different species of animal a choice between two of three different patterns, V, H, and 45. I have run a number of paired tests with different combinations of the above, and my outputs are as follows: estimate sez p-value V 0.07196 0.18992 0.379 0.7047 H 0.10790 0.19009 0.568 0.5700 V:Species2 0.17084 0.24317 0.703 0.4821 H:Species2 0.13490 0.23965 0.563 0.5734 U -0.934680.19993 -4.675 0. I'm assuming that the baseline is the 45 measurement, but I'm not sure what the category titled U refers to? In my model this is particularly interesting as the other p-values are quite high, but the U p-value is 0.003. I haven't been able to find any information about this online thus far. Thank you, Qamar Qamar Schuyler PhD Research Student University of Queensland Moreton Bay Research Station Dunwich, QLD 4183 Mobile: +61 4 275 66868 Work: +61 7 3409 9058 Fax: +61 7 3409 9839 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Declare BASH Array Using R System Function
Hello, It is difficult searching for previous posts about this since the keywords are short and ambiguous, so I hope this is not a duplicate question. I can easily declare an array on the command line. $ names=(X Y) $ echo ${names[0]} X I am unable to do the same from within R. system(names=(X Y)) sh: Syntax error: ( unexpected Reading the documentation for the system function, it appears to only be relevant for executing commands. What can I do instead to declare a BASH array ? Thanks. -- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] surface plot
Dear R users; I have a question about surface plot that show me spatial variability of parameter. I have a data frame with 6 variables and X and Y for coordinate system. X Y pH ..... ... so I want to create a surface plot for my data. please help me. many thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R function
Dear R users; I am MSc student and I want to write my own function, but it cant be completed. please help me for solve it. here is my code: pah1$P = (pah1$Fluoranthene/pah1$Pyrene) T = function(x){ for (i in 1:length(pah1$P)) if (i = 1) print(Combustion) if (i 1) print(Petroleum) } T(pah1$P[c(1:83),]) I wish that R gives me a column that if value greater or equal to one give Combustion and if value is less than one give Petroleum. but my function dose not work. thank you so much for your help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Declare BASH Array Using R System Function
You seem confused. You are programming in R, and asking questions about bash on an R mailing list. You seem to need to learn the difference between environment variables and bash variables and how processes acquire and transfer environment variables, which is really an operating system concept and off topic here. Once you do understand this difference, you might be interested in reading the R help file on Sys.setenv(). --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Dario Strbenac dstr7...@uni.sydney.edu.au wrote: Hello, It is difficult searching for previous posts about this since the keywords are short and ambiguous, so I hope this is not a duplicate question. I can easily declare an array on the command line. $ names=(X Y) $ echo ${names[0]} X I am unable to do the same from within R. system(names=(X Y)) sh: Syntax error: ( unexpected Reading the documentation for the system function, it appears to only be relevant for executing commands. What can I do instead to declare a BASH array ? Thanks. -- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Declare BASH Array Using R System Function
On Jul 29, 2013, at 08:27 , Jeff Newmiller wrote: You seem confused. Not particularly, but he needs to be aware of _which_ shell R is executing in system() calls. These things work for me: system(foo=(bar baz); echo ${foo[1]}) baz Dario's issue is suggested by his error message system(names=(X Y)) sh: Syntax error: ( unexpected The shell is (Bourne) sh, not bash, so bash extension won't work. This is highly system dependent: On OSX Snow Leopard, e.g., /bin/sh really is GNU bash, which is why it works for me. Others have the more sane setup where /bin/sh really is Bourne sh. Next question is of course how to ensure that bash gets used. I must admit that I have long forgotten... -Peter D. You are programming in R, and asking questions about bash on an R mailing list. You seem to need to learn the difference between environment variables and bash variables and how processes acquire and transfer environment variables, which is really an operating system concept and off topic here. Once you do understand this difference, you might be interested in reading the R help file on Sys.setenv(). --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Dario Strbenac dstr7...@uni.sydney.edu.au wrote: Hello, It is difficult searching for previous posts about this since the keywords are short and ambiguous, so I hope this is not a duplicate question. I can easily declare an array on the command line. $ names=(X Y) $ echo ${names[0]} X I am unable to do the same from within R. system(names=(X Y)) sh: Syntax error: ( unexpected Reading the documentation for the system function, it appears to only be relevant for executing commands. What can I do instead to declare a BASH array ? Thanks. -- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R function
Hello, Try the following. T - function(x){ ifelse(pah1$P = 1, Combustion, Petroleum) } T(pah1$P[1:83]) Hope this helps, Rui Barradas Em 29-07-2013 06:35, javad bayat escreveu: Dear R users; I am MSc student and I want to write my own function, but it cant be completed. please help me for solve it. here is my code: pah1$P = (pah1$Fluoranthene/pah1$Pyrene) T = function(x){ for (i in 1:length(pah1$P)) if (i = 1) print(Combustion) if (i 1) print(Petroleum) } T(pah1$P[c(1:83),]) I wish that R gives me a column that if value greater or equal to one give Combustion and if value is less than one give Petroleum. but my function dose not work. thank you so much for your help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R function
Hello, Sorry, that should be T - function(x){ ifelse(x = 1, Combustion, Petroleum) } Rui Barradas Em 29-07-2013 09:32, Rui Barradas escreveu: Hello, Try the following. T - function(x){ ifelse(pah1$P = 1, Combustion, Petroleum) } T(pah1$P[1:83]) Hope this helps, Rui Barradas Em 29-07-2013 06:35, javad bayat escreveu: Dear R users; I am MSc student and I want to write my own function, but it cant be completed. please help me for solve it. here is my code: pah1$P = (pah1$Fluoranthene/pah1$Pyrene) T = function(x){ for (i in 1:length(pah1$P)) if (i = 1) print(Combustion) if (i 1) print(Petroleum) } T(pah1$P[c(1:83),]) I wish that R gives me a column that if value greater or equal to one give Combustion and if value is less than one give Petroleum. but my function dose not work. thank you so much for your help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] surface plot
On 07/29/2013 03:31 PM, javad bayat wrote: Dear R users; I have a question about surface plot that show me spatial variability of parameter. I have a data frame with 6 variables and X and Y for coordinate system. X Y pH ..... ... so I want to create a surface plot for my data. please help me. Hi javad, There are a number of packages that will produce plots that might be useful to you. I suggest that you search for: surface plot r with Google and see which plot suits you. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help R
Hola Maria Teresa, For multiple imputation, I would suggest the Amelia package by Gary King, James Honaker, and Matthew Blackwell, which uses the expectation-maximisation (EM) algorithm with bootstrap. Its excellent vignette has examples which should be more than enough for your needs. Do read very carefully the imputation-improving transformation section to deal with ordinal, nominal, etc., variables. With regards to which variables to select, this goes beyond this R-help group, but the package vignette provides you with this short answer: It is crucial to include at least as much information as will be used in the analysis model. That is, any variable that will be in the analysis model should also be in the imputation model. This includes any transformations or interactions of variables that will appear in the analysis model. In fact, it is often useful to add more information to the imputation model than will be present when the analysis is run. Since imputation is predictive, any variables that would increase predictive power should be included in the model, even if including them in the analysis model would produce bias in estimating a causal effect (such as for post-treatment variables) or collinearity would preclude determining which variable had a relationship with the dependent variable (such as including multiple alternate measures of GDP). Hope this helps! José Prof. José Iparraguirre Chief Economist Age UK -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mª Teresa Martinez Soriano Sent: 26 July 2013 09:36 To: r-help@r-project.org Subject: [R] Help R Hi to everyone, first of all thanks for this service, it is being very useful for me, thanks in advance. I am new in R, so I suppose I could make really naive questions, I'm sorry. I have to impute some missing values and I am trying to do it with VIM library trough Hot Deck imputation. I writte:vmGUImenu(), and it opens a small window of: Visualization and Imputation of Missing Values and I select Imptation and Hot Deck and then one of the variables which I have to select is Select Variables to Build Domains. I don't know which variables I have to select, I don't understand this. I have tried don't put anything and I get : hotdeck(dataframe,variable=c(CRV.IE.2005,CRV.IE.2006,CRV.IE.2007,CRV.IE.2008,CRV.IE.2009,CR V.IE.2010),ord_var=c(CRV.IE.2001,CRV.IE.2002,CRV.IE.2003,CRV.IE.2004,CRV.IE.2005,CRV.IE.20 06,CRV.IE.2007,CRV.IE.2008,CRV.IE.2009,CRV.IE.2010),domain_var=NULL,imp_suffix=_imp) Mensajes de aviso perdido: In hotdeck(data, variable = vars, ord_var = sort, domain_var = domain, Some NAs remained, maybe due to a too restrictive domain building!? In hotdeck(b, variable = c(CRV.IE.2005, CRV.IE.2006, CRV.IE.2007, Some NAs remained, maybe due to a too restrictive domain building!? What should I put in this variable?? Thanks in advance Best regards Teresa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The Wireless from Age UK | Radio for grown-ups. www.ageuk.org.uk/thewireless If you’re looking for a radio station that offers real variety, tune in to The Wireless from Age UK. Whether you choose to listen through the website at www.ageuk.org.uk/thewireless, on digital radio (currently available in London and Yorkshire) or through our TuneIn Radio app, you can look forward to an inspiring mix of music, conversation and useful information 24 hours a day. --- Age UK is a registered charity and company limited by guarantee, (registered charity number 1128267, registered company number 6825798). Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA. For the purposes of promoting Age UK Insurance, Age UK is an Appointed Representative of Age UK Enterprises Limited, Age UK is an Introducer Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth Access for the purposes of introducing potential annuity and health cash plans customers respectively. Age UK Enterprises Limited, JLT Benefit Solutions Limited and Simplyhealth Access are all authorised and regulated by the Financial Services Authority. -- This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you receive a message in error, please advise the sender and delete immediately. Except where this email is sent in the usual course of our business, any opinions expressed in this email are those of the author and do not necessarily reflect the opinions of Age UK
Re: [R] Declare BASH Array Using R System Function
On 29/07/2013 08:49, peter dalgaard wrote: On Jul 29, 2013, at 08:27 , Jeff Newmiller wrote: You seem confused. Not particularly, but he needs to be aware of _which_ shell R is executing in system() calls. These things work for me: system(foo=(bar baz); echo ${foo[1]}) baz Dario's issue is suggested by his error message system(names=(X Y)) sh: Syntax error: ( unexpected The shell is (Bourne) sh, not bash, so bash extension won't work. See below: the shell should always be 'sh'. This is highly system dependent: On OSX Snow Leopard, e.g., /bin/sh really is GNU bash, which is why it works for me. Others have the more sane setup where /bin/sh really is Bourne sh. On recent OS X /bin/sh is *a variant of* bash. E.g. shopt xpg_echo is different if it gets invoked as sh or bash. Where sh is a link to bash the behaviour is usually different depending on how it is invoked. There are quite a lot of systems for which /bin/sh is not based on either bash or Bourne sh. As I understand it, Debian/Ubuntu nowadays use dash by default, and some other Linuxen use ash. zsh is also seen as a system shell. And in many cases this is configurable Note too that there is quite a lot of flexibility in how bash is configured. Next question is of course how to ensure that bash gets used. I must admit that I have long forgotten... From ?system ‘command’ is parsed as a command plus arguments separated by spaces. So if the path to the command (or an argument) contains spaces, it must be quoted e.g. by ‘shQuote’. Unix-alikes pass the command line to a shell (normally ‘/bin/sh’, and POSIX requires that shell), so ‘command’ can be anything the shell regards as executable, including shell scripts, and it can contain multiple commands separated by ‘;’. So you do not have a choice of shell, and the command-line you pass needs to invoke a different shell if that is what you want. But apart from knowing that R's system calls the system(1) OS call (on a Unix-alike) there is nothing relevant to R-help here. -Peter D. You are programming in R, and asking questions about bash on an R mailing list. You seem to need to learn the difference between environment variables and bash variables and how processes acquire and transfer environment variables, which is really an operating system concept and off topic here. Once you do understand this difference, you might be interested in reading the R help file on Sys.setenv(). --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Dario Strbenac dstr7...@uni.sydney.edu.au wrote: Hello, It is difficult searching for previous posts about this since the keywords are short and ambiguous, so I hope this is not a duplicate question. I can easily declare an array on the command line. $ names=(X Y) $ echo ${names[0]} X I am unable to do the same from within R. system(names=(X Y)) sh: Syntax error: ( unexpected Reading the documentation for the system function, it appears to only be relevant for executing commands. What can I do instead to declare a BASH array ? Thanks. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MCMClogit: Cannot calculate marginal likelihood with improper prior
Hi, what I see so far is that you have specified your user.prior.density correctly. The error comes from the prior precision matrix B0 in combination with the marginal.likelihood set to Laplace. B0, if not explicitly specified, defaults to zero, which results in eigenvalues of zero. If Laplace is indicated for the marginal.likelihood, the algorithm usually calls an optimization over logpost.logit in BayesianFactors.R where the matrix B0 is tried to be solved by solve(B0) ... as it is a zero matrix its linear equation system is exactly singular and cannot be solved. The Function MCMClogit knows about this fact and gives out a warning Cannot calculate marginal likelihood with improper prior while changing marginal.likelihood to none. So concluding: Choose your user.prior.density with marginal.likelihood = none and all is fine (implicitly it is done so nevertheless). Best Simon P.S. Using a name on a community help list will certainly improve the number of answers to your questions. On Jul 29, 2013, at 3:00 AM, ba0728 haleyb...@att.net wrote: I'm an undergrad who is new to MCMCpack and I haven't been able to find an answer to my problem online yet: I'm attempting to run MCMClogit with a Cauchy proper prior but I'm getting the warning Cannot calculate marginal likelihood with improper prior (my purposes require the marginal likelihood calculation so I understand that I need to use a proper prior). I'm trying to simulate the user-defined independent Cauchy prior with additional args as specified in the MCMCpack User Manual (p. 76, April 2013 version). My input data has been standardized (mean = 0, sd = 0.5 for non-binary variables, and binary variables with mean of 0 and difference of 1 between upper and lower ends) according to the Gelman 2008 paper on logistic regression (www.stat.columbia.edu/~gelman/research/published/priors11.pdf). When I run the example data set (birthwt) from the User Manual, the logpriorfun works correctly allowing the marginal likelihood to be generated. However, when I try running my data with the logprior fun, I get a warning that the prior is improper. Here is the code I am running: *logpriorfun = function(beta, location,scale){ sum(dcauchy(beta, location, scale, log = TRUE)) }* * MCMC.2= MCMClogit(DEAD ~ YEARS + MALE + x1 + x2 + x3+ x4 +x5 + x6 + x7 + x8 + x9, tune= 0.65,burnin =500, mcmc=5000, data = dat, marginal.likelihood = Laplace, user.prior.density=logpriorfun, logfun=TRUE, location = 0, scale=2.5) * *@ The Metropolis acceptance rate was 0.27418 @ Warning message: In MCMClogit(DEAD ~ YEARS + MALE + x1 + x2 + x3 + : Cannot calculate marginal likelihood with improper prior* Any advice on how to fix my arguments so it is a proper prior and will allow me to generate a marginal likelihood using the Laplace approximation? Or how should I be coding a Cauchy proper prior? I'm having problems defining the priors. Thanks, B. -- View this message in context: http://r.789695.n4.nabble.com/MCMClogit-Cannot-calculate-marginal-likelihood-with-improper-prior-tp4672561.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Declare BASH Array Using R System Function
Hi, system(names=(X Y); echo ${names[0]}) #sh: 1: Syntax error: ( unexpected #this worked for me: system(bash -c 'names=(X Y); echo ${names[0]}') #X A.K. - Original Message - From: Dario Strbenac dstr7...@uni.sydney.edu.au To: r-help@r-project.org r-help@r-project.org Cc: Sent: Sunday, July 28, 2013 10:00 PM Subject: [R] Declare BASH Array Using R System Function Hello, It is difficult searching for previous posts about this since the keywords are short and ambiguous, so I hope this is not a duplicate question. I can easily declare an array on the command line. $ names=(X Y) $ echo ${names[0]} X I am unable to do the same from within R. system(names=(X Y)) sh: Syntax error: ( unexpected Reading the documentation for the system function, it appears to only be relevant for executing commands. What can I do instead to declare a BASH array ? Thanks. -- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to split two levels several times?
Hi Arun, thanks. Great help. I tested the code for several tables and your function works well. Dennis Gesendet: Freitag, 26. Juli 2013 um 15:43 Uhr Von: arun smartpink...@yahoo.com An: dennis1...@gmx.net dennis1...@gmx.net Cc: R help r-help@r-project.org, Rui Barradas ruipbarra...@sapo.pt Betreff: Re: [R] How to split two levels several times? It would be better to wrap it in a function. fun1- function(x,colName,N,value){ rl- rle(as.character(x[,colName])) dat-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1-if(rl$values[i]==value (rl$lengths[i]%/%N1)) rep(N,rl$lengths[i]%/%N) else rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])})) lst1-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1) vec1-sapply(lst1,max) vec2-c(1,vec1[-length(vec1)]+1) res- lapply(seq_along(lst1),function(i) {x1-lst1[[i]]; x[seq(vec2[i],max(x1)),]}) res } fun1(XXX,electrode,6,electrode4) #Using previous dataset XXX, XXX1, XXX2 fun1(XXX,electrode,3,electrode1) fun1(XXX1,electrode,3,electrode1) fun1(XXX2,electrode,3,electrode1) A.K. - Original Message - From: arun smartpink...@yahoo.com To: dennis1...@gmx.net dennis1...@gmx.net Cc: R help r-help@r-project.org; Rui Barradas ruipbarra...@sapo.pt Sent: Friday, July 26, 2013 9:26 AM Subject: Re: [R] How to split two levels several times? Hi Dennis, I guess in this case, instead of Eletrode1 occuring 3 times, it is Electrode4 exists only 6 times. If that is the situation: just change: XXX: data rl-rle(as.character(XXX$electrode)) dat-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1-if(rl$values[i]==electrode4 (rl$lengths[i]%/%61)) rep(6,rl$lengths[i]%/%6) else rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])})) lst1-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1) vec1-sapply(lst1,max) vec2-c(1,vec1[-length(vec1)]+1) res- lapply(seq_along(lst1),function(i) {x1-lst1[[i]]; XXX[seq(vec2[i],max(x1)),]}) res [[1]] electrode length 1 electrode1 206 2 electrode1 194 3 electrode1 182 4 electrode1 172 5 electrode1 169 6 electrode2 82 7 electrode2 78 8 electrode2 70 9 electrode2 58 [[2]] electrode length 10 electrode1 206 11 electrode1 194 12 electrode1 182 13 electrode1 172 14 electrode1 169 15 electrode3 260 16 electrode3 176 17 electrode3 137 [[3]] electrode length 18 electrode1 206 19 electrode1 194 20 electrode1 182 21 electrode1 172 22 electrode1 169 23 electrode4 86 24 electrode4 66 25 electrode4 64 26 electrode4 52 27 electrode4 27 28 electrode4 26 [[4]] electrode length 29 electrode2 82 30 electrode2 78 31 electrode2 70 32 electrode2 58 33 electrode1 206 34 electrode1 194 35 electrode1 182 36 electrode1 172 37 electrode1 169 [[5]] electrode length 38 electrode2 82 39 electrode2 78 40 electrode2 70 41 electrode2 58 42 electrode3 260 43 electrode3 176 44 electrode3 137 [[6]] electrode length 45 electrode2 82 46 electrode2 78 47 electrode2 70 48 electrode2 58 49 electrode4 86 50 electrode4 66 51 electrode4 64 52 electrode4 52 53 electrode4 27 54 electrode4 26 [[7]] electrode length 55 electrode3 260 56 electrode3 176 57 electrode3 137 58 electrode1 206 59 electrode1 194 60 electrode1 182 61 electrode1 172 62 electrode1 169 [[8]] electrode length 63 electrode3 260 64 electrode3 176 65 electrode3 137 66 electrode2 82 67 electrode2 78 68 electrode2 70 69 electrode2 58 [[9]] electrode length 70 electrode3 260 71 electrode3 176 72 electrode3 137 73 electrode4 86 74 electrode4 66 75 electrode4 64 76 electrode4 52 77 electrode4 27 78 electrode4 26 [[10]] electrode length 79 electrode4 86 80 electrode4 66 81 electrode4 64 82 electrode4 52 83 electrode4 27 84 electrode4 26 85 electrode1 206 86 electrode1 194 87 electrode1 182 88 electrode1 172 89 electrode1 169 [[11]] electrode length 90 electrode4 86 91 electrode4 66 92 electrode4 64 93 electrode4 52 94 electrode4 27 95 electrode4 26 96 electrode2 82 97 electrode2 78 98 electrode2 70 99 electrode2 58 [[12]] electrode length 100 electrode4 86 101 electrode4 66 102 electrode4 64 103 electrode4 52 104 electrode4 27 105 electrode4 26 106 electrode3 260 107 electrode3 176 108 electrode3 137 A.K. - Original Message - From: dennis1...@gmx.net dennis1...@gmx.net To: Rui Barradas ruipbarra...@sapo.pt; r-help@r-project.org Cc: Sent: Friday, July 26, 2013 6:07 AM Subject: Re: [R] How to split two levels several times? Hi Rui Arun, really thanks for investing so much
[R] How to double integrate a function in R
I would like to express my gratitude for the great help given by David and Hans regarding my last query. Thank you very much for your time, people. All the best, Tiago --- Hello, R users! I am trying to double integrate the following expression: # expression (1/(2*pi))*exp(-y2/2)*sqrt((y1/(y2-y1))) for y2y10. I am trying the following approach # first attempt library(cubature) fun - function(x) { (1/(2*pi))*exp(-x[2]/2)*sqrt((x[1]/(x[2]-x[1])))} adaptIntegrate(fun, lower = c(0,0), upper =c(5, 6), tol=1e-8) However, I don't know how to constrain the integration so that y2y10. Any ideas? Tiago -- Tiago V. Pereira, MSc, PhD Center for Studies of the Human Genome Department of Genetics and Evolutionary Biology University of São Paulo Rua do Matão, 277 CEP 05508-900 São Paulo - SP, Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Alternative method for range-matching within 2 nested loops in R?
Look into the findInterval function. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John Helly Sent: Saturday, July 27, 2013 3:29 PM To: r-help@r-project.org Subject: [R] Alternative method for range-matching within 2 nested loops in R? Hi. I've been puzzling about how to replace the nested loops below. The idea is that the B dataframe has rows with a posix datetime and the C dataframes has posix Start and End times. I want to assign a value to the observations in B based in intersecting the appropriate time-interval in C. I haven't been able to discern a more efficient way to do this. Any suggestions would be most appreciated. brows = dim(B)[1] mrows = dim(C)[1] for (i in 1:brows ) { for (j in 1:mrows ) { if (B$Datetime[i] = C$DT_Start[j] B$Datetime=C$DT_End[j]){ B$Site[i] = C$Proximity[j] } } } -- John Helly, University of California, San Diego / San Diego Supercomputer Center / Scripps Institution of Oceanography / 760 840 8660 mobile / stonesteps (Skype) / stonesteps7 (iChat) / http://www.sdsc.edu/~hellyj __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] legend in ggmap
Hello, how can I visualize the legend in the following ggmap plot ? In the legend I just want to show the size ranges of the data: ggmap(map) + geom_point(aes(x=Longitude,y=Latitude), size = dataRd[,2]/12, col=2, data=dataRd, alpha=0.7, lwd=2) Thanks Edoardo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R function
Rui has shown you a much more efficient way to code your function in R. To fix the code you posted, you need to add brackets around the loop, test x[i] instead of i (which is always = 1), and get the length of the loop from x not pah1$P. Without the brackets only the first if() is included in the for loop: T - function(x) { for (i in 1:length(x)) { if (x[i] = 1) print(Combustion) if (x[i] 1) print(Petroleum) } } - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rui Barradas Sent: Monday, July 29, 2013 3:46 AM To: javad bayat Cc: r-help@r-project.org Subject: Re: [R] R function Hello, Sorry, that should be T - function(x){ ifelse(x = 1, Combustion, Petroleum) } Rui Barradas Em 29-07-2013 09:32, Rui Barradas escreveu: Hello, Try the following. T - function(x){ ifelse(pah1$P = 1, Combustion, Petroleum) } T(pah1$P[1:83]) Hope this helps, Rui Barradas Em 29-07-2013 06:35, javad bayat escreveu: Dear R users; I am MSc student and I want to write my own function, but it cant be completed. please help me for solve it. here is my code: pah1$P = (pah1$Fluoranthene/pah1$Pyrene) T = function(x){ for (i in 1:length(pah1$P)) if (i = 1) print(Combustion) if (i 1) print(Petroleum) } T(pah1$P[c(1:83),]) I wish that R gives me a column that if value greater or equal to one give Combustion and if value is less than one give Petroleum. but my function dose not work. thank you so much for your help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: Line starting ' ...' is malformed!
A recent version of the R extension manual says, For maximal portability, the ‘DESCRIPTION’ file should be written entirely in ASCII — if this is not possible it must contain an ‘Encoding’ field (see below). It also says, regarding the DESCRIPTION file, Fields start with an ASCII name immediately followed by a colon: the value starts after the colon and a space. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 7/26/13 11:08 AM, Ross Boylan r...@biostat.ucsf.edu wrote: A DESCRIPTION file begins with 0xFFFE and $ file DESCRIPTION DESCRIPTION: Little-endian UTF-16 Unicode text, with CRLF, CR line terminators I think it was created on Windows. In R (2,15,1 on Debian GNU/Linux), using roxygen2, I get roxygenize(../GitHub/mice) Error: Line starting '��P ...' is malformed! Enter a frame number, or 0 to exit 1: roxygenize(../GitHub/mice) 2: read.description(DESCRIPTION) 3: read.dcf(file) Selection: 3 Called from: read.description(DESCRIPTION) Browse[1] Q I'm not sure if the first 2 characters after line starting ', which are octal 377, 376, will survive email; I stripped them out of the subject line. The files (DESCRIPTION isn't the only one) have also caused trouble for git (even on Windows 7), since it thinks they are binary. Any advice about what to do? I'm reluctant to change the format of the files because it's not my package. Ross Boylan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chinese characters in html source captured by download.file() are garbled code , how to convert it readable
Try with adding mode=wb to download.file(), or just use downloadFile() of R.utils. /Henrik On Sun, Jul 28, 2013 at 8:32 PM, Yong Wang wangyo...@gmail.com wrote: Dear list, I am working with R to download numerous html source code from which the data extracted will be further processed. The problem is the Chinese character in the html source code are all garbled and I can't really find a way to convert them to something readable. This problem persists on ubuntu-10 and win-7, English environment. Not try Operating system in Chinese yet. I know literally nothing about encoding and a comprehensive search online does not save me from this woe. # the code download.file( https://www.google.com.hk/finance/company_news?q=SHA:601857gl=cnnum=200 ,destfile=tmp.txt) test-readLines(tmp.txt,encoding=UTF-8) #the garbled code in tmp.txt and test is like below #��#22269;�۪o�ѵM�a�ѥq�]� Any help is highly appreciated. yong [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple interaction terms in GAMM-model
Thanks for your extended reply. Application of the splines seems to be very plausible. I have now added random effects to the GAMM-model: random= list( objectID= ~1|doy ) (is that defined correct?) But I am wondering how to include both time and space in one te()-function? Or would it be better to not use the by= factor(region), but a special spatial autocorrelation with the x and y-coordinates per objectID? Thus something like: correlation= corAR1( form= ~ objectX + objectY ) Thus resulting in the GAMM-model: model.formula - formula( tau ~ te( x1, doy, bs= c('cr','cc' ) ) + ... + te( x4, doy, bs= c('cr', 'cc' ) ) ) model - gamm( formula= model.formula, random= list( farmID= ~1|dayOfTheYear ), correlation= corAR1( form= ~ farmX+farmY ), control= ctrl, na.action= na.omit ) Concerning the memory, yes this will be an issue. I have a 16 GB server available with 6 processors. Maybe it would be wise to run 4 seperate GAMM-models, i.e. with x1, x2, x3, and x4 seperated. Thanks in advance. Jeroen -- View this message in context: http://r.789695.n4.nabble.com/Multiple-interaction-terms-in-GAMM-model-tp4672297p4672577.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package ridge-how to obtain R squared
Dear all, I'm using package ridge to deal with multicollinearity. It's been convenient to automatically choose lambda. However, how do I tell whether the OLS results have been improved after applying ridge regression? I only notice that more variables become statistically significant and some variables' std.errors have been decreased. What is the code to compute R squared for linearRidge() under package ridge? Thanks a lot! Hermia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replace Na values with the mean of the column which contains them
Hi everyone I have a problem with replacing the NA values with the mean of the column which contains them. If I replace Na with the means of the rest values in the column, the mean of the whole column will be still the same as if I would have omitted NA values. I have the following data de [,1][,2] [,3] [1,] NA -0.26928087 -0.1192078 [2,] NA 1.20925752 0.9325334 [3,] NA 0.38012008 -1.8927164 [4,] NA -0.41778861 1.4330507 [5,] NA -0.49677462 0.2892706 [6,] NA -0.13248754 1.3976522 [7,] NA -0.54179054 0.2295291 [8,] NA 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 and I wrote the code de[which(is.na(de))]-sapply(seq_len(ncol(de)),function(i) {mean(de[,i],na.rm=TRUE)}) I get as the result [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.1376 1.20925752 0.9325334 [3,] -0.13412312 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.1376 -0.49677462 0.2892706 [6,] -0.13412312 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.1376 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 It has replaced the NA values in first column with mean of first column -0.505... and second cell with mean of second column etc. I want to have the result like this: [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.50575168 1.20925752 0.9325334 [3,] -0.50575168 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.50575168 -0.49677462 0.2892706 [6,] -0.50575168 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.50575168 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] discontinous ssa forecast
Hello, I compute a singular spectrum ananlysis of a time series using ssa of the Rssa package. Then I compute the forecast based on the results of the singular spectrum ananlysis (ssa). Here I observe that the original time series and the forecast are discontinous. How can I force the forecast to start at the last value (x,y) of the original time series? This minimal setup should show the (my) problem library(Rssa) md=data.frame(time=1:2000,val=runif(1000)) sdd = ts(md[,2], start=0, freq=1) s-ssa(sdd) f1 - forecast(s,groups=list(1:4),len=60) plot(f1,xlim=c(1950,2100)) I use the latest version of Rssa, R on linux Many greets and TIA ingo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Course R for Beginners, September 3-6, Barcelona, Spain
Dear colleagues: This post is for informing you about some slots available in the course below, which may be of interest of some people in the list. 2 % of each fee will be donated to the R project. Course: R FOR BEGINNERS - Second edition. INSTRUCTORS: Dr. Klaus Langohr (Universitat Politécnica de Catalunya, Spain) and Dr. Joan Valls (Biomedical Research Institute of Lleida, Spain). DATES: September 3-6, 2013; 24 teaching hours. PLACE: Premises of Sabadell of the Institut Català de Paleontologia Miquel Crusafont, Sabadell, Barcelona (Spain). Organized by: Transmitting Science and the Institut Catalá de Paleontologia Miquel Crusafont. More information: http://www.transmittingscience.org/courses/stats/r-for-beginners/ or writing to cour...@transmittingscience.org The aim of this course is to give an introduction to R to people that has never used R . By the end of the course, the participants should be able to do the following in R: Import/export data bases to/from R, manage data sets, carry out basic statistical analysis with R, draw high quality graphs and, program specific functions. Please feel free to distribute this information between your colleagues if you consider it appropriate. With best regards Soledad De Esteban Trivigno, PhD. Course Director Transmitting Science http://www.transmittingscience.org/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace Na values with the mean of the column which contains them
On 29-07-2013, at 18:39, iza.ch1 iza@op.pl wrote: Hi everyone I have a problem with replacing the NA values with the mean of the column which contains them. If I replace Na with the means of the rest values in the column, the mean of the whole column will be still the same as if I would have omitted NA values. I have the following data de [,1][,2] [,3] [1,] NA -0.26928087 -0.1192078 [2,] NA 1.20925752 0.9325334 [3,] NA 0.38012008 -1.8927164 [4,] NA -0.41778861 1.4330507 [5,] NA -0.49677462 0.2892706 [6,] NA -0.13248754 1.3976522 [7,] NA -0.54179054 0.2295291 [8,] NA 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 and I wrote the code de[which(is.na(de))]-sapply(seq_len(ncol(de)),function(i) {mean(de[,i],na.rm=TRUE)}) I get as the result [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.1376 1.20925752 0.9325334 [3,] -0.13412312 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.1376 -0.49677462 0.2892706 [6,] -0.13412312 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.1376 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 It has replaced the NA values in first column with mean of first column -0.505... and second cell with mean of second column etc. I want to have the result like this: [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.50575168 1.20925752 0.9325334 [3,] -0.50575168 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.50575168 -0.49677462 0.2892706 [6,] -0.50575168 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.50575168 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 This seems to do what you want: library(plyr) de.res - t(aaply(de,2,.fun=function(x) {x[which(is.na(x))] - mean(x,na.rm=TRUE);x})) dimnames(de.res) - NULL Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace Na values with the mean of the column which contains them
Dear iza.ch1, I hesitate to say this, because mean imputation is such a bad idea, but it's easy to do what you want with a loop, rather than puzzling over a cleverer way to accomplish the task. Here's an example using the Freedman data set in the car package: colSums(is.na(Freedman)) population nonwhitedensity crime 10 0 10 0 means - colMeans(Freedman, na.rm=TRUE) for (j in 1:ncol(Freedman)){ + Freedman[is.na(Freedman[, j]), j] - means[j] + } colSums(is.na(Freedman)) population nonwhitedensity crime 0 0 0 0 colMeans(Freedman) population nonwhitedensity crime 1135.99000 10.80273 765.67000 2714.08182 means population nonwhitedensity crime 1135.99000 10.80273 765.67000 2714.08182 Now you should probably think about whether you really want to do this... Best, John On Mon, 29 Jul 2013 18:39:48 +0200 iza.ch1 iza@op.pl wrote: Hi everyone I have a problem with replacing the NA values with the mean of the column which contains them. If I replace Na with the means of the rest values in the column, the mean of the whole column will be still the same as if I would have omitted NA values. I have the following data de [,1][,2] [,3] [1,] NA -0.26928087 -0.1192078 [2,] NA 1.20925752 0.9325334 [3,] NA 0.38012008 -1.8927164 [4,] NA -0.41778861 1.4330507 [5,] NA -0.49677462 0.2892706 [6,] NA -0.13248754 1.3976522 [7,] NA -0.54179054 0.2295291 [8,] NA 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 and I wrote the code de[which(is.na(de))]-sapply(seq_len(ncol(de)),function(i) {mean(de[,i],na.rm=TRUE)}) I get as the result [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.1376 1.20925752 0.9325334 [3,] -0.13412312 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.1376 -0.49677462 0.2892706 [6,] -0.13412312 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.1376 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 It has replaced the NA values in first column with mean of first column -0.505... and second cell with mean of second column etc. I want to have the result like this: [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.50575168 1.20925752 0.9325334 [3,] -0.50575168 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.50575168 -0.49677462 0.2892706 [6,] -0.50575168 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.50575168 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace Na values with the mean of the column which contains them
Consider the following: f - function(x){ m - mean(x, na.rm = TRUE) x[is.na(x)] - m x } apply(de, 2, f) HTH, Jorge.- On Tue, Jul 30, 2013 at 2:39 AM, iza.ch1 iza@op.pl wrote: Hi everyone I have a problem with replacing the NA values with the mean of the column which contains them. If I replace Na with the means of the rest values in the column, the mean of the whole column will be still the same as if I would have omitted NA values. I have the following data de [,1][,2] [,3] [1,] NA -0.26928087 -0.1192078 [2,] NA 1.20925752 0.9325334 [3,] NA 0.38012008 -1.8927164 [4,] NA -0.41778861 1.4330507 [5,] NA -0.49677462 0.2892706 [6,] NA -0.13248754 1.3976522 [7,] NA -0.54179054 0.2295291 [8,] NA 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 and I wrote the code de[which(is.na(de))]-sapply(seq_len(ncol(de)),function(i) {mean(de[,i],na.rm=TRUE)}) I get as the result [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.1376 1.20925752 0.9325334 [3,] -0.13412312 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.1376 -0.49677462 0.2892706 [6,] -0.13412312 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.1376 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 It has replaced the NA values in first column with mean of first column -0.505... and second cell with mean of second column etc. I want to have the result like this: [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.50575168 1.20925752 0.9325334 [3,] -0.50575168 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.50575168 -0.49677462 0.2892706 [6,] -0.50575168 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.50575168 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace Na values with the mean of the column which contains them
On 29-07-2013, at 18:39, iza.ch1 iza@op.pl wrote: Hi everyone I have a problem with replacing the NA values with the mean of the column which contains them. If I replace Na with the means of the rest values in the column, the mean of the whole column will be still the same as if I would have omitted NA values. I have the following data de [,1][,2] [,3] [1,] NA -0.26928087 -0.1192078 [2,] NA 1.20925752 0.9325334 [3,] NA 0.38012008 -1.8927164 [4,] NA -0.41778861 1.4330507 [5,] NA -0.49677462 0.2892706 [6,] NA -0.13248754 1.3976522 [7,] NA -0.54179054 0.2295291 [8,] NA 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 and I wrote the code de[which(is.na(de))]-sapply(seq_len(ncol(de)),function(i) {mean(de[,i],na.rm=TRUE)}) I get as the result [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.1376 1.20925752 0.9325334 [3,] -0.13412312 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.1376 -0.49677462 0.2892706 [6,] -0.13412312 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.1376 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 It has replaced the NA values in first column with mean of first column -0.505... and second cell with mean of second column etc. I want to have the result like this: [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.50575168 1.20925752 0.9325334 [3,] -0.50575168 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.50575168 -0.49677462 0.2892706 [6,] -0.50575168 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.50575168 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 or this: apply(de,2, function(x) {x[which(is.na(x))] - mean(x,na.rm=TRUE);x}) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace Na values with the mean of the column which contains them
Hi, de- structure(c(NA, NA, NA, NA, NA, NA, NA, NA, 0.27500571, -3.07568579, -0.42240954, -0.26901731, 0.01766284, -0.8099958, 0.20805934, 0.03036708, -0.26928087, 1.20925752, 0.38012008, -0.41778861, -0.49677462, -0.13248754, -0.54179054, 0.35788624, -0.41467591, -0.59234248, 0.73642396, -0.06768044, -0.40321968, -1.52283305, 0.25974308, -0.0401373, -0.1192078, 0.9325334, -1.8927164, 1.4330507, 0.2892706, 1.3976522, 0.2295291, -0.5009389, -0.342656, -0.8439027, -0.4971999, -1.6127122, -0.6508823, 1.4729576, -1.6093478, 0.1686006 ), .Dim = c(16L, 3L)) Your code should be: sapply(seq_len(ncol(de)),function(i) {de[,i][is.na(de[,i])]-mean(de[,i],na.rm=TRUE);de[,i]}) A.K. Hi everyone I have a problem with replacing the NA values with the mean of the column which contains them. If I replace Na with the means of the rest values in the column, the mean of the whole column will be still the same as if I would have omitted NA values. I have the following data de [,1] [,2] [,3] [1,] NA -0.26928087 -0.1192078 [2,] NA 1.20925752 0.9325334 [3,] NA 0.38012008 -1.8927164 [4,] NA -0.41778861 1.4330507 [5,] NA -0.49677462 0.2892706 [6,] NA -0.13248754 1.3976522 [7,] NA -0.54179054 0.2295291 [8,] NA 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 and I wrote the code de[which(is.na(de))]-sapply(seq_len(ncol(de)),function(i) {mean(de[,i],na.rm=TRUE)}) I get as the result [,1] [,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.1376 1.20925752 0.9325334 [3,] -0.13412312 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.1376 -0.49677462 0.2892706 [6,] -0.13412312 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.1376 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 It has replaced the NA values in first column with mean of first column -0.505... and second cell with mean of second column etc. I want to have the result like this: [,1] [,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.50575168 1.20925752 0.9325334 [3,] -0.50575168 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.50575168 -0.49677462 0.2892706 [6,] -0.50575168 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.50575168 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] add different regression lines for groups on ggplot
Thanks John, yes you are right I have add different smooth statements, here is the code from Dennis for my case: library(ggplot2) ggplot(data = df, aes(x=Var1, y=log(Var2), color=SiteID, group=SiteID)) + geom_point() + geom_smooth(data = subset(df, SiteID != AL3), method='lm', formula= y ~ I(1/x), se=FALSE, size=2) + geom_smooth(data = subset(df, SiteID == AL3), method = lm, formula = y ~ log(x), se = FALSE, size = 2) On Sat, Jul 27, 2013 at 9:14 AM, John Kane jrkrid...@inbox.com wrote: I have not tried anything like that but have a look at www.google.ca/url?sa=trct=jq=esrc=ssource=webcd=3ved=0CDkQFjACurl=http%3A%2F%2Fstackoverflow.com%2Fquestions%2F7476022%2Fgeom-point-and-geom-line-for-multiple-datasets-on-same-graph-in-ggplot2ei=MfHzUej7FoSergG1_ICYAwusg=AFQjCNH2b72a6un_xAM-PYxC-sUGU8-xOwsig2=iBIrl1uhIsJXmPbAh4kUbwbvm=bv.49784469,d.aWM You may be able to use two smooth statements to do what you want. John Kane Kingston ON Canada -Original Message- From: ye...@lbl.gov Sent: Fri, 26 Jul 2013 12:21:23 -0700 To: r-help@r-project.org Subject: [R] add different regression lines for groups on ggplot Hey All, I need to apply different regression lines to different group on my ggplot, and here is the code I use: qplot(x=Var1,y=Var2,data=df,color=SiteID,group=SiteID)+geom_point()+geom_smooth(method='lm',formula=log(y)~I(1/x),se=FALSE,size=2) However the regression for different groups is as below: AL1/AL2: log(y)~I(1/x) AL3: log(y)~log(x) How can I apply each regression equation on the same ggplot? Also I have an issue that if I use the code above, the regression lines are not overlapped on top of my data points. Thanks for your help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5 Capture screenshots, upload images, edit and send them to your friends through IMs, post on Twitter®, Facebook®, MySpace, LinkedIn® FAST! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Customized interpolating spline?
Dear useRs, I have a univariate spline application where I have a fixed set of independent variable values. I would like to be able to generate an interpolating spline, given a set of dependent variables, which can easily return the value of the interpolant over its entire range, for use in an intensive variational problem. I am familiar with stats:spline, stats:splinefun and splines::interpSpline. The reason I cannot simply use those functions is that there are caveats on the interpolant. I would like the degree of the interpolant to vary from cubic over part of the range, while still keeping the maximum orders of derivative continuities possible at the changepoints. In other words, over some parts of the domain the interpolant will be piecewise cubic, while over other parts it might be piecewise linear and/or quadratic. I can kludge some splines::bs() calls and their subsequent predict() calls together, by going under their hoods and manipulating their parts and attributes, then using these as the right hand side of an lm() call. There are a couple of issues with this. First, while I may get this kludge to work at first, under the hood these objects are no longer correctly specified for being proper bs objects. The next application of these functions might fail badly. Second, I do not have a handle on the scope of the variational problem, and thus do not know of a permanent fine grid to use. Therefore this method requires reevaluating the bs() and predict() call for different grids, and thus performing the subsequent kludges numerous times. So may I be privy to efficient ways, if any, to customize an interpolating spline in this way in R? My ideal solution would easily produce a function analogous to the output of stats::splinefun, but which account for the varied orders. Websearching was not helpful. Thanks, John John Szumiloski, Ph.D. Associate Principle Scientist, Biostatistics Biometrics Research WP53B-120 Merck Research Laboratories P.O. Box 0004 West Point, PA 19486-0004 (215) 652-7346 (PH) (215) 993-1835 (FAX) Notice: This e-mail message, together with any attachme...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace Na values with the mean of the column which contains them
Replacements are a case where I think an explicit for-loop is better than sapply or any other *apply function. The for-loop will make the output resemble the output: while sapply and friends will mangle the class, dimnames, and other attributes of the input. Also, if you want to replace the NA's by the mean of the containing row then you have to use t() on sapply's output. E.g. d - cbind(AllNAs=NA, NoNAs=c(i=1,ii=2,iii=3,iv=4,v=5), SomeNAs=rep(c(100,NA),len=5)) f1 - function(de)sapply(seq_len(ncol(de)),function(i) {de[,i][is.na(de[,i])]-mean(de[,i],na.rm=TRUE);de[,i]}) f2 - function(de) { for(i in seq_len(ncol(de))) de[is.na(de[,i]),i] - mean(de[,i], na.rm=TRUE) ; de } str(f1(d)) # no column names num [1:5, 1:3] NaN NaN NaN NaN NaN 1 2 3 4 5 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:5] i ii iii iv ... ..$ : NULL str(f2(d)) num [1:5, 1:3] NaN NaN NaN NaN NaN 1 2 3 4 5 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:5] i ii iii iv ... ..$ : chr [1:3] AllNAs NoNAs SomeNAs df - data.frame(AllNAs=NA, NoNAs=c(i=1,ii=2,iii=3,iv=4,v=5), SomeNAs=rep(c(100+1i,NA),len=5)) str(f1(df)) # matrix of complex, not data.frame cplx [1:5, 1:3] NaN+0i NaN+0i NaN+0i ... str(f2(df)) 'data.frame': 5 obs. of 3 variables: $ AllNAs : num NaN NaN NaN NaN NaN $ NoNAs : num 1 2 3 4 5 $ SomeNAs: cplx 100+1i 100+1i 100+1i ... Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Monday, July 29, 2013 10:58 AM To: iza.ch1 Cc: R help Subject: Re: [R] replace Na values with the mean of the column which contains them Hi, de- structure(c(NA, NA, NA, NA, NA, NA, NA, NA, 0.27500571, -3.07568579, -0.42240954, -0.26901731, 0.01766284, -0.8099958, 0.20805934, 0.03036708, -0.26928087, 1.20925752, 0.38012008, -0.41778861, -0.49677462, -0.13248754, -0.54179054, 0.35788624, -0.41467591, -0.59234248, 0.73642396, -0.06768044, -0.40321968, -1.52283305, 0.25974308, -0.0401373, -0.1192078, 0.9325334, -1.8927164, 1.4330507, 0.2892706, 1.3976522, 0.2295291, -0.5009389, -0.342656, -0.8439027, -0.4971999, -1.6127122, -0.6508823, 1.4729576, -1.6093478, 0.1686006 ), .Dim = c(16L, 3L)) Your code should be: sapply(seq_len(ncol(de)),function(i) {de[,i][is.na(de[,i])]- mean(de[,i],na.rm=TRUE);de[,i]}) A.K. Hi everyone I have a problem with replacing the NA values with the mean of the column which contains them. If I replace Na with the means of the rest values in the column, the mean of the whole column will be still the same as if I would have omitted NA values. I have the following data de [,1] [,2] [,3] [1,] NA -0.26928087 -0.1192078 [2,] NA 1.20925752 0.9325334 [3,] NA 0.38012008 -1.8927164 [4,] NA -0.41778861 1.4330507 [5,] NA -0.49677462 0.2892706 [6,] NA -0.13248754 1.3976522 [7,] NA -0.54179054 0.2295291 [8,] NA 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 and I wrote the code de[which(is.na(de))]-sapply(seq_len(ncol(de)),function(i) {mean(de[,i],na.rm=TRUE)}) I get as the result [,1] [,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.1376 1.20925752 0.9325334 [3,] -0.13412312 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.1376 -0.49677462 0.2892706 [6,] -0.13412312 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.1376 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 It has replaced the NA values in first column with mean of first column -0.505... and second cell with mean of second column etc. I want to have the result like this: [,1] [,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.50575168 1.20925752 0.9325334 [3,] -0.50575168 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.50575168 -0.49677462 0.2892706 [6,] -0.50575168 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.50575168 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122
[R] Greek symbols in study labels and custom summary lines in forest plot (meta)
Dear R helpers, Is there a way to display mathematical notations (e.g. greek characters, subscripts) properly in study (studlab) and group (byvar) labels in a forest plot created using the meta package? #Example: library(meta) logHR - log(runif(10,0.5,2)) selogHR - log(runif(10,0.05,0.2)) study=c(0.1,.2,.3,.4,.5,0.1,.2,.3,.4,.5) group=c(rep('alpha',5),rep('beta',5)) meta1=metagen(logHR, selogHR, sm=HR,studlab=paste(Fixed,expression(beta[w]),study),byvar=group) forest(meta1, print.byvar=F) Question 2 Is there a way to add a line to this plot at my preferred location? For example, I want to add a within-group combined estimate line (the default here is just an overall group line by random or fixed effects). I know I need to use grid.lines, e.g. grid.lines(x = 3, y = c(0.5,1),gp = gpar(col = 5)) But for the life of me I can't work out the co-ordinate system in grid graphics! Thank you for ANY help or tips! I've run out of things to try :( Eleni Eleni Rapsomaniki Research Associate/Statistics, PhD Clinical Epidemiology Group Department of Epidemiology and Public Health University College London Medical School __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] triangular color plot of array
Hello, I've encountered an interesting situation and can't seem to find an applicable solution. I've got a multivariate synthetic dataset I generated in order to explore various statistical techniques. In my dataset I vary three things, sample size, effect size, and the number of variables that are affected. As these are varied I've output my results into a three dimensional array. So for each possible combination, think xyz location, I have an output value. What I would like to do is to create a somewhat unique style of plot very similar to a triangular soil texture plot, excepting that at rather than dropping a point at a given coordinate, I have all possible coordinates on the grid, and I would like to overlay a color map in which combinations that yield high values shade towards one color, and low values another or some other such color scheme. Thus displaying under what conditions certain accuracies are achieved for the test. I've explored both the soil texture plotting solutions in R, and as best I can with little background in image work the 3d plotting solutions offered by various packages. I haven't found anything that seems to be able to handle an array this way. I was wondering if anyone could point me in the right direction. P~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace Na values with the mean of the column which contains them
On Jul 29, 2013, at 9:39 AM, iza.ch1 wrote: Hi everyone I have a problem with replacing the NA values with the mean of the column which contains them. If I replace Na with the means of the rest values in the column, the mean of the whole column will be still the same as if I would have omitted NA values. I have the following data de [,1][,2] [,3] [1,] NA -0.26928087 -0.1192078 [2,] NA 1.20925752 0.9325334 [3,] NA 0.38012008 -1.8927164 [4,] NA -0.41778861 1.4330507 [5,] NA -0.49677462 0.2892706 [6,] NA -0.13248754 1.3976522 [7,] NA -0.54179054 0.2295291 [8,] NA 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 Why not replace with a result that would have both the same mean and standard deviation as the existing data? set.seed(123) de[,1][is.na(de[,1])] - rnorm(sum(is.na(de[,1]), #specify the number of random values mean(de[,1],na.rm=TRUE), sd(de[,1],na.rm=TRUE ) ) ) -- David. and I wrote the code de[which(is.na(de))]-sapply(seq_len(ncol(de)),function(i) {mean(de[,i],na.rm=TRUE)}) I get as the result [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.1376 1.20925752 0.9325334 [3,] -0.13412312 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.1376 -0.49677462 0.2892706 [6,] -0.13412312 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.1376 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 It has replaced the NA values in first column with mean of first column -0.505... and second cell with mean of second column etc. I want to have the result like this: [,1][,2] [,3] [1,] -0.50575168 -0.26928087 -0.1192078 [2,] -0.50575168 1.20925752 0.9325334 [3,] -0.50575168 0.38012008 -1.8927164 [4,] -0.50575168 -0.41778861 1.4330507 [5,] -0.50575168 -0.49677462 0.2892706 [6,] -0.50575168 -0.13248754 1.3976522 [7,] -0.50575168 -0.54179054 0.2295291 [8,] -0.50575168 0.35788624 -0.5009389 [9,] 0.27500571 -0.41467591 -0.3426560 [10,] -3.07568579 -0.59234248 -0.8439027 [11,] -0.42240954 0.73642396 -0.4971999 [12,] -0.26901731 -0.06768044 -1.6127122 [13,] 0.01766284 -0.40321968 -0.6508823 [14,] -0.80999580 -1.52283305 1.4729576 [15,] 0.20805934 0.25974308 -1.6093478 [16,] 0.03036708 -0.04013730 0.1686006 Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] about R stat.table function
Hi R Help group, I try to use stat.table function in my R script, when I run stat.table in R. it shows that No stat.function found, and I try to get help using ?stat.table it shows that No documentation for 'stat.table' in specified packages and libraries: you could try '??stat.table'. it seems no stat.table function in my R. I want to know Do I need to install this function? From which website I could install this function? Could you please guide me how to do. Many thanks, Lan This e-mail message (and any attachments) may contain co...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] split beanplot in ggplot2 - adjusting bandwidth
Dear R Users, I'm attempting to create a split beanplot in ggplot2 and have figured most of it out except how to adjust the bandwidth for the density statistic. I read online that geom_violin will not plot 2 separate sets of data and that geom_ribbon should be used to create a split beanplot. So, I was able to create this as a quick example that does a split beanplot of mpg for 6-cylinder vehicle vs 8 cylinder vehicles: p - ggplot(mtcars) p + geom_ribbon(data=subset(mtcars,mtcars$cyl==6), aes(x=mpg,ymax=..density..,ymin=0),stat=density) + geom_ribbon(data=subset(mtcars,mtcars$cyl==8), aes(x=mpg,ymax=0,ymin=-..density..),stat=density) What I can't figure out is how to adjust the bandwidth (e.g. as with 'adjust' in geom_density) for the density statistic in the ribbon plot to make the density plots either more or less smooth. Or, is there a better way to go about this than what I've currently tried? Any advice you might have on this would be greatly appreciated! Thank you for your help! cheers, Andy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] triangular color plot of array
A triangular plot can represent three dimensions in a two dimensional plane because the three dimensions are constrained by the fact that they sum to 100% (e.g. sand/silt/clay composition of a soil). That does not seem to apply to your example. It sounds like you might need a 3d contour plot. You might look at package misc3d, function countour3d. See the R Graphical Manual for some examples: http://rgm3.lab.nig.ac.jp/RGM/R_image_list?package=misc3d Alternatively you can use lattice to draw 2d contours for a series of slices. That may not be as snazzy, but it will probably be easier to interpret. - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of White, William Patrick Sent: Monday, July 29, 2013 1:54 PM To: r-help@R-project.org Subject: [R] triangular color plot of array Hello, I've encountered an interesting situation and can't seem to find an applicable solution. I've got a multivariate synthetic dataset I generated in order to explore various statistical techniques. In my dataset I vary three things, sample size, effect size, and the number of variables that are affected. As these are varied I've output my results into a three dimensional array. So for each possible combination, think xyz location, I have an output value. What I would like to do is to create a somewhat unique style of plot very similar to a triangular soil texture plot, excepting that at rather than dropping a point at a given coordinate, I have all possible coordinates on the grid, and I would like to overlay a color map in which combinations that yield high values shade towards one color, and low values another or some other such color scheme. Thus displaying under what conditions certain accuracies are achieved for the test. I've explored both the soil texture plotting solutions in R, and as best I can with little background in image work the 3d plotting solutions offered by various packages. I haven't found anything that seems to be able to handle an array this way. I was wondering if anyone could point me in the right direction. P~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cross-correlation with R
Dear R-User, I'm Student at the TU Bergakademie Freiberg and have R used for the first time. I have created cross-correlations of air pressure, outside temperature, temperature laboratory and X-ray radiation intensity. However, I do not know how I interpret the graphs. Can someone help me? best regards Tina Weigel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Jul 26, 2013; 12:34am
Dear Frank, you can use grconvertY and grconvertX to convert from user coordinates to inches etc and vice versa. E.g. lines(rep(.25*1500,2),grconvertY(grconvertY(par(usr)[3],user,inches)+c(0,.1),from=inches,to=user)) should produce a tick of 0.1 inches height, regardless of the actual plot height. Hope this helps. Am 26.07.2013 13:57, schrieb Frank Harrell: Thanks Rich and Jim and apologies for omitting the line x - c(285, 43.75, 94, 150, 214, 375, 270, 350, 41.5, 210, 30, 37.6, 281, 101, 210) But the fundamental problem remains that vertical spacing is not correct unless I waste a lot of image space at the top. Frank -- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Aggregate
Hi, You could try: dat1- read.table(text= ID Group1 1 1 1 0 1 1 1 0 2 1 2 1 2 0 2 1 5 1 5 1 5 1 5 0 ,sep=,header=TRUE) library(plyr) res- ddply(dat1,.(ID),summarize,yes=sum(Group1),no=length(Group1)-sum(Group1)) res # ID yes no #1 1 2 2 #2 2 3 1 #need to check #3 5 3 1 ### #or do.call(rbind,by(dat1$Group1,dat1$ID,table)) # 0 1 #1 2 2 #2 1 3 #5 1 3 #or do.call(rbind,with(dat1,tapply(Group1,ID,FUN=table))) # 0 1 #1 2 2 #2 1 3 #5 1 3 A.K. From: farnoosh sheikhi farnoosh...@yahoo.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Monday, July 29, 2013 4:37 PM Subject: Aggregate Hi Arun, I have a question about aggregation in R. I have the following data set: ID Group1 1 1 1 0 1 1 1 0 2 1 2 1 2 0 2 1 5 1 5 1 5 1 5 0 I want to aggregate the data for each ID to get number of zeros and number of ones. something like the following data sets: ID yes no 1 2 2 2 3 0 5 3 0 I though I can put the number of ones as YES and the number of Zeroes as NO. Thanks a lot. Best,Farnoosh Sheikhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about R stat.table function
Hello, Where did you get that script from? You should ask the person that gave it to you for that missing function, of which we know nothing about. Hope this helps, Rui Barradas Em 29-07-2013 20:23, Gu, LanYing escreveu: Hi R Help group, I try to use stat.table function in my R script, when I run stat.table in R. it shows that No stat.function found, and I try to get help using ?stat.table it shows that No documentation for 'stat.table' in specified packages and libraries: you could try '??stat.table'. it seems no stat.table function in my R. I want to know Do I need to install this function? From which website I could install this function? Could you please guide me how to do. Many thanks, Lan This e-mail message (and any attachments) may contain co...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about R stat.table function
Where did you find out about stat.table()? There is one in package Epi, but who knows if it is the one you are looking for? If you don't know about packages and the library() function, you need to work through a basic tutorial on R. You can't really expect to download a script file and run it without knowing anything about R. The main webpage for R is at http://www.r-project.org/ The official documentation is at http://cran.r-project.org/manuals.html User contributed manuals and tutorials in multiple languages at http://cran.r-project.org/other-docs.html 90 two minute R tutorials at http://www.twotorials.com/ - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Gu, LanYing Sent: Monday, July 29, 2013 2:24 PM To: r-help@r-project.org Subject: [R] about R stat.table function Hi R Help group, I try to use stat.table function in my R script, when I run stat.table in R. it shows that No stat.function found, and I try to get help using ?stat.table it shows that No documentation for 'stat.table' in specified packages and libraries: you could try '??stat.table'. it seems no stat.table function in my R. I want to know Do I need to install this function? From which website I could install this function? Could you please guide me how to do. Many thanks, Lan This e-mail message (and any attachments) may contain\ c...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cross-correlation with R
Homework help is off-topic on this list (see the Posting Guide). You should use the assistance provided by your educational institution. In addition, even if this is not homework, in most cases discussions of theoretical background (interpretation) are off-topic here as well. See stats.stack exchange.com for an example of a forum where such discussions may not be off-topic. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. weige...@mailserver.tu-freiberg.de wrote: Dear R-User, I'm Student at the TU Bergakademie Freiberg and have R used for the first time. I have created cross-correlations of air pressure, outside temperature, temperature laboratory and X-ray radiation intensity. However, I do not know how I interpret the graphs. Can someone help me? best regards Tina Weigel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Aggregate
Or just table(dat1$ID, dat1$Group1) # 0 1 # 1 2 2 # 2 1 3 # 5 1 3 Or xtabs(~ID+Group1, dat1) #Group1 # ID 0 1 # 1 2 2 # 2 1 3 # 5 1 3 Or with labeling dat1$Group1 - factor(dat1$Group1, labels=c(No, Yes)) xtabs(~ID+Group1, dat1) #Group1 # ID No Yes # 1 2 2 # 2 1 3 # 5 1 3 - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Monday, July 29, 2013 3:58 PM To: farnoosh sheikhi Cc: R help Subject: Re: [R] Aggregate Hi, You could try: dat1- read.table(text= ID Group1 1 1 1 0 1 1 1 0 2 1 2 1 2 0 2 1 5 1 5 1 5 1 5 0 ,sep=,header=TRUE) library(plyr) res- ddply(dat1,.(ID),summarize,yes=sum(Group1),no=length(Group1)-s um(Group1)) res # ID yes no #1 1 2 2 #2 2 3 1 #need to check #3 5 3 1 ### #or do.call(rbind,by(dat1$Group1,dat1$ID,table)) # 0 1 #1 2 2 #2 1 3 #5 1 3 #or do.call(rbind,with(dat1,tapply(Group1,ID,FUN=table))) # 0 1 #1 2 2 #2 1 3 #5 1 3 A.K. From: farnoosh sheikhi farnoosh...@yahoo.com Sent: Monday, July 29, 2013 4:37 PM Subject: Aggregate Hi Arun, I have a question about aggregation in R. I have the following data set: ID Group1 1 1 1 0 1 1 1 0 2 1 2 1 2 0 2 1 5 1 5 1 5 1 5 0 I want to aggregate the data for each ID to get number of zeros and number of ones. something like the following data sets: ID yes no 1 2 2 2 3 0 5 3 0 I though I can put the number of ones as YES and the number of Zeroes as NO. Thanks a lot. Best,Farnoosh Sheikhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Aggregate
To add: library(reshape2) dcast(dat1,ID~Group1,length,value.var=Group1) # ID would be a column # ID No Yes #1 1 2 2 #2 2 1 3 #3 5 1 3 A.K. - Original Message - From: David Carlson dcarl...@tamu.edu To: 'arun' smartpink...@yahoo.com; 'farnoosh sheikhi' farnoosh...@yahoo.com Cc: 'R help' r-help@r-project.org Sent: Monday, July 29, 2013 5:28 PM Subject: RE: [R] Aggregate Or just table(dat1$ID, dat1$Group1) # 0 1 # 1 2 2 # 2 1 3 # 5 1 3 Or xtabs(~ID+Group1, dat1) # Group1 # ID 0 1 # 1 2 2 # 2 1 3 # 5 1 3 Or with labeling dat1$Group1 - factor(dat1$Group1, labels=c(No, Yes)) xtabs(~ID+Group1, dat1) # Group1 # ID No Yes # 1 2 2 # 2 1 3 # 5 1 3 - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Monday, July 29, 2013 3:58 PM To: farnoosh sheikhi Cc: R help Subject: Re: [R] Aggregate Hi, You could try: dat1- read.table(text= ID Group1 1 1 1 0 1 1 1 0 2 1 2 1 2 0 2 1 5 1 5 1 5 1 5 0 ,sep=,header=TRUE) library(plyr) res- ddply(dat1,.(ID),summarize,yes=sum(Group1),no=length(Group1)-s um(Group1)) res # ID yes no #1 1 2 2 #2 2 3 1 #need to check #3 5 3 1 ### #or do.call(rbind,by(dat1$Group1,dat1$ID,table)) # 0 1 #1 2 2 #2 1 3 #5 1 3 #or do.call(rbind,with(dat1,tapply(Group1,ID,FUN=table))) # 0 1 #1 2 2 #2 1 3 #5 1 3 A.K. From: farnoosh sheikhi farnoosh...@yahoo.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Monday, July 29, 2013 4:37 PM Subject: Aggregate Hi Arun, I have a question about aggregation in R. I have the following data set: ID Group1 1 1 1 0 1 1 1 0 2 1 2 1 2 0 2 1 5 1 5 1 5 1 5 0 I want to aggregate the data for each ID to get number of zeros and number of ones. something like the following data sets: ID yes no 1 2 2 2 3 0 5 3 0 I though I can put the number of ones as YES and the number of Zeroes as NO. Thanks a lot. Best,Farnoosh Sheikhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about R stat.table function
Hi, Try: library(Epi) stat.table(tension,list(count(),mean(breaks)),data=warpbreaks) # --- # tension count() mean(breaks) #--- #L 18 36.39 #M 18 26.39 #H 18 21.67 #--- A.K. - Original Message - From: Gu, LanYing lanying...@cancercare.on.ca To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, July 29, 2013 3:23 PM Subject: [R] about R stat.table function Hi R Help group, I try to use stat.table function in my R script, when I run stat.table in R. it shows that No stat.function found, and I try to get help using ?stat.table it shows that No documentation for 'stat.table' in specified packages and libraries: you could try '??stat.table'. it seems no stat.table function in my R. I want to know Do I need to install this function? From which website I could install this function? Could you please guide me how to do. Many thanks, Lan This e-mail message (and any attachments) may contain co...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Aggregate
Thanks a lot Arun. I like the do.call command a lot. So easy to use and fast:-) Best,Farnoosh Sheikhi Cc: R help r-help@r-project.org Sent: Monday, July 29, 2013 1:58 PM Subject: Re: Aggregate Hi, You could try: dat1- read.table(text= ID Group1 1 1 1 0 1 1 1 0 2 1 2 1 2 0 2 1 5 1 5 1 5 1 5 0 ,sep=,header=TRUE) library(plyr) res- ddply(dat1,.(ID),summarize,yes=sum(Group1),no=length(Group1)-sum(Group1)) res # ID yes no #1 1 2 2 #2 2 3 1 #need to check #3 5 3 1 ### #or do.call(rbind,by(dat1$Group1,dat1$ID,table)) # 0 1 #1 2 2 #2 1 3 #5 1 3 #or do.call(rbind,with(dat1,tapply(Group1,ID,FUN=table))) # 0 1 #1 2 2 #2 1 3 #5 1 3 A.K. Sent: Monday, July 29, 2013 4:37 PM Subject: Aggregate Hi Arun, I have a question about aggregation in R. I have the following data set: ID Group1 1 1 1 0 1 1 1 0 2 1 2 1 2 0 2 1 5 1 5 1 5 1 5 0 I want to aggregate the data for each ID to get number of zeros and number of ones. something like the following data sets: ID yes no 1 2 2 2 3 0 5 3 0 I though I can put the number of ones as YES and the number of Zeroes as NO. Thanks a lot. Best,Farnoosh Sheikhi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Intersecting two matrices
Dear all, I am interested to know a faster matrix intersection package for R handles intersection of two integer matrices with ncol=2. Currently I am using my homemade code adapted from a previous thread: intersectMat - function(mat1, mat2){#mat1 and mat2 are both deduplicated nr1 - nrow(mat1) nr2 - nrow(mat2) mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ]} which handles: size A= 10578373 size B= 9519807 expected intersecting time= 251.2272 intersecting for corssing MPRs took 409.602 seconds. scale a little bit worse than linearly but atomic operation is not good. Wonder if a super fast C/C++ extension exists for this task. Your ideas are appreciated. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tm (text mining) package persistent storage
Hi, My Corpus is bunch of xml files in a single directory. Each xml files have bunch of documents. I can create a persitent storage using PCorpus constructor by specifying a DIrectorySource. PCorpus writes this as key-value data base. My problem is that next time when I want to start R then I want to read from this persistent storage created in my last session. I don't see a constructor/class in the tm package which will just take this persistent storage as input and initialize itself. Currently, I always have to process my xml corpus directory with each new R session. -- Ashwin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Intersecting two matrices
I haven't looked at the size-time relationship, but im2 (below) is faster than your function on at least one example: intersectMat - function(mat1, mat2) { #mat1 and mat2 are both deduplicated nr1 - nrow(mat1) nr2 - nrow(mat2) mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], , drop=FALSE] } im2 - function(mat1, mat2) { stopifnot(ncol(mat1)==2, ncol(mat1)==ncol(mat2)) toChar - function(twoColMat) paste(sep=\1, twoColMat[,1], twoColMat[,2]) mat1[match(toChar(mat2), toChar(mat1), nomatch=0), , drop=FALSE] } m1 - cbind(1:1e7, rep(1:10, len=1e7)) m2 - cbind(1:1e7, rep(1:20, len=1e7)) system.time(r1 - intersectMat(m1,m2)) user system elapsed 430.371.96 433.98 system.time(r2 - im2(m1,m2)) user system elapsed 27.890.20 28.13 identical(r1, r2) [1] TRUE dim(r1) [1] 500 2 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of c char Sent: Monday, July 29, 2013 4:04 PM To: r-help@r-project.org Subject: [R] Intersecting two matrices Dear all, I am interested to know a faster matrix intersection package for R handles intersection of two integer matrices with ncol=2. Currently I am using my homemade code adapted from a previous thread: intersectMat - function(mat1, mat2){#mat1 and mat2 are both deduplicated nr1 - nrow(mat1) nr2 - nrow(mat2) mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ]} which handles: size A= 10578373 size B= 9519807 expected intersecting time= 251.2272 intersecting for corssing MPRs took 409.602 seconds. scale a little bit worse than linearly but atomic operation is not good. Wonder if a super fast C/C++ extension exists for this task. Your ideas are appreciated. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Declare BASH Array Using R System Function
Thank you. This answers my question. I am using Linux, too. From: arun [smartpink...@yahoo.com] Sent: Monday, 29 July 2013 11:11 PM To: Dario Strbenac Cc: R help Subject: Re: [R] Declare BASH Array Using R System Function Hi, system(names=(X Y); echo ${names[0]}) #sh: 1: Syntax error: ( unexpected #this worked for me: system(bash -c 'names=(X Y); echo ${names[0]}') #X A.K. - Original Message - From: Dario Strbenac dstr7...@uni.sydney.edu.au To: r-help@r-project.org r-help@r-project.org Cc: Sent: Sunday, July 28, 2013 10:00 PM Subject: [R] Declare BASH Array Using R System Function Hello, It is difficult searching for previous posts about this since the keywords are short and ambiguous, so I hope this is not a duplicate question. I can easily declare an array on the command line. $ names=(X Y) $ echo ${names[0]} X I am unable to do the same from within R. system(names=(X Y)) sh: Syntax error: ( unexpected Reading the documentation for the system function, it appears to only be relevant for executing commands. What can I do instead to declare a BASH array ? Thanks. -- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tm (text mining) package persistent storage
On Jul 29, 2013, at 4:47 PM, Ashwani Rao wrote: Hi, My Corpus is bunch of xml files in a single directory. Each xml files have bunch of documents. I can create a persitent storage using PCorpus constructor by specifying a DIrectorySource. PCorpus writes this as key-value data base. My problem is that next time when I want to start R then I want to read from this persistent storage created in my last session. I don't see a constructor/class in the tm package which will just take this persistent storage as input and initialize itself. Identifying exactly what error you are making will require a copy of your history file leading up to saving and loading the data-objects in question. -- David. Currently, I always have to process my xml corpus directory with each new R session. -- Ashwin [[alternative HTML version deleted]] And do read the posting Guide. HTML is deprecated. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] triangular color plot of array
On 07/30/2013 04:54 AM, White, William Patrick wrote: Hello, I've encountered an interesting situation and can't seem to find an applicable solution. I've got a multivariate synthetic dataset I generated in order to explore various statistical techniques. In my dataset I vary three things, sample size, effect size, and the number of variables that are affected. As these are varied I've output my results into a three dimensional array. So for each possible combination, think xyz location, I have an output value. What I would like to do is to create a somewhat unique style of plot very similar to a triangular soil texture plot, excepting that at rather than dropping a point at a given coordinate, I have all possible coordinates on the grid, and I would like to overlay a color map in which combinations that yield high values shade towards one color, and low values another or some other such color scheme. Thus displaying under what conditions certain accuracies are achieved for the test. I've explored both the soil texture plotting solutions in R, and as best I can with little background in image work the 3d plotting solutions offered by various packages. I haven't found anything that seems to be able to handle an array this way. I was wondering if anyone could point me in the right direction. P~ Hi William, If you haven't looked at the triax.fill function in the plotrix package, that might be helpful. I'm not sure what resolution you want on the plot, but triax.fill will display colors down to a few pixels. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.