Re: [R] Anova, F statistics, P-values
Thank you for the replies. I am actually trying to gain p-values and f values, and tried the below script but unsuccessful. 1) I have read in another forum to use the package lmer but apparently it does not exist. 2) Then I tried the pvals.fnc but it is not a function. 3) I read also that a glm model must first be created but was not able to gain a P-value through summary but got F statistics. 4) Is there a way to get the P-value and F statistics? Or does this require two different executions? 5) I am afraid to update my R version because I may loose my ALL saved scripts, should I do so? Please advise, Jean install.packages(lmer) Installing package(s) into ‘/Library/Frameworks/R.framework/Versions/2.13/Resources/library’ (as ‘lib’ is unspecified) Warning in install.packages : package ‘lmer’ is not available (for R version 2.13.1) pvals.fnc(HSuccess ~ VegIndex, data = data.to.analyze) Error: could not find function pvals.fnc Model1.glm - glm(cbind(Shells, TotalEggs-Shells) ~ HTL, data=data.to.analyze, family = binomial) summary(Model1.glm) -- View this message in context: http://r.789695.n4.nabble.com/Anova-tp4645130p4645242.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SPM/SemiPar -- Plotting additive interactions
I'm taking the residual-regression approach to semiparametric estimation (Robinson 1988, Econometrica), and basically using SemiPar simply as a convenient means of doing multivariate nonparamteric additive models. The final bit of code is here: finalfit - spm(res~f(V3,basis=trunc.poly)+f(V5,basis=trunc.poly)+f(V6,basis=trunc.poly)) summary(finalfit) par(mfrow = c(2,2)) plot(finalfit) you can see the plot here: http://i.imgur.com/qaPc8.png V3 is a main effect, V5 and V6 are interactions between dummy variables and V3. What I want to do is somehow combine V3 and V5, and V3 and V6. Put differently, V5 shows the additive effect of the dummy variable to V3 when the dummy equals 1. So V3+V5 shows the effect of interest when the dummy equals 1. I would first need to extract the fitted values for each plot, then simply add them, then add the confidence bands (according to rules for adding gaussians). How would I begin to do this based on the objects that semipar gives me, then how would I plot them? I'm new to R, so I appreciate any help. Thanks, Andrew __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Presence/ absence data from matrix to single column
I've been trying to reshape this database but haven't succeed at it. I tried using loops but can't get it right. I just want to reshape my database from this matrix, to the one below, with only one column of data. Year Route Point Sp1 Sp2 Sp3 2004 123 123-1 0 1 0 2004 123 123-2 0 1 1 2004 123 123-10 1 1 0 Year Route Point 2004 123 123-1 Sp1 0 2004 123 123-2 Sp1 0 2004 123 123-10 Sp1 1 2004 123 123-1 Sp2 1 2004 123 123-2 Sp2 1 2004 123 123-10 Sp2 1 2004 123 123-1 Sp3 0 2004 123 123-2 Sp3 1 2004 123 123-10 Sp3 0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Expected number of events, Andersen-Gill model fit via coxph in package survival
Hello, I am interested in producing the expected number of events, in a recurring events setting. I am using the Andersen-Gill model, as fit by the function coxph in the package survival. I need to produce expected numbers of events for a cohort, cumulatively, at several fixed times. My ultimate goal is: To fit an AG model to a reference sample, then use that fitted model to generate expected numbers of events for a new cohort; then, comparing the expected vs. the observed numbers of events would give us some idea of whether the new cohort differs from the reference one. From my reading of the documentation and the text by Therneau and Grambsch, it seems that the function survexp is what I need. But using it I am not able to obtain expected numbers of events that match reasonably well the observed numbers *even for the same reference population.* So, I think I am misunderstanding something quite badly. Below is an example that illustrates the situation. At the end I include the sessionInfo(). Thank you! Omar. # Example of unexpected behavior in computing estimated number of events # in using package survival for fitting the Andersen-Gill model require(survival) head(bladder2) # this is the data, in interval format # Fit Andersen-Gill model cphfit = coxph(Surv(start,stop,event)~rx+number+size+cluster(id),data=bladder2) # Choose some arbitrary time horizons t.horiz = seq(min(bladder2$start),max(bladder2$stop),length=6) # Compute the cohort expected survival s = survexp(~1,data=bladder2,ratetable=cphfit,times=t.horiz) # This are the expected survival values: s$surv # We are interested in the rate of events e.r = as.vector( 1 - s$surv ) # How does this compare to the actual number of events, cumulative at # each time horizon? observed = numeric(length(t.horiz)) for (i in 1:length(t.horiz)){ observed[i] = sum(bladder2$event[bladder2$stop = t.horiz[i]]) } print(observed) # We would like to compute expected numbers of events that approximately # match these observed values. # We should multiply the expected survival rates by the number of individuals. # Now, one would think that this is the number of at-risk individuals: s$n.risk # But that is actually the total number of rows in the data. In any case, # these numbers do not match: rbind(expected = s$n.risk*e.r,observed=observed) # What if we multiply by the number of individuals? rbind(expected = length(unique(bladder2$id))*e.r,observed=observed) # This does not work either! The required factor seems to be about 133, but # I don't see an explanation for that. # In this example, multiplying by 133.182 gives a good match between observed # and expected values, but in other examples even the shape of the curves # are different. # Multiplying by a number of individuals at risk at each time point # (number of individuals # for which there is a time interval containing the time horizon) does # not work either. # sessionInfo() R version 2.15.1 (2012-06-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] survival_2.36-14 loaded via a namespace (and not attached): [1] tools_2.15.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] geoRglm with factor variable as covariable
On 10/4/2012 10:39 PM, Filoche wrote: Dear R users. I'm trying to fit a generalised linear spatial mode using the geoRglm package. To do so, I'm preparing my data (geodata) as follow: geoData9093 = as.geodata(data9093, coords.col= 17:18, data.col=15,* covar.col=16*) where covar.col is a factor variable (years in this case 90-91-92-93)). Then I run the model as follow: / model.5 = list(cov.pars=c(1,1), cov.model='exponential', beta=1, family=poisson) mcmc.5 = mcmc.control(S.scale = 0.25, n.iter = 3, burn.in=5, thin = 100) #trial error outmcmc.5 = glsm.mcmc(geoData9093, model= model.5, mcmc.input = mcmc.5) mcmcobj.5 = prepare.likfit.glsm(outmcmc.5) lik.5 = likfit.glsm(mcmcobj.5, ini.phi = 0.3, fix.nugget.rel = F)/ And the summary of lik.5 is: likfit.glsm: estimated model parameters: beta sigmasq phi tausq.rel 1.2781 0.5193 0.0977 0.0069 likfit.glsm : maximised log-likelihood = 43.62 I'm fairly new to geostatistics, but I thought using a factor variable as covariable would give me 4 intercepts (beta) as I have 4 levels in my covar. But looking at the summary, we see that I only have 1 beta which is equal to 1.28. I guess I made mistakes in specifying the model description, but I can't find where. Any advices would be welcome. With regards, Phil You may have covariates in your data but your model (model.5) is set up as a model without covariates. You put beta=1, thus, the model is a constant. HTH Ruben -- Ruben H. Roa-Ureta, Ph. D. Senior Scientist Marine Studies Section, Center for Environment and Water, Research Institute, King Fahd University of Petroleum and Minerals, KFUPM Box 1927, Dhahran 31261, Saudi Arabia Office Phone : 966-3-860-7850 Cellular Phone : 966-5-61151014 Save a tree. Don't print this e-mail unless it's really necessary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BCA Package Doubts
Hi .., In create.samples() function how to calculate Estimation,validation,Holdout .I got the answer in this function.But I dont know how the function is working.please Reply this..Iam Waiting for Reply... See this below example and reply me... Example: data(cars) create.samples(cars,est=0.4,val=0.4) [1] Validation HoldoutEstimation Validation Validation [6] Estimation Validation Estimation Estimation Validation [11] Estimation Validation Validation Estimation Estimation [16] Validation Estimation Validation Validation Holdout [21] Validation Validation Estimation HoldoutHoldout ..[50] -- View this message in context: http://r.789695.n4.nabble.com/BCA-Package-Doubts-tp4645247.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
Hi, the first command was bringing the numbers into R directly: * testdata - c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262, 0.0264198664609803, 0.0200581303857603, -0.2971754213679500, -0.2353086361784190, 0.0667195538296534, 0.1755852636926560) mean(testdata) [1] 0.0161584* Here I tried to calculate the mean with the same numbers as given above, but taken from my dataset. * str(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18 mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) [1] 0.0167 * It seems that in the second case he calculates the mean with rounded numbers (0.2 and not 0.20061601085...) Could it be that R imports only the rounded numbers? How can I build a CSV-file with numbers showing all decimal places? Because I think my current CSV-file only has numbers with 2 decimal places. Kind Regards, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645252.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Download limit
Hi all, I am trying to use in RStudio the latest code given in https://github.com/systematicinvestor/SIT/blob/master/R/bt.test.r, which seems to work fine but with the following warning for download limits (one for each of the tickers). I searched in options() something which could be related to this setting, w/o success. Any hint for me in order to raise or remove these limits? Where is this limit set? I am using R 2.15-1 on Rstudio 0.96.331 in W7. Best Andrea tickers = spl('SPY,TLT,GLD,SHY') data - new.env() getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T) environment: 0x0b49ba98 Warnmeldungen: 1: In download.file(paste(yahoo.URL, s=, Symbols.name, a=, from.m, : heruntergeladene Länge 261497 != angegebener Länge 200 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] warning in summary(aov())
Hi R-listers, I am receiving an error - see below. Aeventexhumed is the event in which nesting occured, so it is defined by A, B, C. I thought as a factor was ok, tried to change it to as.character but it still gave me the same error. Is there something I should do about this error or just ignore it? Please advise, Jean summary(aov(EDI ~ HTLIndex + Aeventexhumed + HTLIndex:Aeventexhumed, data=data.to.analyze)) Df Sum Sq Mean Sq F value Pr(F) HTLIndex6 2.435 0.40575 0.2027 0.9752 Aeventexhumed 2 4.652 2.32601 1.1619 0.3172 HTLIndex:Aeventexhumed 11 7.941 0.72192 0.3606 0.9680 Residuals 98 196.193 2.00197 5 observations deleted due to missingness Warning message: In model.matrix.default(mt, mf, contrasts) : variable 'Aeventexhumed' converted to a factor -- View this message in context: http://r.789695.n4.nabble.com/warning-in-summary-aov-tp4645253.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vector is not assigned correctly in for loop
Hi there, Here is a minimum working example: lower = 0 upper = 1 n_bins = 50 interval = (upper - lower) / n_bins bins = vector(mode=numeric, length=n_bins) breaks = seq(from=lower + interval, to=upper, by=interval) for(idx in breaks) { bins[idx / interval] = idx } print(bins) which outputs: [1] 0.02 0.04 0.06 0.08 0.10 0.14 0.00 0.16 0.20 0.00 0.22 0.24 0.26 0.28 [15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54 0.56 [29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84 [43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00 It turns out that some elements are incorrect, such as the 6th element 0.14, which should be 0.12 in fact. Is this a bug or I am missing something? And here is the output of sessionInfo(): R version 2.15.0 (2012-03-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936 [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936 [3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936 [4] LC_NUMERIC=C [5] LC_TIME=Chinese (Simplified)_People's Republic of China.936 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] cubature_1.1-1 tools_2.15.0 Thanks in advance. Regards, Guo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating vegetation distance groups from one column
Hello, My example with 'x' was just that, an example. Inline. Em 06-10-2012 00:03, Jhope escreveu: Hi, I have tried the script posted but received the following errors. I hope I copied it correctly. I'm sorry but I don't know how to alter the script myself. Please advise, Jean x - 0:30 + runif(124) data.to.analyze$VegIndex - cut(x, breaks = seq(0, 35, 5)) Error in `$-.data.frame`(`*tmp*`, VegIndex, value = c(1L, 1L, 1L, 1L, : replacement has 124 rows, data has 123 Use cut(data.to.analize$VegIndex, breaks = seq(0, 35, 5)) l - levels(data.to.analyze$VegIndex) l1 - sub(\\], ), l[1]) l2 - as.numeric(sub(\\(([[:digit:]]+),.*, \\1, l[-1])) + 1 l3 - sub(.*,([[:digit:]]+).*, \\1, l[-1]) l.new - c(l1, paste0((,l2,,,l3, ))) Error: could not find function paste0 paste0 was introduced with R 2.15.0, update your version of R and in the mean time use paste(...etc..., sep = ) Rui Barradas levels(data.to.analyze$VegIndex) - l.new Error: object 'l.new' not found str(data.to.analyze$VegIndex) NULL barplot(table(data.to.analyze$VegIndex)) Error in plot.window(xlim, ylim, log = log, ...) : need finite 'xlim' values In addition: Warning messages: 1: In min(w.l) : no non-missing arguments to min; returning Inf 2: In max(w.r) : no non-missing arguments to max; returning -Inf 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf -- View this message in context: http://r.789695.n4.nabble.com/Creating-vegetation-distance-groups-from-one-column-tp4644970p4645230.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector is not assigned correctly in for loop
On 06-10-2012, at 08:14, 周果 guo.c...@gmail.com wrote: Hi there, Here is a minimum working example: lower = 0 upper = 1 n_bins = 50 interval = (upper - lower) / n_bins bins = vector(mode=numeric, length=n_bins) breaks = seq(from=lower + interval, to=upper, by=interval) for(idx in breaks) { bins[idx / interval] = idx } print(bins) which outputs: [1] 0.02 0.04 0.06 0.08 0.10 0.14 0.00 0.16 0.20 0.00 0.22 0.24 0.26 0.28 [15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54 0.56 [29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84 [43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00 It turns out that some elements are incorrect, such as the 6th element 0.14, which should be 0.12 in fact. And the 7th is also incorrect. Is this a bug or I am missing something? It is not a bug in R. Yes you are indeed missing something. Read R FAQ 7.31. Answer is: floating point inaccuracy. Insert print(formatC(idx/interval,format=f,digits=17)) print(as.integer(idx/interval)) immediately after the opening { of the for loop. If you insist on copying breaks to bins in the way you are doing you could use round(idx/interval,3) for example. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector is not assigned correctly in for loop
Forgot to cc the list. RMW On Sat, Oct 6, 2012 at 11:29 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: A case study of a good question! Would that all posters did such a good job. On Sat, Oct 6, 2012 at 7:14 AM, 周果 guo.c...@gmail.com wrote: Hi there, Here is a minimum working example: lower = 0 upper = 1 n_bins = 50 interval = (upper - lower) / n_bins bins = vector(mode=numeric, length=n_bins) breaks = seq(from=lower + interval, to=upper, by=interval) for(idx in breaks) { bins[idx / interval] = idx } Note that this could slightly move idiomatically be done as bins[breaks / interval] = breaks print(bins) which outputs: [1] 0.02 0.04 0.06 0.08 0.10 0.14 0.00 0.16 0.20 0.00 0.22 0.24 0.26 0.28 [15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54 0.56 [29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84 [43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00 It turns out that some elements are incorrect, such as the 6th element 0.14, which should be 0.12 in fact. Is this a bug or I am missing something? Take a look at as.integer(breaks / interval) You're hitting up on floating-point issues (see the link in R FAQ 7.31 for the definitive reference, but it's a large and complicated field with many little manifestations like this) What's basically happening is that the 7 you see in breaks / interval, is actually 6. (or so) which gets printed as a 7 by print() but truncated to a 6 for subsetting as mentioned in ?`[`. If you were to turn on more digits for printing, you'd see it's not really a 7. You'd probably rather have bins[round(breaks / interval)] = breaks Cheers and thanks again for spending so much time to make a good question, Michael And here is the output of sessionInfo(): R version 2.15.0 (2012-03-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936 [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936 [3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936 [4] LC_NUMERIC=C [5] LC_TIME=Chinese (Simplified)_People's Republic of China.936 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] cubature_1.1-1 tools_2.15.0 Thanks in advance. Regards, Guo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download limit
On Sat, Oct 6, 2012 at 8:19 AM, agiani99 agian...@hotmail.com wrote: Hi all, I am trying to use in RStudio the latest code given in https://github.com/systematicinvestor/SIT/blob/master/R/bt.test.r, which seems to work fine but with the following warning for download limits (one for each of the tickers). I searched in options() something which could be related to this setting, w/o success. Any hint for me in order to raise or remove these limits? Where is this limit set? I am using R 2.15-1 on Rstudio 0.96.331 in W7. Best Andrea I don't believe this is an R or an RStudio problem as much as it is a connectivity problem. I'd be willing to guess you're behind a firewall of some sort? Cheers, Michael tickers = spl('SPY,TLT,GLD,SHY') data - new.env() getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T) environment: 0x0b49ba98 Warnmeldungen: 1: In download.file(paste(yahoo.URL, s=, Symbols.name, a=, from.m, : heruntergeladene Länge 261497 != angegebener Länge 200 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] warning in summary(aov())
On Sat, Oct 6, 2012 at 9:12 AM, Jhope jeanwaij...@gmail.com wrote: Hi R-listers, I am receiving an error - see below. Aeventexhumed is the event in which nesting occured, so it is defined by A, B, C. I thought as a factor was ok, tried to change it to as.character but it still gave me the same error. Is there something I should do about this error or just ignore it? Please advise, Jean summary(aov(EDI ~ HTLIndex + Aeventexhumed + HTLIndex:Aeventexhumed, data=data.to.analyze)) Df Sum Sq Mean Sq F value Pr(F) HTLIndex6 2.435 0.40575 0.2027 0.9752 Aeventexhumed 2 4.652 2.32601 1.1619 0.3172 HTLIndex:Aeventexhumed 11 7.941 0.72192 0.3606 0.9680 Residuals 98 196.193 2.00197 5 observations deleted due to missingness Warning message: In model.matrix.default(mt, mf, contrasts) : variable 'Aeventexhumed' converted to a factor I think you should have only seen this when Aeventexhumed was a character -- it's nothing to worry about, just letting you know a factor conversion had to happen (which is almost surely what you wanted). If you see this when Aeventexhumed is a factor already, that's somewhat surprising. What does str(data.to.analyze) show you? Michael -- View this message in context: http://r.789695.n4.nabble.com/warning-in-summary-aov-tp4645253.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector is not assigned correctly in for loop
Hello, This seems to be a case for FAQ 7.31 Why doesn't R think these numbers are equal? See this example: 3/5 - 1/5 - 2/5 # not zero 3/5 - (1/5 + 2/5) # not zero, different from above In your case, try for(idx in breaks){ print(idx / interval, digits = 16) # see problem indices bins[idx / interval] = idx } b2 - breaks identical(bins, b2) # FALSE What happens is that instead of 7, the value of idx/interval is 6.999 with integer part 6. So bins[6] is assigned twice, first 1.2 then this valuew is overwritten by 1.4 and bins[7] is never written to. The same goes with indices 9 and 10. Avoid this type of indexing. And if possible use the vectorized instruction b2 - breaks. Hope this helps, Rui Barradas Em 06-10-2012 07:14, 周果 escreveu: Hi there, Here is a minimum working example: lower = 0 upper = 1 n_bins = 50 interval = (upper - lower) / n_bins bins = vector(mode=numeric, length=n_bins) breaks = seq(from=lower + interval, to=upper, by=interval) for(idx in breaks) { bins[idx / interval] = idx } print(bins) which outputs: [1] 0.02 0.04 0.06 0.08 0.10 0.14 0.00 0.16 0.20 0.00 0.22 0.24 0.26 0.28 [15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54 0.56 [29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84 [43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00 It turns out that some elements are incorrect, such as the 6th element 0.14, which should be 0.12 in fact. Is this a bug or I am missing something? And here is the output of sessionInfo(): R version 2.15.0 (2012-03-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936 [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936 [3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936 [4] LC_NUMERIC=C [5] LC_TIME=Chinese (Simplified)_People's Republic of China.936 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] cubature_1.1-1 tools_2.15.0 Thanks in advance. Regards, Guo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
On Sat, Oct 6, 2012 at 9:11 AM, fxen3k f.seha...@gmail.com wrote: Hi, the first command was bringing the numbers into R directly: * testdata - c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262, 0.0264198664609803, 0.0200581303857603, -0.2971754213679500, -0.2353086361784190, 0.0667195538296534, 0.1755852636926560) mean(testdata) [1] 0.0161584* Here I tried to calculate the mean with the same numbers as given above, but taken from my dataset. * str(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18 mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) [1] 0.0167 * It seems that in the second case he calculates the mean with rounded numbers (0.2 and not 0.20061601085...) Could it be that R imports only the rounded numbers? How can I build a CSV-file with numbers showing all decimal places? Because I think my current CSV-file only has numbers with 2 decimal places. That's something you need to figure out with whatever software is writing the csv. Cheers, Michael Kind Regards, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645252.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] svyhist
?ylim says numeric vectors of length 2 - so just the beginning and end. ?svyhist doesn't specifically mention the ylim parameter, meaning you should look for a ... in the arguments list and click through to the page for ?hist ?hist has an example that shows the ylim parameter only containing the beginning and end values. try using ylim = c( 0 , 0.030 ) if you're looking to set the tick marks, look at ?axis ;) On Fri, Oct 5, 2012 at 11:18 PM, Muhuri, Pradip (SAMHSA/CBHSQ) pradip.muh...@samhsa.hhs.gov wrote: Dear Anthony and David, Sorry- the earlier-sent plots were mislabeled, which I have corrected and attached. But, the y-lim issue is yet to be resolved. Thanks, Pradip Muhuri From: Anthony Damico [ajdam...@gmail.com] Sent: Friday, October 05, 2012 7:29 PM To: David Winsemius Cc: Muhuri, Pradip (SAMHSA/CBHSQ); R help Subject: Re: [R] svyhist this worked for me -- and doesn't require removing the PSUs from the design :) options( survey.lonely.psu = adjust ) svyhist (~dthage, subset (nhis, xspd2=='No SPD'), breaks=MyBreaks, main= , col=grey80, xlab=Age at Death Distribution ) lines (svysmooth(~dthage, bandwidth=5,subset(nhis, xspd2=='No SPD')), lwd=2) Dr. Lumley has written quite a bit about single-PSU strata here: http://faculty.washington.edu/tlumley/survey/exmample-lonely.html On Fri, Oct 5, 2012 at 7:16 PM, David Winsemius dwinsem...@comcast.net mailto:dwinsem...@comcast.net wrote: On Oct 5, 2012, at 3:33 PM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote: Hello, I was trying to draw histograms of age at death and got the following 2 error messages: 1) Error in tapply(1:NROW(x), list(factor(strata)), function(index) { : arguments must have same length This is the top of the output of str applied to the data argument you offered to svyhist: str(subset (nhis, xspd2==2) ) List of 9 $ cluster :'data.frame': 0 obs. of 1 variable: ..$ psu: Factor w/ 47 levels 109.1,115.2,..: ..- attr(*, terms)=Classes 'terms', 'formula' length 2 ~psu .. .. ..- attr(*, variables)= language list(psu) .. .. ..- attr(*, factors)= int [1, 1] 1 .. .. .. ..- attr(*, dimnames)=List of 2 .. .. .. .. ..$ : chr psu .. .. .. .. ..$ : chr psu At least one problem seems pretty clear. No data. That can be corrected by wrapping as.numeric() around the factor on which you are subsetting in two places. Another problem may arise when you restrict to one class only, namely there won't any design to work with. All the clusters there would be only one no longer have any multiplicity, and svyhist apparently isn't built to handle situation, at least with that design argument. Error in onestrat(x[index, , drop = FALSE], clusters[index], nPSU[index][1], : Stratum (2) has only one PSU at stage 1 Taking the 'stratum' argument out of the design() spec allows it to proceed, but I do not know if that is introducing invalidity in the analysis. -- David. 2) Error in findInterval(mm[, i], gx) : 'vec' contains NAs In addition: Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf I would appreciate if someone could help me resolve these issues. Below is reproducible example. Thanks, Pradip Muhuri setwd (E:/RDATA) options(width = 120) library (survey) library (KernSmooth) xd1 - dthage ypll_75 xspd2 psu stratum wt8 56 19 2 2 33 1512.7287 86 0 2 2 129 1830.6400 81 0 2 1 67 536.1400 47 28 2 1 17 519.8350 71 4 1 1 225 254.4087 72 3 1 1 238 424.4787 75 0 2 2 115 407.0987 83 0 2 2 46 622.5137 79 -4 2 1 300 509.1212 78 -3 2 1 133 517.3325 71 4 2 2 328 1179.3063 64 11 2 1 2 301.5250 78 -3 2 1 62 253.9025 65 10 2 2 260 932.6575 75 0 2 1 247 145.5900 63 12 2 2 156 247.0650 71 4 2 1 146 829.4787 76 -1 2 2 234 432.5437 76 0 2 1 109 859.6888 68 7 2 1 228 1236.2975 64 11 2 2 167 347.5788 62 13 2 2 312 354.0500 77 0 2 2 275 882.1938 78 -3 2 1 28 481.5975 81 0 2 1 180 1285.5425 79 0 2 2 205 576. 70 5 2 1 173 128.3725 75 0 2 2 189 359.3863 78 0 2 1 332 512.8062 74 1 2 2 14 449.0800 77 0 2 1 242 283.0013 92 0 2 1 152 915.3200 69 6 2
Re: [R] arrange data
Hello, Using Arun's data example, instead of creating a factor convert to 4 digits years. set.seed(1) dat1 - data.frame(Tahun=rep(c(98:99,00),each=36), Bahun=rep(rep(1:12,times=3),each=3), x=sample(1:500,108,replace=TRUE)) dat2 - dat1 # operate on a copy dat2$Tahun - with(dat2, ifelse(Tahun 71, 2000 + Tahun, 1900 + Tahun)) agg_dt1 - aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum) head(agg_dt1) Hope this helps, Rui Barradas Em 06-10-2012 03:38, arun escreveu: Hi, I hope this helps you. I created a small dataset: 3 replications per month for 1998:2000. set.seed(1) dat1-data.frame(Tahun=rep(c(98:99,00),each=36),Bahun=rep(rep(1:12,times=3),each=3), x=sample(1:500,108,replace=TRUE)) dat2-within(dat1,{Tahun-factor(Tahun,levels=c(98,99,0))}) agg_dt1-aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum) head(agg_dt1) # Tahun Bahunx #198 1 1252 #299 1 680 #3 0 1 687 #498 2 761 #599 2 860 #6 0 2 786 I guess this is what you wanted. In addition, you can also use ddply() with a different way of grouping: but with the same result. library(plyr) dd_dt1-ddply(dat2,.(Tahun,Bahun),summarize, sum(x)) head(dd_dt1) # Tahun Bahun ..1 #198 1 1252 #298 2 761 #398 3 440 #498 4 597 #598 5 987 #698 6 692 tail(dd_dt1) # Tahun Bahun ..1 #31 0 7 685 #32 0 8 504 #33 0 9 633 #34 010 553 #35 011 914 #36 012 1039 A.K. - Original Message - From: Roslina Zakaria zrosl...@yahoo.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 5, 2012 8:09 PM Subject: [R] arrange data Dear r-users, I have dailly rainfall data from year 1971 to 2000. I use aggregate to form monthly rainfall data. What I don't understand is that the data for the year 2000 become on the top, instead of year 1971. Here are some codes and output: agg_dt1 - aggregate(x=dt1[,4],by=dt1[,c(1,2)],FUN=sum) head(agg_dt1,20); tail(agg_dt1,20) Tahun Bulan x 1 0 1 398.6 2 71 1 934.9 3 72 1 107.2 4 73 1 236.4 5 74 1 10.5 6 75 1 744.6 7 76 1 9.2 8 77 1 108.7 9 78 1 251.5 1079 1 197.3 1180 1 144.1 1281 1 104.5 1382 1 17.7 1483 1 151.8 ... Thank you so much for your help. Roslina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dúvida função Anova pacote car - Medidas repetidas
Hello, Yes, your Spanish is close enough to Portuguese for you to understand it. I thought it was homework and didn't read untill the end. Apologies to Diego, and thanks to John. Rui Barradas Em 05-10-2012 22:48, John Fox escreveu: Dear Diego, This is close enough to Spanish for me to understand it (I think). Using Anova() in the car package for repeated-measures designs requires a multivariate linear model for all of the responses, which in turn requires that the data set be in wide format, with each response as a variable. In your case, there are two crossed within-subjects factors and no between-subjects factors. If this understanding is correct (but see below), then you could proceed as follows, where the crucial step is reshaping the data from long to wide: - snip -- Pa2$type.day - with(Pa2, paste(Type, Day, sep=.)) (Wide - reshape(Pa2, direction=wide, v.names=logbiovolume, idvar=Replicate, timevar=type.day, drop=c(Type, Day))) day - ordered(rep(c(0, 2, 4), each=2)) type - factor(rep(c(c, t), 3)) (idata - data.frame(day, type)) mod - lm(cbind(logbiovolume.c.0, logbiovolume.t.0, logbiovolume.c.2, logbiovolume.t.2, logbiovolume.c.4, logbiovolume.t.4) ~ 1, data=Wide) Anova(mod, idata=idata, idesign=~day*type) - snip -- This serves to analyze the data that you showed; you'll have to adapt it for the full data set. I'm assuming that the replicates are independent units, and that the design is therefore entirely within replicate. If that's wrong, then the analysis I've suggested is also incorrect. I hope this helps, John --- John Fox Senator McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Diego Pujoni Sent: Friday, October 05, 2012 9:57 AM To: r-help@r-project.org Subject: [R] Dúvida função Anova pacote car - Medidas repetidas Ola pessoal, estou realizando uma ANOVA com medidas repetidas e estou utilizando a fungco Anova do pacote car. Medi o biovolume de algas a cada dois dias durante 10 dias (no banco de dados abaixo ss coloquei ati o 40 dia). Tenho 2 tratamentos (c,t) e o experimento foi realizado em triplicas (A,B,C). Pa2 Day Type Replicate logbiovolume 10c A19.34 20c B18.27 30c C18.56 40t A18.41 50t B18.68 60t C18.86 72c A18.81 82c B18.84 92c C18.52 10 2t A18.29 11 2t B17.91 12 2t C17.67 13 4c A19.16 14 4c B18.85 15 4c C19.36 16 4t A19.05 17 4t B19.09 18 4t C18.26 . . . Pa2.teste = within(Pa2,{group = factor(Type) time = factor(Day) id = factor(Replicate)}) matrix = with(Pa2.teste,cbind(Pa2[,VAR][group==c],Pa2[,VAR][group==t])) matrix [,1] [,2] [1,] 19.34 18.41 [2,] 18.27 18.68 [3,] 18.56 18.86 [4,] 18.81 18.29 [5,] 18.84 17.91 [6,] 18.52 17.67 [7,] 19.16 19.05 [8,] 18.85 19.09 [9,] 19.36 18.26 [10,] 19.63 18.96 [11,] 19.94 18.06 [12,] 19.54 18.37 [13,] 19.98 17.96 [14,] 20.99 17.93 [15,] 20.45 17.74 [16,] 21.12 17.60 [17,] 21.66 17.33 [18,] 21.51 18.12 model - lm(matrix ~ 1) design - factor(c(c,t)) options(contrasts=c(contr.sum, contr.poly)) aov - Anova(model, idata=data.frame(design), idesign=~design, type=III) summary(aov, multivariate=F) Univariate Type III Repeated-Measures ANOVA Assuming Sphericity SS num Df Error SS den Df FPr(F) (Intercept) 12951.2 1 6.3312 17 34775.336 2.2e-16 *** design 19.1 1 17.3901 1718.697 0.0004606 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 O problema i que eu acho que esta fungco nco esta levando em consideragco os dias, nem as riplicas. Como fago para introduzir isto na analise. Vocjs conhecem alguma fungco correspondente nco paramitrica para este teste? Tipo um teste de Friedman com dois grupos (tratamento e riplica) e um bloco (tempo)? Muito Obrigado Diego PJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Dúvida função Anova pacote car - Medidas repetidas
Sorry, Phone, daughter, forgot to sign. Rui Barradas Em 06-10-2012 12:28, Rui Barradas escreveu: Hello, Yes, your Spanish is close enough to Portuguese for you to understand it. I thought it was homework and didn't read untill the end. Apologies to Diego, and thanks to John. Rui Barradas Em 05-10-2012 22:48, John Fox escreveu: Dear Diego, This is close enough to Spanish for me to understand it (I think). Using Anova() in the car package for repeated-measures designs requires a multivariate linear model for all of the responses, which in turn requires that the data set be in wide format, with each response as a variable. In your case, there are two crossed within-subjects factors and no between-subjects factors. If this understanding is correct (but see below), then you could proceed as follows, where the crucial step is reshaping the data from long to wide: - snip -- Pa2$type.day - with(Pa2, paste(Type, Day, sep=.)) (Wide - reshape(Pa2, direction=wide, v.names=logbiovolume, idvar=Replicate, timevar=type.day, drop=c(Type, Day))) day - ordered(rep(c(0, 2, 4), each=2)) type - factor(rep(c(c, t), 3)) (idata - data.frame(day, type)) mod - lm(cbind(logbiovolume.c.0, logbiovolume.t.0, logbiovolume.c.2, logbiovolume.t.2, logbiovolume.c.4, logbiovolume.t.4) ~ 1, data=Wide) Anova(mod, idata=idata, idesign=~day*type) - snip -- This serves to analyze the data that you showed; you'll have to adapt it for the full data set. I'm assuming that the replicates are independent units, and that the design is therefore entirely within replicate. If that's wrong, then the analysis I've suggested is also incorrect. I hope this helps, John --- John Fox Senator McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Diego Pujoni Sent: Friday, October 05, 2012 9:57 AM To: r-help@r-project.org Subject: [R] Dúvida função Anova pacote car - Medidas repetidas Ola pessoal, estou realizando uma ANOVA com medidas repetidas e estou utilizando a fungco Anova do pacote car. Medi o biovolume de algas a cada dois dias durante 10 dias (no banco de dados abaixo ss coloquei ati o 40 dia). Tenho 2 tratamentos (c,t) e o experimento foi realizado em triplicas (A,B,C). Pa2 Day Type Replicate logbiovolume 10c A19.34 20c B18.27 30c C18.56 40t A18.41 50t B18.68 60t C18.86 72c A18.81 82c B18.84 92c C18.52 10 2t A18.29 11 2t B17.91 12 2t C17.67 13 4c A19.16 14 4c B18.85 15 4c C19.36 16 4t A19.05 17 4t B19.09 18 4t C18.26 . . . Pa2.teste = within(Pa2,{group = factor(Type) time = factor(Day) id = factor(Replicate)}) matrix = with(Pa2.teste,cbind(Pa2[,VAR][group==c],Pa2[,VAR][group==t])) matrix [,1] [,2] [1,] 19.34 18.41 [2,] 18.27 18.68 [3,] 18.56 18.86 [4,] 18.81 18.29 [5,] 18.84 17.91 [6,] 18.52 17.67 [7,] 19.16 19.05 [8,] 18.85 19.09 [9,] 19.36 18.26 [10,] 19.63 18.96 [11,] 19.94 18.06 [12,] 19.54 18.37 [13,] 19.98 17.96 [14,] 20.99 17.93 [15,] 20.45 17.74 [16,] 21.12 17.60 [17,] 21.66 17.33 [18,] 21.51 18.12 model - lm(matrix ~ 1) design - factor(c(c,t)) options(contrasts=c(contr.sum, contr.poly)) aov - Anova(model, idata=data.frame(design), idesign=~design, type=III) summary(aov, multivariate=F) Univariate Type III Repeated-Measures ANOVA Assuming Sphericity SS num Df Error SS den Df F Pr(F) (Intercept) 12951.2 1 6.3312 17 34775.336 2.2e-16 *** design 19.1 1 17.3901 1718.697 0.0004606 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 O problema i que eu acho que esta fungco nco esta levando em consideragco os dias, nem as riplicas. Como fago para introduzir isto na analise. Vocjs conhecem alguma fungco correspondente nco paramitrica para este teste? Tipo um teste de Friedman com dois grupos (tratamento e riplica) e um bloco (tempo)? Muito Obrigado Diego PJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org
Re: [R] Download limit
Hi Michael, I am not under firewall, but I noticed that when I setInternet2=FALSE the problem disappears. SetInternet2=TRUE is required to download Systematic Investor Toolbox (SIT). I don't know why or whether it makes sense, but yes it seems a connection problem and no it seems to have something to do with R or RStudio. Thanks for your time, though. Best Andrea Am 06.10.2012 12:31, schrieb R. Michael Weylandt: On Sat, Oct 6, 2012 at 8:19 AM, agiani99 agian...@hotmail.com wrote: Hi all, I am trying to use in RStudio the latest code given in https://github.com/systematicinvestor/SIT/blob/master/R/bt.test.r, which seems to work fine but with the following warning for download limits (one for each of the tickers). I searched in options() something which could be related to this setting, w/o success. Any hint for me in order to raise or remove these limits? Where is this limit set? I am using R 2.15-1 on Rstudio 0.96.331 in W7. Best Andrea I don't believe this is an R or an RStudio problem as much as it is a connectivity problem. I'd be willing to guess you're behind a firewall of some sort? Cheers, Michael tickers = spl('SPY,TLT,GLD,SHY') data - new.env() getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T) environment: 0x0b49ba98 Warnmeldungen: 1: In download.file(paste(yahoo.URL, s=, Symbols.name, a=, from.m, : heruntergeladene Länge 261497 != angegebener Länge 200 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LaTeX consistent publication graphics from R and Comparison of GLE and R
Hi Marc, It would be interesting to compare with tikz for ease of use. As an aside I've been wishing that someone would write an R function for creating clinical trial disposition charts using tikz or pstricks ... Best, Frank Marc Schwartz-3 wrote On Oct 5, 2012, at 3:32 PM, clangkamp lt; christian.langkamp@ gt; wrote: Hi Everyone I am at the moment preparing my thesis and am looking at producing a few Organigrams / Flow charts (unrelated to the calculations in R) as well as a range of charts (barcharts, histograms, ...) based on calculations in R. For the Organigrams I am looking at an Opensource package called GLE at sourceforge, which produces the text part in Latex figures which is very neat and also in the same style of the thesis, which I wrote in LaTeX. It also offers a range of graphical features, and I am quite tempted. It also produces barcharts and histograms with the options of legends etc. I have done most of my graphs so far with R, but with Organigrams and flow charts I am at a loss (A pointer here would also be very welcome). For some charts I have used MS Visio, but it would be convenient to use just one program for graphing throughout the thesis (i.e. same colour coding etc.). Does anybody have any experience with GLE, ideally working with it with CSV tables generated within R ? Or does there exist another way to generate 'visually LaTeX consistent' graphics within R ? Any takers ? If you are comfortable in LaTeX, I would suggest that you look at PSTricks: http://tug.org/PSTricks/main.cgi I use that for creating subject disposition flow charts for clinical trials with Sweave. I can then use \Sexpr{}'s to fill in various annotations in the boxes, etc. so that all content is programmatically created in a reproducible fashion. There are some examples of flow charts and tree diagrams here: http://tug.org/PSTricks/main.cgi?file=pst-node/psmatrix/psmatrix#flowchart and there are various other online resources for using PSTricks. Keep in mind that since this is PostScript based, you need to use a latex + dvips + ps2pdf sequence, rather than just pdflatex. Regards, Marc Schwartz __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/LaTeX-consistent-publication-graphics-from-R-and-Comparison-of-GLE-and-R-tp4645218p4645269.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector is not assigned correctly in for loop
But the OP should not be doing this **at all.** He apparently has not bothered to read the Intro to R tutorial as he appears not to know about vectorized calculations. -- Bert On Sat, Oct 6, 2012 at 3:29 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Forgot to cc the list. RMW On Sat, Oct 6, 2012 at 11:29 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: A case study of a good question! Would that all posters did such a good job. n Sat, Oct 6, 2012 at 7:14 AM, 周果 guo.c...@gmail.com wrote: Hi there, Here is a minimum working example: lower = 0 upper = 1 n_bins = 50 interval = (upper - lower) / n_bins bins = vector(mode=numeric, length=n_bins) breaks = seq(from=lower + interval, to=upper, by=interval) for(idx in breaks) { bins[idx / interval] = idx } Note that this could slightly move idiomatically be done as bins[breaks / interval] = breaks print(bins) which outputs: [1] 0.02 0.04 0.06 0.08 0.10 0.14 0.00 0.16 0.20 0.00 0.22 0.24 0.26 0.28 [15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54 0.56 [29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84 [43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00 It turns out that some elements are incorrect, such as the 6th element 0.14, which should be 0.12 in fact. Is this a bug or I am missing something? Take a look at as.integer(breaks / interval) You're hitting up on floating-point issues (see the link in R FAQ 7.31 for the definitive reference, but it's a large and complicated field with many little manifestations like this) What's basically happening is that the 7 you see in breaks / interval, is actually 6. (or so) which gets printed as a 7 by print() but truncated to a 6 for subsetting as mentioned in ?`[`. If you were to turn on more digits for printing, you'd see it's not really a 7. You'd probably rather have bins[round(breaks / interval)] = breaks Cheers and thanks again for spending so much time to make a good question, Michael And here is the output of sessionInfo(): R version 2.15.0 (2012-03-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936 [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936 [3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936 [4] LC_NUMERIC=C [5] LC_TIME=Chinese (Simplified)_People's Republic of China.936 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] cubature_1.1-1 tools_2.15.0 Thanks in advance. Regards, Guo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Presence/ absence data from matrix to single column
I've been trying to reshape this database but haven't succeed at it. I tried using loops but can't get it right. I just want to reshape my database from this matrix, to the one below, with only one column of data. YearRoute Point Sp1 Sp2 Sp3 2004123 123-1 0 1 0 2004123 123-2 0 1 1 2004123 123-10 1 1 0 What I want: YearRoute Point 2004123 123-1 Sp1 0 2004123 123-2 Sp1 0 2004123 123-10 Sp1 1 2004123 123-1 Sp2 1 2004123 123-2 Sp2 1 2004123 123-10 Sp2 1 2004123 123-1 Sp3 0 2004123 123-2 Sp3 1 2004123 123-10 Sp3 0 -- View this message in context: http://r.789695.n4.nabble.com/Presence-absence-data-from-matrix-to-single-column-tp4645271.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Presence/ absence data from matrix to single column
Hi, Try this: dat1-read.table(text= Year Route Point Sp1 Sp2 Sp3 2004 123 123-1 0 1 0 2004 123 123-2 0 1 1 2004 123 123-10 1 1 0 ,header=TRUE,sep=,stringsAsFactors=FALSE) library(reshape) melt(dat1,id=c(Year,Route,Point)) Year Route Point variable value 1 2004 123 123-1 Sp1 0 2 2004 123 123-2 Sp1 0 3 2004 123 123-10 Sp1 1 4 2004 123 123-1 Sp2 1 5 2004 123 123-2 Sp2 1 6 2004 123 123-10 Sp2 1 7 2004 123 123-1 Sp3 0 8 2004 123 123-2 Sp3 1 9 2004 123 123-10 Sp3 0 A.K. - Original Message - From: agoijman agoij...@cnia.inta.gov.ar To: r-help@r-project.org Cc: Sent: Saturday, October 6, 2012 11:03 AM Subject: [R] Presence/ absence data from matrix to single column I've been trying to reshape this database but haven't succeed at it. I tried using loops but can't get it right. I just want to reshape my database from this matrix, to the one below, with only one column of data. Year Route Point Sp1 Sp2 Sp3 2004 123 123-1 0 1 0 2004 123 123-2 0 1 1 2004 123 123-10 1 1 0 What I want: Year Route Point 2004 123 123-1 Sp1 0 2004 123 123-2 Sp1 0 2004 123 123-10 Sp1 1 2004 123 123-1 Sp2 1 2004 123 123-2 Sp2 1 2004 123 123-10 Sp2 1 2004 123 123-1 Sp3 0 2004 123 123-2 Sp3 1 2004 123 123-10 Sp3 0 -- View this message in context: http://r.789695.n4.nabble.com/Presence-absence-data-from-matrix-to-single-column-tp4645271.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Presence/ absence data from matrix to single column
Try the reshape2 package. You will probablly have to install the package. install.packages(reshape2) with your data as xx : library(reshape2) melt(xx, id =c(Year, Route, Point)) seems to do what you want. John Kane Kingston ON Canada -Original Message- From: agoij...@cnia.inta.gov.ar Sent: Sat, 6 Oct 2012 08:03:11 -0700 (PDT) To: r-help@r-project.org Subject: [R] Presence/ absence data from matrix to single column I've been trying to reshape this database but haven't succeed at it. I tried using loops but can't get it right. I just want to reshape my database from this matrix, to the one below, with only one column of data. Year Route Point Sp1 Sp2 Sp3 2004 123 123-1 0 1 0 2004 123 123-2 0 1 1 2004 123 123-10 1 1 0 What I want: Year Route Point 2004 123 123-1 Sp1 0 2004 123 123-2 Sp1 0 2004 123 123-10 Sp1 1 2004 123 123-1 Sp2 1 2004 123 123-2 Sp2 1 2004 123 123-10 Sp2 1 2004 123 123-1 Sp3 0 2004 123 123-2 Sp3 1 2004 123 123-10 Sp3 0 -- View this message in context: http://r.789695.n4.nabble.com/Presence-absence-data-from-matrix-to-single-column-tp4645271.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Share photos screenshots in seconds... TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if1 Works in all emails, instant messengers, blogs, forums and social networks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] arrange data
Hi Roslina, Extending Rui's solution if you want only the last two digits for Year. agg_dt1$Tahun-as.numeric(gsub(\\d{2}(\\d+),\\1,agg_dt1$Tahun)) head(agg_dt1) # Tahun Bahun x #1 98 1 607 #2 99 1 814 #3 0 1 580 #4 98 2 1006 #5 99 2 941 #6 0 2 1075A.K. - Original Message - From: Rui Barradas ruipbarra...@sapo.pt To: arun smartpink...@yahoo.com Cc: Roslina Zakaria zrosl...@yahoo.com; R help r-help@r-project.org Sent: Saturday, October 6, 2012 7:22 AM Subject: Re: [R] arrange data Hello, Using Arun's data example, instead of creating a factor convert to 4 digits years. set.seed(1) dat1 - data.frame(Tahun=rep(c(98:99,00),each=36), Bahun=rep(rep(1:12,times=3),each=3), x=sample(1:500,108,replace=TRUE)) dat2 - dat1 # operate on a copy dat2$Tahun - with(dat2, ifelse(Tahun 71, 2000 + Tahun, 1900 + Tahun)) agg_dt1 - aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum) head(agg_dt1) Hope this helps, Rui Barradas Em 06-10-2012 03:38, arun escreveu: Hi, I hope this helps you. I created a small dataset: 3 replications per month for 1998:2000. set.seed(1) dat1-data.frame(Tahun=rep(c(98:99,00),each=36),Bahun=rep(rep(1:12,times=3),each=3), x=sample(1:500,108,replace=TRUE)) dat2-within(dat1,{Tahun-factor(Tahun,levels=c(98,99,0))}) agg_dt1-aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum) head(agg_dt1) # Tahun Bahun x #1 98 1 1252 #2 99 1 680 #3 0 1 687 #4 98 2 761 #5 99 2 860 #6 0 2 786 I guess this is what you wanted. In addition, you can also use ddply() with a different way of grouping: but with the same result. library(plyr) dd_dt1-ddply(dat2,.(Tahun,Bahun),summarize, sum(x)) head(dd_dt1) # Tahun Bahun ..1 #1 98 1 1252 #2 98 2 761 #3 98 3 440 #4 98 4 597 #5 98 5 987 #6 98 6 692 tail(dd_dt1) # Tahun Bahun ..1 #31 0 7 685 #32 0 8 504 #33 0 9 633 #34 0 10 553 #35 0 11 914 #36 0 12 1039 A.K. - Original Message - From: Roslina Zakaria zrosl...@yahoo.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 5, 2012 8:09 PM Subject: [R] arrange data Dear r-users, I have dailly rainfall data from year 1971 to 2000. I use aggregate to form monthly rainfall data. What I don't understand is that the data for the year 2000 become on the top, instead of year 1971. Here are some codes and output: agg_dt1 - aggregate(x=dt1[,4],by=dt1[,c(1,2)],FUN=sum) head(agg_dt1,20); tail(agg_dt1,20) Tahun Bulan x 1 0 1 398.6 2 71 1 934.9 3 72 1 107.2 4 73 1 236.4 5 74 1 10.5 6 75 1 744.6 7 76 1 9.2 8 77 1 108.7 9 78 1 251.5 10 79 1 197.3 11 80 1 144.1 12 81 1 104.5 13 82 1 17.7 14 83 1 151.8 ... Thank you so much for your help. Roslina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Presence/ absence data from matrix to single column
Hi John, Thanks for your comments. I have both packages. I am using R 2.15. May be reshape is out-of-date. I don't load reshape2 (may be lazy to add 2 at the end) that much except when I need dcast() I tried the code with only reshape2 loaded, and is getting the same result. A.K. - Original Message - From: John Kane jrkrid...@inbox.com To: arun smartpink...@yahoo.com Cc: Sent: Saturday, October 6, 2012 11:24 AM Subject: Re: [R] Presence/ absence data from matrix to single column I think reshape is out of date. reshape2 has been out for about a year I think. John Kane Kingston ON Canada -Original Message- From: smartpink...@yahoo.com Sent: Sat, 6 Oct 2012 08:15:34 -0700 (PDT) To:melt(dat1,id=c(Year,Route,Point)) Subject: Re: [R] Presence/ absence data from matrix to single column Hi, Try this: dat1-read.table(text= Year Route Point Sp1 Sp2 Sp3 2004 123 123-1 0 1 0 2004 123 123-2 0 1 1 2004 123 123-10 1 1 0 ,header=TRUE,sep=,stringsAsFactors=FALSE) library(reshape) melt(dat1,id=c(Year,Route,Point)) Year Route Point variable value 1 2004 123 123-1 Sp1 0 2 2004 123 123-2 Sp1 0 3 2004 123 123-10 Sp1 1 4 2004 123 123-1 Sp2 1 5 2004 123 123-2 Sp2 1 6 2004 123 123-10 Sp2 1 7 2004 123 123-1 Sp3 0 8 2004 123 123-2 Sp3 1 9 2004 123 123-10 Sp3 0 A.K. - Original Message - From: agoijman agoij...@cnia.inta.gov.ar To: r-help@r-project.org Cc: Sent: Saturday, October 6, 2012 11:03 AM Subject: [R] Presence/ absence data from matrix to single column I've been trying to reshape this database but haven't succeed at it. I tried using loops but can't get it right. I just want to reshape my database from this matrix, to the one below, with only one column of data. Year Route Point Sp1 Sp2 Sp3 2004 123 123-1 0 1 0 2004 123 123-2 0 1 1 2004 123 123-10 1 1 0 What I want: Year Route Point 2004 123 123-1 Sp1 0 2004 123 123-2 Sp1 0 2004 123 123-10 Sp1 1 2004 123 123-1 Sp2 1 2004 123 123-2 Sp2 1 2004 123 123-10 Sp2 1 2004 123 123-1 Sp3 0 2004 123 123-2 Sp3 1 2004 123 123-10 Sp3 0 -- View this message in context: http://r.789695.n4.nabble.com/Presence-absence-data-from-matrix-to-single-column-tp4645271.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5 Capture screenshots, upload images, edit and send them to your friends through IMs, post on Twitter®, Facebook®, MySpace™, LinkedIn® – FAST! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
Where is the csv data coming from? If it is an export from a spreadsheet, Excel (and others?) has a nasty habit of exporting as displayed rather than the actual number as it's default. John Kane Kingston ON Canada -Original Message- From: f.seha...@gmail.com Sent: Sat, 6 Oct 2012 01:11:11 -0700 (PDT) To: r-help@r-project.org Subject: Re: [R] Calculating the mean in one column with empty cells Hi, the first command was bringing the numbers into R directly: * testdata - c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262, 0.0264198664609803, 0.0200581303857603, -0.2971754213679500, -0.2353086361784190, 0.0667195538296534, 0.1755852636926560) mean(testdata) [1] 0.0161584* Here I tried to calculate the mean with the same numbers as given above, but taken from my dataset. * str(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18 mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) [1] 0.0167 * It seems that in the second case he calculates the mean with rounded numbers (0.2 and not 0.20061601085...) Could it be that R imports only the rounded numbers? How can I build a CSV-file with numbers showing all decimal places? Because I think my current CSV-file only has numbers with 2 decimal places. Kind Regards, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645252.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download limit
On Sat, Oct 6, 2012 at 12:38 PM, agiani99 agian...@hotmail.com wrote: Hi Michael, I am not under firewall, but I noticed that when I setInternet2=FALSE the problem disappears. SetInternet2=TRUE is required to download Systematic Investor Toolbox (SIT). I don't know why or whether it makes sense, but yes it seems a connection problem and no it seems to have something to do with R or RStudio. Thanks for your time, though. Best Andrea I think this is one of those situations where non-Windows folks just shake their heads and sigh. I'm afraid I don't know enough about Windows internet settings to comment (though BDR, Duncan M, Uwe, or many of the other folks on this list much smarter than I could likely explain it) but for now, I'm just happy to hear you got it working. Cheers, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] smoothScatter plot
Hi Zhengyu, You might want to have a look at http://gallery.r-enthusiasts.com/graph/Scatterplots_with_smoothed_densities_color_representation,139 which seems to be showing a smoothScatter() that seems like what you want. I've never used the function so I am probably not much help Something else that I thought of, late yesterday, was the ggplot2 approach shown using this code. d - ggplot(diamonds, aes(carat, price)) d + geom_point() # graph all points with similar colour d + geom_point(alpha = 1/10) # graph points with transparency setting The alpha settings may give you something similar to smoothScatter() but probably without the colours though a question on the google groups ggplot2 group might help. Good luck Good luck, John Kane Kingston ON Canada -Original Message- From: zhyjiang2...@hotmail.com Sent: Sat, 6 Oct 2012 01:01:41 +0800 To: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot Hi John, Thanks for your link. Those plots look pretty but way too complicated in terms of making R code. Maybe my decription is not clear. But could you take a look at the attached png? I saw several publications showing smoothed plots like this but not sure how to make one... Thanks, Best, Zhengyu Date: Fri, 5 Oct 2012 06:36:38 -0800 From: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot To: zhyjiang2...@hotmail.com CC: r-help@r-project.org In line John Kane Kingston ON Canada -Original Message- From: zhyjiang2...@hotmail.com Sent: Fri, 5 Oct 2012 05:41:29 +0800 To: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot Hi John, Thanks for your email. Your way works good. However, I was wondering if you can help with a smoothed scatter plot that has shadows with different darker blue color representing higher density of points. Zhengyu Do you mean something like what is being discussed here? http://andrewgelman.com/2012/08/graphs-showing-uncertainty-using-lighter-intensities-for-the-lines-that-go-further-from-the-center-to-de-emphasize-the-edges/ If so I think there has been some discussion and accompanying ggplot2 code on google groups ggplot2 site. Otherwise can you explain a bit more clearly? Date: Thu, 4 Oct 2012 05:46:46 -0800 From: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot To: zhyjiang2...@hotmail.com CC: r-help@r-project.org Hi, Do you mean something like this? = scatter.smooth(x,y)scatter.smooth(x,y) = It looks like invoking that dcols - densCols(x,y) is callling in some package that is masking the basic::smoothScatter() and applying some other version of smoothScatter, but I am not expert enough to be sure. Another way to get the same result as mine with smoothScatter is to use the ggplot2 package. it looks a bit more complicated but it is very good and in some ways easier to see exactly what is happening. To try it you would need to install the ggplot2 package (install.packages(ggplot2) then with your original x and y data frames === library(ggplot2) xy - cbind(x, y) names(xy) - c(xx, yy) p - ggplot(xy , aes(xx, yy )) + geom_point( ) + geom_smooth( method=loess, se =FALSE) p Thanks for the data set. However it really is easier to use dput() To use dput() simply issue the command dput(myfile) where myfile is the file you are working with. It will give you something like this: == 1 dput(x) structure(c(0.4543462924, 0.2671718761, 0.1641577016, 1.1593356462, 0.0421177346, 0.3127782861, 0.4515537795, 0.5332559665, 0.0913911528, 0.1472054054, 0.1340672893, 1.2599304224, 0.3872026125, 0.0368560053, 0.0371828779, 0.3999714282, 0.0175815783, 0.8871547761, 0.2706762487, 0.7401904063, 0.0991320236, 0.2565567348, 0.5854167363, 0.7515717421, 0.7220388222, 1.3528297744, 0.9339971349, 0.0128652431, 0.4102527051 ), .Dim = c(29L, 1L), .Dimnames = list(NULL, V1)) 1 dput(y) structure(list(V1 = c(0.8669898448, 0.6698647266, 0.1641577016, 0.4779091929, 0.2109900366, 0.2915241414, 0.2363116664, 0.3808731568, 0.379908928, 0.2565868263, 0.1986675964, 0.7589866876, 0.6496236922, 0.1327986663, 0.4196107999, 0.3436442638, 0.1910728051, 0.5625817464, 0.1429791079, 0.6441837334, 0.1477153617, 0.369079266, 0.3839842979, 0.39044223, 0.4186374286, 0.7611640016, 0.446291999, 0.2943343355, 0.3019098386)), .Names = V1, class = data.frame, row.names = c(NA, -29L)) 1 === That is your x in dput() form. You just copy it from the R terminal and paste it into your email message. It is handy if you add the x - and y - to the output. Your method works just fine but it's a bit more cumbersome with a lot of data. Also, please reply to the R-help list as well. It is a source of much more
Re: [R] Expected number of events, Andersen-Gill model fit via coxph in package survival
On Oct 5, 2012, at 8:48 PM, Omar De la Cruz C. wrote: Hello, I am interested in producing the expected number of events, in a recurring events setting. I am using the Andersen-Gill model, as fit by the function coxph in the package survival. I need to produce expected numbers of events for a cohort, cumulatively, at several fixed times. My ultimate goal is: To fit an AG model to a reference sample, then use that fitted model to generate expected numbers of events for a new cohort; then, comparing the expected vs. the observed numbers of events would give us some idea of whether the new cohort differs from the reference one. From my reading of the documentation and the text by Therneau and Grambsch, it seems that the function survexp is what I need. But using it I am not able to obtain expected numbers of events that match reasonably well the observed numbers *even for the same reference population.* So, I think I am misunderstanding something quite badly. Below is an example that illustrates the situation. At the end I include the sessionInfo(). Thank you! Omar. # Example of unexpected behavior in computing estimated number of events # in using package survival for fitting the Andersen-Gill model require(survival) head(bladder2) # this is the data, in interval format # Fit Andersen-Gill model cphfit = coxph(Surv(start,stop,event)~rx+number+size+cluster(id),data=bladder2) # Choose some arbitrary time horizons t.horiz = seq(min(bladder2$start),max(bladder2$stop),length=6) # Compute the cohort expected survival s = survexp(~1,data=bladder2,ratetable=cphfit,times=t.horiz) # This are the expected survival values: s$surv # We are interested in the rate of events e.r = as.vector( 1 - s$surv ) Rates are events/n-exposed/time, so those are not rates as I understand the term. And I do not see any accounting for the length of intervals at risk in the rest of your code. That vector does not even calculate interval event expectations as I read it. -- David # How does this compare to the actual number of events, cumulative at # each time horizon? observed = numeric(length(t.horiz)) for (i in 1:length(t.horiz)){ observed[i] = sum(bladder2$event[bladder2$stop = t.horiz[i]]) } print(observed) # We would like to compute expected numbers of events that approximately # match these observed values. # We should multiply the expected survival rates by the number of individuals. # Now, one would think that this is the number of at-risk individuals: s$n.risk # But that is actually the total number of rows in the data. In any case, # these numbers do not match: rbind(expected = s$n.risk*e.r,observed=observed) # What if we multiply by the number of individuals? rbind(expected = length(unique(bladder2$id))*e.r,observed=observed) # This does not work either! The required factor seems to be about 133, but # I don't see an explanation for that. # In this example, multiplying by 133.182 gives a good match between observed # and expected values, but in other examples even the shape of the curves # are different. # Multiplying by a number of individuals at risk at each time point # (number of individuals # for which there is a time interval containing the time horizon) does # not work either. # sessionInfo() R version 2.15.1 (2012-06-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] survival_2.36-14 loaded via a namespace (and not attached): [1] tools_2.15.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sample
Hello If I have x=c(3,2,6,1) and n=length(x), are the following codes equivalent?? sample(x,1,replace=TRUE) and sample(x,1,replace=TRUE,prob=rep(1/n , n) ) Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
On Oct 6, 2012, at 1:11 AM, fxen3k wrote: Hi, the first command was bringing the numbers into R directly: * testdata - c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262, 0.0264198664609803, 0.0200581303857603, -0.2971754213679500, -0.2353086361784190, 0.0667195538296534, 0.1755852636926560) mean(testdata) [1] 0.0161584* Here I tried to calculate the mean with the same numbers as given above, but taken from my dataset. * str(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18 mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) [1] 0.0167 * This is something that has happened in data processing: dat - read.csv2(text=0,2006160108532920 + 0,1321167173880490 + 0,0563941428921262 + 0,0264198664609803 + 0,0200581303857603 + -0,2971754213679500 + -0,2353086361784190 + 0,0667195538296534 + 0,1755852636926560 + , header=FALSE) mean(dat[[1]]) [1] 0.0161584 It seems that in the second case he calculates the mean with rounded numbers (0.2 and not 0.20061601085...) Could it be that R imports only the rounded numbers? How can I build a CSV-file with numbers showing all decimal places? Because I think my current CSV-file only has numbers with 2 decimal places. That is more likely the fault of Excel than it is something R is responsible for. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sample
Yes and no. Same effect, but you won't get the same random numbers because -- I believe -- a different algorithm is used. grep the source for sample and sample2 if you're interested. Cheers, Michael On Sat, Oct 6, 2012 at 5:02 PM, solafah bh solafa...@yahoo.com wrote: Hello If I have x=c(3,2,6,1) and n=length(x), are the following codes equivalent?? sample(x,1,replace=TRUE)and sample(x,1,replace=TRUE,prob=rep(1/n , n) ) Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sample
Hi, They get different results: with the same set.seed() x=c(3,2,6,1) n=length(x) set.seed(1) sample(x,1,replace=TRUE) #[1] 2 set.seed(1) sample(x,1,replace=TRUE,prob=rep(1/n , n) ) #[1] 6 identical(sample(x,1,replace=TRUE),sample(x,1,replace=TRUE,prob=rep(1/n , n) )) #[1] FALSE A.K. - Original Message - From: solafah bh solafa...@yahoo.com To: R help mailing list r-help@r-project.org Cc: Sent: Saturday, October 6, 2012 12:02 PM Subject: [R] sample Hello If I have x=c(3,2,6,1) and n=length(x), are the following codes equivalent?? sample(x,1,replace=TRUE) and sample(x,1,replace=TRUE,prob=rep(1/n , n) ) Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LaTeX consistent publication graphics from R and Comparison of GLE and R
Hi Frank, I have not used tikz, so am not sure. I have been hand coding the TeX markup in the .Rnw files to date, since each study has been somewhat different in terms of various characteristics and the sponsors, in some cases, have requested some customizations to the flow charts. That has typically been done with psmatrix constructs (http://tug.org/PSTricks/main.cgi?file=pst-node/psmatrix/psmatrix). I have also used PSTricks, with pst-tree constructs (http://tug.org/PSTricks/main.cgi?file=pst-tree/pst-tree), to create branching trees for stratified randomization flow charts. So you have a top level with all enrolled subjects, then branches from there showing each stratification level, each box showing the sample size (using \Sexpr{}s) within each strata level. Similar concept to the matrix-like orgchart style used for disposition charts, but just a different implementation, which allows for an imbalance in the tree structure (eg. differing strata in each arm based upon various criteria, etc.). I suppose that if one were to think about it conceptually, R's list structures would be a suitable substrate for creating an object that could be passed to a print method of sorts and generate the TeX markup during Sweave (or knitr) processing. I just have not spent the time to consider how that would be done generically enough and still allow for some of the customizations that might be encountered. Food for thought. Best regards, Marc On Oct 6, 2012, at 8:14 AM, Frank Harrell f.harr...@vanderbilt.edu wrote: Hi Marc, It would be interesting to compare with tikz for ease of use. As an aside I've been wishing that someone would write an R function for creating clinical trial disposition charts using tikz or pstricks ... Best, Frank Marc Schwartz-3 wrote On Oct 5, 2012, at 3:32 PM, clangkamp lt; christian.langkamp@ gt; wrote: Hi Everyone I am at the moment preparing my thesis and am looking at producing a few Organigrams / Flow charts (unrelated to the calculations in R) as well as a range of charts (barcharts, histograms, ...) based on calculations in R. For the Organigrams I am looking at an Opensource package called GLE at sourceforge, which produces the text part in Latex figures which is very neat and also in the same style of the thesis, which I wrote in LaTeX. It also offers a range of graphical features, and I am quite tempted. It also produces barcharts and histograms with the options of legends etc. I have done most of my graphs so far with R, but with Organigrams and flow charts I am at a loss (A pointer here would also be very welcome). For some charts I have used MS Visio, but it would be convenient to use just one program for graphing throughout the thesis (i.e. same colour coding etc.). Does anybody have any experience with GLE, ideally working with it with CSV tables generated within R ? Or does there exist another way to generate 'visually LaTeX consistent' graphics within R ? Any takers ? If you are comfortable in LaTeX, I would suggest that you look at PSTricks: http://tug.org/PSTricks/main.cgi I use that for creating subject disposition flow charts for clinical trials with Sweave. I can then use \Sexpr{}'s to fill in various annotations in the boxes, etc. so that all content is programmatically created in a reproducible fashion. There are some examples of flow charts and tree diagrams here: http://tug.org/PSTricks/main.cgi?file=pst-node/psmatrix/psmatrix#flowchart and there are various other online resources for using PSTricks. Keep in mind that since this is PostScript based, you need to use a latex + dvips + ps2pdf sequence, rather than just pdflatex. Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
For nine numbers, R-helpers should recommend that people show their data with dput(obj) instead of str(obj). dput() shows everything in the object to full precision. str() shows a summary of the object and rounds numbers to 2 digits -- it is good for an overview of the data, but when the question is why did I get a mean of .06 instead of .06547494 from my 9 numbers str() is not useful. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Saturday, October 06, 2012 9:08 AM To: fxen3k Cc: r-help@r-project.org Subject: Re: [R] Calculating the mean in one column with empty cells On Oct 6, 2012, at 1:11 AM, fxen3k wrote: Hi, the first command was bringing the numbers into R directly: * testdata - c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262, 0.0264198664609803, 0.0200581303857603, -0.2971754213679500, -0.2353086361784190, 0.0667195538296534, 0.1755852636926560) mean(testdata) [1] 0.0161584* Here I tried to calculate the mean with the same numbers as given above, but taken from my dataset. * str(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18 mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9]) [1] 0.0167 * This is something that has happened in data processing: dat - read.csv2(text=0,2006160108532920 + 0,1321167173880490 + 0,0563941428921262 + 0,0264198664609803 + 0,0200581303857603 + -0,2971754213679500 + -0,2353086361784190 + 0,0667195538296534 + 0,1755852636926560 + , header=FALSE) mean(dat[[1]]) [1] 0.0161584 It seems that in the second case he calculates the mean with rounded numbers (0.2 and not 0.20061601085...) Could it be that R imports only the rounded numbers? How can I build a CSV-file with numbers showing all decimal places? Because I think my current CSV-file only has numbers with 2 decimal places. That is more likely the fault of Excel than it is something R is responsible for. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple graphs boxplot
Does something like this make any sense? library(reshape2) library(ggplot2) yy - structure(list(A = c(23, 21, 21, 20, 19, 19), B = c(20, 18, 20, 19, 20, 18), C = c(15, 15, 15, 12, 13, 13)), .Names = c(A, B, C), class = data.frame, row.names = c(NA, -6L)) y1 - melt(yy) # using reshape2 ggplot(y1, aes(variable, value))+ geom_boxplot() # or ggplot(y1, aes(variable, value))+ geom_boxplot() + facet_grid(variable ~ .) John Kane Kingston ON Canada -Original Message- From: dagr...@hotmail.com Sent: Fri, 5 Oct 2012 18:01:39 +0200 To: r-help@r-project.org Subject: [R] Multiple graphs boxplot Dear all I am trying to represent a dependent variable (treatment) against different independent variables (v1, v2, v3v20). I am using the following command: boxplot(v1~treatment,data=y, main=xx,xlab=xx, ylab=xx) However, it provides me only one graph for v1~treatment. For the other comparisons, I have to repeat the same command but changing the parameters. My intentions is to get different plots in just one sheet using only one command. Is it possible to join the same order for all the comparisons in only one command? Thanks David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with making figures
On 05.10.2012 21:59, megalops wrote: Bert, Can you help me understand your suggestion? Megalops31, which suggestion? You failed to quote former messages! I don't understand how I can include all 30 sites under the label called site in the xypot What is an xypot example? Please read the posting guide for this *mailing list*. Uwe Ligges code example you provided. -- View this message in context: http://r.789695.n4.nabble.com/help-with-making-figures-tp4645074p4645216.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sample with equal probabilities
Hello If I have this vector x=c(5,1,2,9) and n=length(x) and I want to sample one value from x , and each value of x has equal probability to appear (1/n). Are the following codes equivalent?? sample(x,1,replace=TRUE) and sample(x,1,replace=TRUE,prob=rep(1/n , n)) Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sample with equal probabilities
Please don't double post. And see my response to you here: https://stat.ethz.ch/pipermail/r-help/2012-October/325470.html Michael On Sat, Oct 6, 2012 at 6:51 PM, solafah bh solafa...@yahoo.com wrote: Hello If I have this vector x=c(5,1,2,9) and n=length(x) and I want to sample one value from x , and each value of x has equal probability to appear (1/n). Are the following codes equivalent?? sample(x,1,replace=TRUE) and sample(x,1,replace=TRUE,prob=rep(1/n , n)) Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector is not assigned correctly in for loop
On Sat, Oct 6, 2012 at 3:23 PM, Bert Gunter gunter.ber...@gene.com wrote: But the OP should not be doing this **at all.** He apparently has not bothered to read the Intro to R tutorial as he appears not to know about vectorized calculations. -- Bert I don't really think that's relevant or constructive here. Yes, it would be more idiomatic, as I and Rui already noted, to vectorize _this_ trivial example, but it obviously is just a minimal example. If the OP really wanted the result 0.02, 0.04, etc., and vectorization really was the issue at hand, we would have just directed him to seq(0.02, 1.00, 0.02) and that would have been that. The meat of the question -- which I feel the OP nicely isolated -- was the eternally surprising behavior of floating point numbers, specifically in regards to vector subsetting. It's a fact of life that many of us have seen before and know to deal with, but it's surprising and profoundly counterintuitive for beginning programmers. So much so, in fact, that the Python folks none too recently considered changing the language so that non-integer literals would use infinite precision (or at least accurate decimal implementations). It didn't really get off the ground, but it was a serious consideration by several intelligent people. Vectorization is orthogonal to all that and rather than berating the OP for his question and asserting that he hasn't bothered, I would re-commend him for a well posed and deep question, though perhaps he didn't know quite how deep this particular rabbit hole actually goes. It shows, to me, strong evidence of someone taking the time to isolate the problem from a larger codebase before asking the list, and I appreciate that fully. Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Random Points on a Sphere
Lorenzo Isella lorenzo.ise...@gmail.com writes: Dear All, I implemented an algorithm for (uniform) random rotations. In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian coordinates. The result is supposed to be a set of random, uniformly distributed, points on a sphere (not the point of the algorithm, but a way to test it). This is what the points look like when I plot them, but other then eyeballing them, can anyone suggest a test to ensure that I am really generating uniform random points on a sphere? There is a substantial literature on this topic and more than one (metaphorical?) direction you could follow. I suggest you Google 'directional statistics' and start reading. Visit http://www.rseek.org and enter 'directional statistics' in the search box and click on the search button to see if there is something in R to meet your needs. A post to r-sig-geo might get more helpful responses once you can focus the question a bit more. HTH, Chuck Many thanks Lorenzo -- Charles C. BerryDept of Family/Preventive Medicine cberry at ucsd edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ffbase, help with %in%
Hello, You are trying to index based on a logical ff vector instead of based on a integer ff vector. Indexing based logical ff vectors are only allowed since version 0.6 of the package which is not on CRAN currently yet. Jan -- View this message in context: http://r.789695.n4.nabble.com/ffbase-help-with-in-tp4644730p4645268.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LaTeX consistent publication graphics from R and Comparison of GLE and R
I had a brief look at PSTricks, and really quite like it. There is however one catch, like Sweave etc. it is assumed to be processed along with the LaTeX. I find these things rather annoying, as it is just a major and unnecessary error source. I think it is much better to produce single objects (e.g. PNGs) and then embed them, like R does, without needing to actually embed the R into the main text. But many thanks for the pointer - I think it is probably the best bet to continue formatting data from R as well as being able to produce organigrams etc. - Christian Langkamp christian.langkamp-at-gmxpro.de -- View this message in context: http://r.789695.n4.nabble.com/LaTeX-consistent-publication-graphics-from-R-and-Comparison-of-GLE-and-R-tp4645218p4645293.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
I created a Microsoft Excel spreadsheet. As you said, I only have as displayed numbers. I just solved the problem by showing 25 decimal places in Excel and then exported the data into a CSV-file. Is there a better way to solve this? Regards, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645278.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Update Failed !!!
Hi R-listers, I just tried updating my R and now I can't even open it and it is prompting me to relaunch then relaunch just reappears. And it will not open R. I am afraid I may have lost my scripts. What should I do? I am running a MacBook OS X Version 10.5.8 1) Restore the entire system to an earlier date? 2) Drag R from applications and reload R program (updated version)? Will my script files still show up if I do either of these options this? Please advise. Jean -- View this message in context: http://r.789695.n4.nabble.com/R-Update-Failed-tp4645297.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Presence/ absence data from matrix to single column
Works fantastic!!! thank you SO much -- View this message in context: http://r.789695.n4.nabble.com/Presence-absence-data-from-matrix-to-single-column-tp4645271p4645302.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multinomial MCMCglmm
Dear all, I would like to add mixed effects in a multinomial model and I am trying to use MCMCglmm for that. The main problem I face: my data set is a trapping data set, where the observation at each trap (1 or 0 for several species) have been aggregated per trapline (i.e. 25 traps). Therefore we have a proportion of presence/absence for each species per trapline. ex: ID_line mesh habitat Apsy Mygl Crle Crru Miag Miar Mimi Mumu Misu Soar Somi 11 028S6A 28 copse200000000 00 12 028S6B 28 copse110000000 00 13 028S6C 28 hedge200400000 00 14 028S6D 28 hedge100700001 00 15 028S6E 28 hedge700100000 00 empty 1128 1228 1324 1421 1522 When I run the following: test1 - MCMCglmm(fixed=cbind(Apsy,Mygl,Crle,Crru,Miag,Miar,Mimi,Mumu,Misu,Soar,Somi,empty)~habitat,random=~mesh,family=multinomial12,data=metalSmA[,c(2,9,23:34)],rcov=~us(trait):units) I got some error concerning the variance structure: ill-conditioned G/R structure: use proper priors if you haven't or rescale data if you have I guess that the problem comes from the nature of my observations which are frequencies instead of 0/1 per unit Does someone know if a multinomial model fitted with MCMCglmm can handle those frequencies table and how to specify the good G/R variance structures? Regards Amélie Vaniscotte __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multinomial MCMCglmm
Dear all, I would like to add mixed effects in a multinomial model and I am trying to use MCMCglmm for that. The main problem I face: my data set is a trapping data set, where the observation at each trap (1 or 0 for several species) have been aggregated per trapline (i.e. 25 traps). Therefore we have a proportion of presence/absence for each species per trapline. ex: ID_line mesh habitat Apsy Mygl Crle Crru Miag Miar Mimi Mumu Misu Soar Somi 11 028S6A 28 copse200000000 00 12 028S6B 28 copse110000000 00 13 028S6C 28 hedge200400000 00 14 028S6D 28 hedge100700001 00 15 028S6E 28 hedge700100000 00 empty 1128 1228 1324 1421 1522 When I run the following: test1 - MCMCglmm(fixed=cbind(Apsy,Mygl,Crle,Crru,Miag,Miar,Mimi,Mumu,Misu,Soar,Somi,empty)~habitat,random=~mesh,family=multinomial12,data=metalSmA[,c(2,9,23:34)],rcov=~us(trait):units) I got some error concerning the variance structure: ill-conditioned G/R structure: use proper priors if you haven't or rescale data if you have I guess that the problem comes from the nature of my observations which are frequencies instead of 0/1 per unit Does someone know if a multinomial model fitted with MCMCglmm can handle those frequencies table and how to specify the good G/R variance structures? Regards Amélie Vaniscotte __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Update Failed !!!
This is being handled on R-SIG-Mac. Please disregard here. Michael On Oct 6, 2012, at 9:05 PM, Jhope jeanwaij...@gmail.com wrote: Hi R-listers, I just tried updating my R and now I can't even open it and it is prompting me to relaunch then relaunch just reappears. And it will not open R. I am afraid I may have lost my scripts. What should I do? I am running a MacBook OS X Version 10.5.8 1) Restore the entire system to an earlier date? 2) Drag R from applications and reload R program (updated version)? Will my script files still show up if I do either of these options this? Please advise. Jean -- View this message in context: http://r.789695.n4.nabble.com/R-Update-Failed-tp4645297.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Quantile Granger causality
Good evening (in Italy), Someone of you have ever read anything about quantile cointegration? I want to use the test statistic explained in Chuang et al. (2009), that fundamentally followed the suggestion of Koenker and Machado (1999). This is a Wald test used for quantile cointegration proposed by Xiao (2009). I don't understand well if anova.rq of the package quantreg can do it or not. Thank you, Massimo -- View this message in context: http://r.789695.n4.nabble.com/Quantile-Granger-causality-tp4645300.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multinomial MCMCglmm
Dear all, I would like to add mixed effects in a multinomial model and I am trying to use MCMCglmm for that. The main problem I face: my data set is a trapping data set, where the observation at each trap (1 or 0 for several species) have been aggregated per trapline (i.e. 25 traps). Therefore we have a proportion of presence/absence for each species per trapline. ex: ID_line mesh habitat Apsy Mygl Crle Crru Miag Miar Mimi Mumu Misu Soar Somi 11 028S6A 28 copse200000000 00 12 028S6B 28 copse110000000 00 13 028S6C 28 hedge200400000 00 14 028S6D 28 hedge100700001 00 15 028S6E 28 hedge700100000 00 empty 1128 1228 1324 1421 1522 When I run the following: test1 - MCMCglmm(fixed=cbind(Apsy,Mygl,Crle,Crru,Miag,Miar,Mimi,Mumu,Misu,Soar,Somi,empty)~habitat,random=~mesh,family=multinomial12,data=metalSmA[,c(2,9,23:34)],rcov=~us(trait):units) I got some error concerning the variance structure: ill-conditioned G/R structure: use proper priors if you haven't or rescale data if you have I guess that the problem comes from the nature of my observations which are frequencies instead of 0/1 per unit Does someone know if a multinomial model fitted with MCMCglmm can handle those frequencies table and how to specify the good G/R variance structures? Regards Ame'lie Vaniscotte __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
On Oct 6, 2012, at 4:54 PM, fxen3k f.seha...@gmail.com wrote: I created a Microsoft Excel spreadsheet. As you said, I only have as displayed numbers. I just solved the problem by showing 25 decimal places in Excel and then exported the data into a CSV-file. Is there a better way to solve this? Don't use Excel. (or at least find a way to get reasonable defaults) This isn't sarcastic: just acknowledging that instances like this show Excel really isn't a suitable tool for real data analysis (cf Pat Burns' 'Spreadsheet Addiction' paper) Michael Regards, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645278.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] two indirect effects of path analysis
Hello, This is Elaine. I am trying a path analysis using lavaan Package. There are three explanatory variables: X, Z, and M. The response variable is Y. A, b, and c have direct effects on Y. On the other hand, X and Z also have direct effects on M. In other words, X and Z have indirect effects on Y. I found the code example of lavaan package describes only one indirect effect as below. Please kindly advise how to modify it as two indirect effects. Thank you. Elaine set.seed(1234) X - rnorm(100) M - 0.5*X + rnorm(100) Y - 0.7*M + rnorm(100) Data - data.frame(X = X, Y = Y, M = M) model - ' # direct effect + Y ~ c*X + # mediator + M ~ a*X + Y ~ b*M + # indirect effect (a*b) + ab := a*b + # total effect + total := c + (a*b) + ' fit - sem(model, data=Data) summary(fit) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Presence/ absence data from matrix to single column
On 10/07/2012 01:03 AM, agoijman wrote: I've been trying to reshape this database but haven't succeed at it. I tried using loops but can't get it right. I just want to reshape my database from this matrix, to the one below, with only one column of data. YearRoute Point Sp1 Sp2 Sp3 2004123 123-1 0 1 0 2004123 123-2 0 1 1 2004123 123-10 1 1 0 What I want: YearRoute Point 2004123 123-1 Sp1 0 2004123 123-2 Sp1 0 2004123 123-10 Sp1 1 2004123 123-1 Sp2 1 2004123 123-2 Sp2 1 2004123 123-10 Sp2 1 2004123 123-1 Sp3 0 2004123 123-2 Sp3 1 2004123 123-10 Sp3 0 Hi agoijman, You can do this using the rep_n_stack function. adat-data.frame(Year=rep(2004,3),Route=rep(123,3), Point=c(123-1,123-2,123-10),Sp1=c(0,0,1), Sp2=c(1,1,1),Sp3=c(0,1,0)) library(prettyR) rep_n_stack(adat,c(Sp1,Sp2,Sp3), stack.names=c(Sp-names,Sp-values)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] svyhist
Hi Anthony, The ylim () has been added to the code (please see below), and I got 4 plots that have the same y -dimension. Each plot displays 2 distributions - one as histogram from the data and another one as line (i.e., idealized theoretical normal distribution?). My question is, Is there way to change the distribution in the line () function and try other theoretical distribution to approximate the observed distribution? Thanks, Pradip Muhuri MyBreaks - c(18,35,45,55,65,75,85,95) png(svyhist_no_spd_age_at_inteview.png) options( survey.lonely.psu = adjust ) svyhist (~age_p, subset (nhis, xspd2=='No SPD'), breaks=MyBreaks, ylim = c(0,0.035), main= , col=grey80, xlab=Age at Interview among those Who had no SPD ) lines (svysmooth(~dthage, bandwidth=5,subset(nhis, xspd2=='No SPD')), lwd=2) dev.off () From: Anthony Damico [ajdam...@gmail.com] Sent: Saturday, October 06, 2012 6:56 AM To: Muhuri, Pradip (SAMHSA/CBHSQ) Cc: David Winsemius; R help Subject: Re: [R] svyhist ?ylim says numeric vectors of length 2 - so just the beginning and end. ?svyhist doesn't specifically mention the ylim parameter, meaning you should look for a ... in the arguments list and click through to the page for ?hist ?hist has an example that shows the ylim parameter only containing the beginning and end values. try using ylim = c( 0 , 0.030 ) if you're looking to set the tick marks, look at ?axis ;) On Fri, Oct 5, 2012 at 11:18 PM, Muhuri, Pradip (SAMHSA/CBHSQ) pradip.muh...@samhsa.hhs.govmailto:pradip.muh...@samhsa.hhs.gov wrote: Dear Anthony and David, Sorry- the earlier-sent plots were mislabeled, which I have corrected and attached. But, the y-lim issue is yet to be resolved. Thanks, Pradip Muhuri From: Anthony Damico [ajdam...@gmail.commailto:ajdam...@gmail.com] Sent: Friday, October 05, 2012 7:29 PM To: David Winsemius Cc: Muhuri, Pradip (SAMHSA/CBHSQ); R help Subject: Re: [R] svyhist this worked for me -- and doesn't require removing the PSUs from the design :) options( survey.lonely.psu = adjust ) svyhist (~dthage, subset (nhis, xspd2=='No SPD'), breaks=MyBreaks, main= , col=grey80, xlab=Age at Death Distribution ) lines (svysmooth(~dthage, bandwidth=5,subset(nhis, xspd2=='No SPD')), lwd=2) Dr. Lumley has written quite a bit about single-PSU strata here: http://faculty.washington.edu/tlumley/survey/exmample-lonely.html On Fri, Oct 5, 2012 at 7:16 PM, David Winsemius dwinsem...@comcast.netmailto:dwinsem...@comcast.netmailto:dwinsem...@comcast.netmailto:dwinsem...@comcast.net wrote: On Oct 5, 2012, at 3:33 PM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote: Hello, I was trying to draw histograms of age at death and got the following 2 error messages: 1) Error in tapply(1:NROW(x), list(factor(strata)), function(index) { : arguments must have same length This is the top of the output of str applied to the data argument you offered to svyhist: str(subset (nhis, xspd2==2) ) List of 9 $ cluster :'data.frame': 0 obs. of 1 variable: ..$ psu: Factor w/ 47 levels 109.1,115.2,..: ..- attr(*, terms)=Classes 'terms', 'formula' length 2 ~psu .. .. ..- attr(*, variables)= language list(psu) .. .. ..- attr(*, factors)= int [1, 1] 1 .. .. .. ..- attr(*, dimnames)=List of 2 .. .. .. .. ..$ : chr psu .. .. .. .. ..$ : chr psu At least one problem seems pretty clear. No data. That can be corrected by wrapping as.numeric() around the factor on which you are subsetting in two places. Another problem may arise when you restrict to one class only, namely there won't any design to work with. All the clusters there would be only one no longer have any multiplicity, and svyhist apparently isn't built to handle situation, at least with that design argument. Error in onestrat(x[index, , drop = FALSE], clusters[index], nPSU[index][1], : Stratum (2) has only one PSU at stage 1 Taking the 'stratum' argument out of the design() spec allows it to proceed, but I do not know if that is introducing invalidity in the analysis. -- David. 2) Error in findInterval(mm[, i], gx) : 'vec' contains NAs In addition: Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf I would appreciate if someone could help me resolve these issues. Below is reproducible example. Thanks, Pradip Muhuri setwd (E:/RDATA) options(width = 120) library (survey) library (KernSmooth) xd1 - dthage ypll_75 xspd2 psu stratum wt8 56 19 2 2 33 1512.7287 86 0 2 2 129 1830.6400 81 0 2 1 67 536.1400 47 28 2 1 17 519.8350 71 4 1 1 225 254.4087 72 3
[R] what exactly is the dim of data set yarn in package pls?
Hi list, I am looking at the data yarn in package, I don't understand what is dimension of this data set. I did the following: library(pls) data(yarn) dim(yarn) [1] 28 3 head(yarn) NIR.1 NIR.2 NIR.3 NIR.4 NIR.5 NIR.6 NIR.7 NIR.8 NIR.9 NIR.10 NIR.11 1 3.06630 3.08610 3.10790 3.09720 2.99790 2.82730 2.62330 2.40390 2.19310 2.00580 1.83790 2 3.06750 3.08570 3.09580 3.06920 2.98180 2.84080 2.67600 2.50590 2.35060 2.22300 2.11920 3 3.07500 3.09660 3.09160 3.02880 2.88490 2.68850 2.47640 2.26940 2.08240 1.91950 1.77470 4 3.08280 3.09730 3.10100 3.07350 2.99130 2.87090 2.73920 2.61020 2.5 2.42370 2.37740 5 3.10290 3.10340 3.08480 3.02280 2.89270 2.71590 2.53840 2.37640 2.23970 2.13460 2.05340 6 3.08150 3.08490 3.04870 2.93050 2.73230 2.50890 2.29440 2.09950 1.93280 1.79250 1.66930 There were 270 columns, I only copy pasted the first 11 columns. But either way, this is NOT 3 columns. Could anyone let me know what is wrong here? Thanks in advance, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what exactly is the dim of data set yarn in package pls?
Perhaps what is wrong is that you need to learn when to use the str() function. str(yarn) --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. C W tmrs...@gmail.com wrote: Hi list, I am looking at the data yarn in package, I don't understand what is dimension of this data set. I did the following: library(pls) data(yarn) dim(yarn) [1] 28 3 head(yarn) NIR.1 NIR.2 NIR.3 NIR.4 NIR.5 NIR.6 NIR.7 NIR.8 NIR.9 NIR.10 NIR.11 1 3.06630 3.08610 3.10790 3.09720 2.99790 2.82730 2.62330 2.40390 2.19310 2.00580 1.83790 2 3.06750 3.08570 3.09580 3.06920 2.98180 2.84080 2.67600 2.50590 2.35060 2.22300 2.11920 3 3.07500 3.09660 3.09160 3.02880 2.88490 2.68850 2.47640 2.26940 2.08240 1.91950 1.77470 4 3.08280 3.09730 3.10100 3.07350 2.99130 2.87090 2.73920 2.61020 2.5 2.42370 2.37740 5 3.10290 3.10340 3.08480 3.02280 2.89270 2.71590 2.53840 2.37640 2.23970 2.13460 2.05340 6 3.08150 3.08490 3.04870 2.93050 2.73230 2.50890 2.29440 2.09950 1.93280 1.79250 1.66930 There were 270 columns, I only copy pasted the first 11 columns. But either way, this is NOT 3 columns. Could anyone let me know what is wrong here? Thanks in advance, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what exactly is the dim of data set yarn in package pls?
I just looked at it again, and I did: is.list(yarn) [1] TRUE So, I guess it's a list with data inside. M On Sat, Oct 6, 2012 at 9:44 PM, C W tmrs...@gmail.com wrote: Hi list, I am looking at the data yarn in package, I don't understand what is dimension of this data set. I did the following: library(pls) data(yarn) dim(yarn) [1] 28 3 head(yarn) NIR.1 NIR.2 NIR.3 NIR.4 NIR.5 NIR.6 NIR.7 NIR.8 NIR.9 NIR.10 NIR.11 1 3.06630 3.08610 3.10790 3.09720 2.99790 2.82730 2.62330 2.40390 2.19310 2.00580 1.83790 2 3.06750 3.08570 3.09580 3.06920 2.98180 2.84080 2.67600 2.50590 2.35060 2.22300 2.11920 3 3.07500 3.09660 3.09160 3.02880 2.88490 2.68850 2.47640 2.26940 2.08240 1.91950 1.77470 4 3.08280 3.09730 3.10100 3.07350 2.99130 2.87090 2.73920 2.61020 2.5 2.42370 2.37740 5 3.10290 3.10340 3.08480 3.02280 2.89270 2.71590 2.53840 2.37640 2.23970 2.13460 2.05340 6 3.08150 3.08490 3.04870 2.93050 2.73230 2.50890 2.29440 2.09950 1.93280 1.79250 1.66930 There were 270 columns, I only copy pasted the first 11 columns. But either way, this is NOT 3 columns. Could anyone let me know what is wrong here? Thanks in advance, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what exactly is the dim of data set yarn in package pls?
Thanks, str() gave me all the info I am looking for. I am too used to dim(), and forgot about list() and str(). Thanks again, -M On Sat, Oct 6, 2012 at 9:57 PM, Jeff Newmiller jdnew...@dcn.davis.ca.uswrote: Perhaps what is wrong is that you need to learn when to use the str() function. str(yarn) --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. C W tmrs...@gmail.com wrote: Hi list, I am looking at the data yarn in package, I don't understand what is dimension of this data set. I did the following: library(pls) data(yarn) dim(yarn) [1] 28 3 head(yarn) NIR.1 NIR.2 NIR.3 NIR.4 NIR.5 NIR.6 NIR.7 NIR.8 NIR.9 NIR.10 NIR.11 1 3.06630 3.08610 3.10790 3.09720 2.99790 2.82730 2.62330 2.40390 2.19310 2.00580 1.83790 2 3.06750 3.08570 3.09580 3.06920 2.98180 2.84080 2.67600 2.50590 2.35060 2.22300 2.11920 3 3.07500 3.09660 3.09160 3.02880 2.88490 2.68850 2.47640 2.26940 2.08240 1.91950 1.77470 4 3.08280 3.09730 3.10100 3.07350 2.99130 2.87090 2.73920 2.61020 2.5 2.42370 2.37740 5 3.10290 3.10340 3.08480 3.02280 2.89270 2.71590 2.53840 2.37640 2.23970 2.13460 2.05340 6 3.08150 3.08490 3.04870 2.93050 2.73230 2.50890 2.29440 2.09950 1.93280 1.79250 1.66930 There were 270 columns, I only copy pasted the first 11 columns. But either way, this is NOT 3 columns. Could anyone let me know what is wrong here? Thanks in advance, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what exactly is the dim of data set yarn in package pls?
Start by using str() to get an idea of the structure of this dataset str(pls::yarn) 'data.frame': 28 obs. of 3 variables: $ NIR: num [1:28, 1:268] 3.07 3.07 3.08 3.08 3.1 ... ..- attr(*, dimnames)=List of 2 .. ..$ : NULL .. ..$ : NULL $ density: num 100 80.2 79.5 60.8 60 ... $ train : logi TRUE TRUE TRUE TRUE TRUE TRUE ... I.e., it contains 3 things, each with 28 observations: a 268-column matrix of numbers, NIR, a numeric vector, density, and a logical vector, train. data.frames can contain matrices. They are not common, but can be used for grouping purposes or because it is easier to refer to NIR[,j] than paste(NIR, j, sep=.). help(yarn) tells about the meaning of the components. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of C W Sent: Saturday, October 06, 2012 6:45 PM To: r-help Subject: [R] what exactly is the dim of data set yarn in package pls? Hi list, I am looking at the data yarn in package, I don't understand what is dimension of this data set. I did the following: library(pls) data(yarn) dim(yarn) [1] 28 3 head(yarn) NIR.1 NIR.2 NIR.3 NIR.4 NIR.5 NIR.6 NIR.7 NIR.8 NIR.9 NIR.10 NIR.11 1 3.06630 3.08610 3.10790 3.09720 2.99790 2.82730 2.62330 2.40390 2.19310 2.00580 1.83790 2 3.06750 3.08570 3.09580 3.06920 2.98180 2.84080 2.67600 2.50590 2.35060 2.22300 2.11920 3 3.07500 3.09660 3.09160 3.02880 2.88490 2.68850 2.47640 2.26940 2.08240 1.91950 1.77470 4 3.08280 3.09730 3.10100 3.07350 2.99130 2.87090 2.73920 2.61020 2.5 2.42370 2.37740 5 3.10290 3.10340 3.08480 3.02280 2.89270 2.71590 2.53840 2.37640 2.23970 2.13460 2.05340 6 3.08150 3.08490 3.04870 2.93050 2.73230 2.50890 2.29440 2.09950 1.93280 1.79250 1.66930 There were 270 columns, I only copy pasted the first 11 columns. But either way, this is NOT 3 columns. Could anyone let me know what is wrong here? Thanks in advance, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Presence/ absence data from matrix to single column
Also reshape() will work: adat-data.frame(Year=rep(2004,3),Route=rep(123,3), Point=c(123-1,123-2,123-10),Sp1=c(0,0,1), Sp2=c(1,1,1),Sp3=c(0,1,0)) reshape(adat, varying=4:6, v.name=Sp-value, times=c(Sp1, Sp2, Sp3), idvar=Point, timevar=Sp-name, direction=long) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Jim Lemon Sent: Saturday, October 06, 2012 7:36 PM To: agoijman Cc: r-help@r-project.org Subject: Re: [R] Presence/ absence data from matrix to single column On 10/07/2012 01:03 AM, agoijman wrote: I've been trying to reshape this database but haven't succeed at it. I tried using loops but can't get it right. I just want to reshape my database from this matrix, to the one below, with only one column of data. YearRoute Point Sp1 Sp2 Sp3 2004123 123-1 0 1 0 2004123 123-2 0 1 1 2004123 123-10 1 1 0 What I want: YearRoute Point 2004123 123-1 Sp1 0 2004123 123-2 Sp1 0 2004123 123-10 Sp1 1 2004123 123-1 Sp2 1 2004123 123-2 Sp2 1 2004123 123-10 Sp2 1 2004123 123-1 Sp3 0 2004123 123-2 Sp3 1 2004123 123-10 Sp3 0 Hi agoijman, You can do this using the rep_n_stack function. adat-data.frame(Year=rep(2004,3),Route=rep(123,3), Point=c(123-1,123-2,123-10),Sp1=c(0,0,1), Sp2=c(1,1,1),Sp3=c(0,1,0)) library(prettyR) rep_n_stack(adat,c(Sp1,Sp2,Sp3), stack.names=c(Sp-names,Sp-values)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.