Re: [R] Creating a Matrix from a vector with some conditions
Hi, embed() seemed well-suited, but I couldn't figure out an elegant way to use it embed(c(A,A), 4)[1:4, 4:1] HTH, baptiste On 6 January 2011 22:34, ADias diasan...@gmail.com wrote: Hi Suppose we have an object with strings: A-c(a,b,c,d) Now I do: B-matrix(A,4,4, byrow=F) and I get a a a a b b b b c c c c d d d d But what I really want is: a b c d b c d a c d a b d a b c How can I do this? thank you A. Dias -- View this message in context: http://r.789695.n4.nabble.com/Creating-a-Matrix-from-a-vector-with-some-conditions-tp3178219p3178219.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Operating on count lists of non-equal lengths
Hi: This is an abridged version of the reply I sent privately to the OP. Generate an artificial data frame # function to randomly generate one of the Q* columns with length 1000 mysamp - function() sample(c(-1, 0, 1, NA), 1000, prob = c(0.35, 0.2, 0.4, 0.05), replace = TRUE) # use above function to randomly generate 10 questions and assign them names in the workspace for(i in 1:10) assign(paste('Q', i, sep = ''), mysamp()) # create a data frame from the generate questions C - data.frame(time = rep(1:4, each = 250), sector = sample(LETTERS[1:6], 1000, replace = TRUE), Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8, Q9, Q10) # A function to generate the scores from the combined questions # for an arbitrary input data frame d: scorefun - function(d) { dm - matrix(unlist(apply(d, 2, table)[-(1:2)]), nrow = 3) tsums - cbind(rowSums(dm[, 1:3]), dm[, 4], rowSums(dm[, 5:6]), rowSums(dm[, 7:8]), rowSums(dm[, 9:10]) ) dprop - function(x) (x[3] - x[1])/sum(x) 100 * (1 + apply(tsums, 2, dprop)) } library(plyr) # Apply scorefun() to each sub-data frame corresponding to time-sector combinations ddply(C, .(time, sector), scorefun) Dennis On Sat, Jan 8, 2011 at 10:19 PM, Kari Manninen k...@econadvisor.com wrote: This is my first post to R-help and I look forward receiving some advice for a novice like me... Ive got a simple repeated (4 periods so far) 10-question survey data that is very easy to work on Excel. However, Id like to move the compilation to R but Im having some trouble operating on count list data in a neat way. The data C str(C) 'data.frame': 551 obs. of 13 variables: $ TIME : int 1 1 1 1 1 1 1 1 1 1 ... $ Sector : Factor w/ 6 levels D,F,G,H,..: 1 1 1 1 1 1 1 1 1 1 ... $ COMP : Factor w/ 196 levels (_ __ _) ,..: 73 133 128 109 153 147 56 26 142 34 ... $ Q1 : int 0 0 1 1 0 -1 -1 1 1 -1 ... $ Q2 : int 0 0 0 -1 0 -1 0 0 1 -1 ... $ Q3 : int 0 0 0 1 0 -1 -1 1 1 -1 ... $ Q4 : int -1 0 0 0 0 -1 0 -1 0 -1 ... $ Q5 : int 0 0 0 -1 0 -1 0 -1 0 0 ... $ Q6 : int 0 0 0 1 0 -1 0 -1 0 0 ... $ Q7 : int 0 1 1 0 0 0 1 0 1 1 ... $ Q8 : int 0 0 0 0 0 -1 0 0 1 0 ... $ Q9 : int 0 1 0 0 0 -1 0 -1 1 -1 ... $ Q10: int 0 0 0 0 -1 -1 0 -1 0 0 ... summary(C) TIME Sector COMPQ1 Q2 Min. :1.000 D:130 A: 4 Min. :-1.000 Min. :-1. 1st Qu.:2.000 F:126 B: 4 1st Qu.: 0.000 1st Qu.: 0. Median :3.000 G:158 C: 4 Median : 1.000 Median : 0. Mean :2.684 H: 26 D: 4 Mean : 0.446 Mean : 0.2178 3rd Qu.:4.000 I: 20 E: 4 3rd Qu.: 1.000 3rd Qu.: 1. Max. :4.000 J: 91 F: 4 Max. : 1.000 Max. : 1. (Other):527 NA's :60.000 NA's :69. The aim is to produce balance scores between positive and negative answers shares in the data. First counts of -1, 0 and 1 (negative, neutral, positive) and missing NA (it would be som much simple without the missing values) for each question Q1-Q10 for each period (TIME) in 6 Sectors: b-apply(C[,4:13], 2, function (x) tapply(x,C[,1:2], count)) I know that b is a list of data.frames dim(4x6) for each question, where each cell is a count list. For example, for Question 1, Time period 2, Sector 1: str(b$Q1[2,1]) List of 1 $ :data.frame: 4 obs. of 2 variables: ..$ x: int [1:4] -1 0 1 NA ..$ freq : int [1:4] 3 9 12 2 Now I would like to group questions (C[, 4:6], C[, 7], C[8:9], C[10:11] and C[, 12:13]) and sum counts (-1, 0, 1) for these groups and present them in percentage terms. I dont know how to this efficiently for the whole data. I would not like to go through each cell separately Then Id give each group a balance score based on something like: Score = 100 + 100*[ pos% - neg%] for each group by TIME, Sector, while excluding the missing observations. ### This is not working Score - 100 + 100*[sum(count( ==1)/sum(count(list( -1, 0,1) - sum(count( ==-1)/sum(count(list( -1, 0,1)] for each 5 groups defined above and by TIME, Sector I would greatly appreciate your help on this. Regards, - Kari Manninen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to calculate this natural logarithm
You could re-do part of your code to run with mpfr-class variables, and use this function: # mpfr choose(n,k) rmpfac-function(n,k,prec=50) factorial(mpfr(n,prec))/factorial(mpfr(k,prec))/factorial(mpfr(n-k,prec)) Converting into and out of mpfr may not be worth it, but you will get all the precision you want without any nasty Inf results :-) Carl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Post hoc analysis for ANOVA with repeated measures
Dear all, how can I perform a post hoc analysis for ANOVA with repeated measures (in presence of a balanced design)? I am not able to find a good example over internet in R...is there among you someone so kind to give me an hint with a R example please? For example, the aov result of my analysis says that there is a statistical difference between stimuli (there are 7 different stimuli). ...I would like to know which stimuli are involved. aov1 = aov(response ~ stimulus*condition + Error(subject/(stimulus*condition)), data=scrd) summary(aov1) Error: subject Df Sum Sq Mean Sq F value Pr(F) Residuals 14 227.57 16.255 Error: subject:stimulus Df Sum Sq Mean Sq F value Pr(F) stimulus 6 11.695 1.94921 2.3009 0.04179 * Residuals 84 71.162 0.84717 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: subject:condition Df Sum Sq Mean Sq F value Pr(F) condition 1 42.076 42.076 12.403 0.003386 ** Residuals 14 47.495 3.393 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: subject:stimulus:condition Df Sum Sq Mean Sq F value Pr(F) stimulus:condition 6 9.124 1.5206 1.4465 0.2068 Residuals 84 88.305 1.0513 In attachment you find the table I am using, you can access it by means of: scrd- read.csv(file='/Users/../tables_for_R/table_quality_gravel.csv',sep=',',header=T) The data are from an experiment where participants had to evaluate on a seven point likert scale the realism of some stimuli, which are presented both in condition A and in condition AH. Thanks in advance Best regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bartlett HAC covariance matrix estimator
Dear everyone: I am doing a research on several stock markets. And I need to construct an Bartlett HAC covariance matrix estimator for Sigma(Cov(Y0,Yj)), j is from 0 to T. Can you tell me how to do it. Your Sincerely! Nigel Gregory 01/09/11 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Post hoc analysis for ANOVA with repeated measures
My suggestion is that you take the weekend lull in readership to now read the Posting Guide which it appears you nave not yet done. Had you done so you should have encountered the advice about how to successfully attach data files. (None came through.) You would also have read the request that before posting that you search the R-help archives for previous questions that match yours. (There are several using the search terms post-hoc repeated measures: http://search.r-project.org/cgi-bin/namazu.cgi?query=post-hoc+repeated+measuresmax=100result=normalsort=scoreidxname=functionsidxname=Rhelp08idxname=Rhelp10idxname=Rhelp02 -- David. On Jan 9, 2011, at 7:10 AM, Frodo Jedi wrote: Dear all, how can I perform a post hoc analysis for ANOVA with repeated measures (in presence of a balanced design)? I am not able to find a good example over internet in R...is there among you someone so kind to give me an hint with a R example please? For example, the aov result of my analysis says that there is a statistical difference between stimuli (there are 7 different stimuli). ...I would like to know which stimuli are involved. aov1 = aov(response ~ stimulus*condition + Error(subject/ (stimulus*condition)), data=scrd) summary(aov1) Error: subject Df Sum Sq Mean Sq F value Pr(F) Residuals 14 227.57 16.255 Error: subject:stimulus Df Sum Sq Mean Sq F value Pr(F) stimulus 6 11.695 1.94921 2.3009 0.04179 * Residuals 84 71.162 0.84717 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: subject:condition Df Sum Sq Mean Sq F value Pr(F) condition 1 42.076 42.076 12.403 0.003386 ** Residuals 14 47.495 3.393 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: subject:stimulus:condition Df Sum Sq Mean Sq F value Pr(F) stimulus:condition 6 9.124 1.5206 1.4465 0.2068 Residuals 84 88.305 1.0513 In attachment you find the table I am using, you can access it by means of: scrd- read.csv(file='/Users/../tables_for_R/ table_quality_gravel.csv',sep=',',header=T) The data are from an experiment where participants had to evaluate on a seven point likert scale the realism of some stimuli, which are presented both in condition A and in condition AH. Thanks in advance Best regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Heat map in R
Make sure your data is a matrix. There are many examples of expression heatmaps available on the bioconductor list. After checking out these examples, I would post to the bioconductor list if you are still having problems. Also consider a small example to get you a working heatpmap. You have to install two bioconductor packages for this by using: source(http://bioconductor.org/biocLite.R;) biocLite(c(ALL,genefilter)) This will also install other bioconductor packages that are needed. #Then try: library(ALL) library(genefilter) data(ALL) # just creating create data bcell - grep(^B, as.character(ALL$BT)) types - c(NEG, BCR/ABL) mysub - which(as.character(ALL$mol.biol) %in% types) bc - ALL[, intersect(bcell, mysub)] bc$BT - factor(bc$BT) bc$mol.biol - factor(bc$mol.biol) filter_bc - nsFilter(bc,var.cutoff=0.9) myfilt - filter_bc$eset e - exprs(myfilt) # end of data creation dim(e) #[1] 880 79 class(e) # [1] matrix heatmap(e) On Wed, Jan 5, 2011 at 4:33 PM, lraeb...@sfu.ca lraeb...@sfu.ca wrote: Hello, I am trying to make a heatmap in R and am having some trouble. I am very new to the world of R, but have been told that what I am trying to do should be possible. I want to make a heat map that looks like a gene expression heatmap (see http://en.wikipedia.org/wiki/Heat_map). I have 43 samples and 900 genes (yes I know this will be a huge map). I also have copy numbers associated with each gene/sample and need these to be represented as the colour intensities on the heat map. There are multiple genes per sample with different copy numbers. I think my trouble may be how I am setting up my data frame. My data frame was created in excel as a tab deliminated text file: Gene Copy Number Sample ID A 1935 01 B 2057 01 C 2184 02 D 1498 03 E 2294 03 F 2485 03 G 1560 04 H 3759 04 I 2792 05 J 7081 05 K 1922 06 . . . . . . . . . ZZZ 1354 43 My code in R is something like this: data-read.table(/Users/jsmt/desktop/test.txt,header=T) data_matrix-data.matrix(data) data_heatmap - heatmap(data_matrix, Rowv=NA, Colv=NA, col = cm.colors(256), scale=column, margins=c(5,10)) I end up getting a heat map split into 3 columns: sample, depth, gene and the colours are just in big blocks that don't mean anything. Can anyone help me with my dataframe or my R code? Again, I am fairly new to R, so if you can help, please give me very detailed help :) Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/Heat-map-in-R-tp3176478p3176478.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different LLRs on multinomial logit models in R and SPSS
Hello, thanks for all your replies, it was a helpful lesson for me (and hopefully for my colleagues, too). Bests, Sören On 11-01-07 11:23, David Winsemius wrote: Date: Fri, 7 Jan 2011 11:23:04 -0500 From: David Winsemius dwinsem...@comcast.net To: sovo0...@gmail.com Cc: r-help@r-project.org Subject: Re: [R] Different LLRs on multinomial logit models in R and SPSS On Jan 7, 2011, at 8:26 AM, sovo0...@gmail.com wrote: On Thu, 6 Jan 2011, David Winsemius wrote: On Jan 6, 2011, at 11:23 AM, Sören Vogel wrote: Thanks for your replies. I am no mathematician or statistician by far, however, it appears to me that the actual value of any of the two LLs is indeed important when it comes to calculation of Pseudo-R-Squared-s. If Rnagel devides by (some transformation of) the actiual value of llnull then any calculation of Rnagel should differ. How come? Or is my function wrong? And if my function is right, how can I calculate a R-Squared independent from the software used? You have two models in that function, the null one with .~ 1 and the origianl one and you are getting a ratio on the likelihood scale (which is a difference on the log-likelihood or deviance scale). If this is the case, calculating 'fit' indices for those models must end up in different fit indices depending on software: n - 143 ll1 - 135.02 ll2 - 129.8 # Rcs (Rcs - 1 - exp( (ll2 - ll1) / n )) # Rnagel Rcs / (1 - exp(-ll1/n)) ll3 - 204.2904 ll4 - 199.0659 # Rcs (Rcs - 1 - exp( (ll4 - ll3) / n )) # Rnagel Rcs / (1 - exp(-ll3/n)) The Rcs' are equal, however, the Rnagel's are not. Of course, this is no question, but I am rather confused. When publishing results I am required to use fit indices and editors would complain that they differ. It is well known that editors are sometimes confused about statistics, and if an editor is insistent on publishing indices that are in fact arbitrary then that is a problem. I would hope that the editor were open to education. (And often there is a statistical associate editor who will be more likely to have a solid grounding and to whom one can appeal in situations of initial obstinancy.) Perhaps you will be doing the world of science a favor by suggesting that said editor first review a first-year calculus text regarding the fact that indefinite integrals are only calculated up to a arbitrary constant and that one can only use the results in a practical setting by specifying the limits of integration. So it is with likelihoods. They are only meaningful when comparing two nested models. Sometimes the software obscures this fact, but it remains a statistical _fact_. Whether you code is correct (and whether the Nagelkerke R^2 remain invariant with respect to such transformations) I cannot say. (I suspect that it would be, but I have never liked the NagelR2 as a measure, and didn't really like R^2 as a measure in linear regression for that matter, either.) I focus on fitting functions to trends, examining predictions, and assessing confidence intervals for parameter estimates. The notion that model fit is well-summarized in a single number blinds one to other critical issues such as the linearity and monotonicity assumptions implicit in much of regression (mal-)practice. So, if someone who is more enamored of (or even more knowledgeably scornful of) the Nagelkerke R^2 measure wants to take over here, I will read what they say with interest and appreciation. Sören David Winsemius, MD West Hartford, CT -- Sören Vogel, sovo0...@gmail.com, http://sovo0815.wordpress.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Post hoc analysis for ANOVA with repeated measures
Please read the maiz example in HH install.packages(HH) library(HH) ?MMC Read all the way to the end of the maiz example. On Sun, Jan 9, 2011 at 7:10 AM, Frodo Jedi frodo.j...@yahoo.com wrote: Dear all, how can I perform a post hoc analysis for ANOVA with repeated measures (in presence of a balanced design)? I am not able to find a good example over internet in R...is there among you someone so kind to give me an hint with a R example please? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about the chow test of poolability
Good day R-listers, My question is more a statistical question than an R related question, so please bear with me i'm currently applying the chow test of poolability in fact i'm working with panel N=17 T=5 , and my model looks like this : Yit= a0+B1X1+B2X2+B3X3+B4X4+eit My question is the following when i'm Testing for the equality of the coefficients of the unpooled data (the last stage) many of my constraints get dropped, this indeed impact the degree of freedom of my F statistic , and would like to understand the reason? is this because the time dimension of my panel is too small? or because the number of my constraints is too high? Any hint will be highly appreciated. Ama __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice, combine histogram and line graph
Hello everyone, I have a simple histogram of gasoline prices going back a few years that I want to insert a line graph of consumer price index (cpi) over the histogram. I have looked through the Lattice book by Deepayan Sarkar but don't see anything there. How might this be done? An example would be wonderful. Current code snippet follows. For example additional field to add as a line graph would be a cpi calculation like gas_data$regular * (2010_cpi / gas_data$year ). xyplot( regular ~ as.Date(gas_data$dates,%b %d, %Y) , data = gas_data, type = c(g, h )) Thanks, Jim Burke __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Aggragating subsets of data in larger vector with sapply
Have 40,000 rows of buy/sell trade data and am trying to add up the buys for each second, the code works but it is very slow. Any suggestions how to improve the sapply function ? secEP = endpoints(xSym$Direction, secs) # vector of last second on an XTS timeseries object with multiple entries for each second. d = xSym$Direction s = xSym$Size buySize = sapply(1:(length(secEP)-1), function(y) { i = (secEP[y]+ 1):secEP[y+1]; # index of vectors between each secEP return(sum(as.numeric(s[i][d[i] == buy]))); } ) Object details: secEP = numeric Vector of one second Endpoints in xSym$Direction. head(xSym$Direction) Direction 2011-01-05 09:30:00 unkn 2011-01-05 09:30:02 sell 2011-01-05 09:30:02 buy 2011-01-05 09:30:04 buy 2011-01-05 09:30:04 buy 2011-01-05 09:30:04 buy head(xSym$Size) Size 2011-01-05 09:30:00 865 2011-01-05 09:30:02 100 2011-01-05 09:30:02 100 2011-01-05 09:30:04 100 2011-01-05 09:30:04 100 2011-01-05 09:30:04 41 Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Aggragating-subsets-of-data-in-larger-vector-with-sapply-tp3206445p3206445.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Audio Problems when using Vista/Windows 7
Ive been trying to help someone with using R scripts with audio on Windows Vista/7. The problem is that all the packages Ive tried fail to work properly when being used on Windows 7. I couldnt get audio to work at all. It kept coming up with the message that wmm output could not happen. Worked fine on Windows XP and OS X but not 7. Tried tuneR and the default worked alright but I couldnt get the window to embed, as in not appear. I then copied over mplayer and sndrec32 and those worked but I got tons of warnings about them returning error codes. What Id like some help with is either someone knows an audio package for R that works under Windows 7 or if someone knows a way of getting rid of the warnings that show up when using one of the old xp applications. -- View this message in context: http://r.789695.n4.nabble.com/R-Audio-Problems-when-using-Vista-Windows-7-tp3206400p3206400.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Shortcut key to get to beginniing of line in R?
This is a generalize question: basically, say you are typing a long line of command in R console, and then you realize you forgot to add something in the beginning, is there a way to get to the beginning of the line without pressing the left key on the keyboard and waiting for the cursor to get to the beginning, or using the mouse? I'm using windows version of R. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Shortcut-key-to-get-to-beginniing-of-line-in-R-tp3206303p3206303.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting total bar's label value labels in a barplot
Hi, I have been trying to get the label under the total column - i.e. a mean value of columns 2 to 6 - in a barplot I generate with this script: t1 - tapply(A, B, sum) t1[8] - mean(t1[2:6]) t1 - as.table(t1) barplot(t1, ylim=c(0,3000)) mtext(Var1, side = 1, line = 3) mtext(Var2, side = 2, line = 3) I have been trying to use axis(1, at=1:8, labels=c(1,2,3,4,5,6,7,8)) but I get labels not standing underneat the columns...can someone help me out on this one? Also, I would like to plot onto each bar the corresponding numerical value - e.g. 1824 on the first bar, ecc... Please notice that str(t1) would look like: Named num [1:8] 1824 2339 2492 2130 2360 ... - attr(*, names)= chr [1:8] 1 2 3 4 ... Thanks, Luca Mr. Luca Meyer www.lucameyer.com IBM SPSS Statistics release 19.0.0 R version 2.12.1 (2010-12-16) Mac OS X 10.6.5 (10H574) - kernel Darwin 10.5.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shortcut key to get to beginniing of line in R?
try home or better yet use a text editor and cut/paste your command; let you type a lot of shorter lines that are more readable and easy to change. Sent from my iPad On Jan 9, 2011, at 16:00, casshyr cass...@hotmail.com wrote: This is a generalize question: basically, say you are typing a long line of command in R console, and then you realize you forgot to add something in the beginning, is there a way to get to the beginning of the line without pressing the left key on the keyboard and waiting for the cursor to get to the beginning, or using the mouse? I'm using windows version of R. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Shortcut-key-to-get-to-beginniing-of-line-in-R-tp3206303p3206303.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice, combine histogram and line graph
On Jan 9, 2011, at 8:13 PM, Jim Burke wrote: Hello everyone, I have a simple histogram of gasoline prices going back a few years that I want to insert a line graph of consumer price index (cpi) over the histogram. I have looked through the Lattice book by Deepayan Sarkar but don't see anything there. How might this be done? An example would be wonderful. Current code snippet follows. For example additional field to add as a line graph would be a cpi calculation like gas_data$regular * (2010_cpi / gas_data$year ). xyplot( regular ~ as.Date(gas_data$dates,%b %d, %Y) , data = gas_data, type = c(g, h )) http://finzi.psych.upenn.edu/R/library/latticeExtra/html/doubleYScale.html -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shortcut key to get to beginniing of line in R?
On Jan 9, 2011, at 4:00 PM, casshyr wrote: This is a generalize question: basically, say you are typing a long line of command in R console, and then you realize you forgot to add something in the beginning, is there a way to get to the beginning of the line without pressing the left key on the keyboard and waiting for the cursor to get to the beginning, or using the mouse? I'm using windows version of R. As far as I know all versions accept cntr-a for that purpose. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shortcut key to get to beginniing of line in R?
On 01/09/2011 04:00 PM, casshyr wrote: This is a generalize question: basically, say you are typing a long line of command in R console, and then you realize you forgot to add something in the beginning, is there a way to get to the beginning of the line without pressing the left key on the keyboard and waiting for the cursor to get to the beginning, or using the mouse? I'm using windows version of R. Thanks! Try Ctrl+a or the home key Ctrl+e should put you at the end of the line. Jason __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting total bar's label value labels in a barplot
Try looking at the barplot function and notice that is should be returning values for the mid-points. You should use those instead of the at=values On Jan 9, 2011, at 12:03 PM, Luca Meyer wrote: Hi, I have been trying to get the label under the total column - i.e. a mean value of columns 2 to 6 - in a barplot I generate with this script: t1 - tapply(A, B, sum) t1[8] - mean(t1[2:6]) t1 - as.table(t1) barplot(t1, ylim=c(0,3000)) mtext(Var1, side = 1, line = 3) mtext(Var2, side = 2, line = 3) I have been trying to use axis(1, at=1:8, labels=c(1,2,3,4,5,6,7,8)) but I get labels not standing underneat the columns...can someone help me out on this one? Also, I would like to plot onto each bar the corresponding numerical value - e.g. 1824 on the first bar, ecc... Please notice that str(t1) would look like: Named num [1:8] 1824 2339 2492 2130 2360 ... - attr(*, names)= chr [1:8] 1 2 3 4 ... Thanks, Luca Mr. Luca Meyer www.lucameyer.com IBM SPSS Statistics release 19.0.0 R version 2.12.1 (2010-12-16) Mac OS X 10.6.5 (10H574) - kernel Darwin 10.5.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Aggragating subsets of data in larger vector with sapply
split the data by truncating the time to a second, then process each group. this will save the subsetting you are doing. also merge the data with direction and size in the same frame. it looks like you can subset by buy to begin with. Sent from my iPad On Jan 9, 2011, at 19:10, rivercode aqua...@gmail.com wrote: Have 40,000 rows of buy/sell trade data and am trying to add up the buys for each second, the code works but it is very slow. Any suggestions how to improve the sapply function ? secEP = endpoints(xSym$Direction, secs) # vector of last second on an XTS timeseries object with multiple entries for each second. d = xSym$Direction s = xSym$Size buySize = sapply(1:(length(secEP)-1), function(y) { i = (secEP[y]+ 1):secEP[y+1]; # index of vectors between each secEP return(sum(as.numeric(s[i][d[i] == buy]))); } ) Object details: secEP = numeric Vector of one second Endpoints in xSym$Direction. head(xSym$Direction) Direction 2011-01-05 09:30:00 unkn 2011-01-05 09:30:02 sell 2011-01-05 09:30:02 buy 2011-01-05 09:30:04 buy 2011-01-05 09:30:04 buy 2011-01-05 09:30:04 buy head(xSym$Size) Size 2011-01-05 09:30:00 865 2011-01-05 09:30:02 100 2011-01-05 09:30:02 100 2011-01-05 09:30:04 100 2011-01-05 09:30:04 100 2011-01-05 09:30:04 41 Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Aggragating-subsets-of-data-in-larger-vector-with-sapply-tp3206445p3206445.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice, combine histogram and line graph
Hi Jim, Some example data would help us. I typically think of a histogram as the frequency of values falling within a certain range (determined by bins). Since they are univariate plots, I'm not sure how you are planning on adding a line graph to that. If you just want bars of the average gasoline price at different years, perhaps something along these lines would work for you: ## Load required packages require(lattice) require(latticeExtra) ## Sample Data dat - data.frame(year = 1996:2010, x1 = rnorm(15, 3, .2), x2 = rnorm(15, 200, 1)) ## Base xyplot (not a histogram) adding a layer with different y axis xyplot(x1 ~ year, data = dat, type = h) + as.layer(xyplot(x2 ~ year, data = dat, type = l, col = black), y.same = FALSE) ## See ?xyplot ?as.layer ?hist # for info about histograms HTH, Josh On Sun, Jan 9, 2011 at 5:13 PM, Jim Burke j.bu...@earthlink.net wrote: Hello everyone, I have a simple histogram of gasoline prices going back a few years that I want to insert a line graph of consumer price index (cpi) over the histogram. I have looked through the Lattice book by Deepayan Sarkar but don't see anything there. How might this be done? An example would be wonderful. Current code snippet follows. For example additional field to add as a line graph would be a cpi calculation like gas_data$regular * (2010_cpi / gas_data$year ). xyplot( regular ~ as.Date(gas_data$dates,%b %d, %Y) , data = gas_data, type = c(g, h )) Thanks, Jim Burke __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Normal Distribution Quantiles
Just to add to the silly solutions, here's how I would have done it... mu - 40 sdev - 10 days - 100:120 # range to explore p - 0.8 days[ match(TRUE, qnorm(0.2, mu*days, sqrt(sdev * sdev * days)) = 4000) ] Michael On 9 January 2011 08:48, Bert Gunter gunter.ber...@gene.com wrote: If I understand what you have said below, it looks like you do NOT have the problem solved manually. You CAN use qnorm , and when you do so, your equation yields a simple quadratic which, of course, has an exact solution that you can calculate in R. Of course, one can use uniroot or whatever to solve the quadratic; or simulation or interpolation using pnorm. But other than the R practice, these are unnecessary and, in this case, a bit silly. Cheers, Bert On Sat, Jan 8, 2011 at 6:25 AM, Rainer Schuermann rainer.schuerm...@gmx.net wrote: Sounds like homework, which is not an encouraged use of the Rhelp list. You can either do it in theory... It is _from_ a homework but I have the solution already (explicitly got that done first!) - this was the pasted Latex code (apologies for that, but in plain text it looks unreadable[1], and I thought everybody here has his / her favorite Latrex editor open all the time anyway...). I'm just looking, for my own advancement and programming training, for a way of doing that in R - which, from your and Dennis' reply, doesn't seem to exist. I would _not_ misuse the list for getting homework done easily, I will not ask learning statistics questions here, and I will always try to find the solution myself before posting something here, I promise! Thanks anyway for the simulation advice, Rainer (4000 - (40*n)) -329 [1] --- = 1 200 (10*(n^-)) 2 On Saturday 08 January 2011 14:56:20 you wrote: On Jan 8, 2011, at 6:56 AM, Rainer Schuermann wrote: This is probably embarrassingly basic, but I have spent quite a few hours in Google and RSeek without getting a clue - probably I'm asking the wrong questions... There is this guy who has decided to walk through Australia, a total distance of 4000 km. His daily portion (mean) is 40km with an sd of 10 km. I want to calculate the number of days it takes to arrive with 80, 90, 95, 99% probability. I know how to do this manually, eg. for 95% $\Phi \left( \frac{4000-40n}{10 \sqrt{n}} \right) \leq 0.05$ find the z score... but how would I do this in R? Not qnorm(), but what is it? Sounds like homework, which is not an encouraged use of the Rhelp list. You can either do it in theory or you can simulate it. Here's a small step toward a simulation approach. cumsum(rnorm(100, mean=40, sd=10)) [1] 41.90617 71.09148 120.55569 159.56063 229.73167 255.35290 300.74655 snipped [92] 3627.25753 3683.24696 3714.11421 3729.41203 3764.54192 3809.15159 3881.71016 [99] 3917.16512 3932.00861 cumsum(rnorm(100, mean=40, sd=10)) [1] 38.59288 53.82815 111.30052 156.58190 188.15454 207.90584 240.64078 snipped [92] 3776.25476 3821.90626 3876.64512 3921.16797 3958.83472 3992.33155 4045.96649 [99] 4091.66277 4134.45867 The first realization did not make it in the expected 100 days so further efforts should extend the simulation runs to maybe 120 days. The second realization had him making it on the 98th day. There is an R replicate() function available once you get a function running that will return a specific value for an instance. This one might work: min(which(cumsum(rnorm(120, mean=40, sd=10)) = 4000) ) [1] 97 If you wanted a forum that does not explicitly discourage homework and would be a better place to ask theory and probability questions, there is CrossValidated: http://stats.stackexchange.com/faq Thanks in advance, and apologies for the level of question... Rainer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Lattice, combine histogram and line graph
On Sun, Jan 9, 2011 at 8:13 PM, Jim Burke j.bu...@earthlink.net wrote: Hello everyone, I have a simple histogram of gasoline prices going back a few years that I want to insert a line graph of consumer price index (cpi) over the histogram. I have looked through the Lattice book by Deepayan Sarkar but don't see anything there. How might this be done? An example would be wonderful. Current code snippet follows. For example additional field to add as a line graph would be a cpi calculation like gas_data$regular * (2010_cpi / gas_data$year ). xyplot( regular ~ as.Date(gas_data$dates,%b %d, %Y) , data = gas_data, type = c(g, h )) xyplot.zoo in the zoo package has facilities for drawing multiple time series using lattice graphics or in different panels. The documentation has many examples. library(zoo) library(lattice) z1 - zoo(1:6) z2 - z1^2 z - cbind(z1, z2) xyplot(z) # different panels xyplot(z, screen = 1) # all on one panel ?xyplot.zoo example(xyplot.zoo) vignette(package = zoo) # lists available vignettes -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Anova with repeated measures for unbalanced design
Dear list members, I extensively searched in the previous threads of this mailing list an example easy to understand for me, and able to fit my problem, but I didn´t succed to find a solution for which I feel certain. At the moment I am stuck with this solution for my unbalanced design, and I don´t know if it is correct or not: library(nlme) scrd.lme - lme(response~stimulus*condition,random=~1|subject,data=scrd) Now at this point is it correct to simply run the command anova(scrd.lme) ? Or should I do something different using aov? As I told, for a balanced case I would use the command aov1 = aov(response ~ stimulus*condition + Error(subject/(stimulus*condition)), data=scrd) Now in the R prompt I get this output, which is very different from the one listed of aov for a balanced case: anova(scrd.lme) numDF denDF F-value p-value (Intercept)1 182 178.56833 .0001 stimulus 6 182 1.57851 0.1557 condition 1 182 39.68822 .0001 stimulus:condition 6 182 0.67992 0.6660 Now, (if the previous point is correct) I am burning with curiosity to solve another problem. I found that for condition there are significatively differences. For condition I have only 2 levels so there is no need to do a post-hoc analysis. But if I had 4 levels, I would need. Now, which is for the ANOVA with repeated measures with UNBALANCED design the right approach for a post hoc test? Is there anyone who can provide a R code example to solve this problem so I can better understand? I know, I should read some books to understand better the subject. I am doing my best. Thanks for any suggestion From: Ben Bolker bbol...@gmail.com To: r-h...@stat.math.ethz.ch Sent: Sat, January 8, 2011 9:39:20 PM Subject: Re: [R] Anova with repeated measures for unbalanced design Frodo Jedi frodo.jedi at yahoo.com writes: Dear all, I need an help because I am really not able to find over internet a good example in R to analyze an unbalanced table with Anova with repeated measures. For unbalanced table I mean that the questions are not answered all by the same number of subjects. For a balanced case I would use the command aov1 = aov(response ~ stimulus*condition + Error(subject/(stimulus*condition)), data=scrd) I recommend that you find a copy of Pinheiro and Bates 2000 ... Does the same code still work for unbalanced design? No, it doesn't. Can anyone provide a R example of code in order to get the same analysis? Something like library(nlme) lme1 - lme(response~stimulus*condition,random=~1|subject,data=scrd) or possibly lme1 - lme(response~stimulus*condition,random=~stimulus*condition|subject, data=scrd) if your experimental design would allow you to estimate stimulus and condition effects for each subject. Further questions along these lines should probably go to the r-sig-mixed-model mailing list. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice, combine histogram and line graph
Thanks Josh, Gabor, and David, I appreciate your suggestions and the time you took to think about this. This was all most helpful. Gabor I will look at the zoo package soon. Sounds interesting. Below is what worked for me from Josh to overlay a line graph on a histogram. obj1 - xyplot( regular ~ as.Date(gas_data$dates,%b %d, %Y) , data = gas_data, type = c(g, h ) ) obj2 - xyplot( (gas_data$regular * (cpi_2010 / gas_data$cpi) ) ~ as.Date(gas_data$dates,%b %d, %Y) , data = gas_data, type = c( l ), col = black ) obj1 + as.layer(obj2, style = 2, axes = NULL, ) Have a great week, Jim Joshua Wiley wrote: Hi Jim, Some example data would help us. I typically think of a histogram as the frequency of values falling within a certain range (determined by bins). Since they are univariate plots, I'm not sure how you are planning on adding a line graph to that. If you just want bars of the average gasoline price at different years, perhaps something along these lines would work for you: ## Load required packages require(lattice) require(latticeExtra) ## Sample Data dat - data.frame(year = 1996:2010, x1 = rnorm(15, 3, .2), x2 = rnorm(15, 200, 1)) ## Base xyplot (not a histogram) adding a layer with different y axis xyplot(x1 ~ year, data = dat, type = h) + as.layer(xyplot(x2 ~ year, data = dat, type = l, col = black), y.same = FALSE) ## See ?xyplot ?as.layer ?hist # for info about histograms HTH, Josh On Sun, Jan 9, 2011 at 5:13 PM, Jim Burke j.bu...@earthlink.net wrote: Hello everyone, I have a simple histogram of gasoline prices going back a few years that I want to insert a line graph of consumer price index (cpi) over the histogram. I have looked through the Lattice book by Deepayan Sarkar but don't see anything there. How might this be done? An example would be wonderful. Current code snippet follows. For example additional field to add as a line graph would be a cpi calculation like gas_data$regular * (2010_cpi / gas_data$year ). xyplot( regular ~ as.Date(gas_data$dates,%b %d, %Y) , data = gas_data, type = c(g, h )) Thanks, Jim Burke __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Operating on count lists of non-equal lengths
Dear Dennis, Thank you so very much for your quick reply. What an introduction to R-help!! Especially I appreciated the time you put to explain the code privately. After a few hick-ups I got it working on my data as well. Regards, - Kari Quoting Dennis Murphy djmu...@gmail.com: Hi: This is an abridged version of the reply I sent privately to the OP. Generate an artificial data frame # function to randomly generate one of the Q* columns with length 1000 mysamp - function() sample(c(-1, 0, 1, NA), 1000, prob = c(0.35, 0.2, 0.4, 0.05), replace = TRUE) # use above function to randomly generate 10 questions and assign them names in the workspace for(i in 1:10) assign(paste('Q', i, sep = ''), mysamp()) # create a data frame from the generate questions C - data.frame(time = rep(1:4, each = 250), sector = sample(LETTERS[1:6], 1000, replace = TRUE), Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8, Q9, Q10) # A function to generate the scores from the combined questions # for an arbitrary input data frame d: scorefun - function(d) { dm - matrix(unlist(apply(d, 2, table)[-(1:2)]), nrow = 3) tsums - cbind(rowSums(dm[, 1:3]), dm[, 4], rowSums(dm[, 5:6]), rowSums(dm[, 7:8]), rowSums(dm[, 9:10]) ) dprop - function(x) (x[3] - x[1])/sum(x) 100 * (1 + apply(tsums, 2, dprop)) } library(plyr) # Apply scorefun() to each sub-data frame corresponding to time-sector combinations ddply(C, .(time, sector), scorefun) Dennis On Sat, Jan 8, 2011 at 10:19 PM, Kari Manninen k...@econadvisor.com wrote: This is my first post to R-help and I look forward receiving some advice for a novice like me... Ive got a simple repeated (4 periods so far) 10-question survey data that is very easy to work on Excel. However, Id like to move the compilation to R but Im having some trouble operating on count list data in a neat way. The data C str(C) 'data.frame': 551 obs. of 13 variables: $ TIME : int 1 1 1 1 1 1 1 1 1 1 ... $ Sector : Factor w/ 6 levels D,F,G,H,..: 1 1 1 1 1 1 1 1 1 1 ... $ COMP : Factor w/ 196 levels (_ __ _) ,..: 73 133 128 109 153 147 56 26 142 34 ... $ Q1 : int 0 0 1 1 0 -1 -1 1 1 -1 ... $ Q2 : int 0 0 0 -1 0 -1 0 0 1 -1 ... $ Q3 : int 0 0 0 1 0 -1 -1 1 1 -1 ... $ Q4 : int -1 0 0 0 0 -1 0 -1 0 -1 ... $ Q5 : int 0 0 0 -1 0 -1 0 -1 0 0 ... $ Q6 : int 0 0 0 1 0 -1 0 -1 0 0 ... $ Q7 : int 0 1 1 0 0 0 1 0 1 1 ... $ Q8 : int 0 0 0 0 0 -1 0 0 1 0 ... $ Q9 : int 0 1 0 0 0 -1 0 -1 1 -1 ... $ Q10: int 0 0 0 0 -1 -1 0 -1 0 0 ... summary(C) TIME Sector COMPQ1 Q2 Min. :1.000 D:130 A: 4 Min. :-1.000 Min. :-1. 1st Qu.:2.000 F:126 B: 4 1st Qu.: 0.000 1st Qu.: 0. Median :3.000 G:158 C: 4 Median : 1.000 Median : 0. Mean :2.684 H: 26 D: 4 Mean : 0.446 Mean : 0.2178 3rd Qu.:4.000 I: 20 E: 4 3rd Qu.: 1.000 3rd Qu.: 1. Max. :4.000 J: 91 F: 4 Max. : 1.000 Max. : 1. (Other):527 NA's :60.000 NA's :69. The aim is to produce balance scores between positive and negative answers shares in the data. First counts of -1, 0 and 1 (negative, neutral, positive) and missing NA (it would be som much simple without the missing values) for each question Q1-Q10 for each period (TIME) in 6 Sectors: b-apply(C[,4:13], 2, function (x) tapply(x,C[,1:2], count)) I know that b is a list of data.frames dim(4x6) for each question, where each cell is a count list. For example, for Question 1, Time period 2, Sector 1: str(b$Q1[2,1]) List of 1 $ :data.frame: 4 obs. of 2 variables: ..$ x: int [1:4] -1 0 1 NA ..$ freq : int [1:4] 3 9 12 2 Now I would like to group questions (C[, 4:6], C[, 7], C[8:9], C[10:11] and C[, 12:13]) and sum counts (-1, 0, 1) for these groups and present them in percentage terms. I dont know how to this efficiently for the whole data. I would not like to go through each cell separately Then Id give each group a balance score based on something like: Score = 100 + 100*[ pos% - neg%] for each group by TIME, Sector, while excluding the missing observations. ### This is not working Score - 100 + 100*[sum(count( ==1)/sum(count(list( -1, 0,1) - sum(count( ==-1)/sum(count(list( -1, 0,1)] for each 5 groups defined above and by TIME, Sector I would greatly appreciate your help on this. Regards, - Kari Manninen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Problems with glht function for lme object
Dear all, Thanks for your input. It works fine now. All it took was for me to tidy up the workspace. Simple solution, that I should have considered earlier. Regards, Andreas -- View this message in context: http://r.789695.n4.nabble.com/Problems-with-glht-function-for-lme-object-tp3179128p3206686.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Normal Distribution Quantiles
Altogether I got five more or less silly solutions (not my judgment!), some of them further discussed in private mail, for a problem where my expectation was to get a simple one-liner back: Check ?clt or so... Fortunately, with all of them I seem to arrive at a result that is consistent with what my expressions evaluates to (104.25) which gives me a great opportunity to play around with the various approaches - brain fodder for quite a few days. Great experience, Thanks to all, Rainer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] YourCast Data Format
Thomas Jensen-6 wrote: ... data set ... The data set has in total 27 countries for the years 1999 to 2008, but with unbalanced panels. I want to be able to estimate a model and do forecasting for each country in the data set. I have been looking into the YourCast package from King et al. but since I have all my data in a single file, I am at a loss as to how to create a data object that the yourcast() function will accept. The base R-method uses by followed by do.call: dt = your data structure in the mail which has only one country, so the result is a bit confusing dt.by = by(dt,dt$Country, function(x){ # put you own calculation here data.frame(Absention.neg=mean(x$Abstention.Neg), Absention.neg=mean(x$Abstention.Neg)) }) do.call(rbind,dt.by) This sequence is not really intuitive, so an add-on industry has evolved, for examples in packages doBy (fast, straightforward) and plyr (can be slow, but comprehensive and consistent). Best is you try the base method first, and work with the packages later. Dieter -- View this message in context: http://r.789695.n4.nabble.com/YourCast-Data-Format-tp3205174p3206697.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting stable solution while applying 3 parameter model (tpm) on response data
I have started exploring potential of R in applying IRT to a dataset. I have a data of about 20k students who took a Maths test, a diagnostic in nature. I find that I don't get stable solution while using tpm() even after passing an argument start.val = RANDOM. What should be done in this case to achieve stable solution? The item parameters estimated would not be sensible when the stable solution is not arrived at. However I find that discrimination parameter of one of the item estimated is negative in that case. (I used tpm() from ltm package.) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Capturing error messages while running R through an external program
I am trying to automate scoring done by applying 3 parameter model of IRT to the response data. I call R from PHP to do this using exec(). Since I face convergence issues while using R, it is important that the program captures the error messages like the solution is not stable given by R while running tpm(). How does one do that? I find sink() of no use to do that. Regards, Maulik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Step command failing for lm function
Hi, I have a fairly simple linear regression using the lm function. There are about 100 variables and 30,000 rows of data. It runs fine and produces a decent looking R2 value. I'm interested in performing a stepwise variable selection to see if things can be cleaned up a bit. Calling the step function returns ONE iteration (all the variables) and then stops. No errors are reported. Can someone suggest why this might not be working as expected. (Normally this function steps through all the variables to find the best combination.) Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.