Re: [R] R exponential regression
Chris, I haven't seen anyone post a reply yet so thought I'd throw in my thoughts. I'm no R expert! When you talk about an exponential trend line are you refering to: 1) y=ax^b or 2) y=ae^(bx) If 1) then take base10 logs of y and x and then fit them with simple linear regression. Then calculate the antilog of the residulas and plot these as your trendline. If 2) then take natural logs of y and x and follow the rest of the procedure described in 1). Hope this helps. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "chrisli1223" To: Sent: Thursday, January 07, 2010 10:33 PM Subject: [R] R exponential regression Hi all, I have a dataset which consists of 2 columns. I'd like to plot them on a x-y scatter plot and fit an exponential trendline. I'd like R to determine the equation for the trendline and display it on the graph. Since I am new to R (and statistics), any advice on how to achieve this will be greatly appreciated. Many thanks, Chris -- View this message in context: http://n4.nabble.com/R-exponential-regression-tp1009449p1009449.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Thanks to all who commented on my code
As I said, I am new to R after spending far too many years using SAS. I'm slowly getting the hang of R and like it very much. Thanks for your insights and help. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using sapply to build a count matrix
Dear All, I am new to R and slowly learning how to use the system. The following code is an exercise I was trying. The intent is to generate 10 random samples of size 5 from a vector with integers 1:10 and 2 missing values. I then want to generate a matrix, for each sample which shows the frequency of missing values (NA) in each sample. My solution, using sapply is at the end. If anyone has the time and/or intrest to critique my method I'd be very grateful. I'm especially interested in knowing if there is a better way to accomplish this problem. (x<-replicate(10,sample(c(1:10,rep(NA,2)),5))) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]3 NA342 10 NA45 4 [2,]5773928 NA7 9 [3,] NA815 NA7 102 NA 6 [4,]2 NA6 1084474 7 [5,]79 108361 NA9NA # Since table will return only a single item of vaule FALSE # if there are no missing values (NA) in a sample, sapply # will return a list and not a matrix. # So to get a matrix, the factor function needs to be used # to identify possible results (FALSE, TRUE) for the table # function. sapply(1:10,function(i) table(factor(is.na(x[,i]),c(FALSE,TRUE [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] FALSE435545434 4 TRUE 120010121 1 Thanks for your thoughts. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function recommendation for this study...
Paul, I suggest looking up "observer agreement". The description of your study sounds like a classical categorical observer agreement problem. I can't give a reference off the top of my head, but if you get stuck, e-mail me and I'll try and find a ref to get you started. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Paul Heinrich Dietrich" To: Sent: Sunday, May 10, 2009 8:25 AM Subject: [R] Function recommendation for this study... Hi, I'm not used to thinking along these lines, and wanted to ask your advice: Suppose you have a sample of around 100, consisting of patients according to doctors, in which patients and doctors are given a questionnaire with categorical responses. Each patient somehow has roughly 3 doctors, or 3 rows of data. The goal is to assess by category of each question or DV the agreement between the patient and 3 doctors. For example, a question may be asked about how well the treatment is understood by the patient, and the patient answers with their perception, while the 3 doctors each answer with their perception. The person currently working on this has used a Wilcoxon Sign Rank test, and asked what I thought. Personally, I shy away from nonparametrics and prefer parametric Bayesian methods, but of course am up for whatever is most appropriate. I was concerned about using multiple Wilcoxon tests, one for each question, and wondering if there is a parametric method in R for something like this, and a method which is multivariate? Thanks for any suggestions. -- View this message in context: http://www.nabble.com/Function-recommendation-for-this-study...-tp23469646p23469646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] large factorials
You don't say what the error was, for the R factorial function, but it is probably irrelevant for your question. Factorials get to be big numbers rather quickly and unless you are using a program that does arbitrary precission arithmetic you will quickly exceed the precission limits, for storing a number. If you have Maple, do 170! and count the number of digits in the result. You will see what I mean. There are some tricks when working with large factorials, depending on what you are doing with them. I'd first try the log factorial function in R I think its called lfactorial. Just do a ?factorial and you'll find documentation. If this doesn't work, for you, repost with a clear description of what you're trying to do and someone may be able to help. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "molinar" To: Sent: Wednesday, April 22, 2009 3:21 PM Subject: [R] large factorials I am working on a project that requires me to do very large factorial evaluations. On R the built in factorial function and the one I created both are not able to do factorials over 170. The first gives an error and mine return Inf. Is there a way to have R do these larger calculations (the calculator in accessories can do 1 factorial and Maple can do even larger) -- View this message in context: http://www.nabble.com/large-factorials-tp23175816p23175816.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT ?] rant (was : Re: Conversions From standard to metricunits)
For science yes. For pleasure I'll still take a pint instead of 570ml! Murray - Original Message - From: "Rolf Turner" To: "Emmanuel Charpentier" Cc: Sent: Friday, April 03, 2009 6:18 PM Subject: Re: [R] [OT ?] rant (was : Re: Conversions From standard to metricunits) On 4/04/2009, at 10:37 AM, Emmanuel Charpentier wrote: Le vendredi 03 avril 2009 à 14:17 -0400, stephen sefick a écrit : I am starting to use R for almost any sort of calculation that I need. I am a biologist that works in the states, and there is often a need to convert from standard units to metric units. US/Imperial units are *not* standard units. The former "metric system" is now called "Système International" (International System) for a reason, which is *not* gallocentrism of a "few" 6e7 frogs, but rather laziness of about 5.6e9 losers who refuse to load their memories with meaningless conversion factors... Right on, Red Freak!!! cheers, Rolf ## Attention:\ This e-mail message is privileged and confid...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Burt table from word frequency list
The usual approach is to count the co-occurence within so many words of each other. Typical is between 5 words before and 5 words after a given word. So for each word in the document, you look for the occurence of all other words within -5 -4 -3 -2 -1 0 1 2 3 4 5 words. Depending on the language and the question being asked certain words may be excluded. This is not a simple function! I don't know if anyone has done a package, for this type of analysis but with over 2000 packages floating around you might get lucky. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Ted Harding" To: "Joan-Josep Vallbé" Cc: Sent: Sunday, March 29, 2009 2:46 PM Subject: Re: [R] Burt table from word frequency list On 29-Mar-09 16:32:11, Joan-Josep Vallbé wrote: Ok, thank you. And is there any function to get the table directly from the original corpus? best, joan-josep vallbé You will have to think about what you are doing. As Duncan said, you need "counts of pairs of words" or, more precisely, of co-occurrence. But co-occurrence within what? Adjacent? Within the same sentence? Within the same paragraph? Within the same chapter? Within the same document (if your corpus incorporates several documents)? Within documents by the same author? If so, then is there an additional classification by individual document? Etc., etc., etc. In short, what is the structure of your corpus, and how do you wish this to be represented in the Burt table? Hoping this helps to move you forward, Ted. On Mar 29, 2009, at 2:00 PM, Duncan Murdoch wrote: On 29/03/2009 7:02 AM, Joan-Josep Vallbé wrote: Dear all, I have a word frequency list from a corpus (say, in .csv), where the first column is a word and the second is the occurrence frequency of that word in the corpus. Is it possible to obtain a Burt table (a table crossing all words with each other, i.e., where rows and columns are the words) from that frequency list with R? I'm exploring the "ca" package but I'm not able to solve this detail. No, because you don't have any information on that. You only have marginal counts. You need counts of pairs of words (from the original corpus, or already summarized.) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 29-Mar-09 Time: 18:46:40 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fourier Analysis Help
Just a word of caution. Having done a lot of work with 24 hour blood pressure and ecg recordings, these series are seldom stationary which presents problems with spectral analysis. I don't know what your ultimate goal is, but in my work I found it often better to work with subsets, of the series, chosen according to the needs of the study. There is an extensive literature, on this type of study. If you haven't done so, I encourage you to do a literature search. Murray - Original Message - From: "Vittorio Colagrande" To: Sent: Friday, March 13, 2009 2:46 PM Subject: [R] Fourier Analysis Help Dear R-help members, To whom it may concern, our research group is conducting a study to evaluate the predictive value of 24 hour blood pressure variability. We are looking for an R routine that performs a fast Fourier transform spectral analysis (with an output of the approximation function of the Fourier and estimates the validity of the model for the various harmonics). Thanks Vittorio Colagrande [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] windows vs. linux code
I'm deeply disappointed! I keep checking the mail list to see if you guys are posting answers to questions I haven't asked yet. It would save me a lot of time! Best, Murray - Original Message - From: "Rolf Turner" To: "Sherri Heck" Cc: Sent: Wednesday, February 25, 2009 8:16 PM Subject: Re: [R] windows vs. linux code On 26/02/2009, at 2:08 PM, Sherri Heck wrote: Dear All- I have been given some Rcode that was written using a Linux OS, but I use Windows-based R. The person that is giving it to me said that it needs to run on a Linux system. Does anyone have any insight and/ or can verify this. I haven't yet obtained the code, so I haven't been able to try it yet. Despite the knowledge, wisdom, insight, skill, good looks, and other admirable characteristics of the members of the R-help list, few of us are skilled in telepathy or clairvoyance. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrap or Wilcoxons' test?
First of all, sorry for my typing mistakes. Second, the WRS test is most certainly not a test for unequal medians. Although under specified models it would be. Just as under specified models it can be a test for other measures of location. Perhaps I did not word my explanation correctly, but I did not mean to imply that it would be a test of equality of variance. It is plain and simple a test for the equality of distributions. When the results of a properly applied parametric test do not agree with the WRS, it is usually do to a difference in the empirical density function of the two samples. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "David Winsemius" To: "Murray Cooper" Cc: "Charlotta Rylander" ; Sent: Friday, February 13, 2009 9:19 PM Subject: Re: [R] Bootstrap or Wilcoxons' test? I must disagree with both this general characterization of the Wilcoxon test and with the specific example offered. First, we ought to spell the author's correctly and then clarify that it is the Wilcoxon rank-sum test that is being considered. Next, the WRS test is a test for differences in the location parameter of independent samples conditional on the samples having been drawn from the same distribution. The WRS test would have no discriminatory power for samples drawn from the same distribution having equal location parameters but only different with respect to unequal dispersion. Look at the formula, for Pete's sake. It summarizes differences in ranking, so it is in fact designed NOT to be sensitive to the spread of the values in the sample. It would have no power, for instance, to test the variances of two samples, both with a mean of 0, and one having a variance of 1 with the other having a variance of 3. One can think of the WRS as a test for unequal medians. -- David Winsemius, MD. MPH Heritage Laboratories On Feb 13, 2009, at 7:48 PM, Murray Cooper wrote: Charlotta, I'm not sure what you mean when you say simple linear regression. From your description you have two groups of people, for which you recorded contaminant concentration. Thus, I would think you would do something like a t-test to compare the mean concentration level. Where does the regression part come in? What are you regressing? As for the Wilcoxnin test, it is often thought of as a nonparametric t-test equivalent. This is only true if the observations were drawn, from a population with the same probability distribution. The null hypothesis of the Wilcoxin test is actually "the observations were drawn, from the same probability distribution". Thus if your two samples had say different variances, there means could be the same, but since the variances are different, the Wilcoxin could give you a significant result. Don't know if this all makes sense, but if you have more questions, please e-mail your data and a more detailed description of what analysis you used and I'd be happy to try and help out. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Charlotta Rylander" To: Sent: Friday, February 13, 2009 3:24 AM Subject: [R] Bootstrap or Wilcoxons' test? Hi! I'm comparing the differences in contaminant concentration between 2 different groups of people ( N=36, N=37). When using a simple linear regression model I found no differences between groups, but when evaluating the diagnostic plots of the residuals I found my independent variable to have deviations from normality (even after log transformation). Therefore I have used bootstrap on the regression parameters ( R= 1000 & R=1) and this confirms my results , i.e., no differences between groups ( and the distribution is log-normal). However, when using wilcoxons' rank sum test on the same data set I find differences between groups. Should I trust the results from bootstrapping or from wilcoxons' test? Thanks! Regards Lotta Rylander [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrap or Wilcoxons' test?
Charlotta, I'm not sure what you mean when you say simple linear regression. From your description you have two groups of people, for which you recorded contaminant concentration. Thus, I would think you would do something like a t-test to compare the mean concentration level. Where does the regression part come in? What are you regressing? As for the Wilcoxnin test, it is often thought of as a nonparametric t-test equivalent. This is only true if the observations were drawn, from a population with the same probability distribution. The null hypothesis of the Wilcoxin test is actually "the observations were drawn, from the same probability distribution". Thus if your two samples had say different variances, there means could be the same, but since the variances are different, the Wilcoxin could give you a significant result. Don't know if this all makes sense, but if you have more questions, please e-mail your data and a more detailed description of what analysis you used and I'd be happy to try and help out. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Charlotta Rylander" To: Sent: Friday, February 13, 2009 3:24 AM Subject: [R] Bootstrap or Wilcoxons' test? Hi! I'm comparing the differences in contaminant concentration between 2 different groups of people ( N=36, N=37). When using a simple linear regression model I found no differences between groups, but when evaluating the diagnostic plots of the residuals I found my independent variable to have deviations from normality (even after log transformation). Therefore I have used bootstrap on the regression parameters ( R= 1000 & R=1) and this confirms my results , i.e., no differences between groups ( and the distribution is log-normal). However, when using wilcoxons' rank sum test on the same data set I find differences between groups. Should I trust the results from bootstrapping or from wilcoxons' test? Thanks! Regards Lotta Rylander [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT: A test with dependent samples.
David, If you really want to do a test on this data, I would suggest a Fisher's Exact test, but you want to use hypergeometric probabilities. You would probably want to try the CMH test, if the function allows a single table and actually uses hypergeometric probabilities. My suggestion, would be to calculate the frequency of vomiting, for animals that didn't vomit before, calculate the CIs and then use some historical data on the vomiting rate, for non-treated cats and see whether it falls inside the CIs for your treated animals. If it does, then you might conclude that the vomiting rate, for treated cats, is similar to non-treated cats. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "David Winsemius" To: "Rolf Turner" Cc: "R-help Forum" Sent: Tuesday, February 10, 2009 4:50 PM Subject: Re: [R] OT: A test with dependent samples. In the biomedical arena, at least as I learned from Rosner's introductory text, the usual approach to analyzing paired 2 x 2 tables is McNemar's test. ?mcnemar.test > mcnemar.test(matrix(c(73,0,61,12),2,2)) McNemar's Chi-squared test with continuity correction data: matrix(c(73, 0, 61, 12), 2, 2) McNemar's chi-squared = 59.0164, df = 1, p-value = 1.564e-14 The help page has citation to Agresti. -- David winsemius On Feb 10, 2009, at 4:33 PM, Rolf Turner wrote: I am appealing to the general collective wisdom of this list in respect of a statistics (rather than R) question. This question comes to me from a friend who is a veterinary oncologist. In a study that she is writing up there were 73 cats who were treated with a drug called piroxicam. None of the cats were observed to be subject to vomiting prior to treatment; 12 of the cats were subject to vomiting after treatment commenced. She wants to be able to say that the treatment had a ``significant'' impact with respect to this unwanted side-effect. Initially she did a chi-squared test. (Presumably on the matrix matrix(c(73,0,61,12),2,2) --- she didn't give details and I didn't pursue this.) I pointed out to her that because of the dependence --- same 73 cats pre- and post- treatment --- the chi-squared test is inappropriate. So what *is* appropriate? There is a dependence structure of some sort, but it seems to me to be impossible to estimate. After mulling it over for a long while (I'm slow!) I decided that a non-parametric approach, along the following lines, makes sense: We have 73 independent pairs of outcomes (a,b) where a or b is 0 if the cat didn't barf, and is 1 if it did barf. We actually observe 61 (0,0) pairs and 12 (0,1) pairs. If there is no effect from the piroxicam, then (0,1) and (1,0) are equally likely. So given that the outcome is in {(0,1),(1,0)} the probability of each is 1/2. Thus we have a sequence of 12 (0,1)-s where (under the null hypothesis) the probability of each entry is 1/2. Hence the probability of this sequence is (1/2)^12 = 0.00024. So the p-value of the (one-sided) test is 0.00024. Hence the result is ``significant'' at the usual levels, and my vet friend is happy. I would very much appreciate comments on my reasoning. Have I made any goof-ups, missed any obvious pit-falls? Gone down a wrong garden path? Is there a better approach? Most importantly (!!!): Is there any literature in which this approach is spelled out? (The journal in which she wishes to publish will almost surely demand a citation. They *won't* want to see the reasoning spelled out in the paper.) I would conjecture that this sort of scenario must arise reasonably often in medical statistics and the suggested approach (if it is indeed valid and sensible) would be ``standard''. It might even have a name! But I have no idea where to start looking, so I thought I'd ask this wonderfully learned list. Thanks for any input. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped: 9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package functions to be included in R
A recent thread on summary statistics, got me thinking. (Note this may not happen often.) A function that would do summaries as describe below (similar to SAS PROC UNIVARIATE) might be a nice addition to the main R system. Is there a process by which functions, from packages can eventually be incorporated into R. The reason I ask, is having them in R would guarantee they get adequate testing. This would be helpful, for GLP and GCP validation. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "William Revelle" To: "David Winsemius" ; "phoebe kong" Cc: Sent: Monday, February 09, 2009 9:06 PM Subject: Re: [R] summary statistics At 6:41 PM -0500 2/9/09, David Winsemius wrote: describe() in Hmisc provides much of the rest of what you asked for: describe(pref900$TCHDL) pref900$TCHDL n missing uniqueMean .05 .10 .25 .50 .75 .90 .95 9061904469 16051 4.123 2.320 2.557 3.061 3.841 4.886 6.054 6.867 lowest : 0.9342 1.0200 1.0522 1.1008 1.1061, highest: 19.8696 20.1667 20.7619 21.6364 21.7200 As does describe in the psych package describe(sat.act) describe(sat.act) var n mean sd median trimmedmad min max range skew kurtosis se gender 1 700 1.65 0.48 21.68 0.00 1 2 1 -0.61-1.62 0.02 education 2 700 3.16 1.43 33.31 1.48 0 5 5 -0.68-0.07 0.05 age 3 700 25.59 9.50 22 23.86 5.93 13 6552 1.64 2.42 0.36 ACT 4 700 28.55 4.82 29 28.84 4.45 3 3633 -0.66 0.53 0.18 SATV5 700 612.23 112.90620 619.45 118.61 200 800 600 -0.64 0.33 4.27 SATQ6 687 610.22 115.64620 617.25 118.61 200 800 00 -0.59-0.02 4.41 see also describe.by to break this down by some grouping variable. Bill On Feb 9, 2009, at 6:04 PM, phoebe kong wrote: Hi all, I'm wondering if there is a function that can return summary statistics: N=total number of observation, # missing, mean, median, range, standard deviation. As I know, summary() returns some of info I've mentioned above. Thanks, SY [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org/personality.html Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern University http://www.northwestern.edu/ Use R for psychology http://personality-project.org/r __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Upgrade R program (version 2.6.2) ???
Nidhi, You neglect to say what your OS is. If you are using Windows, see the "R for Windows FAQ" faq 2.8. I haven't looked, but I'm sure there is a similar item for other OS. Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Nidhi Kohli" To: "r-help" ; "r-help" Sent: Friday, February 06, 2009 10:41 AM Subject: [R] Upgrade R program (version 2.6.2) ??? Hi All, I downloaded the R program (version 2.6.2) in last Jan 2008. I now want to upgrade the program to its latest version, but I don't want to go through the process of deleting the existing version and downloading the new version. This is because my existing R program has numerous packages that I downloaded for my research work. I want to upgrade my R program with those packages in it. Is there a way I can do this? I would appreciate if someone can help me in this issue. Thank you Regards Nidhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval and as.name
I am new to R, so maybe I'm missing the point of your question. But why wouldn't you just use sum(a,b)? Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Fuchs Ira" To: Sent: Thursday, February 05, 2009 5:10 PM Subject: [R] eval and as.name I'm sure there is a more general way to ask this question but how do you use the elements of a character vector as names of objects in an expression? For example, say you have: a = c(1,3,5,7) b = c(2,4,6,8) n=c("a","b") and you want to use the names a and b in a function (e.g. sum) sum(eval(as.name(n[1])),eval(as.name(n[2]))) works but what is a simpler way to effect this level of indirection? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-squared test adjusted for multiple comparisons? Harbe'stest?
Categorical data analysis is definitely the way you want to go. Which test you use depends on how you are going to use the results. For "quick and dirty" I would suggest using Fisher's exact test on all 2x2 submatricies of counts. In this case, with 4 treatments you have 6 possible 2x2 submatricies. See "fishers.test" function. Another possibility would be a log-linear model, to model Ln(p/q). Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Laura Lucia Prieto Godino" To: Sent: Thursday, February 05, 2009 7:06 AM Subject: [R] Chi-squared test adjusted for multiple comparisons? Harbe'stest? Hi! I have some data that looks like this up down percentaje uew_21 20 14 58.82 uew_20_5 27 40 40.29 uew_20 8 13 38.09 uew_19_5 17 42 28.81 So I have 4 experimental conditions and I am counting number of animals in the up and down compartment and the calculating the percentage, I want to know which one of the conditions is different from each other. If the data wouldn't be percentage I would runt a kruskal-wallis test to check for general differences and then when significan a post-hoc test, comparing differents pairs with Man- Whitney (wilcoxon function in R) with a bonferroni correction for multiple comparisons. But as the data are in the percentaje form, I know I need to analize them with either a chi squared or a g-test, but I have no idea if I can do such a test with many comparissons or how to do it in R, as well I have seen a paper in which they do something similar and they are using a Harber's chi squared test. Does anybody know how to do that in R? Thank you very much for your help, and thanks to the jim and chuck for answering my previous statistical question! Lucia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The Origins of R
Consider yourself lucky! I'm sure there are many people who would prefer not to see their name in the NYT. ;-) Murray Coooper - Original Message - From: "Duncan Murdoch" To: "Mark Difford" Cc: Sent: Thursday, February 05, 2009 10:16 AM Subject: Re: [R] The Origins of R On 2/5/2009 1:05 AM, Mark Difford wrote: I think that all appeared on January 8 in Vance's blog posting, with a comment on it by David M Smith on Jan 9. So those people have -27 days Then there was no need for vituperative comments (not from you, of course): simply point doubters to the right place, as you have done. But Mr. Vance's comments only deepen the "mystery." If Mr. Vance was aware of the true origins of R, why did he choose to misrepresent them in his article, which is what got the publicity and which is the item that most people saw/read? Most right-thinking people don't, wouldn't, or haven't taken the matter further than that. Their criticisms, as mine have been, have been aimed at the NY Times and Mr. Vance's lack of ethics. It also seems clear from Mr. Vance's comments that there was no editorial or sub-editorial meddling. That's not what I read in the posting to this list that I cited. I doubt if Ashlee Vance is reading this list, so it doesn't really seem fair to blame him if he doesn't respond to your attacks. So I'm not complaining, but the main problem I saw in his article was that it didn't mention me. I knew Robert Gentleman (even had an office next to him!) before he started R: surely that must have been a key influence. Why else did he move to the far side of the globe? And not only that, but to compound the insult, the NY Times has failed to mention me every day since then! Duncan Murdoch The knee-jerk reaction ? Well, it is almost amusing to see how sensitive some very hard-nosed individuals on this list can be, or have become. Regards, Mark. still to wait. Duncan Murdoch-2 wrote: On 2/4/2009 3:53 PM, Mark Difford wrote: >>> Indeed. The postings exuded a tabloid-esque level of slimy nastiness. Hi Rolf, It is good to have clarification, for you wrote "..,the postings...," tarring everyone with the same brush. And it was quite a nasty brush. It also is conjecture that "this was due to an editor or sub-editor," i.e. the botched article. I think that what some people are waiting for are factual statements from the parties concerned. Conjecture is, well, little more than conjecture. I think that all appeared on January 8 in Vance's blog posting, with a comment on it by David M Smith on Jan 9. So those people have -27 days still to wait. Duncan Murdoch Regards, Mark. Rolf Turner-3 wrote: On 4/02/2009, at 8:15 PM, Mark Difford wrote: Indeed. The postings exuded a tabloid-esque level of slimy nastiness. Indeed, indeed. But I do not feel that that is necessarily the case. Credit should be given where credit is due. And that, I believe is the issue that is getting (some) people hot and bothered. Certainly, Trevor Hastie in his reply to the NY Times article, was not too happy with this aspect of the story. Granted, his comments were not made on this list, but the objection is essentially the same. I would not call what he had to say "Mischief making" or smacking of a "tabloid-esque level of slimy nastiness." The knee- jerk reaction seems to be that this is a criticism of R. It is not. It is a criticism of a poorly researched article. It also is an undeniable and inescapable fact that most S code runs in R. The problem is not with criticism of the NY Times article, although as Pat Burns and others have pointed out this criticism was somewhat misdirected and unrealistic considering the exigencies of newspaper editing. The problem was with a number of posts that cast aspersions upon the integrity of Ihaka and Gentleman. It is these posts that exuded tabloid-esque slimy nastiness. I am sure that Ross and Robert would never dream of failing to give credit where credit is due and it is almost certainly the case that they explained the origins of R in the S language to the writer of the NYT article (wherefrom the explanation was cut in the editing process). Those of us on this list (with the possible exception of one or two nutters) would take it that it goes without saying that R was developed on the basis of S --- we all ***know*** that. To impugn the integrity of Ihaka and Gentleman, because an article which *they didn't write* failed to mention this fact, is unconscionable. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eliminating control characters from formatted data files
David, This may be a case of "If all you have is a hammer, everything looks like a nail". If all you want to do is remove the last line if it contains a CONTROL-Z, why not use something like perl to process the files? Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "David Epstein" To: Sent: Thursday, February 05, 2009 4:01 AM Subject: [R] eliminating control characters from formatted data files I have a few hundred files of formatted data. Unfortunately most of them end with a spurious CONTROL-Z. I want to rewrite the files without the spurious character. Here's what I've come up with so far, but my code is unsafe because it assumes without justification that the last row of df contains a control character (and some NAs to fill up the record). options(warn=-1) #turn off irritating warning from read.table() df<-read.table(file=filename) df.new<-df[1:nrow(df)-1,] write.table(df.new,file=filename.new, quote=F) Before defining df.new, I want to check that the last line really does contain a control character. I've tried various methods, but none of them work. I have been wondering if I should use a function (scan?) that reads in the file line by line and checks each line for control characters, but I don't know how to do this either. Thanks for any help David -- View this message in context: http://www.nabble.com/eliminating-control-characters-from-formatted-data-files-tp21847583p21847583.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I get my IT department to "bless" R?
I was about to post a similar reply. Stavros's reply was very eloquent and should be taken to heart! Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Stavros Macrakis" To: Sent: Sunday, February 01, 2009 6:11 PM Subject: Re: [R] How do I get my IT department to "bless" R? Though there are certainly some *ir*rational reasons for IT departments' behavior, there are also many rational reasons that IT departments try to control the software running in their organizations. Condescendingly assuming that the IT department is run by idiots whose decisions are ruled by emotional attachments (as one correspondent suggested), or that they are irrationally prejudiced against free/open source, and that it is obvious and irrefutable that you know better than them (as was implied by some correspondents), may make you feel better, but probably won't help much. It also won't help much if you don't explain clearly and calmly *why* exactly you need to use R for your work. You can use many kinds of arguments, including technical (functionality, efficiency, capacity), economic (no license fees), scientific-community (widely used in the statistics community), and so on. It *will* help to think a bit about some of the concerns that the IT department may have. Many of these concerns apply both to free/open software and to commercial software: 1) Security. They probably don't want you to install software which risks exposing company data to the outside world either intentionally or unintentionally. For example, they probably don't want you to run code that mirrors your disk drive on an external server, even if it claims to be secured cryptographically etc. Some companies will be more careful, wanting to vet any software that can open a TCP connection (which most non-trivial software systems, including both Excel and R, can). 2) Protection against malware (also a security issue). Some software which appears innocuous may contain a variety of malware. I'm pretty sure that R+CRAN is free of malware, but I don't know what measures are taken to ensure that. 3) Support and maintenance. Not only do they not want to be in a situation where they're asked to support software they don't know, they certainly don't want to be responsible for bad *interactions* between your add-on software and the standard software. 4) Licensing. Besides the question of proper use of commercial licenses, some licenses (notably GPL) have "contagion" clauses which affect other software which is linked to them. Though this doesn't affect the vast majority of users of R (because they neither modify R nor redistribute it), your company's legal department will probably want to know what's going on. 5) Interoperability, maintainability, and continuity. What happens when the user of a particular non-supported software package leaves the company or takes a vacation? Who is going to take over the work he was doing? If s/he's developed programs/scripts on a non-standard infrastructure to solve business problems, do the solutions leave as soon as he's out of the building? Even if the IT department *is* behaving irrationally, responding irrationally yourself probably won't help your cause. -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about contributed packages
I am working on a methodology for qualifying R, for GLP and GCP. If I quailfy only the base R install, with no contributed packages, it seems relatively simple to qualify R. However, from time to time I will want to use a contributed package. If I use a contributed package, does it leave anything behind that will be loaded with the next invocation of R? Suppose I run R and use a contributed package and then exit. Next time I want to run R, for GLP work and will only use base R. Can I be sure I am only working with base R? Or do I need to maintain two installations of R, one for use with GLP/GCP and one for when I want to use contributed packages? I hope this is clear. Thanks, Murray M Cooper, Ph.D. Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where to find the source codes for the internal function in stats package
Dear Dr Murdoch, I understand in principle your explanation, but specifically where in the source distribution are these functions found? For instance, I would like to look at the code for model.matrix. Ex: ans <- .Internal(model.matrix(t, data)) I have looked at the source distribution but been unable to locate the file which contains model.matrix. Thanks for your help. Murray Cooper - Original Message - From: "Duncan Murdoch" To: "zhijie zhang" Cc: Sent: Saturday, January 17, 2009 6:00 AM Subject: Re: [R] Where to find the source codes for the internal function in stats package On 17/01/2009 2:23 AM, zhijie zhang wrote: Dear all, I want to see the source codes for "dchisq(x, df, ncp=0, log = FALSE)", but cannot find it. I input "dchisq" in the R interface, and then enter, the following message return: dchisq /*/ function (x, df, ncp = 0, log = FALSE) { if (missing(ncp)) .Internal(dchisq(x, df, log)) else .Internal(dnchisq(x, df, ncp, log)) } /*/ It seems that dchisq() is the internal function in STATS package. So go to "C:\Program Files\R\R-2.7.2\library\stats" to look for it. I browsed the files in this catalog, but it seems that i missed it. Anybody can tell me how and where to find the codes, Thanks a lot. Uwe Ligges wrote a nice article on finding source in R News : Ligges, U. (2006): R Help Desk: Accessing the Sources. R News 6 (4), 43-45. http://cran.r-project.org/doc/Rnews/ As it explains, .Internal() calls functions in the main R binary, not in a package DLL. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to compute Bootstrap p-values
Do you really need the p-value or do you want to test at one of the socially acceptable levels (i.e. .05 or .01). If all you want is the test, use: quantile(bootsample,c(0.025,0.975)) If the quantile range includes 0 then you decide there is no evidence that the mean is different from zero, at the .05 level. If the quantile range does not include 0 then you decide there is evidence that the mean is different from zero, at the .05 level. If you wanted to use .01 level then use: quantile(bootsample,c(0.005,0.995)) Murray M Cooper Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Andreas Klein" To: Sent: Friday, January 09, 2009 4:36 AM Subject: [R] How to compute Bootstrap p-values Hello. How can I compute the Bootstrap p-value for a two-sided test problem like H_0: beta=0 vs. H_1: beta!=0 ? Example for the sample mean: x <- rnorm(100) bootsample <- numeric(1000) for(i in 1:1000) { idx <- sample(1:100,100,replace=TRUE) bootsample[i] <- mean(x[idx]) } How can I compute the Bootstrap p-value for the mean of x? H_0: "mean of x" = 0 vs. H_1: "mean of x" != 0 Thank you in advance. Sincerely, Andreas Klein. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] VCOV Source Code
Mostly I am interested in using R for statistics. I am also interested in being able to look at source code. I hope to be able to write extensions. I tried the suggestion below but was unable to access vcov.lm. methods(vcov) [1] vcov.Arima* vcov.glm* vcov.lm*vcov.mlm* vcov.nls* Non-visible functions are asterisked stats::vcov.lm Error: 'vcov.lm' is not an exported object from 'namespace:stats' I am currently using version 2.7.2 on XP. Any suggestions on how to proceed next? Thank You, Murray M Cooper Richland Statistics 9800 N 24th St Richland, MI, USA 49083 Mail: richs...@earthlink.net - Original Message - From: "Carlos J. Gil Bellosta" To: "Yang Wan" Cc: Sent: Thursday, January 08, 2009 4:44 AM Subject: Re: [R] VCOV Source Code Hello, You can do stats:::vcov.lm to see the source code for that particular method. In order to see which are the methods supported by vcov, write methods("vcov") Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com On Wed, 2009-01-07 at 21:37 -0600, Yang Wan wrote: Dear R Help, I wonder the way to show the source code of [vcov] command. Usually, it can show the source code after input the command and enter. But for [vcov], it shows function (object, ...) UseMethod("vcov") I appreciate for your help. Best wishes. Christina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.