Re: [R] Too many warnings when updating R
A Lenzo wrote: Hello friends, I loaded R 2.4.1 onto a Fedora Core 6 Linux box (taking all defaults). Then I ran these commands from within R: options(CRAN=http://cran.stat.ucla.edu;) install.packages(CRAN.packages()[,1]) As a new user of R, I was shocked when I finished loading R and discovered the following message: There were 50 or more warnings (use warnings() to see the first 50) Let me get this straight: You install last year's R on last year's Fedora, then install over 1000 unspecified packages and you are shocked that you get warnings? In addition to this, I saw errors such as this one: ERROR: lazy loading failed for package 'PerformanceAnalytics' What is this lazy loading? More importantly, do I have to worry about all these warnings? I am intimidated by the idea that I have to go back and fix each and every one in order to have a clean R update. Shouldn't the update with CRAN just work? Or is there something really important that I am missing? Well, you need to know what you're doing. At the very least, notice what the warnings say and decide whether they point to real trouble or are just what they say they are: warnings. If you are worried about investigating all the packages, maybe install what you really need first. And no, you can't expect a repository like CRAN to keep track of all versions of R on all versions of all OS's. In each individual case, a human maintainer is responsible for fixing problems and he/she may or may not be around to fix issues. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE Function
Terence Broderick wrote: I am just trying to teach myself how to use the mle function in R because it is much better than what is provided in MATLAB. I am following tutorial material from the internet, however, it gives the following errors, does anybody know what is happening to cause such errors, or does anybody know any better tutorial material on this particular subject. x.gam-rgamma(200,rate=0.5,shape=3.5) x-x.gam library(stats4) ll-function(lambda,alfa){n-200;x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-9alfa-1)*sum(log(x))+lambda*sum(x)} Error: syntax error, unexpected SYMBOL, expecting '\n' or ';' or '}' in ll-function(lambda,alfa){n-200;x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-9alfa ll-function(lambda,alfa){n-200;x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa-1)*sum(log(x))+lambda*sum(x)} est-mle(minuslog=ll,start=list(lambda=2,alfa=1)) Error in optim(start, f, method = method, hessian = TRUE, ...) : objective function in optim evaluates to length 200 not 1 Er, not what I get. Did your version have that linefeed after x - x.gam ? If not, then you'll get your negative log-likelihood added to x.gam and the resulting likelihood becomes a vector of length 200 instead of a scalar. In general, the first piece of advice for mle() is to check that the likelihood function really is what it should be. Otherwise there is no telling what the result might mean... Secondly, watch out for parameter constraints. With your function, it very easily happens that alfa tries to go negative in which case the gamma function in the likelihood will do crazy things. A common trick in such cases is to reparametrize by log-parameters, i.e. ll - function(lambda,alfa){n-200; x-x.gam -n*alfa*log(lambda)+n*lgamma(alfa)-(alfa-1)*sum(log(x))+lambda*sum(x)} ll2 - function(llam, lalf) ll(exp(llam),exp(lalf)) est - mle(minuslog=ll2,start=list(llam=log(2),lalf=log(1))) par(mfrow=c(2,1)) plot(profile(est)) Notice, incidentally, the use of lgamma rather than log(gamma(.)), which is prone to overflow. In fact, you could also write this likelihood directly as -sum(dgamma(x, rate=lambda, shape=alfa, log=T)) audaces fortuna iuvat - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lisp-like primitives in R
François Pinard wrote: [Roland Rau] [François Pinard] I wonder what happened, for R to hide the underlying Scheme so fully, at least at the level of the surface language (despite there are hints). To further foster portability, we chose to write R in ANSI C Yes, of course. Scheme is also (often) implemented in C. I meant that R might have implemented a Scheme engine (or part of a Scheme engine, extended with appropriate data types) with a surface language (nearly the S language) which is purposely not Scheme, but could have been. If the gap is not extreme, one could dare dreaming that the Scheme engine in R be completed, and Scheme offered as an alternate extension language. If you allow me to continue dreaming awake -- they told me they will let me free as long as I do not get dangerous! :-) -- part of the interest lies in the fact there are excellent Scheme compilers. If we could only find or devise some kind of marriage between a mature Scheme and R, so to speed up the non-vectorisable parts of R scripts... Well, depending on what you want, this is either trivial or impossible... The internal storage of R is still pretty much equivalent to scheme. E.g. try this: r2scheme - function(e) if (!is.recursive(e)) deparse(e) else c((, unlist(lapply(as.list(e), r2scheme)), )) paste(r2scheme(quote(for(i in 1:4)print(i))), collapse= ) [1] ( for i ( : 1 4 ) ( print i ) ) and a parser that parses a similar language to R internal format is not a very hard exercise (some care needed in places). However, replacing the front-end is not going to make anything faster, and the evaluation engine in R does a couple of tricks which are not done in Scheme, notably lazy evaluation, and other forms of non-local evaluation, which drives optimizers crazy. Look up the writings of Luke Tierney on the matter to learn more. If we are lucky and one of the original authors reads this thread they might explain the situation further and better [...]. In r-devel, maybe! We would be lucky if the authors really had time to read r-help. :-) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ploting missing data
Markus Schmidberger wrote: Hello, I have this kind of dataframe and have to plot it. data - data.frame(sw= c(1,2,3,4,5,6,7,8,9,10,11,12,15), zehn = c(33.44,20.67,18.20,18.19,17.89,19.65,20.05,19.87,20.55,22.53,NA,NA,NA), zwanzig = c(61.42,NA,26.60,23.28,NA,24.90,24.47,24.53,26.41,28.26,NA,29.80,35.49), fuenfzig = c(162.51,66.08,49.55,43.40,NA,37.77,35.53,36.46,37.25,37.66,NA,42.29,47.80) ) The plot should have lines: lines(fuenfzig~sw, data=data) lines(zwanzig~sw, data=data) But now I have holes in my lines for the missing values (NA). How to plot the lines without the holes? The missing values should be interpolated or the left and right point directly connected. The function approx interpolates the whole dataset. Thats not my goal! Is there no plotting function to do this directly? Just get rid of the NAs: lines(fuenfzig~sw, data=data, subset=!is.na(fuenfzig)) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kendall test
Stefan Grosse wrote: On Thursday 06 September 2007 09:48:22 elyakhlifi mustapha wrote: em I thougth that there is a function which does the kendall test in R, em I writed on the console apropos(kendall) and I didn't found anything em can you tell me how could I do to use the kendall test? ?cor.test btw.: rseek.org is a very good help for such questions Interesting site! However, I don't see that leading to cor.test. Rather it points to the Kendall package which would seem to be a bit of an overkill. However, help.search(kendall) gets you there immediately. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] order intervals in a data.frame
Paul Smith wrote: On 9/6/07, João Fadista [EMAIL PROTECTED] wrote: I would like to know how can I order a data.frame with increasing the dat$Interval (dat$Interval is a factor). There is an example below. Original data.frame: dat Interval Number_reads 0-100 685 200-300 744 100-2001082 300-4004213 Desired_dat: Interval Number_reads 0-100 685 100-200 1082 200-300 744 300-400 4213 What about Desired_dat - dat[match(dat$Interval,sort(dat$Interval)),] dat[order(dat$Interval),] would be more to the point, but it is a bit fortuitous that it works at all (split the first group at 50 and you'll see). This (or at least something like it) should sort according to left endpoints: o - order(as.numeric(sub(-.*, , dat$Interval))) dat[o,] ? Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table
Ingo Holz wrote: Hi, I want to read a ascii-file using the function read.table. With 'skip' and 'nrows' I can select the rows to read from this file. Is there a way to select columns (in the selected rows)? Yes, use the colClasses argument. (I won't rewrite the help page here; I expect that you can read it once you know where to look.) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems in read.table
[EMAIL PROTECTED] wrote: Dear R-users, I have encountered the following problem every now and then. But I was dealing with a very small dataset before, so it wasn't a problem (I just edited the dataset in Openoffice speadsheet). This time I have to deal with many large datasets containing commuting flow data. I appreciate if anyone could give me a hint or clue to get out of this problem. I have a .dat file called 1081.dat: 1001 means Birmingham, AL. I imported this .dat file using read.table like tmp-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T) Then I got this error message: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 9499 did not have 209 elements Since I got an error message saying other rows did not have 209 elements, I added skip=c(205,9499,9294)) in hoping that R would take care of this problem. But I got a similar error message: tmp-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T,skip=c(205,9499,9294)) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 9294 did not have 209 elements In addition: Warning message: the condition has length 1 and only the first element will be used in: if (skip 0) readLines(file, skip) Is there any way to let a R code to automatically skip problematic rows? Thank you very much! Skip is the NUMBER of rows to skip before reading. It has to be a single number. You can use fill and flush to read lines with too few or too many elements, but it might be better to investigate the cause of the problem. What are in those lines? Quote and comment characters are common culprits. Taka __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to extract part of an array?
Lauri Nikkinen wrote: Hi, How can I extract part of an array? I would like to extract table Supported from this array. If this is not possible, how do I convert array to list? I'm sorry this is not an reproducible example. spl - tapply(temp$var1, list(temp$var2, temp$var3, temp$var3), mean) spl , , Supported 07 08 A68.38710 71.48387 B21.67742 20.83871 C55.74194 61.12903 AL L 26.19816 27.39631 , , Not_supported 07 08 ANA 82.38710 BNA 24.0 CNA 68.77419 ALL NA 29.97984 How about spl[,,Supported]? -p -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Table and ftable
David Barron wrote: There might be simpler ways, but you can certainly do this with the reshape package, like this: library(reshape) dta - read.table(clipboard,header=TRUE) sic level area 1 a 211 2.4 2 b 311 2.3 3 b 322 0.2 4 b 322 0.5 5 c 100 3.0 6 c 100 1.5 7 c 242 1.5 8 d 222 0.2 mlt.dta - melt(dta) cst.dta - cast(mlt.dta,sic~level,sum) sic 100 211 222 242 311 322 1 a NA 2.4 NA NA NA NA 2 b NA NA NA NA 2.3 0.7 3 c 4.5 NA NA 1.5 NA NA 4 d NA NA 0.2 NA NA NA Then just replace the NAs with 0s. tapply() will do this too: with(d,tapply(area,list(sic,level), sum)) 100 211 222 242 311 322 a NA 2.4 NA NA NA NA b NA NA NA NA 2.3 0.7 c 4.5 NA NA 1.5 NA NA d NA NA 0.2 NA NA NA This has the same awkwardness of giving NA for empty cells, and there is no easy way to circumvent it since the FUN of tapply is simply not called for such cells. Replacing NA by zero is a bit dangerous (albeit not in the present case) since you can get an NA cell for more than one reason. A more careful approach is like this: with(d,{t1 - tapply(area,list(sic,level), sum) t2 - table(sic,level) t1[t2==0] - 0 t1} ) 100 211 222 242 311 322 a 0.0 2.4 0.0 0.0 0.0 0.0 b 0.0 0.0 0.0 0.0 2.3 0.7 c 4.5 0.0 0.0 1.5 0.0 0.0 d 0.0 0.0 0.2 0.0 0.0 0.0 HTH. David Barron On 9/4/07, Giulia Bennati [EMAIL PROTECTED] wrote: Dear listmembers, I have a little question: I have my data organized as follow sic level area a2112.4 b3112.3 b3220.2 b3220.5 c1003.0 c1001.5 c2421.5 d2220.2 where levels and sics are factors. I'm trying to obtain a matrix like this: level 211311322 100242 222 sic a2.4 0 0 0 00 b 0 2.30.7 0 00 c 00 0 4.5 1.5 0 d 00 00 0 0.2 I tryed with table function as table(sic,level) but i obteined only a contingency table. Have you any suggestions? Thank you very much, Giulia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confusion using functions to access the function call stack example section
Leeds, Mark (IED) wrote: I was going through the example below which is taken from the example section in the R documentation for accessing the function call stack. I am confused and I have 3 questions that I was hoping someone could answer. 1) why is y equal to zero even though the call was done with gg(3) There are multiple nested calls to gg, and y is counted down. You're not calling ggg when y 0, and that what does the printing. 2) what does parents are 0,1,2,0,4,5,6,7 mean ? I understand what a parent frame is but how do the #'s relate to this particular example ? Why is the current frame # 8 ? How did you get that?? Did you miss the part where it said that the example gives different results when run by example()? I get gg(3) current frame is 5 parents are 0 1 2 3 4 function() { cat(current frame is, sys.nframe(), \n) cat(parents are, sys.parents(), \n) print(sys.function(0)) # ggg print(sys.function(2)) # gg } environment: 0x8bb8e10 function(y) { ggg - function() { cat(current frame is, sys.nframe(), \n) cat(parents are, sys.parents(), \n) print(sys.function(0)) # ggg print(sys.function(2)) # gg } if(y 0) gg(y-1) else ggg() } which should make somewhat better sense. (My versions, 2.5.1 and pre-2.6.0 don't seem to print y either?) As a general matter, frames make a tree: two of them can have the same parent - e.g., this happens whenever an argument expression is being evaluated as part of evaluating a function call. Try, e.g. f - function(x) {x;print(sys.status())} ; f(f(1)) 3) it says that sys.function(2) should be gg but I would think that sys.function(1) would be gg since it's one up from where the call is being made. There are multiple calls to gg() so both could be true. Thanks a lot. If the answers are too complicated and someone knows of a good reference that goes into more details about the sys functions, that's appreciated also. The best way is to just poke around with some simple examples until you get the hang of it. Possibly modify the examples you have already seen but print the entire sys.status(). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Q: selecting a name when it is known as a string
D. R. Evans wrote: I am 100% certain that there is an easy way to do this, but after experimenting off and on for a couple of days, and searching everywhere I could think of, I haven't been able to find the trick. I have this piece of code: ... attach(d) if (ORDINATE == 'ds') { lo - loess(percent ~ ncms * ds, d, control=loess.control(trace.hat = 'approximate')) grid - data.frame(expand.grid(ds=MINVAL:MAXVAL, ncms=MINCMS:MAXCMS)) ... then there several almost-identical if statements for different values of ORDINATE. For example, the next if statement starts with: ... if (ORDINATE == 'dsl') { lo - loess(percent ~ ncms * dsl, d, control=loess.control(trace.hat = 'approximate')) grid - data.frame(expand.grid(dsl=MINVAL:MAXVAL, ncms=MINCMS:MAXCMS)) ... This is obviously pretty silly code (although of course it does work). I imagine that my question is obvious: given that I have a variable, ORDINATE, whose value is a string, how do I re-write statements such as the lo - and grid - statements above so that they use ORDINATE instead of the hard-coded names ds and dsl. I am almost sure (almost) that it has something to do with deparse(), but I couldn't find the right incantation, and the ?deparse() help left my head swimming. myvar - 12345 vname - myvar eval(substitute(X+54321, list(X=as.name(vname However, this does not work for argument names as in expand.grid(ds=.), so for that part you may need to patch up names afterwards. It is (paraphrasing Thomas Lumley) often a good idea to reconsider the question if the answer involves this sort of trickery. Perhaps it is better handled by a loop or lapply over a list of variables? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sin(pi)?
Nguyen Dinh Nguyen wrote: Dear all, I found something strange when calculating sin of pi value sin(pi) [1] 1.224606e-16 pi [1] 3.141593 sin(3.141593) [1] -3.464102e-07 Any help and comment should be appreciated. Regards Nguyen Well, sin(pi) is theoretically zero, so you are just seeing zero at two different levels of precision. The built-in pi has more digits than it displays: pi [1] 3.141593 pi - 3.141593 [1] -3.464102e-07 print(pi, digits=20) [1] 3.141592653589793 Nguyen Dinh Nguyen Garvan Institute of Medical Research Sydney, Australia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] size limitations in R
Daniel Lakeland wrote: On Fri, Aug 31, 2007 at 01:31:12PM +0100, Fabiano Vergari wrote: I am a SAS user currently evaluating R as a possible addition or even replacement for SAS. The difficulty I have come across straight away is R's apparent difficulty in handling relatively large data files. Whilst I would not expect it to handle datasets with millions of records, I still really need to be able to work with dataset with 100,000+ records and 100+ variables. Yet, when reading a .csv file with 180,000 records and about 200 variables, the software virtually ground to a halt (I stopped it after 1 hour). Are there guidelines or maybe a limitations document anywhere that helps me assess the size 180k records with 200 variables = 36 million entries, if they're numeric then they're doubles taking up 8 bytes, so 288 MB of RAM. This should be perfectly fine for R, as long as you have that much free RAM. However, the routines that read CSV and tabular delimited files are relatively inefficient for such large files. In order to handle large data files, it is better to use one of the database interfaces. My preference would be sqlite unless I already had the data on a mysql or other database server. Yes. However, for an intermediate solution, notice that much of the inefficiency comes from storing data as character vectors before deciding what to do with them. Character vectors have an overhead of one SEXP per string stored i.e. 20-28 bytes in addition to the actual string. There are options for telling the read routines explicitly that data are numeric/integer/logical: 'colClasses' for read.table(), 'what' for scan(). This will bypass the intermediate storage. the documentation for the packages RSQLite and SQLiteDF should be helpful, as well as the documentation for SQLite itself, which has a facility for efficiently importing CSV and similar files directly to a SQLite database. eg: http://netadmintools.com/art572.html -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in cor.test, difference between exact=FALSE and exact=NULL
Andrew Yee wrote: Thanks for the clarification. I should have recognized the difference between warning and error. But if I may take this a step further, shouldn't it then be exact=TRUE instead of exact=NULL? Thanks, Andrew Nope. The two are equivalent for the Spearman test, but not for Kendall's tau. The login in that case is that NULL implies exact testing if n 50 and asymptotic otherwise. TRUE and FALSE enforces one or the other (if possible). On 8/31/07, Peter Dalgaard [EMAIL PROTECTED] wrote: Andrew Yee wrote: Pardon my ignorance, but is there a difference in cor.test between exact=FALSE and exact=NULL when method=spearman? Take for example: x-c(1,2,2,3,4,5) y-c(1,2,2,10,11,12) cor.test(x,y, method=spearman, exact=NULL) This gives an error message, Warning message: Cannot compute exact p-values with ties in: cor.test.default(x, y, method = spearman, exact = NULL) However, when exact is changed to FALSE, this seems to run okay. cor.test(x,y, method=spearman, exact=FALSE) Question: should this be exact = FALSE in the documentation and/or the code? No. The default is indeed NULL. This implies that calculation of exact p-values will be attempted, and when there are ties you get a warning (NB: not error) message. Setting exact=FALSE, no attempt is made and no warning is given. Thanks, Andrew MGH Cancer Center __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing transform to with
Muenchen, Robert A (Bob) wrote: Hi All, I've been successfully using the with function for analyses and the transform function for multiple transformations. Then I thought, why not use with for both? I ran into problems couldn't figure them out from help files or books. So I created a simplified version of what I'm doing: rm( list=ls() ) x1-c(1,3,3) x2-c(3,2,1) x3-c(2,5,2) x4-c(5,6,9) myDF-data.frame(x1,x2,x3,x4) rm(x1,x2,x3,x4) ls() myDF This creates two new variables just fine transform(myDF, sum1=x1+x2, sum2=x3+x4 ) This next code does not see sum1, so it appears that transform cannot see the variables that it creates. Would I need to transform new variables in a second pass? transform(myDF, sum1=x1+x2, sum2=x3+x4, total=sum1+sum2 ) Next I'm trying the same thing using with. It doesn't not work but also does not generate error messages, giving me the impression that I'm doing something truly idiotic: with(myDF, { sum1-x1+x2 sum2-x3+x4 total - sum1+sum2 } ) myDF ls() Then I thought, perhaps one of the advantages of transform is that it works on the left side of the equation without using a longer name like myDF$sum1. with probably doesn't do that, so I use the longer form below. It also does not work and generates no error messages. # Try it again, writing vars to myDF explicitly. # It generates no errors, and no results. with(myDF, { myDF$sum1-x1+x2 myDF$sum2-x3+x4 myDF$total - myDF$sum1+myDF$sum2 } ) myDF ls() I would appreciate some advice about the relative roles of these two functions why my attempts with with have failed. Yes, transform() calculates all its new values, then assigns to the given names. This is expedient, but it has the drawback that new variables are not usable inside the expressions. A possible alternative implementation would be equivalent to a series of nested calls to transform, which of course you could also do manually: transform( transform(myDF, sum1=x1+x2, sum2=x3+x4 ), total=sum1+sum2 ) The problem with with() on data frames and lists is that, like the eval family of functions, _converts_ the object to an environment, and then evaluates the expression in the converted environment. The environment is temporary, so assignments to it get lost. The current development sources has a new (experimental) function within() which is like with(), but stores any modified variables back. (This is very recent and may or may not make it to 2.6.0). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in cor.test, difference between exact=FALSE and exact=NULL
Andrew Yee wrote: Pardon my ignorance, but is there a difference in cor.test between exact=FALSE and exact=NULL when method=spearman? Take for example: x-c(1,2,2,3,4,5) y-c(1,2,2,10,11,12) cor.test(x,y, method=spearman, exact=NULL) This gives an error message, Warning message: Cannot compute exact p-values with ties in: cor.test.default(x, y, method = spearman, exact = NULL) However, when exact is changed to FALSE, this seems to run okay. cor.test(x,y, method=spearman, exact=FALSE) Question: should this be exact = FALSE in the documentation and/or the code? No. The default is indeed NULL. This implies that calculation of exact p-values will be attempted, and when there are ties you get a warning (NB: not error) message. Setting exact=FALSE, no attempt is made and no warning is given. Thanks, Andrew MGH Cancer Center __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time conversion problems
[EMAIL PROTECTED] wrote: Hi there I have precipitation data from 2004 to 2006 in varying resolutions (10 to 20min intervals) with time in seconds from beginnig of the year (summation) and a second variable as year. I applied follwing code to convert the time into a date: times-strptime(2004-01-01, %Y-%m-%d, tz=GMT) + precipitation$time1 everytihng went well, except that every year, the seconds-counter starts by zero, therefore I have now three 2004 series instead of going further from 04 to 05 etc. I tried to sum the last seconds-values of 2004 to the first of 2005 with an if command like: if (year=2005) time2=time1+632489 ;(seconds) but it doesn't work. thanks for a solution Can't you just do strptime(paste(year, 01-01, sep=-), ? (or use ISOdatetime(year,1,1,0,0,0,tz=GMT)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sphericity test in R for repeated measures ANOVA
Orou Gaoue wrote: Hi, Is there a way to do a sphericity test in R for repeated measures ANOVA (using aov or lme)? I can't find anything about it in the help. Thanks Orou There is for lm() with multivariate response (mauchly.test). For lme(), you can compare models with a corSymm correlation structure to ones with corCompSymm. Thus is a similar test, but not quite the same. For aov() it doesn't really make sense, partly because the repeatedness is ambiguous in some models, partly because aov's internal algorithms rely strongly on orthogonality with respect to a particular covariance structure. If you relax assumptions, orthogonality no longer holds. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about unicode characters in tcltk
R Help wrote: hello list, Can someone help me figure out why the following code doesn't work? I'm trying to but both Greek letters and subscripts into a tcltk menu. The code creates all the mu's, and the 1 and 2 subscripts, but it won't create the 0. Is there a certain set of characters that R won't recognize the unicode for? Or am I input the \u2080 incorrectly? library(tcltk) m -tktoplevel() frame1 - tkframe(m) frame2 - tkframe(m) frame3 - tkframe(m) entry1 - tkentry(frame1,width=5,bg='white') entry2 - tkentry(frame2,width=5,bg='white') entry3 - tkentry(frame3,width=5,bg='white') tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left') tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left') tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left') tkpack(frame1,entry1,side='top') tkpack(frame2,entry2,side='top') tkpack(frame3,entry3,side='top') thanks -- Sam Which OS was this? I can reproduce the issue on SuSE, but NOT Fedora 7. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mann-Whitney U
Lucke, Joseph F wrote: R and SPSS are using different but equivalent statistics. R is using the rank sum of group1 adjusted for the mean rank. SPSS is using the rank sum of group2 adjusted for the mean rank. Close: It is the _minimum_ possible rank sum that is getting subtracted. If everyone in group1 is less than everyone in group2, R's W statistic will be zero. Other way around in SPSS. Example. G1=group1 G2=group2[-length(group2)] #get rid of the NA n1=length(G1) #n1=28 n2=length(G2) #n2=27 # convert to ranks W=rank(c(G1,G2)) R1=W[1:n1] #put the ranks back into the groups R2=W[n1+1:n2] #Get the sum of the ranks for each group W1=sum(R1) W2=sum(R2) #Adjust for mean rank for group 1 W1-n1*(n1+1)/2 [1] 405.5 #Adjust for mean rank for group 2 W2-n2*(n2+1)/2 [1] 350.5 W1-n1*(n1+1)/2 gives R's result; W2-n2*(n2+1)/2 gives SPSS's result. Ties throw a wrench in the works. R uses a continuity correction by default, SPSS does not. Taking out the continuity correction, wilcox.test(G1,G2,correct=FALSE) Wilcoxon rank sum test data: G1 and G2 W = 405.5, p-value = 0.6433 alternative hypothesis: true location shift is not equal to 0 Warning message: cannot compute exact p-value with ties in: wilcox.test.default(G1, G2, correct = FALSE) This p-value is the same as SPSS's. Consult a serious non-parametrics text. I used Lehmann, E. L., Nonparametrics: Statistical methods based on ranks. 1975. Holden-Day. San Francisco, CA. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Natalie O'Toole Sent: Wednesday, August 15, 2007 1:07 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] Mann-Whitney U Hi, I do want to use the Mann-Whitney test which ranks my data and then uses those ranks rather than the actual data. Here is the R code i am using: group1- c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2, 2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3) group2- c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9 7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA) result - wilcox.test(group1, group2, paired=FALSE, conf.level = 0.95, na.action) paired = FALSE so that the Wilcoxon rank sum test which is equivalent to the Mann-Whitney test is used (my samples are NOT paired). conf.level = 0.95 to specify the confidence level na.action is used because i have a NA value (i suspect i am not using na.action in the correct manner) When i use this code i get the following error message: Error in arg == choices : comparison (1) is possible only for atomic and list types When i use this code: group1- c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2, 2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3) group2- c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9 7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA) result - wilcox.test(group1, group2, paired=FALSE, conf.level = 0.95) I get the following result: Wilcoxon rank sum test with continuity correction data: group1 and group2 W = 405.5, p-value = 0.6494 alternative hypothesis: true location shift is not equal to 0 Warning message: cannot compute exact p-value with ties in: wilcox.test.default(group1, group2, paired = FALSE, conf.level = 0.95) The W value here is 405.5 with a p-value of 0.6494 in SPSS, i am ranking my data and then performing a Mann-Whitney U by selecting analyze - non-parametric tests - 2 independent samples and then checking off the Mann-Whitney U test. For the Mann-Whitney test in SPSS i am gettting the following results: Mann-Whitney U = 350.5 2- tailed p value = 0.643 I think maybe the descrepancy has to do with the specification of the NA values in R, but i'm not sure. If anyone has any suggestions, please let me know! I hope i have provided enough information to convey my problem. Thank-you, Nat __ Natalie, It's best to provide at least a sample of your data. Your field names suggest that your data might be collected in units of mm^2 or some similar measurement of area. Why do you want to use Mann-Whitney, which will rank your data and then use those ranks rather than your actual data? Unless your sample is quite small, why not use a two sample t-test? Also,are your samples paired? If they aren't, did you use the paired = FALSE option? JWDougherty __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R 2.5.1 configure problem
Andreas Hey wrote: Hi, I have follwoing problem: I will install R-2.5.1 on a Linux Maschine (64Bit) Which? (CPU and OS, please. There are about four likely possibilities, half a dozen less likely ones...) and I will use the R-GUI JGR (Jaguar) SO I make following steps: ./configure --with-gnu-ld --enable-R-shlib VAR=fPIC VAR=TCLTK_LIBS What are those options supposed to be good for??? You appear to be setting VAR twice, and I don't recall VAR as anything used by configure. And it is unlikely that a Linux system would use anything but GNU ld by default. Did configure terminate succesfully??? What did the output summary say? make I become following error messages: (that's not how to translate Ich bekomme...) . Something must have gone before this! A linker error perhaps? Entering directory `/root/R-2.5.1/tests' make[2]: Entering directory `/root/R-2.5.1/tests' make[3]: Entering directory `/root/R-2.5.1/tests/Examples' make[4]: Entering directory `/root/R-2.5.1/tests/Examples' make[4]: `Makedeps' is up to date. make[4]: Leaving directory `/root/R-2.5.1/tests/Examples' make[4]: Entering directory `/root/R-2.5.1/tests/Examples' make[4]: *** No rule to make target `../../lib/libR.so', needed by `base-Ex.Rout'. Stop. Did you really run make. This looks like make check output. make[4]: Leaving directory `/root/R-2.5.1/tests/Examples' make[3]: *** [test-Examples-Base] Error 2 make[3]: Leaving directory `/root/R-2.5.1/tests/Examples' make[2]: *** [test-Examples] Error 2 make[2]: Leaving directory `/root/R-2.5.1/tests' make[1]: *** [test-all-basics] Error 1 make[1]: Leaving directory `/root/R-2.5.1/tests' make: *** [check] Error 2 Make check all - I become following messages: Entering directory `/root/R-2.5.1/tests' make[2]: Entering directory `/root/R-2.5.1/tests' make[3]: Entering directory `/root/R-2.5.1/tests/Examples' make[4]: Entering directory `/root/R-2.5.1/tests/Examples' make[4]: `Makedeps' is up to date. make[4]: Leaving directory `/root/R-2.5.1/tests/Examples' make[4]: Entering directory `/root/R-2.5.1/tests/Examples' make[4]: *** No rule to make target `../../lib/libR.so', needed by `base-Ex.Rout'. Stop. make[4]: Leaving directory `/root/R-2.5.1/tests/Examples' make[3]: *** [test-Examples-Base] Error 2 make[3]: Leaving directory `/root/R-2.5.1/tests/Examples' make[2]: *** [test-Examples] Error 2 make[2]: Leaving directory `/root/R-2.5.1/tests' make[1]: *** [test-all-basics] Error 1 make[1]: Leaving directory `/root/R-2.5.1/tests' make: *** [check] Error 2 Can you help me? With best regards Andreas Hey _ Tel:030/2093-1463 Email: mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about unicode characters in tcltk
R Help wrote: hello list, Can someone help me figure out why the following code doesn't work? I'm trying to but both Greek letters and subscripts into a tcltk menu. The code creates all the mu's, and the 1 and 2 subscripts, but it won't create the 0. Is there a certain set of characters that R won't recognize the unicode for? Or am I input the \u2080 incorrectly? library(tcltk) m -tktoplevel() frame1 - tkframe(m) frame2 - tkframe(m) frame3 - tkframe(m) entry1 - tkentry(frame1,width=5,bg='white') entry2 - tkentry(frame2,width=5,bg='white') entry3 - tkentry(frame3,width=5,bg='white') tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left') tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left') tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left') tkpack(frame1,entry1,side='top') tkpack(frame2,entry2,side='top') tkpack(frame3,entry3,side='top') Odd, but I think not an R issue. I get weirdness in wish too. Try this % toplevel .a .a % label .a.b -text \u03bc\u2080 -font {Roman -10} .a.b % pack .a.b % .a.b configure {-activebackground [] {-text text Text {} μ₀} {-textvariable textVariable Variable {} {}} {-underline underline Underline -1 -1} {-width width Width 0 0} {-wraplength wrapLength WrapLength 0 0} % .a.b configure -font {Helvetica -12 bold} # the default, now shows \u2080 % .a.b configure -font {Roman -10} # back to Roman, *still* shows \u2080 ???!!! -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory allocation glitches
Ben Bolker wrote: (not sure whether this is better for R-devel or R-help ...) Hardcore debugging is usually better off in R-devel. I'm leaving it in R-help though. I am currently trying to debug someone else's package (they're not available at the moment, and I would like it to work *now*), which among other things allocates memory for a persistent buffer that gets used by various functions. The first symptoms of a problem were that some things just didn't work under Windows but were (apparently) fine on Linux. I don't have all the development tools installed for Windows, so I started messing around under Linux, adding Rprintf() statements to the main code. Once I did that, strange pointer-error-like inconsistencies started appearing -- e.g., the properties of some of the persistent variables would change if I did debug(function). I'm wondering if anyone has any tips on how to tackle this -- figure out how to use valgrind? Do straight source-level debugging (R -d gdb etc.) and look for obvious problems? The package uses malloc/realloc rather than Calloc/Realloc -- does it make sense to go through the code replacing these all and see if that fixes the problem? Valgrind is a good idea to try and as I recall it, the basic incantations are not too hard to work out (now exactly where is it that we wrote them down?). It only catches certain error types though, mostly use of uninitialized data and read/write off the ends of allocated blocks of memory. If that doesn't catch it, you get to play with R -d gdb. However, my experience is that line-by-line tracing is usually a dead end, unless you have the trouble spot pretty well narrowed down. Apart from that, my usual procedure would be 1) find a minimal script reproducing the issue and hang onto it. Or at least as small as you can get it without losing the bug. Notice that any change to either the script or R itself may allow the bug to run away and hide somewhere else. 2) if memory corruption is involved, run under gdb, set a hardware watchpoint on the relevant location (this gets a little tricky sometimes because it might be outside the initial address space, in which case you need to somehow run the code for a while, break to gdb, and then set the watchpoint). 3) It is not unlikely that the watchpoint triggers several thousand times before the relevant one. You can conditionalize it; a nice trick is to use the gc_count. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mann-Whitney U
Prof Brian Ripley wrote: On Tue, 14 Aug 2007, Natalie O'Toole wrote: Hi, Could someone please tell me how to perform a Mann-Whitney U test on a dataset with 2 groups where one group has more data values than another? I have split up my 2 groups into 2 columns in my .txt file i'm using with R. Here is the code i have so far... group1 - c(LeafArea2) group2 - c(LeafArea1) wilcox.test(group1, group2) This code works for datasets with the same number of data values in each column, but not when there is a different number of data values in one column than another column of data. There is an example of that scenario on the help page for wilcox.test, so it does 'work'. What exactly went wrong for you? Is the solution that i have to have a null value in the data column with the fewer data values? I'm testing for significant diferences between the 2 groups, and the result i'm getting in R with the uneven values is different from what i'm getting in SPSS. We need a worked example. As the help page says, definitions do differ. If you can provide a reproducible example in R and the output from SPSS we may be able to tell you how to relate that to what you see in R. [...] PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. As it says, we really need such code (and the output you get) to be able to help you. Also, two variables of different length in two columns is not a good idea. If you read in things in parallel columns, it would usually imply paired data. If one column is shorter, you may be reading different data than you think. Check e.g. the sleep data for a better format. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining two ANOVA outputs of different lengths
Christoph Scherber wrote: Dear R users, I have been trying to combine two anova outputs into one single table (for later publication). The outputs are of different length, and share only some common explanatory variables. Using merge() or melt() (from the reshape package) did not work out. Here are the model outputs and what I would like to have: anova(model1) numDF denDF F-value p-value (Intercept) 174 0.063446 0.8018 days 174 6.613997 0.0121 logdiv 174 1.587983 0.2116 leg 174 4.425843 0.0388 anova(model2) numDF denDF F-value p-value (Intercept) 173 165.94569 .0001 funcgr 173 7.91999 0.0063 grass173 42.16909 .0001 leg 173 4.72108 0.0330 funcgr:grass 173 8.49068 0.0047 #merge(anova(model1),anova(model2),...) F-value 1 p-val1 F-value 2 p-value 2 (Intercept) 0.0634460.8018 165.94569 .0001 days 6.6139970.0121 NA NA logdiv1.5879830.2116 NA NA leg 4.4258430.0388 4.72108 0.033 funcgrNA NA 7.91999 0.0063 grass NA NA 42.16909.0001 funcgr:grass NA NA 8.49068 0.0047 I would be glad if someone would have an idea of how to do this in principle. The main problems are that the merge key is the rownames and that you want to keep entries that are missing in one of the analysis. There are ways to deal with that: example(anova.lm) . merge(anova(fit2), anova(fit4), by=0, all=T) Row.names Df.x Sum Sq.x Mean Sq.x F value.xPr(F).x Df.y Sum Sq.y 1 ddpi NANANANA NA1 63.05403 2 dpi NANANANA NA1 12.40095 3 pop151 204.11757 204.11757 13.211166 0.0006878681 204.11757 4 pop751 53.34271 53.34271 3.452517 0.0694253851 53.34271 5 Residuals 47 726.16797 15.45038NA NA 45 650.71300 Mean Sq.y F value.y Pr(F).y 1 63.05403 4.3604959 0.0424711387 2 12.40095 0.8575863 0.3593550848 3 204.11757 14.1157322 0.0004921955 4 53.34271 3.6889104 0.0611254598 5 14.46029 NA NA Presumably, you can take it from here. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rcmdr window border lost
Andy Weller wrote: OK, I tried completely removing and reinstalling R, but this has not worked - I am still missing window borders for Rcmdr. I am certain that everything is installed correctly and that all dependencies are met - there must be something trivial I am missing?! Thanks in advance, Andy Andy Weller wrote: Dear all, I have recently lost my Rcmdr window borders (all my other programs have borders)! I am unsure of what I have done, although I have recently update.packages() in R... How can I reclaim them? I am using: Ubuntu Linux (Feisty) R version 2.5.1 R Commander Version 1.3-5 This sort of behaviour is usually the fault of the window manager, not R/Rcmdr/tcltk. It's the WM's job to supply the various window decorations on a new window, so either it never got told that there was a window, or it somehow got into a confused state. Did you try restarting the WM (i.e., log out/in or reboot)? And which WM are we talking about? Same combination works fine on Fedora 7, except for a load of messages saying Warning: X11 protocol error: BadWindow (invalid Window parameter) I have deleted the folder: /usr/local/lib/R/site-library/Rcmdr and reinstalled Rcmdr with: install.packages(Rcmdr, dep=TRUE) This has not solved my problem though. Maybe I need to reinstall something else as well? Thanks in advance, Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need Help: Installing/Using xtable package
M. Jankowski wrote: Hi all, Let me know if I need to ask this question of the bioconductor group. I used the bioconductor utility to install this package and also the CRAN package.install function. My computer crashed a week ago. Today I reinstalled all my bioconductor/R packages. One of my scripts is giving me the following error: in my script I set: library(xtable) print.xtable( and receive this error: Error : could not find function print.xtable This is a new error and I cannot find the source. Looks like the current xtable is no longer exporting its print methods. Why were you calling print.xtable explicitly in the first place? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Invert Likert-Scale Values
(Ted Harding) wrote: On 04-Aug-07 22:02:33, William Revelle wrote: Alexis and John, To reverse a Likert like item, subtract the item from the maximum acceptable value + the minimum acceptable value, That is, if x - 1:8 xreverse - 9-x Bill A few of us have suggested this, but Alexis's welcome for the recode() suggestion indicates that by the time he gets round to this his Likert scale values have already become levels of a factor. Levels 1, 2, ... of a factor may look like integers, but they're not; and R will not let you do arithmetic on them: x-factor(c(1,1,1,2,2,2)) x [1] 1 1 1 2 2 2 Levels: 1 2 y-(3-x) Warning message: - not meaningful for factors in: Ops.factor(3, x) y [1] NA NA NA NA NA NA However, you can turn them back into integers, reverse, and then turn the results back into a factor: y - factor(3 - as.integer(x)) y [1] 2 2 2 1 1 1 Levels: 1 2 So, even for factors, the insight undelying our suggestion of - is still valid! :) Er, wouldn't y - factor(x, levels=2:1, labels=1:2) be more to the point? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme and aov
Gang Chen wrote: I have a mixed balanced ANOVA design with a between-subject factor (Grp) and a within-subject factor (Rsp). When I tried the following two commands which I thought are equivalent, fit.lme - lme(Beta ~ Grp*Rsp, random = ~1|Subj, Model); fit.aov - aov(Beta ~ Rsp*Grp+Error(Subj/Rsp)+Grp, Model); I got totally different results. What did I do wrong? Except for not telling us what your data are and what you mean by totally different? One model has a random interaction between Subj and Rsp, the other does not. This may make a difference, unless the interaction term is aliased with the residual error. If your data are unbalanced, aov is not guaranteed to give meaningful results. -pd __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme and aov
Gang Chen wrote: Thanks a lot for clarification! I just started to learn programming in R for a week, and wanted to try a simple mixed design of balanced ANOVA with a between-subject factor (Grp) and a within-subject factor (Rsp), but I'm not sure whether I'm modeling the data correctly with either of the command lines. Here is the result. Any help would be highly appreciated. fit.lme - lme(Beta ~ Grp*Rsp, random = ~1|Subj, Model); summary(fit.lme) Linear mixed-effects model fit by REML Data: Model AIC BIClogLik 233.732 251.9454 -108.8660 Random effects: Formula: ~1 | Subj (Intercept) Residual StdDev:1.800246 0.3779612 Fixed effects: Beta ~ Grp * Rsp Value Std.Error DFt-value p-value (Intercept) 1.1551502 0.5101839 36 2.2641837 0.0297 GrpB-1.1561248 0.7215090 36 -1.6023706 0.1178 GrpC-1.2345321 0.7215090 36 -1.7110417 0.0957 RspB-0.0563077 0.1482486 36 -0.3798196 0.7063 GrpB:RspB -0.3739339 0.2096551 36 -1.7835665 0.0829 GrpC:RspB0.3452539 0.2096551 36 1.6467705 0.1083 Correlation: (Intr) GrpB GrpC RspB GrB:RB GrpB -0.707 GrpC -0.707 0.500 RspB -0.145 0.103 0.103 GrpB:RspB 0.103 -0.145 -0.073 -0.707 GrpC:RspB 0.103 -0.073 -0.145 -0.707 0.500 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.72266114 -0.41242552 0.02994094 0.41348767 1.72323563 Number of Observations: 78 Number of Groups: 39 fit.aov - aov(Beta ~ Rsp*Grp+Error(Subj/Rsp)+Grp, Model); fit.aov Call: aov(formula = Beta ~ Rsp * Grp + Error(Subj/Rsp) + Grp, data = Model) Grand Mean: 0.3253307 Stratum 1: Subj Terms: Grp Sum of Squares 5.191404 Deg. of Freedom1 1 out of 2 effects not estimable Estimated effects are balanced Stratum 2: Subj:Rsp Terms: Rsp Sum of Squares 7.060585e-05 Deg. of Freedom1 2 out of 3 effects not estimable Estimated effects are balanced Stratum 3: Within Terms: Rsp Grp Rsp:Grp Residuals Sum of Squares0.33428 36.96518 1.50105 227.49594 Deg. of Freedom 1 2 270 Residual standard error: 1.802760 Estimated effects may be unbalanced This looks odd. It is a standard split-plot layout, right? 3 groups of 13 subjects, each measured with two kinds of Rsp = 3x13x2 = 78 observations. In that case you shouldn't see the same effect allocated to multiple error strata. I suspect you forgot to declare Subj as factor. Also: summary(fit.aov) is nicer, and anova(fit.lme) should be informative. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] - round() strange behaviour
Monica Pisica wrote: Hi, I am getting some strange results using round - it seems that it depends if the number before the decimal point is odd or even For example: round(1.5)[1] 2 round(2.5)[1] 2 While i would expect that round(2.5) be 3 and not 2. Do you have any explanation for that? http://www.google.com/search?q=round+to+even; __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the large dataset problem
(Ted Harding) wrote: On 30-Jul-07 11:40:47, Eric Doviak wrote: [...] Sympathies for the constraints you are operating in! The Introduction to R manual suggests modifying input files with Perl. Any tips on how to get started? Would Perl Data Language (PDL) be a good choice? http://pdl.perl.org/index_en.html I've not used SIPP files, but itseems that they are available in delimited format, including CSV. For extracting a subset of fields (especially when large datasets may stretch RAM resources) I would use awk rather than perl, since it is a much lighter program, transparent to code for, efficient, and it will do that job. On a Linux/Unix system (see below), say I wanted to extract fields 1, 1000, 1275, , 5678 from a CSV file. Then the 'awk' line that would do it would look like awk ' BEGIN{FS=,}{print $(1) , $(1000) , $(1275) , ... $(5678) ' sippfile.csv newdata.csv Awk reads one line at a tine, and does with it what you tell it to do. Yes, but notice that there are also options within R. If you use a carefully constructed colClasses= argument to read.table()/read.csv()/etc or what= argument to scan(), you don't get more columns than you ask for. The basic trick is to use NULL for each of the columns that you do NOT want, and preferably numeric or character or whatever for those that you want (NA lets read.table do it's usual trickery of guessing type from contents). However... I wrote a script which loads large datasets a few lines at a time, writes the dozen or so variables of interest to a CSV file, removes the loaded data and then (via a for loop) loads the next few lines I managed to get it to work with one of the SIPP core files, but it's SLW. See above ... Looking at the actual data files and data dictionaries (we're talking about http://www.bls.census.gov/sipp_ftp.html, right?), it looks like SIPP files are in a fixed-width format, which suggests that you might want to employ read.fwf(). If you want to get really smart about it, extract the 'D' fields from the dictionary files Try this dict - readLines(ftp://www.sipp.census.gov/pub/sipp/2004/l04puw1d.txt;) D.lines - grep(^D , dict) vdict - read.table(con - textConnection(dict[D.lines])); close(con) head(vdict) a little bit of further fiddling and you have the list of field widths and variable names to feed to read.fwf(). Just subset the name list and set the field width negative for those variables that you wish to skip. Extracting value labels from the V fields looks like it could be done, but requires more thinking, especially where they straddle multiple lines (but hey, it's your job, not mine...) -Peter D. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] manipulating arrays
Henrique Dallazuanna wrote: Hi, I don't know if is the more elegant way, but: X-c(1,2,3,4,5) X - c(X[1], 0, X[2:5]) append(X, 0, 1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About infinite value
arigado wrote: Hi everyone I have a problem about infinite. If I type 10^308, R shows 1e+308 When I type 10^309, R shows Inf So, we know if a value is large than 1.XXXe+308, R will show Inf How can i do let the value, like 10^400 ,typed in R to show the word 1e+400 not Inf 1. You can't, due to the computer representation of floating point numbers. 2. Package brobdingnag lets you do it anyway. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary of linear fixed effects model is different than the HSAUR book
Christopher W. Ryan wrote: But on page 169, summary() is shown to produce additional columns in the fixed effects section, namely degrees of freedom and the P-value (with significance stars). How can I produce that output? Am I doing something wrong? Has lme4 changed? The latter. To make a long story short, the author got so fed up with the reliability of the DF heuristics that he decided to remove them altogether. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RAM, swap, Error: cannot allocate vector of size, Linux:
Feldman, Maximilian Jeffrey wrote: Dear Community, I am very new to the world of Linux and R and I have stumbled upon a problem that I cannot seem to resolve on my own. Here is the relevant background: I am working on a 64-bit Linux Fedora Core 6 OS. I using R version 2.5.1. I have 3.8 Gb of RAM and 1.9 Gb of swap. As I see it, there are no restraints on the amount of memory that R can use imposed by this particular OS build. When I type in the 'ulimit' command at the command line the response is 'unlimited'. Here is the problem: I have uploaded and normalized 48 ATH1 microarray slides using the justRMA function. library(affy) setwd(/Data/cel) Data-justRMA() The next step in my analysis is to calculate a distance matrix for my dataset using bioDist package. This is where I get my error. library(bioDist) x-cor.dist(exprs(Data)) Error: cannot allocate vector of size 3.9 Gb I used the following function to examine my memory limitations: mem.limits() nsize vsize NA NA I believe this means there isn't any specified limit to the amount of memory R can allocate to my task. I realize I only have 3.8 Gb of RAM but I would expect that R would use my 1.9 Gb of swap. It does, if swap works at all on your machine. However, the error message is relates to the object that R fails to create, not the total memory usage. I.e. this might very well be the _second_ object of size 3.9Gb that you are trying to fit into 5.7Gb of memory. You could try increasing the swap space (the expedient, although perhaps not efficient, way is to find a file system with a few tens of Gb to spare and create a large swapfile on it.) Does R not use my swap space? Can I explicitly tell R to use my swap space for large tasks such as this? I was not able to find any information regarding this particular issue in the R Linux manual, Linux FAQ, or on previous listserv threads. Many of the users who had similar questions resolved their problems in a different manner. Thanks to anyone who thinks they can provide assistance! Max Graduate Student Molecular Plant Sciences Washington State University [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
ONKELINX, Thierry wrote: The problem also exists in a clean workspace. But I've found the troublemaker. I had set options(OutDec = ,). Resetting this to options(OutDec = .) solved the problem. Thanks, Thierry Oups. That sounds like there's a bug somewhere. Can you cook up a minimal example which shows the behaviour? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
Prof Brian Ripley wrote: On Thu, 19 Jul 2007, Peter Dalgaard wrote: ONKELINX, Thierry wrote: The problem also exists in a clean workspace. But I've found the troublemaker. I had set options(OutDec = ,). Resetting this to options(OutDec = .) solved the problem. Thanks, Thierry Oups. That sounds like there's a bug somewhere. Can you cook up a minimal example which shows the behaviour? Any use of summary.lm will do it (e.g. example(lm)). The problem is in printCoefmat, at x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0) and yes, it looks like an infelicity to me. Ick. Any better ideas than printsAs0 - scan(con - textConnection(Cf[okP), dec=options(outDec)) ; close(con) x0 - (xm[okP] == 0) != printsAs0 ? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply
sigalit mangut-leiba wrote: I'm sorry for the unfocused questions, i'm new here... the output should be: classaps_mean 1 na 2 11.5 3 8 the mean aps of every class, when every id count *once*, for example: class 2, mean= (11+12)/2=11.5 hope it's clearer. Much... Get the first record for each individual from (e.g.) icul.redux - subset(icul, !duplicated(id)) then use tapply as before using variables from icul.redux. Or in one go with( subset(icul, !duplicated(id)), tapply(aps, class, mean, na.rm=TRUE) ) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R equivalent to Matlab's Bayes net toolbox
On Wed, 2007-07-18 at 03:52 +, Jose wrote: The thing that I don't understand in the gR page is why there are so many different packages and why they are not very integrated: You have to understand the gR project for that. It started from a number of completely separate pieces of software within the general field of graphical models, and tried to bring people together and make the existing pieces of software accessible from R. Given that the active core of the group was really just a handful of people with limited R programming experience (much of the original code was written in dialects of Pascal/Delphi), the project must be said to have had some success. However, the most pronounced effect has been to bring those old codes out in the open, but seamless integration would be quite far into the horizon. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting data frame by a string variable
Dimitri Liakhovitski wrote: I have a data frame MyData with 2 variables. One of the variables (String) contains a string of letters. How can I resort MyData by MyData$String (alphabetically) and then save the output as a sorted data file? I tried: o-order(MyData$String) SortedData-rbind(MyData$String[o], MyData$Value[o]) write.table(SortedData,file=Sorted.txt,sep=\t,quote=F, row.names=F) However, all strings get replaced with digits (1 for the first string, 2 for the second string etc.). How can I keep the strings instead of digits? Why on earth are you trying to rbind() things together? Anything wrong with SortedData - MyData[o,] write.table(SortedData,...whatever...) ? Thank you! Dimitri __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hmisc variable labels as vector?
Steve Powell wrote: Dear members I have imported an SPSS data file using Hmisc. So label(mydata[[1]]) gives me the first variable label Just wondering how I can access all the variable labels as a vector? Something like label(mydata[[1:3]]) but that doesn't work. Something like sapply(mydata, label) should work. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table function
sigalit mangut-leiba wrote: Hello all, I want to use the table function, but for every id I have a different no. of rows (obs.). If I write: table(x$class, x$infec) I don't get the right frequencies because I should count every id once, if id 1 has 20 observations It should count as one. can I use unique func. here? Hope it's clear. Almost. I assume that class and infect are constant over id? (If people change infection status during the trial, you have a more complex problem). You could then use unique() like this with(unique(x[c(id, class, infec)] , table(class, infec)), but I'd prefer using duplicated() as in with(subset(x, !duplicated(id)), table(class, infec)) (notice that the latter tabulates the first record for each id, whereas the former will count ids multiple times if the change class or infec). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The $ operator and vectors
Gustaf Rydevik wrote: Hi all, I've run into a slightly illogical (to me) behaviour with the $ subsetting function. consider: Test A B 1 1 Q 2 2 R Test$A [1] 1 2 vector-A Test$vector NULL Test$A [1] 1 2 Test[,vector] [1] 1 2 Is there a reason for the $ operator not evaluating the vector before executing? Yes, the evaluation rule for $ is like that Notice that it also didn't go looking for an object called A when you said test$A. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling a list faster
Balazs Torma wrote: Thank you all for your answers! The problem is that I don't know the length of the list in advance! And hoped for a convinience structure which reallocates once the preallocated list (or matrix) becomes full. That's not massively hard to do yourself, is it? As in if (i N) {l - c(l, vector(list,N); N - N*2} i.e. N-1; l - vector(list, N) system.time(for(i in (1:1e5)) { if (i N) {l - c(l, vector(list,N)); N - N*2} ; l[[i]] - c(i,i+1,i)}) user system elapsed 1.508 0.012 1.520 l[i+1:N]-NULL -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] charset in graphics
Donatas G. wrote: How do I make Lithuanian characters display correctly in R graphics? Instead of the special characters for Lithuanian language I get question marks... I use Ubuntu Feisty, the locale is utf-8 ... Do I need to specify somewhere the locale for R, or - default font for the graphics? You mean as in plot(0,main=\u104\u116\u0118\u012e\u0172\u016a\u010c\u0160\u017d) plot(0,main=tolower(\u104\u116\u0118\u012e\u0172\u016a\u010c\u0160\u017d)) ? This works fine for me on OpenSUSE 10.2, so I don't think the issue is in R. More likely, this has to do with X11 fonts (Unicode is handled via a rather complicated mechanism involving virtual fonts). Postscript/PDF is a bit more difficult. See ?postscript and the reference to Murrell+Ripley's R News article inside. The correct incantation seems to be postscript(font=URWHelvetica, encoding=ISOLatin7) plot(0,main=tolower(\u104\u116\u0118\u012e\u0172\u016a\u010c\u0160\u017d)) dev.off() -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] charset in graphics
Prof Brian Ripley wrote: On Fri, 13 Jul 2007, Peter Dalgaard wrote: The correct incantation seems to be postscript(font=URWHelvetica, encoding=ISOLatin7) plot(0,main=tolower(\u104\u116\u0118\u012e\u0172\u016a\u010c\u0160\u017d)) dev.off() The encoding should happen automagically in a Lithuanian UTF-8 locale, and does for me. But suitable fonts (e.g. URW ones) are needed. OK, I sort of suspected that, although it wasn't entirely clear to me whether autoconversion would cover cases like en_LT.utf8, if that even exists. Still, the explicit (portable?) way of doing it is probably worth knowing too (there could be a few pitfalls with scripts getting run outside their usual domain). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation matrix
Caskenette, Amanda wrote: I have a model with 5 parameters that I am optimising where the (best) value of the objective function is negative. I would like to use the Hessian matrix (from genoud and/or optim functions) to construct the covariance and correlation matrices. This is the code that I am using: est - out$par # Parameter estimates H - out$hessian # Hessian V - solve(H) # Covariance matrix s - sqrt(abs(diag(V)))# Vector of standard deviations cor - V/(s%o%s)# Correlation coefficient matrix ci - est+qnorm(0.975)*s%o%c(-1,1) # 95% CI's However I am getting values that are greater than 1 (1.05, 2.34, etc) for the correlation matrix. Might this be due to the fact that the out$val is negative? Not by itself (just add a large enough constant to the objective function and the value becomes positive without changing the Hessian). More likely, you have not actually found the minimum (Hessian not positive definite), or there is a code error. Print out and review the following items: H, eigen(H), V, s, s%o%s and see if that makes you any wiser (why are you taking abs(diag(V))? Negative elements?) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compute rank within factor groups
Ken Williams wrote: Hi, I have a data.frame which is ordered by score, and has a factor column: Browse[1] wc[c(report,score)] report score 9 ADEA 0.96 8 ADEA 0.90 11 Asylum_FED9 0.86 3 ADEA 0.75 14 Asylum_FED9 0.60 5 ADEA 0.56 13 Asylum_FED9 0.51 16 Asylum_FED9 0.51 2 ADEA 0.42 7 ADEA 0.31 17 Asylum_FED9 0.27 1 ADEA 0.17 4 ADEA 0.17 6 ADEA 0.12 10ADEA 0.11 12 Asylum_FED9 0.10 15 Asylum_FED9 0.09 18 Asylum_FED9 0.07 Browse[1] I need to add a column indicating rank within each factor group, which I currently accomplish like so: wc$rank - 0 for(report in as.character(unique(wc$report))) { wc[wc$report==report,]$rank - 1:sum(wc$report==report) } I have to wonder whether there's a better way, something that gets rid of the for() loop using tapply() or by() or similar. But I haven't come up with anything. I've tried these: by(wc, wc$report, FUN=function(pr){pr$rank - 1:nrow(pr)}) by(wc, wc$report, FUN=function(pr){wc[wc$report %in% pr$report,]$rank - 1:nrow(pr)}) But in both cases the effect of the assignment is lost, there's no $rank column generated for wc. Any suggestions? There's a little known and somewhat unfortunately named function called ave() which does just that sort of thing. ave(wc$score, wc$report, FUN=rank) [1] 10.0 9.0 8.0 8.0 7.0 7.0 5.5 5.5 6.0 5.0 4.0 3.5 3.5 2.0 1.0 [16] 3.0 2.0 1.0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] inquiry about anova and ancova
Anderson, Mary-Jane wrote: Dear R users, I have a rather knotty analysis problem and I was hoping that someone on this list would be able to help. I was advised to try this list by a colleague who uses R but it is a statistical inquiry not about how to use R. In brief I have a 3x2 anova, 2 tasks under 3 conditions, within subjects. I also took a variety of personality measures that might influence the results under the different conditions. I had thought that an ancova would be the best test, but it might be the case that this would not work with a within subjects design. I have not found anything that explicitly states whether or not it would, but all the examples I have read are between subjects design. I also thought of investigating a manova, but it is not really the case that I have more than one DV, it is the same DV in 6 different combinations of task and condition. There were 4 personality measures and I wanted to look at the degree to which they affected the task/ condition interaction. I have explained this briefly here, but I can of course provied more details to anyone who can advise me further with this. This sounds like a job for a Multivariate Linear Model (assuming that you have complete data for each subject or are prepared to throw away subjects with missing values). This lets you decompose the response into mean, effects of task and condition, and the interaction effect. Each component can then be separately tested for effect of predictors, using multivariate tests, or F tests under sphericity assumptions. Have a look at example(anova.mlm); this mostly looks at cases where effects are tested against zero, but the last example involves a (bogus) between subject factor f. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] type III ANOVA for a nested linear model
Carsten Jaeger wrote: Hello Peter, thanks for your help. I'm quite sure that I specified the right model. Factor C is indeed nested within factor A. I think you were confused by the numbering of C (1..11), and it is easier to understand when I code it as you suggested (1,2,3 within each level of A, as in mydata1 [see below]). However, it does not matter which numbering I choose for carrying the analysis, as anova(lm(resp ~ A * B + (C %in% A), mydata)) anova(lm(resp ~ A * B + (C %in% A), mydata1)) both give the same results (as at least I had expected because of the nesting). However, I found that Anova() from the car package only accepts the second version. So, Anova(lm(resp ~ A * B + (C %in% A), mydata)) does not work (giving an error) but Anova(lm(resp ~ A * B + (C %in% A), mydata1)) does. This behaviour is rather confusing, or is there anything I'm missing? You're not listening to what I told you! A term C %in% A (or A/C) is not a _specification_ that C is nested in A, it is a _directive_ to include the terms A and C:A. Now, C:A involves a term for each combination of A and C, of which many are empty if C is strictly coarser than A. This may well be what is confusing Anova(). In fact, with this (c(1:3,6:11)) coding of C, A:C is completely equivalent to C, but if you look at summary(lm()) you will see a lot of NA coefficients in the A:C case. If you use resp ~ A*B+C, then you still get a couple of missing coefficients in the C terms because of collinearity with the A terms. (Notice that this is one case where the order inside the model formula will matter; C+A*B is not the same.) Whether you'd want C as a random factor is a different matter. It is often the natural model if C is subject and A is group. Let's assume that this is the case: In an ordinary linear model, you can test whether you can remove C (or A:C) , which implies that all subjects in the same group have the same level of the response. In your case, the hypothesis is accepted, but the F statistic is around 3 (on (6, 6) DF) , which suggests that there might be some variation of subjects within groups. In a mixed-effects model, you assume that this variation exists and therefore you use the SSD for C as the denominator when testing A, which is arguably safer than pooling it with the somewhat smaller residual SSD. Thanks for your help again, Carsten R mydata A B C resp 1 1 1 1 34.12 2 1 1 2 32.45 3 1 1 3 44.55 4 1 2 1 20.88 5 1 2 2 22.32 6 1 2 3 27.71 7 2 1 6 38.20 8 2 1 7 31.62 9 2 1 8 38.71 102 2 6 18.93 112 2 7 20.57 122 2 8 31.55 133 1 9 40.81 143 1 10 42.23 153 1 11 41.26 163 2 9 28.41 173 2 10 24.07 183 2 11 21.16 R mydata1 A B C resp 1 1 1 1 34.12 2 1 1 2 32.45 3 1 1 3 44.55 4 1 2 1 20.88 5 1 2 2 22.32 6 1 2 3 27.71 7 2 1 1 38.20 8 2 1 2 31.62 9 2 1 3 38.71 102 2 1 18.93 112 2 2 20.57 122 2 3 31.55 133 1 1 40.81 143 1 2 42.23 153 1 3 41.26 163 2 1 28.41 173 2 2 24.07 183 2 3 21.16 On Tue, 2007-07-10 at 13:54 +0200, Peter Dalgaard wrote: Carsten Jaeger wrote: Hello, is it possible to obtain type III sums of squares for a nested model as in the following: lmod - lm(resp ~ A * B + (C %in% A), mydata)) I have tried library(car) Anova(lmod, type=III) but this gives me an error (and I also understand from the documentation of Anova as well as from a previous request (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/64477.html) that it is not possible to specify nested models with car's Anova). anova(lmod) works, of course. My data (given below) is balanced so I expect the results to be similar for both type I and type III sums of squares. But are they *exactly* the same? The editor of the journal which I'm sending my manuscript to requests what he calls conventional type III tests and I'm not sure if can convince him to accept my type I analysis. In balanced designs, type I-IV SSD's are all identical. However, I don't think the model does what I think you think it does. Notice that nesting is used with two diferent meanings, in R it would be that the codings of C only makes sense within levels of A - e.g. if they were numbered 1:3 within each group, but with C==1 when A==1 having nothing to do with C==1 when A==2. SAS does something. er. else... What I think you want is a model where C is a random terms so that main effects of A can be tested, like in summary(aov(resp ~ A * B + Error(C), dd
Re: [R] make error R-5.1 on sun solaris
Dan Powers wrote: I hope this is enough information to determine the problem. Thanks in advance for any help. Configure goes ok (I think) ./configure --prefix=$HOME --without-iconv R is now configured for sparc-sun-solaris2.9 Source directory: . Installation directory:/home/dpowers C compiler:gcc -g -O2 Fortran 77 compiler: f95 -g C++ compiler: g++ -g -O2 Fortran 90/95 compiler:f95 -g Obj-C compiler: -g -O2 Interfaces supported: X11 External libraries:readline Additional capabilities: NLS Options enabled: shared BLAS, R profiling, Java Recommended packages: yes Make ends after the gcc.. make . . . gcc -I. -I../../src/include -I../../src/include -I/usr/openwin/include -I/usr/local/include -DHAVE_CONFIG_H -g -O2 -c system.c -o system.o system.c: In function `Rf_initialize_R': system.c:144: parse error before `char' system.c:216: `localedir' undeclared (first use in this function) system.c:216: (Each undeclared identifier is reported only once system.c:216: for each function it appears in.) *** Error code 1 make: Fatal error: Command failed for target `system.o' Current working directory /home/dpowers/R-2.5.1/src/unix *** Error code 1 make: Fatal error: Command failed for target `R' Current working directory /home/dpowers/R-2.5.1/src/unix *** Error code 1 make: Fatal error: Command failed for target `R' Current working directory /home/dpowers/R-2.5.1/src *** Error code 1 make: Fatal error: Command failed for target `R' I have tried setting localedir directly in configure options, but get the same error. Any ideas? Hmm, which version of gcc is this? The problem seems to be around line 144 which reads 140 Rstart Rp = rstart; 141 cmdlines[0] = '\0'; 142 143 #ifdef ENABLE_NLS 144 char localedir[PATH_MAX+20]; 145 #endif 146 147 #if defined(HAVE_SYS_RESOURCE_H) defined(HAVE_GETRLIMIT) 148 { 149 struct rlimit rlim; I seem to remember that it used to be non-kosher to mix declarations and ordinary code like that, but the current compiler doesn't seem to care (I do have #define ENABLE_NLS 1 in Rconfig.h, as I assume you do too). Could you perhaps try moving line 141 down below #endif? Thanks, Dan =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Daniel A. Powers, Ph.D. Department of Sociology University of Texas at Austin 1 University Station A1700 Austin, TX 78712-0118 phone: 512-232-6335 fax: 512-471-1748 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repeated Measure different results to spss
mb2 wrote: Hi, I have some problems with my repeated measures analysis. When I compute it with SPSS I get different results than with R. Probably I am doing something wrong in R. I have two groups (1,2) both having to solve a task under two conditions (1,2). That is one between subject factor (group) and one within subject factor (task). I tried the following: aov(Score ~factor(Group)*factor(Task)+Error(Id))) aov(Score ~factor(Group)*factor(Task)) but it leads to different results than my spss. I definitely miss some point here . Did you mean Error(factor(Id)) ? With that modification, things look sane. Can't vouch for SPSS... (As a general matter, I prefer to do the factor conversions up front, rather than inside model formulas.) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] type III ANOVA for a nested linear model
Carsten Jaeger wrote: Hello, is it possible to obtain type III sums of squares for a nested model as in the following: lmod - lm(resp ~ A * B + (C %in% A), mydata)) I have tried library(car) Anova(lmod, type=III) but this gives me an error (and I also understand from the documentation of Anova as well as from a previous request (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/64477.html) that it is not possible to specify nested models with car's Anova). anova(lmod) works, of course. My data (given below) is balanced so I expect the results to be similar for both type I and type III sums of squares. But are they *exactly* the same? The editor of the journal which I'm sending my manuscript to requests what he calls conventional type III tests and I'm not sure if can convince him to accept my type I analysis. In balanced designs, type I-IV SSD's are all identical. However, I don't think the model does what I think you think it does. Notice that nesting is used with two diferent meanings, in R it would be that the codings of C only makes sense within levels of A - e.g. if they were numbered 1:3 within each group, but with C==1 when A==1 having nothing to do with C==1 when A==2. SAS does something. er. else... What I think you want is a model where C is a random terms so that main effects of A can be tested, like in summary(aov(resp ~ A * B + Error(C), dd)) Error: C Df Sum Sq Mean Sq F value Pr(F) A 2 33.123 16.562 0.4981 0.6308 Residuals 6 199.501 33.250 Error: Within Df Sum Sq Mean Sq F value Pr(F) B 1 915.21 915.21 83.7846 9.57e-05 *** A:B2 16.138.07 0.7384 0.5168 Residuals 6 65.54 10.92 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (This is essentially the same structure as Martin Bleichner had earlier today, also @web.de. What is this? an epidemic? ;-)) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building R on Interix 6.0
length of command line arguments... 262144 checking command to parse /bin/nm -B output from gcc object... ok checking for objdir... .libs checking for ranlib... (cached) ranlib checking for strip... strip checking if gcc static flag works... yes checking if gcc supports -fno-rtti -fno-exceptions... no checking for gcc option to produce PIC... -fPIC checking if gcc PIC flag -fPIC works... yes checking if gcc supports -c -o file.o... yes checking whether the gcc linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld) supports shared libraries... no checking dynamic linker characteristics... no checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes checking if libtool supports shared libraries... no checking whether to build shared libraries... no checking whether to build static libraries... yes configure: creating libtool appending configuration tag CXX to libtool checking for ld used by g++... /opt/gcc.3.3/i586-pc-interix3/bin/ld checking if the linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld) is GNU ld... yes checking whether the g++ linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld) supports shared libraries... no sed: 1: s/\*/\\\*/g: invalid command code checking for g++ option to produce PIC... -fPIC checking if g++ PIC flag -fPIC works... yes checking if g++ supports -c -o file.o... yes checking whether the g++ linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld) supports shared libraries... no checking dynamic linker characteristics... no checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes appending configuration tag F77 to libtool checking if libtool supports shared libraries... no checking whether to build shared libraries... no checking whether to build static libraries... yes checking for g77 option to produce PIC... -fPIC checking if g77 PIC flag -fPIC works... yes checking if g77 supports -c -o file.o... yes checking whether the g77 linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld) supports shared libraries... no checking dynamic linker characteristics... no checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes ./configure: : bad substitution __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using the function unique(), but asking it to ignore a column of a data.frame
Andrew Yee wrote: Thanks. But in this specific case, I would like the output to include all three columns, including the ignored column (in this case, I'd like it to ignore column a). df[!duplicated(df[,c(a,c)]),] or perhaps df[!duplicated(df[-2]),] Thanks, Andrew On 7/9/07, hadley wickham [EMAIL PROTECTED] wrote: On 7/9/07, Andrew Yee [EMAIL PROTECTED] wrote: Take for example the following data.frame: a-c(1,1,5) b-c(3,2,3) c-c(5,1,5) sample.data.frame-data.frame(a=a,b=b,c=c) I'd like to be able to use unique(sample.data.frame), but have unique() ignore column a when determining the unique elements. However, I figured that this would be setting for incomparables=, but it appears that this funcationality hasn't been incorporated. Is there a work around for this, i.e. to be able to get unique to only look at selected columns of a data frame? unique(df[,c(a,c)]) ? Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANOVA: Does a Between-Subjects Factor belong in the Error Term?
Alex Baugh wrote: I am executing a Repeated Measures Analysis of Variance with 1 DV (LOCOMOTOR RESPONSE), 2 Within-Subjects Factors (AGE, ACOUSTIC CONDITION), and 1 Between-Subjects Factor (SEX). Does anyone know whether the between-subjects factor (SEX) belongs in the Error Term of the aov or not? And if it does belong, where in the Error Term does it go? The 3 possible scenarios are listed below: e.g., 1. Omit Sex from the Error Term: My.aov = aov(Locomotor.Response~(Age*AcousticCond*Sex) + Error (Subject/(Timepoint*Acx.Cond)), data=locomotor.tab) note: Placing SEX outside the double paretheses of the Error Term has the same statistical outcome effect as omitting it all together from the Error Term (as shown above in #1). 2. Include SEX inside the Error Term (inside Double parentheses): My.aov = aov(Locomotor.Response~(Age*AcousticCond*Sex) + Error (Subject/(Timepoint*Acx.Cond+Sex)), data=locomotor.tab) 3. Include SEX inside the Error Term (inside Single parentheses): My.aov = aov(Locomotor.Response~(Age*AcousticCond*Sex) + Error (Subject/(Timepoint*Acx.Cond)+Sex), data=locomotor.tab) note: Placing SEX inside the single parentheses (as shown above in #3) generates no main effect of Sex. Thus, I'm fairly confident that option #3 is incorrect. Scenarios 1,2, and 3 yield different results in the aov summary. You don't generally want terms with systematic effects to appear as error terms also, so 3 is wrong. In 2 you basically have a random effect of sex within subject, which is nonsensical since the subjects presumably have only one sex each. This presumably generates an error stratum with 0 DF, which may well be harmless. That leaves 1 as the likely solution. You'll probably do yourself a favour if you learn to expand error terms, a/b == a + a:b, etc.; that's considerably more constructive than trying to think in terms of whether things are inside or outside parentheses. Thanks for your help! Alex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How does the r-distribution function work
pieter claassen wrote: I am trying to understand what rbinom function does. Here is some sample code. Are both the invocations of bfunc effectively doing the same or I am missing the point? There are some newbie issues with your code (you are extending a on every iteration, and your bfunc is just rbinom with the parameters in a different order), but basically, yes: They are conceptually the same. Both give 1 independent binomial samples. In fact, if you reset the random number generator in between, they also give the same results (this is an implementation issue and not obviously guaranteed for any distribution) . Here's an example with smaller values than 1 and 30. set.seed(123) rbinom(10,1,.5) [1] 0 1 0 1 1 0 1 1 1 0 set.seed(123) for (i in 1:10) print(rbinom(1,1,.5)) [1] 0 [1] 1 [1] 0 [1] 1 [1] 1 [1] 0 [1] 1 [1] 1 [1] 1 [1] 0 set.seed(123) replicate(10, rbinom(1,1,.5)) [1] 0 1 0 1 1 0 1 1 1 0 Thanks, Pieter bfunc - function(n1,p1,sims) { c-rbinom(sims,n1,p1) c } a=c() b=c() p1=.5 for (i in 1:1){ a[i]=bfunc(30,p1,1) } b=bfunc(30,p1,1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t.test
matthew wrote: Hi, how can I solve a problem without the function t.test??? for example: x-(1,3,5,7) y-(2,4,6) t.test(x,y,alternative=less,paired=FALSE,var.equal=TRUE,conf.level=0.95) Homework? Hints: Take out your statistics textbook and look up the formulas for the two-sample t. You'll probably (there can be some variation depending on the book) find that you need to compute - difference of means - sd for each group - pooled sd - s.e. of differences of means all of which you can do easily in R, once you have the formulas. Then calculate the t statistic and the corresponding p value, either using a table or R's function for the t distibution. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning message: cannot create HTML package index
Leo wrote: On 06/07/2007, Prof Brian Ripley wrote: On Fri, 6 Jul 2007, Leo wrote: I have set R_LIBS=~/R_lib as I don't have root access. The following message shown up every time after installing a package: .. The downloaded packages are in /tmp/RtmpBoIPoz/downloaded_packages Warning message: cannot create HTML package index in: tools:::unix.packages.html(.Library) Any ideas? It is a correct warning. What is the problem with being warned? R tries to maintain an HTML page of installed packages, but you don't have permission to update it. Where is that HTML page located on a GNU/Linux system? Is it possible to maintain a user HTML page of installed packages? Thanks, This confuses me a bit too. I had gotten used to the warning without thinking about it. It tries to update $RHOME/doc/html/packages.html, which starts like this: . ph3Packages in the standard library/h3 . However, if I run help.start, I get help.start() Making links in per-session dir ... If 'firefox' is already running, it is *not* restarted, and you must switch to its window. Otherwise, be patient ... and then it opens (say) file:///tmp/RtmpXyp5Cg/.R/doc/html/index.html which has a link to file:///tmp/RtmpXyp5Cg/.R/doc/html/packages.html which looks like this ph3Packages in /home/bs/pd/Rlibrary/h3 ph3Packages in /usr/lib64/R/library/h3 I.e. it is autogenerated by help.start and doesn't even look at the file in $RHOME. So what puzzles me is (a) why we maintain $RHOME/doc/html/packages.html at all One argument could be that this is browseable for everyone on a system, even without starting R. But then (b) why do we even try updating it when packages are installed in a private location? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Me again, about the horrible documentation of tcltk
Alberto Monteiro wrote: How on Earth can I know what are the arguments of any of the functions of the tcl/tk package? I tried hard to find, using all search engines available, looking deep into keywords of R, python's tkinter and tcl/tk, but nowhere I found anything remotely similar to a help. For example, what are the possible arguments to tkgetOpenFile? I know that this works: library(tcltk) filename - tclvalue(tkgetOpenFile( filetypes={{Porn Files} {.jpg}} {{All files} {*}})) if (filename != ) cat(Selected file:, filename, \n) but, besides filetypes, what are the other arguments to tkgetOpenFile? I would like to force the files to be sorted by time, with most recent files coming first (and no, the purpose is not to use for porn files). man n tk_getOpenFile or if you are not on Unix/Linux, find it online with Google Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lookups in R
mfrumin wrote: Hey all; I'm a beginner++ user of R, trying to use it to do some processing of data sets of over 1M rows, and running into a snafu. imagine that my input is a huge table of transactions, each linked to a specif user id. as I run through the transactions, I need to update a separate table for the users, but I am finding that the traditional ways of doing a table lookup are way too slow to support this kind of operation. i.e: for(i in 1:100) { userid = transactions$userid[i]; amt = transactions$amounts[i]; users[users$id == userid,'amt'] += amt; } I assume this is a linear lookup through the users table (in which there are 10's of thousands of rows), when really what I need is O(constant time), or at worst O(log(# users)). is there any way to manage a list of ID's (be they numeric, string, etc) and have them efficiently mapped to some other table index? I see the CRAN package for SQLite hashes, but that seems to be going a bit too far. Sometimes you need a bit of lateral thinking. I suspect that you could do it like this: tbl - with(transactions, tapply(amount, userid, sum)) users$amt - users$amt + tbl[users$id] one catch is that there could be users with no transactions, in which case you may need to replace userid by factor(userid, levels=users$id). None of this is tested, of course. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lookups in R
Michael Frumin wrote: i wish it were that simple. unfortunately the logic i have to do on each transaction is substantially more complicated, and involves referencing the existing values of the user table through a number of conditions. any other thoughts on how to get better-than-linear performance time? is there a recommended binary searching/sorting (i.e. BTree) module that I could use to maintain my own index? The point remains: To do anything efficient in R, you need to get rid of that for loop and use something vectorized. Notice that you can expand values from the user table into the transaction table by indexing with transactions$userid, or you can use a merge operation. thanks, mike Peter Dalgaard wrote: mfrumin wrote: Hey all; I'm a beginner++ user of R, trying to use it to do some processing of data sets of over 1M rows, and running into a snafu. imagine that my input is a huge table of transactions, each linked to a specif user id. as I run through the transactions, I need to update a separate table for the users, but I am finding that the traditional ways of doing a table lookup are way too slow to support this kind of operation. i.e: for(i in 1:100) { userid = transactions$userid[i]; amt = transactions$amounts[i]; users[users$id == userid,'amt'] += amt; } I assume this is a linear lookup through the users table (in which there are 10's of thousands of rows), when really what I need is O(constant time), or at worst O(log(# users)). is there any way to manage a list of ID's (be they numeric, string, etc) and have them efficiently mapped to some other table index? I see the CRAN package for SQLite hashes, but that seems to be going a bit too far. Sometimes you need a bit of lateral thinking. I suspect that you could do it like this: tbl - with(transactions, tapply(amount, userid, sum)) users$amt - users$amt + tbl[users$id] one catch is that there could be users with no transactions, in which case you may need to replace userid by factor(userid, levels=users$id). None of this is tested, of course. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] focus to tkwindow after a PDF window pop up
Hao Liu wrote: Dear All: I currently have a TK window start a acroread window: However, when the acroread window is open, I can't get back to the TK window unless I close the acroead. I invoked the acroread window using: system(paste(acroread ,file, sep=)) anything I can do to make them both available to users? Tell system() not to _wait_ for command to complete. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] compute time span in months between two dates
Aydemir, Zava (FID) wrote: Hi, I am just starting to play with R. What is the recommended manner for calculating time spans between 2 dates? In particular, should I be using the chron or the date package (so far I just found how to calculate a timespan in terms of days)? Thanks I'd recommend something along these lines: d1 - 11/03-1959 d2 - 2/7-2007 f - %d/%m-%Y as.numeric(as.Date(d2, f) - as.Date(d1, f), units=days) (The format in f needs to be adjusted to the actual format, of course. For some formats, it can be omitted altogether). Zava This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exaustive subgrouping or combination
David Duffy wrote: Waverley [EMAIL PROTECTED] asked: Dear Colleagues, I am looking for a package or previous implemented R to subgroup and exaustively divide a vector of squence into 2 groups. -- Waverley @ Palo Alto Google [R] Generating all possible partitions and you will find some R code from 2002 or so. In 2002 this wasn't already in R. These days, help(combn) is more to the point: mn - sort(zapsmall(combn(sleep$extra,10,mean))) plot(unique(mn),table(mn)) abline(v=mean(sleep$extra[1:10])) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparison: glm() vs. bigglm()
Benilton Carvalho wrote: Hi, Until now, I thought that the results of glm() and bigglm() would coincide. Probably a naive assumption? Anyways, I've been using bigglm() on some datasets I have available. One of the sets has 15M observations. I have 3 continuous predictors (A, B, C) and a binary outcome (Y). And tried the following: m1 - bigglm(Y~A+B+C, family=binomial(), data=dataset1, chunksize=10e6) m2 - bigglm(Y~A*B+C, family=binomial(), data=dataset1, chunksize=10e6) imp - m1$deviance-m2$deviance For my surprise imp was negative. I then tried the same models, using glm() instead... and as I expected, imp was positive. I also noticed differences on the coefficients estimated by glm() and bigglm() - small differences, though, and CIs for the coefficients (a given coefficient compared across methods) overlap. Are such incrongruences expected? What can I use to check for convergence with bigglm(), as this might be one plausible cause for a negative difference on the deviances? It doesn't sound right, but I cannot reproduce your problem on a similar sized problem (it pretty much killed my machine...). Some observations: A: You do realize that you are only using 1.5 chunks? (15M vs. 10e6 chunksize) B: Deviance changes are O(1) under the null hypothesis but the deviances themselves are O(N). In a smaller variant (N=1e5), I got m1$deviance [1] 138626.4 m2$deviance [1] 138626.4 m2$deviance - m1$deviance [1] -0.05865785 This does leave some scope for roundoff to creep in. You may want to play with a lower setting of tol=... -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression and dummy variable coding
Li, Bingshan wrote: Hi Frank, I do not quite get you. What do you mean by simulation and speed issues? I do not see why they have to be considered in logistic regression. Exactly. So don't use techniques that are only needed when such issues do have to be considered. Thanks. Bingshan From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] Sent: Fri 6/29/2007 7:40 AM To: Li, Bingshan Cc: Seyed Reza Jafarzadeh; r-help@stat.math.ethz.ch Subject: Re: [R] logistic regression and dummy variable coding Bingshan Li wrote: Hi All, Now it works. Thanks for all your answers and the explanations are very clear. Bingshan But note that you are not using R correctly unless you are doing a simulation and have some special speed issues. Let the model functions do all this for you. Frank On Jun 28, 2007, at 7:44 PM, Seyed Reza Jafarzadeh wrote: NewVar - relevel( factor(OldVar), ref = b) should create a dummy variable, and change the reference category for the model. Reza On 6/28/07, Bingshan Li [EMAIL PROTECTED] wrote: Hello everyone, I have a variable with several categories and I want to convert this into dummy variables and do logistic regression on it. I used model.matrix to create dummy variables but it always picked the smallest one as the reference. For example, model.matrix(~.,data=as.data.frame(letters[1:5])) will code 'a' as '0 0 0 0'. But I want to code another category as reference, say 'b'. How to do it in R using model.matrix? Is there other way to do it if model.matrix has no such functionality? Thanks! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dominant eigenvector displayed as third (Marco Visser)
Marco Visser wrote: Dear R users Experts, This is just a curiousity, I was wondering why the dominant eigenvetor and eigenvalue of the following matrix is given as the third. I guess this could complicate automatic selection procedures. 000005 100000 010000 001000 000100 000010 Please copy paste the following into R; a=c(0,0,0,0,0,5,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0) mat=matrix(a, ncol=6,byrow=T) eigen(mat) The matrix is a population matrix for a plant pathogen (Powell et al 2005). Basically I would really like to know why this happens so I will know if it can occur again. Thanks for any comments, Marco Visser Comment: In Matlab the the dominant eigenvetor and eigenvalue of the described matrix are given as the sixth. Again no idea why. I get eigen(mat)$values [1] -0.65383+1.132467i -0.65383-1.132467i 0.65383+1.132467i 0.65383-1.132467i [5] -1.30766+0.00i 1.30766+0.00i Mod(eigen(mat)$values) [1] 1.307660 1.307660 1.307660 1.307660 1.307660 1.307660 So all the eigenvalues are equal in modulus. What makes you think one of them is dominant? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exaustive subgrouping or combination
David Duffy wrote: On Fri, 29 Jun 2007, Peter Dalgaard wrote: David Duffy wrote: Waverley [EMAIL PROTECTED] asked: Dear Colleagues, I am looking for a package or previous implemented R to subgroup and exaustively divide a vector of squence into 2 groups. -- Waverley @ Palo Alto Google [R] Generating all possible partitions and you will find some R code from 2002 or so. In 2002 this wasn't already in R. These days, help(combn) is more to the point: mn - sort(zapsmall(combn(sleep$extra,10,mean))) plot(unique(mn),table(mn)) abline(v=mean(sleep$extra[1:10])) As I read it, the original query is about partitioning the set eg ((1 2) 3) ((1 3) 2) (1 (2 3)). Yes, and combn(3,2) [,1] [,2] [,3] [1,]112 [2,]233 gives you the first group of each of the three partitions __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R 2.5.1 is released
I've rolled up R-2.5.1.tar.gz a short while ago. This is a maintenance release and fixes a number of mostly minor bugs and platform issues. See the full list of changes below. You can get it (in a short while) from http://cran.r-project.org/src/base/R-2/R-2.5.1.tar.gz or wait for it to be mirrored at a CRAN site nearer to you. Binaries for various platforms will appear in due course. For the R Core Team Peter Dalgaard These are the md5sums for the freshly created files, in case you wish to check that they are uncorrupted: a8efde35b940278de19730d326f58449 AUTHORS eb723b61539feef013de476e68b5c50a COPYING a6f89e2100d9b6cdffcea4f398e37343 COPYING.LIB 24ad9647e525609bce11f6f6ff9eac2d FAQ 70447ae7f2c35233d3065b004aa4f331 INSTALL f04bdfaf8b021d046b8040c8d21dad41 NEWS 88bbd6781faedc788a1cbd434194480c ONEWS 4f004de59e24a52d0f500063b4603bcb OONEWS 162f6d5a1bd7c60fd652145e050f3f3c R-2.5.1.tar.gz 162f6d5a1bd7c60fd652145e050f3f3c R-latest.tar.gz 433182754c05c2cf7a04ad0da474a1d0 README 020479f381d5f9038dcb18708997f5da RESOURCES 4eaf8a3e428694523edc16feb0140206 THANKS Here is the relevant bit of the NEWS file: CHANGES IN R VERSION 2.5.1 NEW FEATURES o density(1:20, bw = SJ) now works as bw.SJ() now tries a larger search interval than the default (lower, upper) if it does not find a solution within the latter. o The output of library() (no arguments) is now sorted by library trees in the order of .libPaths() and not alphabetically. o R_LIBS_USER and R_LIBS_SITE feature possible expansion of specifiers for R version specific information as part of the startup process. o C-level warning calls now print a more informative context, as C-level errors have for a while. o There is a new option rl_word_breaks to control the way the input line is tokenized in the readline-based terminal interface for object- and file-name completion. This allows it to be tuned for people who use their space bar vs those who do not. The default now allows filename-completion with +-* in the filenames. o If the srcfile argument to parse() is not NULL, it will be added to the result as a srcfile attribute. o It is no longer possible to interrupt lazy-loading (which was only at all likely when lazy-loading environments), which would leave the object being loaded in an unusable state. This is a temporary measure: error-recovery when evaluating promises will be tackled more comprehensively in 2.6.0. INSTALLATION o 'make check' will work with --without-iconv, to accommodate building on AIX where the system iconv conflicts with libiconv and is not compatible with R's requirements. o There is support for 'DESTDIR': see the R-admin manual. o The texinfo manuals are now converted to HTML with a style sheet: in recent versions of makeinfo the markup such as @file was being lost in the HTML rendering. o The use of inlining has been tweaked to avoid warnings from gcc = 4.2.0 when compiling in C99 mode (which is the default from configure). BUG FIXES o as.dendrogram() failed on objects of class dendrogram. o plot(type =s) (or S) with many (hundreds of thousands) of points could overflow the stack. (PR#9629) o Coercing an S4 classed object to matrix (or other basic class) failed to unset the S4 bit. o The 'useS4' argument of print.default() had been broken by an unrelated change prior to 2.4.1. This allowed print() and show() to bounce badly constructed S4 objects between themselves indefinitely. o Prediction of the seasonal component in HoltWinters() was one step out at one point in the calculations. decompose() incorrectly computed the 'random' component for a multiplicative fit. o Wildcards work again in unlink() on Unix-alikes (they did not in 2.5.0). o When qr() used pivoting, the coefficient names in qr.coef() were not pivoted to match. (PR#9623) o UseMethod() could crash R if the first argument was not a character string. o R and Rscript on Unix-alikes were not accepting spaces in -e arguments (even if quoted). o Hexadecimal integer constants (e.g. 0x10L) were not being parsed correctly on platforms where the C function atof did not accept hexadecimal prefixes (as required by C99, but not implemented in MinGW as used by R on Windows). (PR#9648) o libRlapack.dylib on Mac OS X had no version information and sometimes an invalid identification name. o Rd conversion of \usage treated '\\' as a single backslash in all but latex: it now acts consistently with the other verbatim-like environments (it was never 'verbatim
Re: [R] : regular expressions: escaping a dot
Prof Brian Ripley wrote: This is explained in ?regexp (in the See Also of ?regexpr): Patterns are described here as they would be printed by 'cat': _do remember that backslashes need to be doubled when entering R character strings from the keyboard_. and in the R FAQ and Hmm, that's not actually correct, is it? Perhaps this is better ...entering R character string literals (i.e., between quote symbols.) The counterexample would be readLines() \\abc [1] abc (of course it is more important to get people to read the documentation at all...) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aov and lme differ with interaction in oats example of MASS?
Karl Knoblick wrote: Dear R-Community! The example oats in MASS (2nd edition, 10.3, p.309) is calculated for aov and lme without interaction term and the results are the same. But I have problems to reproduce the example aov with interaction in MASS (10.2, p.301) with lme. Here the script: library(MASS) library(nlme) options(contrasts = c(contr.treatment, contr.poly)) # aov: Y ~ N + V oats.aov - aov(Y ~ N + V + Error(B/V), data = oats, qr = T) summary(oats.aov) # now lme oats.lme-lme(Y ~ N + V, random = ~1 | B/V, data = oats) anova(oats.lme, type=m) # Ok! # aov:Y ~ N * V + Error(B/V) oats.aov2 - aov(Y ~ N * V + Error(B/V), data = oats, qr = T) summary(oats.aov2) # now lme - my trial! oats.lme2-lme(Y ~ N * V, random = ~1 | B/V, data = oats) anova(oats.lme2, type=m) # differences!!! (except of interaction term) My questions: 1) Is there a possibility to reproduce the result of aov with interaction using lme? 2) If not, which result of the above is the correct one for the oats example? The issue is that you are using marginal tests which will do strange things when contrasts are not coded right, and in particular treatment contrasts are not. Switch to e.g. contr.helmert and the results become similar. Marginal tests of main effects in the presence of interaction is not necessarily a good idea and they have been debated here and elsewhere a number of times before. People don't agree entirely, but the dividing line is essentially whether it is uniformly or just mostly a bad idea. It is essentially the discussion of type III SS. fortune(type III) Some of us feel that type III sum of squares and so-called ls-means are statistical nonsense which should have been left in SAS. -- Brian D. Ripley s-news (May 1999) Thanks a lot! Karl __ Alles was der Gesundheit und Entspannung dient. BE A BETTER MEDIZINMANN! www.yahoo.de/clever __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repeat if
Birgit Lemcke wrote: Thanks that was really a quick answer. It works but I get this warning message anyway: 1: kein nicht-fehlendes Argument f�r min; gebe Inf zur�ck (None not- lacking argument for min; give Inf back) 2: kein nicht-fehlendes Argument f�r max; gebe -Inf zur�ck what does this mean? Same as this max(c(NA, NA), na.rm=T) [1] -Inf Warning message: no non-missing arguments to max; returning -Inf which is related to the issues of empty sum(), prod(), any(), and all() in that it allows a consistent concatenation rule: max(c(x1,x2)) == max(max(x1), max(x2)) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANOVA non-sphericity test and corrections (eg, Greenhouse-Geisser)
DarrenWeber wrote: I'm an experimental psychologist and when I run ANOVA analysis in SPSS, I normally ask for a test of non-sphericity (Box's M-test). I also ask for output of the corrections for non-sphericity, such as Greenhouse-Geisser and Huhn-Feldt. These tests and correction factors are commonly used in the journals for experimental and other psychology reports. I have been switching from SPSS to R for over a year now, but I realize now that I don't have the non-sphericity test and correction factors. This can be done using anova.mlm() and mauchly.test() which work on mlm objects, i.e., lm() output where the response is a matrix. There is no theory, to my knowledge, to support it for general aov() models, the catch being that you need to have a within-subject covariance matrix. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Source code for rlogis
Anup Nandialath wrote: Dear friends, I was trying to read the source code for rlogis but ran into a roadblock. It shows [[1]] function (n, location = 0, scale = 1) .Internal(rlogis(n, location, scale)) environment: namespace:stats Is is possible to access the source code for the same. Yes, but as it is .Internal, you have to look in the (C code) sources for R itself. You can access that either by getting the source files for R and unpacking them somewhere on your computer, or by browsing e.g. https://*svn*.*R*-project.org/*R*/tags/R-2-5-0 or https://svn.r-project.org/R/branches/R-2-5-branch. Specifically, src/nmath/rlogis.c. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: Suse RPM installation problem
Stephen Henderson wrote: Thanks for your help As you suggested I do indeed have a 64bit version called exactly the same PC5-140:/home/rmgzshd # rpm -qf /usr/lib/libpng12.so.0 libpng-32bit-1.2.8-19.5 PC5-140:/home/rmgzshd # rpm -qf /usr/lib64/libpng12.so.0 libpng-1.2.8-19.5 SO how do I tell rpm to find this and not the 32bit file? Or do I need to edit something in the rpm file? Thanks Odd... Do you actually _have_ /usr/lib64/libpng12.so.0 (whereis didn't seem to find it) --- as opposed to rpm -qf telling you which package contains the file? If not, try (re)installing libpng, possibly with --force. -Original Message- From: Peter Dalgaard [mailto:[EMAIL PROTECTED] Sent: Thu 6/21/2007 6:34 PM To: Stephen Henderson Cc: r-help@stat.math.ethz.ch Subject: Re: [R] FW: Suse RPM installation problem Stephen Henderson wrote: Hello I am trying to install the R RPM for Suse 10.0 on an x86_64 PC. However I am failing a dependency for libpng12.so.0 straight away PC5-140:/home/rmgzshd # rpm -i R-base-2.5.0-2.1.x86_64.rpm error: Failed dependencies: libpng12.so.0(PNG12_0)(64bit) is needed by R-base-2.5.0-2.1.x86_64 I do seem to have this file PC5-140:/home/rmgzshd # whereis libpng12.so.0 libpng12.so: /usr/lib/libpng12.so.0 /usr/local/lib/libpng12.so but presuming that it is not the 64bit version mentioned I went looking for a 64 bit version but could not find it through google. However reading the Installation manual I noted that libpng is mention in the context of a source build. I therefore downloaded libpng-1.2.18 (v-1.2.8 or later is specified in the manual) and succesfully compiled this. This did not however help with my problem. Any suggestions? I have viggo:~/rpm -qf /usr/lib/libpng12.so.0 libpng-32bit-1.2.12-25 viggo:~/rpm -qf /usr/lib64/libpng12.so.0 libpng-1.2.12-25 viggo:~/rpm -q R-base R-base-2.5.0-2.1 Thanks Stephen Henderson ** This email and any files transmitted with it are confidentia...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** This email and any files transmitted with it are confidentia...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace number with month
Don MacQueen wrote: You can get the names using month.name[MM] And it may be necessary to use factor(month.name[MM], levels=month.name[1:12]) to get them to show up in the correct order in the barchart. You're crossing the creek to fetch water there, and getting yourself soaked in the process... (by an unnecessary conversion to character which is subject to alphabetical sorting) I think the canonical way is factor(MM, levels=1:12, labels=month.name) (and the levels=1:12 may not even be necessary when all 12 months are present) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: Suse RPM installation problem
Stephen Henderson wrote: Hello I am trying to install the R RPM for Suse 10.0 on an x86_64 PC. However I am failing a dependency for libpng12.so.0 straight away PC5-140:/home/rmgzshd # rpm -i R-base-2.5.0-2.1.x86_64.rpm error: Failed dependencies: libpng12.so.0(PNG12_0)(64bit) is needed by R-base-2.5.0-2.1.x86_64 I do seem to have this file PC5-140:/home/rmgzshd # whereis libpng12.so.0 libpng12.so: /usr/lib/libpng12.so.0 /usr/local/lib/libpng12.so but presuming that it is not the 64bit version mentioned I went looking for a 64 bit version but could not find it through google. However reading the Installation manual I noted that libpng is mention in the context of a source build. I therefore downloaded libpng-1.2.18 (v-1.2.8 or later is specified in the manual) and succesfully compiled this. This did not however help with my problem. Any suggestions? I have viggo:~/rpm -qf /usr/lib/libpng12.so.0 libpng-32bit-1.2.12-25 viggo:~/rpm -qf /usr/lib64/libpng12.so.0 libpng-1.2.12-25 viggo:~/rpm -q R-base R-base-2.5.0-2.1 Thanks Stephen Henderson ** This email and any files transmitted with it are confidentia...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anova on data means
Ronaldo Reis Junior wrote: Em Quinta 21 Junho 2007 16:56, Thomas Miller escreveu: I am transitioning from SAS to R and am struggling with a relatively simple analysis. Have tried Venables and Ripley and other guides but can't find a solution. I have an experiment with 12 tanks. Each tank holds 10 fish. The 12 tanks have randomly assigned one of 4 food treatments - S(tarve), L(ow), M(edium) and H(igh). There are 3 reps of each treatment. I collect data on size of each fish at the end of the experiment. So my data looks like Tank Trt Fish Size 1 S 1 3.4 1 S 2 3.6 1 S10 3.5 2 L 1 3.4 12M 10 2.1 To do the correct test of hypothesis using anova, I need to calculate the tank means and use those in the anova. I have tried using tapply() and by() functions, but when I do so I loose the treatment level because it is categorical. I have used Meandattapply(Size,list(Tank, Trt), mean) But that doesn't give me a dataframe that I can then use to do the actual aov analysis. So what is the most efficient way to accomplish the analysis Thanks Tom Miller Tom, try the aggregate funtion. Somethink like this meandat - aggregate(Size,list(Tank,Trt),mean) Why not just include an error term for Tank in the model? summary(aov(Size~Trt+Error(Tank))) Inte Ronaldo -- Prof. Ronaldo Reis Júnior | .''`. UNIMONTES/Depto. Biologia Geral/Lab. de Ecologia | : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil | `- Fone: (38) 3229-8187 | [EMAIL PROTECTED] | [EMAIL PROTECTED] | http://www.ppgcb.unimontes.br/ | ICQ#: 5692561 | LinuxUser#: 205366 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Got Unexpected ELSE error
Shiazy Fuzzy wrote: Dear R-users, I have a problem with the IF-ELSE syntax. Please look at the folllowing code and tell me what's wrong: a - TRUE if ( a ) { cat(TRUE,\n) } else { cat(FALSE,\n) } If I try to execute with R I get: Error: syntax error, unexpected ELSE in else The strange thing is either cat instructions are executed!! For some odd reason this is not actually a FAQ... It is an anomaly of the R (and S) language (or maybe a necessary consequence of its interactive usage) that it tries to complete parsing of expressions as soon as possible, so 2 + 2 + 5 prints 4 and then 5, whereas 2 + 2 + 5 prints 9. Similarly, when encountered on the command line, if (foo) bar will result in the value of bar if foo is TRUE and otherwise NULL. A subsequent else baz will be interpreted as a new expression, which is invalid because it starts with else. To avoid this effect you can either move the else to the end of the previous line, or put braces around the whole if construct. I.e. if (foo) { bar } else { baz } or if (foo) bar else baz or { if (foo) bar else baz } should all work. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to compute Wilk's Lambda
Dietrich Trenkler wrote: Dear helpeRs, the following data set comes from Johnson/Wichern: Applied Multivariate Statistical Analysis, 6th ed, pp. 304-306. /X - structure(c(9, 6, 9, 3, 2, 7), .Dim = as.integer(c(3, 2))) Y - structure(c(0, 2, 4, 0), .Dim = as.integer(c(2, 2))) Z - structure(c(3, 1, 2, 8, 9, 7), .Dim = as.integer(c(3, 2)))/ I would like to compute Wilk's Lambda in R, which I know is 0.0385. How can I do that? I tried /U - rbind(X,Y,Z) m - manova(U~rep(1:3, c(3, 2, 3))) summary(m,test=Wilks)/ which gives / Df Wilks approx F num Df den Df Pr(F) rep(1:3, c(3, 2, 3)) 1 0.162 12.930 2 5 0.01057 * Residuals 6 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1/ I suppose the argument rep(1:3, c(3, 2, 3)) in manova() is not appropriate. Exactly. If intended as a grouping, you need to turn it into a factor: m - manova(U~factor(rep(1:3, c(3, 2, 3 summary(m,test=Wilks) Df Wilks approx F num Df den Df Pr(F) factor(rep(1:3, c(3, 2, 3))) 2 0.0385 8.1989 4 8 0.006234 ** Residuals 5 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Or, for that matter: anova(lm(U~factor(rep(1:3, c(3, 2, 3, test=Wilks) Analysis of Variance Table Df Wilks approx F num Df den Df Pr(F) (Intercept) 1 0.048 39.766 2 4 0.002293 ** factor(rep(1:3, c(3, 2, 3))) 2 0.038 8.199 4 8 0.006234 ** Residuals 5 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Any help is very much appreciated. Dietrich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to obtain the OR and 95%CI with 1 SD change of a continue variable
felix wrote: Dear all, How to obtain the odds ratio (OR) and 95% confidence interval (CI) with 1 standard deviation (SD) change of a continuous variable in logistic regression? for example, to investigate the risk of obesity for stroke. I choose the happening of stroke (positive) as the dependent variable, and waist circumference as an independent variable. Then I wanna to obtain the OR and 95% CI with 1 SD change of waist circumference.how? Any default package(s) or options in glm available now? if not, how to calculate them by hand? Unless you want to do something advanced like factoring in the sampling error of the SD (I don't think anyone bothers with that), probably the easiest way is to scale() the predictor and look at the relevant line of exp(confint(glm(.))). As in (library(MASS); example(confint.glm)) budworm.lg0 - glm(SF ~ sex + scale(ldose), family = binomial) exp(confint(budworm.lg0)) Waiting for profiling to be done... 2.5 % 97.5 % (Intercept) 0.2652665 0.7203169 sexM 1.5208018 6.1747207 scale(ldose) 4.3399952 10.8983903 Or, if you insist on getting asymptotic Wald-statistic based intervals: exp(confint.default(budworm.lg0)) 2.5 % 97.5 % (Intercept) 0.269864 0.7294944 sexM 1.496808 6.0384756 scale(ldose) 4.220890 10.5546837 You can also get it from the coefficients of the unscaled analysis, like in budworm.lg0 - glm(SF ~ sex + ldose, family = binomial) confint(budworm.lg0) Waiting for profiling to be done... 2.5 %97.5 % (Intercept) -4.4582430 -2.613736 sexM 0.4192377 1.820464 ldose0.8229072 1.339086 exp(confint(budworm.lg0)[3,]*sd(ldose)) Waiting for profiling to be done... 2.5 %97.5 % 4.339995 10.898390 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removing values from a vector, where both the value and its name are the same?
Patrick Burns wrote: In case it matters, the given solution has a problem if the data look like: x - c(sum=77, test=99, sum=99) By the description all three elements should be kept, but the duplicated solution throws out the last element. A more complicated solution is: unique(data.frame(x, names(x))) (and then put the vector back together again). Yes, I was about to say the same. x[!duplicated(cbind(x,names(x)))] looks like it might cut the mustard. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Preserving dates in Excel.
Patnaik, Tirthankar wrote: Hi, Quick question: Say I have a date variable in a data frame or matrix, and I'd like to preserve the date format when using write.table. However, when I export the data, I get the generic number underlying the date, not the date per se, and a number such as 11323, 11324, etc are not meaningful in Excel. Is there any way I can preserve the format of a date on writing into a text-file? Er, what is exactly the problem here? d - data.frame(date=as.Date(2007-6-1)+1:5, x=rnorm(5)) d date x 1 2007-06-02 0.7987635130 2 2007-06-03 -0.7381623316 3 2007-06-04 -1.3626708691 4 2007-06-05 0.0007668082 5 2007-06-06 0.6719088533 write.table(d) date x 1 2007-06-02 0.798763513018864 2 2007-06-03 -0.738162331606612 3 2007-06-04 -1.36267086906438 4 2007-06-05 0.000766808196322155 5 2007-06-06 0.671908853312511 write.csv(d) ,date,x 1,2007-06-02,0.798763513018864 2,2007-06-03,-0.738162331606612 3,2007-06-04,-1.36267086906438 4,2007-06-05,0.000766808196322155 5,2007-06-06,0.671908853312511 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with na.rm=T
Lucke, Joseph F wrote: Suddenly (e.g. yesterday) all my functions that have na.rm= as a parameter (e.g., mean(), sd(), range(), etc.) have been reporting warnings with na.rm=T. The message is Warning message: the condition has length 1 and only the first element will be used in: if (na.rm) x - x[!is.na(x)] . This has never happened before. I don't recall having done anything that might generate this message. How do I fix this? Rename the object that you suddenly called T... (And notice that some people will advise you to use na.rm=TRUE to avoid this) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: bad value ? what is that?
Jose Quesada wrote: Hi, I'm finding a very strange error. For no good reason my R console (Rgui.exe, R 2.5.0, under win XP) stops producing anything meaningful, and just returns: Error: bad value to _whatever_ I enter. It starts doing this after a while, not immediately when launched. I have to restart R when this happens. No idea why. I didn't change anything in the R config that I remenber. Any thoughts? Thanks. Hmm that message comes from deep down inside SETCAR() and friends. I can't see other reasons for it than memory corruption. Are you running some rogue C code? Is the machine flaky in other respects? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Book Advice Needed
Roland Rau wrote: Hi, [EMAIL PROTECTED] wrote: I am new to using R and would appreciate some advice on which books to start with to get up to speed on using R. My Background: 1-C# programmer. 2-Programmed directly using IMSL (Now Visual Numerics). 3- Used in past SPSS and Statistica. I put together a list but would like to pick the best of and avoid redundancy. Any suggestions on these books would be helpful (i.e. too much overlap, porly written etc?) Books: 1-Analysis of Integrated and Co-integrated Time Series with R (Use R) - Bernhard Pfaff 2-An Introduction to R - W. N. Venables 3-Statistics: An Introduction using R - Michael J. Crawley 4-R Graphics (Computer Science and Data Analysis) - Paul Murrell 5-A Handbook of Statistical Analyses Using R - Brian S. Everitt 6-Introductory Statistics with R - Peter Dalgaard 7-Using R for Introductory Statistics - John Verzani 8-Data Analysis and Graphics Using R - John Maindonald; 9-Linear Models with R (Texts in Statistical Science) - Julian J. Faraway 10-Analysis of Financial Time Series (Wiley Series in Probability and Statistics)2nd edition - Ruey S. Tsay as one other message says, it depends a lot on your ideas what you want to do with R. And, I'd like to add, how familiar you are with statistics. One book I am missing in your list is Venables / Ripley: Modern Applied Statistics with S. I can highly recommend it. If you are going to buy yourself only one book, then I would say: buy Venables/Ripley And given the programming background, also check out the other VR book, S Programming. (This is about R too). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rounding?
jim holtman wrote: your number 6.6501 is to large to fit in a floating point number. It takes 56 bits and there are only 54 in a real number so the system see it as 6.65 and does the rounding to an even digit; 6.6 6.651 does fit into a real number (takes 54 bits) and this will now round to 6.7 Actually, a bit more insidious than that because 6.65 does not have an exact binary representation. Hence round(66.5) [1] 66 round(6.65,1) [1] 6.7 round(0.665,2) [1] 0.66 Notice that these are from Linux and differ from what you get on Windows. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tools For Preparing Data For Analysis
Douglas Bates wrote: Frank Harrell indicated that it is possible to do a lot of difficult data transformation within R itself if you try hard enough but that sometimes means working against the S language and its whole object view to accomplish what you want and it can require knowledge of subtle aspects of the S language. Actually, I think Frank's point was subtly different: It is *because* of the differences in view that it sometimes seems difficult to find the way to do something in R that is apparently straightforward in SAS. I.e. the solutions exist and are often elegant, but may require some lateral thinking. Case in point: Finding the first or the last observation for each subject when there are multiple records for each subject. The SAS way would be a datastep with IF-THEN-DELETE, and a RETAIN statement so that you can compare the subject ID with the one from the previous record, working with data that are sorted appropriately. You can do the same thing in R with a for loop, but there are better ways e.g. subset(df,!duplicated(ID)), and subset(df, rev(!duplicated(rev(ID))), or maybe do.call(rbind,lapply(split(df,df$ID), head, 1)), resp. tail. Or something involving aggregate(). (The latter approaches generalize better to other within-subject functionals like cumulative doses, etc.). The hardest cases that I know of are the ones where you need to turn one record into many, such as occurs in survival analysis with time-dependent, piecewise constant covariates. This may require transposing the problem, i.e. for each interval you find out which subjects contribute and with what, whereas the SAS way would be a within-subject loop over intervals containing an OUTPUT statement. Also, there are some really weird data formats, where e.g. the input format is different in different records. Back in the 80's where punched-card input was still common, it was quite popular to have one card with background information on a patient plus several cards detailing visits, and you'd get a stack of cards containing both kinds. In R you would most likely split on the card type using grep() and then read the two kinds separately and merge() them later. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme vs. SAS proc mixed. Point estimates and SEs are the same, DFs are different
John Sorkin wrote: R 2.3 Windows XP I am trying to understand lme. My aim is to run a random effects regression in which the intercept and jweek are random effects. I am comparing output from SAS PROC MIXED with output from R. The point estimates and the SEs are the same, however the DFs and the p values are different. I am clearly doing something wrong in my R code. I would appreciate any suggestions of how I can change the R code to get the same DFs as are provided by SAS. This has been hashed over a number of times before. In short: 1) You're not necessarily doing anything wrong 2) SAS PROC MIXED is not necessarily doing it right 3) lme() is _definitely_ not doing it right in some cases 4) both work reasonably in large sample cases (but beware that this is not equivalent to having many observation points) SAS has an implementation of the method by Kenward and Rogers, which could be the most reliable general DF-calculation method around (I don't trust their Satterthwaite option, though). Getting this or equivalent into lme() has been on the wish list for a while, but it is not a trivial thing to do. SAS code: proc mixed data=lipids2; model ldl=jweek/solution; random int jweek/type=un subject=patient; where lastvisit ge 4; run; SAS output: Solution for Fixed Effects Standard Effect Estimate Error DFt ValuePr |t| Intercept 113.48 7.4539 25 15.22 .0001 jweek -1.7164 0.5153 24 -3.33 0.0028 Type 3 Tests of Fixed Effects Num Den Effect DF DFF ValuePr F jweek 1 24 11.090.0028 R code: LesNew3 - groupedData(LDL~jweek | Patient, data=as.data.frame(LesData3), FUN=mean) fit3- lme(LDL~jweek, data=LesNew3[LesNew3[,lastvisit]=4,], random=~1+jweek) summary(fit3) R output: Random effects: Formula: ~1 + jweek | Patient Structure: General positive-definite, Log-Cholesky parametrization Fixed effects: LDL ~ jweek Value Std.Error DF t-value p-value (Intercept) 113.47957 7.453921 65 15.224144 0. jweek-1.71643 0.515361 65 -3.330535 0.0014 John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for the so...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme vs. SAS proc mixed. Point estimates and SEs are the same, DFs are different
Peter Dalgaard wrote: John Sorkin wrote: R 2.3 Windows XP I am trying to understand lme. My aim is to run a random effects regression in which the intercept and jweek are random effects. I am comparing output from SAS PROC MIXED with output from R. The point estimates and the SEs are the same, however the DFs and the p values are different. I am clearly doing something wrong in my R code. I would appreciate any suggestions of how I can change the R code to get the same DFs as are provided by SAS. This has been hashed over a number of times before. In short: 1) You're not necessarily doing anything wrong 2) SAS PROC MIXED is not necessarily doing it right 3) lme() is _definitely_ not doing it right in some cases 4) both work reasonably in large sample cases (but beware that this is not equivalent to having many observation points) SAS has an implementation of the method by Kenward and Rogers, which could be the most reliable general DF-calculation method around (I don't trust their Satterthwaite option, though). Getting this or equivalent into lme() has been on the wish list for a while, but it is not a trivial thing to do. Forgot to say: All DF-based corrections are wrong if you have non-normally distributed data (they depend on the 3rd and 4th moment of the error distribution(s)), although they can be useful as warning signs even in those cases. I also forgot to point to the simulate.lme() function which can simulate the LR statistics directly. SAS code: proc mixed data=lipids2; model ldl=jweek/solution; random int jweek/type=un subject=patient; where lastvisit ge 4; run; SAS output: Solution for Fixed Effects Standard Effect Estimate Error DFt ValuePr |t| Intercept 113.48 7.4539 25 15.22 .0001 jweek -1.7164 0.5153 24 -3.33 0.0028 Type 3 Tests of Fixed Effects Num Den Effect DF DFF ValuePr F jweek 1 24 11.090.0028 R code: LesNew3 - groupedData(LDL~jweek | Patient, data=as.data.frame(LesData3), FUN=mean) fit3- lme(LDL~jweek, data=LesNew3[LesNew3[,lastvisit]=4,], random=~1+jweek) summary(fit3) R output: Random effects: Formula: ~1 + jweek | Patient Structure: General positive-definite, Log-Cholesky parametrization Fixed effects: LDL ~ jweek Value Std.Error DF t-value p-value (Intercept) 113.47957 7.453921 65 15.224144 0. jweek-1.71643 0.515361 65 -3.330535 0.0014 John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for the so...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lines to plots with a for-loop
Saanisto, Taija wrote: Hello all, I'm plotting several graphs with a for-loop with a code: par(mfrow=c(3,4)) for(i in levels(fHCGB$code)) with(subset(fHCGB,code==i), plot(pooledPlateIntra, type=b, ylim=ylim, xlab=code, ylab=CV%)) With which I have no problems.. However I need to add lines to all of these 12 plots, but I cannot get it to work. I've tried for example par(mfrow=c(3,4)) for(i in levels(fHCGB$code)) with(subset(fHCGB,code==i), plot(pooledPlateIntra, type=b, ylim=ylim, xlab=code, ylab=CV%) points(fHCGB$limitVarC,type=b, col=green))) But run into errors. How can the lines be added? The with() construct gets a little more complicated if you want to do more than one thing inside: for(i in levels(fHCGB$code)) with(subset(fHCGB,code==i), { plot(pooledPlateIntra, type=b, ylim=ylim, xlab=code, ylab=CV%) points(fHCGB$limitVarC,type=b, col=green) }) or, since with() is really only needed for the plot() for(i in levels(fHCGB$code)) { with(subset(fHCGB,code==i), plot(pooledPlateIntra, type=b, ylim=ylim, xlab=code, ylab=CV%)) points(fHCGB$limitVarC,type=b, col=green) } ( you might have used lines() rather than points() if you think of it as an added line, but that's a matter of taste since the two functions only differ in the default for type=.) -p Taija Saanisto Biostatistician Quality assurance, Process Development PerkinElmer Life and Analytical Sciences / Wallac Oy Phone: +358-2-2678 741 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R CMD BATCH command
Austin, Peter wrote: The version of R on our unix system has been updated to version 2.5.0. When I type the following command at the unix prompt: 'R CMD BATCH filename' I receive the following error message: Error in Sys.unsetenv(R_BATCH) : 'Sys.unsetenv' is not available on this system Execution halted. 'R CMD BATCH filename' used to work with the prior version of R that I had installed (version 2.2.0). Is there something that I need to modify for it to work now? Thanks, Peter A similar problem was found on an old version of Solaris and discussed on this very list on May 14 (use the list archive and look for the thread started by Simon Penel). This could be similar to your problem (but you omitted to tell us what system you were on). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting t.test to work with apply()
Petr Klasterecky wrote: Andrew Yee napsal(a): Hi, I'm interested in using apply() with t.test() on a data.frame. Specifically, I'd like to use apply() to do the following: t.test(raw.sample[1,alive],raw.sample[1,dead]) t.test(raw.sample[2,alive],raw.sample[2,dead]) t.test(raw.sample[3,alive],raw.sample[3,dead]) etc. I tried the following, apply(raw.sample,1,function(x) t.test(raw.sample[,alive],raw.sample[,dead])) Two comments: 1) apply() works on arrays. If your dataframe only has numeric values, turn it (or its copy) to a matrix via as.matrix(). If it has mixed variables, take only the numeric part for t-tests. The conversion is made implicitly but explicit asking for it cannot hurt. 2) the main problem - you are using a wrong argument to t.test The call should look like apply(as.matrix(raw.sample), 1, function(x){t.test(x[alive], x[dead])}) assuming 'alive' and 'dead' are logical vectors of the same length as 'x'. Petr Notice also that the other apply-style functions may give an easier route to the goal: lapply(1:N, function(i) t.test(raw.sample[i,alive],raw.sample[i,dead])) or (maybe, depends on raw.sample being a data frame and alive/dead being indexing vectors) mapply(t.test, raw.sample[,alive], raw.sample[,dead]) but it gives me a list of identical results. Thanks, Andrew [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting t.test to work with apply()
Andrew Yee wrote: Thanks for everyone's suggestions. I did try results -apply(raw.sample,1,function(x) t.test(x[,alive],x[,dead])) However, I get: Error in x[, alive] : incorrect number of dimensions Full disclosure, raw.sample is a data.frame, and I am using alive and dead as indexing vectors. On the other hand, the lapply suggestion works better. results - lapply(1:nrow(raw.sample), function(i) t.test(raw.sample [i,alive],raw.sample[i,dead])) nrow()? Oops, yes. I didn't notice that your data are transposed relative to the usual cases x variables layout. So mapply() is not going to work unless you use as.data.frame(t(raw.sample)) first. -pd Thanks, Andrew On 6/4/07, Peter Dalgaard [EMAIL PROTECTED] wrote: Petr Klasterecky wrote: Andrew Yee napsal(a): Hi, I'm interested in using apply() with t.test() on a data.frame. Specifically, I'd like to use apply() to do the following: t.test(raw.sample[1,alive],raw.sample[1,dead]) t.test(raw.sample[2,alive],raw.sample[2,dead]) t.test(raw.sample[3,alive],raw.sample[3,dead]) etc. I tried the following, apply(raw.sample,1,function(x) t.test(raw.sample[,alive],raw.sample [,dead])) Two comments: 1) apply() works on arrays. If your dataframe only has numeric values, turn it (or its copy) to a matrix via as.matrix(). If it has mixed variables, take only the numeric part for t-tests. The conversion is made implicitly but explicit asking for it cannot hurt. 2) the main problem - you are using a wrong argument to t.test The call should look like apply(as.matrix(raw.sample), 1, function(x){t.test(x[alive], x[dead])}) assuming 'alive' and 'dead' are logical vectors of the same length as 'x'. Petr Notice also that the other apply-style functions may give an easier route to the goal: lapply(1:N, function(i) t.test(raw.sample[i,alive],raw.sample[i,dead])) or (maybe, depends on raw.sample being a data frame and alive/dead being indexing vectors) mapply(t.test, raw.sample[,alive], raw.sample[,dead]) but it gives me a list of identical results. Thanks, Andrew [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard �ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recompile R using ActiveTcl
James Foadi wrote: Dear all, While running some code requiring the tcltk package I have realised that my version of R was compiled with the Tcl/Tk libraries included in Fedora 6. It would be for me better to use the ActiveTcl libraries (which I have under /usr/local), and I'm aware that this probably means to recompile R with the proper configuration variables. But...is it by any chance possible to just recompile the bit affected by Tcl/Tk, like, for instance, to install tcltk with some environment variable pointing at the right ActiveTcl library? Maybe, but I don't think it is worth the trouble compared to a full rebuild. There are obstacles, e.g. that the Makefile in the packages is created from Makefile.in by the toplevel configure script. I.e., better waste some computer resources than your own. Many thanks for your suggestions and help. J __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subscript in axis label
Tobias Verbeke wrote: [EMAIL PROTECTED] wrote: Dear R help list members, I am experiencing difficulty in trying to generate a subscript '2' in an axis label. Although I can get the '2' into a subscript using expression(), R then forces me to leave at least one space between the '2' and the following character. My label is supposed to read 'N2O concentration (ppm)', and the space between the '2' and the 'O' makes it look rather inelegant! My code is the following (the comments in it are there to stop me forgetting what I have done, I am new to R): postscript(file=/Users/patrickmartin/Documents/York Innova Precision/N2Oinnova.eps, horizontal=FALSE, onefile=FALSE, height=4, width=5, pointsize=10) plot(n2o, lty=0, las=1, xlab=Time, ylab=expression(N[2]~O concentration (ppm))) points(n2o, pch=16) # suppresses line but adds points dev.off() # turns postscript device off again Is this better plot(1:10, ylab = expression(paste(N[2],O concentration (ppm), sep = ))) Or, plot(1:10, ylab = expression(N[2]*O~concentration (ppm))) (because of the ~, you can even do away with expression(), but I think that would be overly sneaky.) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about lex/yacc
elyakhlifi mustapha wrote: hello, what about these functions lex/yacc which can parse and recognize a syntax? thanks What about them? There are books, notably an O'Reilly one by D.Brown, as well as works on parser theory (Aho+Sethi+Ullman, e.g.). (This is more than a bit off-topic for this list). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.